-
Notifications
You must be signed in to change notification settings - Fork 2
mcloud
Featurize SMILES strings, perform dimensionality reduction using UMAP and save
the resulting chemical point cloud. Mordred descriptors that result in errors
or otherwise non-numeric values are dropped with drop_non_numeric_columns
path : str
The path to the file containing SMILES strings. Should include file extension.
path_out : str, default='cloud_out'
The output file name for the resulting chemical point cloud. Default is 'cloud_out'.
sep : str, default=','
The separator used in the file containing SMILES strings. Default is ','.
position : int, default=0
The index of the column containing the SMILES strings. Default is 0.
exl : bool, default=False
Flag indicating whether the input file is an Excel sheet.
head : bool, default=False
Indicates if there is a header in the input SMILES file.
cloud : numpy.ndarray of shape (N, 3) Coordinates for the resulting chemical point cloud (UMAP projection).
Visualize the chemical point cloud in a 3D scatter plot.
f_out (str): The output file name or path (excluding file extension) for the plot.
points : numpy.ndarray of shape (N, 3)
The array of points representing the chemical point cloud.
save : bool, default=False
Flag indicating whether to save the plot. Default is False.
Save the chemical point cloud with corresponding SMILES strings to a human readable file.
smiles_loc : str
The path/name of the file containg the SMILES to be saved.
f_out : str
The path/name of the file to be saved.
points : str, default='chem_cloud.npy'
The file path/name for the chemical point cloud.
sep : str, default=','
The delimitor to be used when parsing the file with SMILES.
position : int, default=0
The index location to be used when reading SMILES.
indx : str or None, default=None
The .npy file path/name of the downsampling indexs. If the
'-d' or '--down' flags were not used for the comand line, or
if otherwise left as None, then the full point cloud is saved.
exl : bool, default=False
Flag indicating wether the input file is an Excel sheet. If set
to True, the resulting output file will aslo be an Excel sheet.
heady : bool, default=False
Flag to indicate if the SMILES file contains a header line or not.
f_out : pd.DataFrame of shape (points, 4)
Dataframe to be saved containing SMILES and 3-D UMAP embeddings of
a chemical point cloud.
Drop non-numeric columns from a DataFrame.
df : pd.DataFrame
The input DataFrame.
df_out : pd.DataFrame
The modified DataFrame with non-numeric columns dropped.