- IEV2Mol:
MAIN/model/
- SMILES:
- all:
MAIN/data/Druglike_million_canonical_no_dot_dup.smi
- This file is zipped. If you use,
unzip -j Druglike_million_canonical_no_dot_dup.smi.zip
inMAIN/data
- This file is zipped. If you use,
- all:
-
SMILES, grid file, interaction file :
- all:
MAIN/data/{protein}/
- all:
-
IFP
- all:
IFP-RNN/MAIN/{protein}/
- all:
- SMILES
- all:
MAIN/data/chembl_33_no_dot.smi
- This file is zipped. If you use,
unzip -j chembl_33_no_dot.smi.zip
inMAIN/data
- This file is zipped. If you use,
- all:
conda env create -f=iev_vae_env.yml
If you want to train IFP-RNN, it is necessary to create some enviroments by run commands below, and install mgltools etc. See README in IFP-RNN directry.
conda env create -f=IFP-RNN.yml
conda env create -f=IFP-RNN_py3_7.yml
conda env create -f=vina.yml
cd MAIN/model
python train_iev2mol.py
- Making training data from SMILES
JTVAE/JTVAE/FastJTNNpy3/fast_molvae sh preprocess_drd2_no_dot.sh
- Training
cd MAIN/model
python train_jtvae.py
- Calculate IFP from SDF
cd IFP-RNN/MAIN_sdf
python ../AIFP/prepare_sdf_glide.py --sdf drd2_all_smiles_MAIN_HTVS_pv.sdf --work_dir ./
python ../AIFP/create_reference_sdf.py --config config_ifp.txt --protein 6cm4.pdb --sdf drd2_all_smiles_MAIN_HTVS_pv_prepared.sdf --n_jobs 50
python ../AIFP/create_IFP_sdf.py --config config_ifp.txt --sdf drd2_all_smiles_MAIN_HTVS_pv_prepared.sdf --n_jobs 50
python split_IFP_ResIFP.py
python ../AIFP/prepare_ddc_input_sdf.py --dataset IFP_ResIFP_train.csv --info Tmp_drd2_all_smiles_MAIN_HTVS_pv_prepared/info.csv --type aifp
python ../AIFP/prepare_ddc_input_sdf.py --dataset IFP_ResIFP_test.csv --info Tmp_drd2_all_smiles_MAIN_HTVS_pv_prepared/info.csv --type aifp
- Training
python ../train_ddc.py --train_csv IFP_ResIFP_train_AIFPsmi.csv --load_pkl 0 --save ./results/
- Results are saved in
MAIN/evaluate_model/results/{protein}/test{i}/raw_csv/iev2mol.csv
cd MAIN/evaluate_model/iev2mol
python make_csv.py
- Results are saved in
MAIN/evaluate_model/results/{protein}/test{i}/raw_csv/jt-vae.csv
cd MAIN/evaluate_model/jt-vae
python make_csv.py
- Generate
cd IFP-RNN/MAIN
python test_model.py --model results/fullBits--80--0.1731--0.0010000 --IFP IFP_ResIFP_test_AIFPsmi.csv --save generated1000smi
Before executing the following command, separate the compounds generated above and save them as MAIN/evaluate_model/ifp-rnn/test{i}/CHEMBL4467359_0.smi
.
- Evaluate(Results are saved in
MAIN/evaluate_model/results/{protein}/test{i}/raw_csv/ifp-rnn.csv
)
cd MAIN/evaluate_model/ifp-rnn
python make_csv.py
- Results are saved in
MAIN/evaluate_model/results/{protein}/test{i}/mol_from_rawcsv/
cd MAIN/evaluate_model
python make_figs_from_rawcsv.py
Draw the compounds and their distributions that satisfy the Tanimoto coefficient and IEV cosine similarity thresholds.
- Save images of all compounds that meet the threshold and one compound that meets the Tanimoto similarity threshold and has the highest IEV cosine similarity.(Results are saved in
MAIN/evaluate_model/results/{protein}/test{i}/tanimoto{threshold}_ievcos{threshold}}/
) - The number of compounds that meet the threshold within each test data and the average across the test data are standardized outputs.
cd MAIN/evaluate_model
python find_high_ievcos_row_tanimoto.py
- Results are saved in
MAIN/evaluate_model/results/{protein}/test{i}/chemicalspace.jpeg
- Plot kernel density estimates of ECFP4 reduced to two dimensions by PCA for 10000 randomly selected compounds from the DM-QP-1M dataset and compounds from the DRD2 Active dataset.
- Each test data point and the 100 data points generated by IEV2Mol using it as a seed are plotted over the chemical space above.
cd MAIN/evaluate_model
python plot_chemicalspace.py
Plot distributions of IEV cosine similarity to seed compoud, Tanimoto coefficient, and docking score
- Each results of 10 test data are saved in
MAIN/evaluate_model/results/{protein}/test{i}/
- Results of all test data are saved in
MAIN/evaluate_model/results/{protein}/
cd MAIN/evaluate_model
python plot_density_graph.py
Calculate validity, uniqueness, diversity, number of dockable, number of IEV cosine similarity≥0.7 and show them on standard output
cd MAIN/evaluate_model
python make_metrics_from_raw_csv.py