Generative AI-assisted Drug Discovery Pipeline

Paper: https://www.biorxiv.org/content/10.1101/2024.12.07.627340v1

Table of Contents

Abstract
Files structure
Environments setup
Citing

Abstract

Drug repurposing presents a valuable strat- egy to expedite drug discovery by identifying new ther- apeutic uses for existing compounds, especially for dis- eases with limited treatment options. We propose a Gen- erative AI-assisted Virtual Screening Pipeline that com- bines generative modeling, binding pocket prediction, and similarity-based searches within drug databases to achieve a generalizable and efficient approach to drug repurposing. Our pipeline enables blind screening of any protein target without requiring prior structural or functional knowledge, allowing it to adapt to a wide range of diseases, including emerging health threats and novel targets where informa- tion is scarce. By rapidly generating potential ligands and efficiently identifying and ranking drug candidates, our ap- proach accelerates the drug discovery process, broadening the scope and impact of repurposing efforts and offering new possibilities for therapeutic development.

Overview of the Generative AI-assisted Drug Repurposing Pipeline. The pipeline consists of two phases: Phase 1 generates potential ligands using generative AI, and Phase 2 identifies promising drug candidates via similarity-based searches within drug databases.

Files structure

Files should be placed as the following folder structure:

root
├── assets
│   ├── hiv
│   │   │── generation
│   │   │   │── generation.csv
│   │   │── generation_docking
│   │   │   │──...
│   │   │──preprocessed_data
│   │   │   │──...
│   │   │── remove_water
│   │   │   │── 2jle.pdb
│   │   │   │── 2jle.pdbqt
│   │── covid19
│   │   │── generation
│   │   │   │── generation.csv
│   │   │── ...
│   │── admet
│   │   │── covid_preds.csv
│   │   │── hiv_preds.csv
├── datasets
│   ├── drugbank.csv
│   ├── drugbank_conformation
│   │   ├── DB00114.sdf
│   │   ├── DB00116.sdf
│   │   ├── ...
├── search_dgi
├── diffusion_generate
├── e3gnn_utils.py
├── e3gnn.py
├── equiformer.py
├── gat.py
├── pipeline_gnn.py
├── utils.py
├── README.md

Environments setup

Please install the environments by the following command:

conda env create --name pipeline --file=pipeline.yml

Usage

Conformer generation:

cd datasets/
python generate_conformation.py

Run pipeline with gnns searching methods dataset:
```
python pipeline_gnn.py 0
```
Run gat searching methods dataset:
```
python gat.py
```
Run equiformer searching methods dataset:
```
python equiformer.py
```
Run e3gnn searching methods dataset:
```
python e3gnn.py
```
ADMET properties prediction:
```
bash admet.sh
```

If our work is useful, please cite us!

@article {Pham2024.12.07.627340,
	author = {Pham, Phuc and Nguyen, Viet Thanh Duy and Cho, Kyu Hong and Hy, Truong Son},
	title = {Generative AI-assisted Virtual Screening Pipeline for Generalizable and Efficient Drug Repurposing},
	elocation-id = {2024.12.07.627340},
	year = {2024},
	doi = {10.1101/2024.12.07.627340},
	publisher = {Cold Spring Harbor Laboratory},
	abstract = {Drug repurposing presents a valuable strategy to expedite drug discovery by identifying new therapeutic uses for existing compounds, especially for diseases with limited treatment options. We propose a Generative AI-assisted Virtual Screening Pipeline that combines generative modeling, binding pocket prediction, and similarity-based searches within drug databases to achieve a generalizable and efficient approach to drug repurposing. Our pipeline enables blind screening of any protein target without requiring prior structural or functional knowledge, allowing it to adapt to a wide range of diseases, including emerging health threats and novel targets where information is scarce. By rapidly generating potential ligands and efficiently identifying and ranking drug candidates, our approach accelerates the drug discovery process, broadening the scope and impact of repurposing efforts and offering new possibilities for therapeutic development. Detailed results and implementation can be accessed at https://github.com/HySonLab/DrugPipeCompeting Interest StatementThe authors have declared no competing interest.},
	URL = {https://www.biorxiv.org/content/early/2024/12/11/2024.12.07.627340},
	eprint = {https://www.biorxiv.org/content/early/2024/12/11/2024.12.07.627340.full.pdf},
	journal = {bioRxiv}
}

References

@inproceedings{
velickovic2018deep,
title="{Deep Graph Infomax}",
author={Petar Veli{\v{c}}kovi{\'{c}} and William Fedus and William L. Hamilton and Pietro Li{\`{o}} and Yoshua Bengio and R Devon Hjelm},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=rklz9iAcKQ},
}

@misc{Gordić2020PyTorchGAT,
  author = {Gordić, Aleksa},
  title = {pytorch-GAT},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/gordicaleksa/pytorch-GAT}},
}

@inproceedings{
    liao2023equiformer,
    title={Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs},
    author={Yi-Lun Liao and Tess Smidt},
    booktitle={International Conference on Learning Representations},
    year={2023},
    url={https://openreview.net/forum?id=KwmPfARgOTD}
}

@article{brandstetter2021geometric,
      title={Geometric and Physical Quantities improve E(3) Equivariant Message Passing},
      author={Johannes Brandstetter and Rob Hesselink and Elise van der Pol and Erik Bekkers and Max Welling},
      year={2021},
      eprint={2110.02905},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

@article{doi:10.1021/acs.jcim.1c00203,
author = {Eberhardt, Jerome and Santos-Martins, Diogo and Tillack, Andreas F. and Forli, Stefano},
title = {AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings},
journal = {Journal of Chemical Information and Modeling},
volume = {61},
number = {8},
pages = {3891-3898},
year = {2021},
doi = {10.1021/acs.jcim.1c00203},
    note ={PMID: 34278794},
}

@article{swanson2024admet,
  title={ADMET-AI: a machine learning ADMET platform for evaluation of large-scale chemical libraries},
  author={Swanson, Kyle and Walther, Parker and Leitz, Jeremy and Mukherjee, Souhrid and Wu, Joseph C and Shivnaraine, Rabindra V and Zou, James},
  journal={Bioinformatics},
  volume={40},
  number={7},
  pages={btae416},
  year={2024},
  publisher={Oxford University Press}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative AI-assisted Drug Discovery Pipeline

Abstract

Files structure

Environments setup

Usage

If our work is useful, please cite us!

References

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
AutoDock-Vina @ 912b51b		AutoDock-Vina @ 912b51b
assets		assets
datasets		datasets
diffusion_generate		diffusion_generate
search_dgi		search_dgi
.gitmodules		.gitmodules
README.md		README.md
admet.sh		admet.sh
e3gnn.py		e3gnn.py
e3gnn_utils.py		e3gnn_utils.py
equiformer.py		equiformer.py
gat.py		gat.py
pipeline.yml		pipeline.yml
pipeline_gnn.py		pipeline_gnn.py
utils.py		utils.py

HySonLab/DrugPipe

Folders and files

Latest commit

History

Repository files navigation

Generative AI-assisted Drug Discovery Pipeline

Abstract

Files structure

Environments setup

Usage

If our work is useful, please cite us!

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages