Skip to content

Latest commit

 

History

History
87 lines (66 loc) · 2.74 KB

README.md

File metadata and controls

87 lines (66 loc) · 2.74 KB

AMPd-Up

De novo antimicrobial peptide sequence generation with a recurrent neural network

Dependencies

  • Python 3.6
  • PyTorch 1.7.1
  • Numpy
  • Pandas
  • Biopython

Installation

  1. Create a new conda environment:
conda create -n ampd-up python=3.6
  1. Activate the environment:
conda activate ampd-up
  1. Install AMPd-Up in the environment:
conda install -c bioconda ampd-up

AMPd-Up can now be run. See usage information below.

  1. To deactivate an active environment, use:
conda deactivate

Datasets

The training set (antibacterial sequences only) and known AMP sequences for sequence novelty analysis are stored in the data folder.

  • Training set: APD3_ABP_20190320.fa
  • Known AMP sequences: APD3_20220711.fa + DADP_mature_AMP_20181206.fa

Pre-trained models

The 1,000 model instances used to generate the peptide sequences presented in the publication can be accessed through the Zenodo repository. Users can either choose to use the pre-trained models or train their own models for sequence generation.

Sequence generation

Usage: AMPd-Up [-h] [-fm FROM_MODEL] -n NUM_SEQ [-sm SAVE_MODEL] [-od OUT_DIR] [-of {fasta,tsv}]

optional arguments:
  -h, --help            Show this help message and exit
  -fm FROM_MODEL, --from_model FROM_MODEL
                        Directory of the existing models; only specify this
                        argument if you want to sample from existing models
                        (optional)
  -n NUM_SEQ, --num_seq NUM_SEQ
                        Number of sequences to sample
  -sm SAVE_MODEL, --save_model SAVE_MODEL
                        Prefix of the models if you want to save them; only
                        specify this argument if you want to sample by
                        training new models (optional)
  -od OUT_DIR, --out_dir OUT_DIR
                        Output directory (optional)
  -of {fasta,tsv}, --out_format {fasta,tsv}
                        Output format, fasta or tsv (tsv by default, optional)

Examples:

  1. Sample sequences by training new models: AMPd-Up -n 100
  2. Sample sequences from existing models: AMPd-Up -fm ../models/ -n 100

Author

Chenkai Li ([email protected])

Contact

If you have any questions, comments, or would like to report a bug, please file a Github issue or contact us.

Citation

If you use AMPd-Up in your work, please cite our publication:

Li, C., Sutherland, D., Richter, A. et al. De novo synthetic antimicrobial peptide design with a recurrent neural network. Protein Science 33, e5088 (2024). https://doi.org/10.1002/pro.5088