CodonShuffle3

Description

This package is an updated version of the original CodonShuffle tool, originally described by Jorge D M M, Mills R E, Lauring A S (2015). The updates ensure compatibility with Python3 and current computing environments, while preserving the core functionality: generating and analyzing permuted sequences of an open reading frame from a viral genome.

This updated package can be used to generate permuted sequences by shuffling bases in a way that preserves the protein coding sequence. The resulting sequences contain a large number of synonymous substitutions and may differ in various sequence-determined features (e.g., dinucleotide frequency or free energy of RNA folding). Additional scripts are then used to quantify these differences relative to the unpermuted sequence, and a least squares method is applied to identify permuted sequences that are most similar to the "wild type."

For more information or citation for the original CodonShuffle package, see:

Jorge D M M, Mills R E, Lauring A S, CodonShuffle: a tool for generating and analyzing synonymously mutated sequences. Virus Evolution, 2015, 1(1): vev012

Instructions

Download files and folders
Check software requirements
Install dependencies by running the install_dependencies.py script
Run the Python file (CodonShuffle.py)
Select sequence file in frame (input_file.fas)
Select a single permutation script to use (n3, dn23, dn31, dn231)
Select desired number of permuted sequences
Select the genomic feature to be evaluated

CodonShuffle.py

Script to shuffle nucleotides and evaluate genomic features

Input Commands:

-i Input
Fasta file (input_file.fas) of open reading frame beginning with ATG
-s Permutation script
Choose one of: n3, dn23, dn31, dn231 to specify which permutation method to use
-r Number of replicates
Number of permuted sequences that the program will generate, default is 1000
-m Genomic feature(s) to be used
Specifies the genomic features for final least squares distance calculation. Defaults to all. To use a subset, list each feature: CAI, ENC, VFOLD, UFOLD, DN, CPB separated by spaces. If using RNA folding, specify algorithm (UNAfold or ViennaRNA) with this command.
-g Graphics
Generates graphs of distributions of values for all genomic features
--seed Random seed
Allows setting a random seed for reproducibility.
-h Help
Displays help menu

Outputs:

Graphs of distribution of values for genomic features of permuted sequences. Each file will include the input sequence and permutation algorithm in its title, with suffixes such as:
- _dn.pdf for dinucleotide frequency graphs
- _dnls.pdf for overall dinucleotide bias least squares graphs
- fas.hamming.pdf for Hamming distance graphs
- .fold.pdf for RNA folding free energy graphs
- .out.enc.pdf for effective number of codons (ENC) graphs
- .cai.pdf for codon adaptation index (CAI) graphs
- .cpb.pdf for codon pair bias (CPB) graphs
A graph showing Hamming distance versus least squares distance: fasfinal_graph.pdf
A table (_final_table.txt) with hamming distance, genomic feature values, and aggregate least squares distance. Headers include:
- Sequence number (with input sequence as 0)
- Distance.ls (overall least squares distance)
- Nucleotide_difference
- CPB (codon pair bias)
- DN_least_square (aggregate dinucleotide least squares value)
- Individual dinucleotide frequencies (e.g., DN..AA.)
- VFOLD.mfe (minimum free energy from Vienna RNA)
- ENC (effective number of codons)
- CAI (codon adaptive index)
A multi-sequence FASTA file (.fas) with all permuted sequences labeled (e.g., replicate_1)
Additional intermediate output files during the CodonShuffle run:
- _least_square.txt
- .blk
- .cpb
- .dn
- .dnls
- fas_distance_table.txt
- .fasfold_table_mfe.txt
- .out
- new_table_final_graph
- .cai
- .fold

Example Usage:

$ python CodonShuffle.py -i Poliovirus_1_Mahoney_P1.fas -s dn23 -r 100 -m CAI ENC CPB -g

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
CodonShuffle3.py		CodonShuffle3.py
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodonShuffle3

Description

Instructions

CodonShuffle.py

About

Releases

Packages

Languages

License

a01508252/CodonShuffle3

Folders and files

Latest commit

History

Repository files navigation

CodonShuffle3

Description

Instructions

CodonShuffle.py

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages