genomedownload.py
is a python script that can be use to download anything but was written specifically to download
genome data files from EMBL-EBI FTP. It can take a single URL or a list of them (via file) to
download.
Note: This script is written for Python 3 or later. This was cowboyed together and has no tests. It is intended as both a starting point for future development and a useful example.
The following instructions assume you're on a Unix like system with Python 3 or later.
- Clone the repository:
git clone [email protected]:BigCheeze45/genomedownloader.git
- Create and activate a Python virtual environment
python3 -m venv env
source /path/to/env/bin/activate
- Installed the required packages:
pip install -r /path/to/genomedownloader/requirements.txt
Once installation is complete the script is ready to use!
genomedownload.py
takes a single URL or a list of URLs of genome data files to download. You can also provide an output directory to place the downloaded data.
If using a file make sure each URL is on its own line.
# Single URL no output folder specifed
python genomedownload.py --url ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR752/ERR752938/ERR752938_1.fastq.gz
# Single URL with output folder
python genomedownload.py --url ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR752/ERR752938/ERR752938_1.fastq.gz -o output/
# List of URLs with output folder
python genomedownload.py -f yeastDNAlinks.txt -o output/
# Complete usage guide
usage: genomedownload.py [-h] [--url URL | -f FILE] [-o OUTPUT]
Tool to download genome data files from EMBL-EBI
optional arguments:
-h, --help show this help message and exit
--url URL The absolute URL to the genome data you want
downloaded
-f FILE, --file FILE Path to a plain text file containing full URLs to the
genome data you want downloaded
-o OUTPUT, --output OUTPUT
Location you want to store the downloaded genome data.
Default to current working directory
- Publish to PyPi for easier end uer installation
- Progress bar
- Refine logging control
Submit a pull request with your changes. Open an issue if you find any!