-
Notifications
You must be signed in to change notification settings - Fork 11
Reference guided assembly with reference selection
Victoria Cepeda edited this page Sep 1, 2018
·
1 revision
Download and extract metagenomic sample:
ftp://public-ftp.hmpdacc.org/Illumina/posterior_fornix/SRS044742.tar.bz2
SRS044742/
SRS044742.denovo_duplicates_marked.trimmed.1.fastq
SRS044742.denovo_duplicates_marked.trimmed.2.fastq
SRS044742.denovo_duplicates_marked.trimmed.singleton.fastq
Run:
python3 go_metacompass.py -P SRS044742/SRS044742.denovo_duplicates_marked.trimmed.1.fastq,SRS044742/SRS044742.denovo_duplicates_marked.trimmed.2.fastq -U SRS044742/SRS044742.denovo_duplicates_marked.trimmed.singleton.fastq -o SRS044742_2018 -k
Notice that this time we added a new parameter "-k". This parameter will add intermediate file to the final output. You will see the following messages while running metacompass:
/cbcb/software/Linux-x86_64/packages/ncbi-blast-2.4.0+/bin/blastn
/cbcb/project2-scratch/treangen/kmer/kmer-mask
/cbcb/sw/RedHat-7-x86_64/users/treangen/local/mash/1.1.1/bin/mash
/cbcb/sw/RedHat-7-x86_64/common/local/Python3/common/3.6.0/bin/snakemake
Provided cores: 12
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
1 assemble_unmapped
1 bam_sort
1 bowtie2_map
1 build_contigs
1 create_tsv
1 fastq2fasta
1 join_contigs
1 kmer_mask
1 merge_reads
1 pilon_contigs
1 pilon_map
1 reference_recruitment
1 sam_to_bam
14
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
merge_reads
Selected jobs (1):
merge_reads
Resources after job selection: {'_cores': 11, '_nodes': 9223372036854775806}
---merge fastq reads
Reason: Missing output files: SRS044742_2018/SRS044742.merged.fq
Releasing 1 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 12.
1 of 14 steps (7%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
kmer_mask
Selected jobs (1):
kmer_mask
Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806}
---kmer-mask fastq
Reason: Missing output files: SRS044742_2018/SRS044742.marker.match.1.fastq; Input files updated by another job: SRS044742_2018/SRS044742.merged.fq
Releasing 12 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 13.
2 of 14 steps (14%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
fastq2fasta
Selected jobs (1):
fastq2fasta
Resources after job selection: {'_cores': 11, '_nodes': 9223372036854775806}
---Converting fastq to fasta.
Reason: Missing output files: SRS044742_2018/SRS044742.fasta; Input files updated by another job: SRS044742_2018/SRS044742.marker.match.1.fastq
Releasing 1 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 11.
3 of 14 steps (21%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
reference_recruitment
Selected jobs (1):
reference_recruitment
Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806}
---reference recruitment.
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/mc.refseq.fna; Input files updated by another job: SRS044742_2018/SRS044742.fasta
Releasing 12 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 8.
4 of 14 steps (29%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
bowtie2_map
Selected jobs (1):
bowtie2_map
Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806}
---Build index .
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/SRS044742.sam; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/mc.refseq.fna, SRS044742_2018/SRS044742.merged.fq
Releasing 12 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 10.
5 of 14 steps (36%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
build_contigs
Selected jobs (1):
build_contigs
Resources after job selection: {'_cores': 11, '_nodes': 9223372036854775806}
---Build contigs .
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/contigs.fasta; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/mc.refseq.fna, SRS044742_2018/SRS044742.0.assembly.out/SRS044742.sam
Skipped removing non-empty directory SRS044742_2018/SRS044742.0.assembly.out
Releasing 1 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 6.
6 of 14 steps (43%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
pilon_map
Selected jobs (1):
pilon_map
Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806}
---Map reads for pilon polishing.
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc.sam.unmapped.2.fq, SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc.sam, SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc_unpaired.sam, SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc.sam.unmapped.1.fq; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/contigs.fasta
Releasing 12 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 7.
7 of 14 steps (50%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (2):
sam_to_bam
assemble_unmapped
Selected jobs (1):
sam_to_bam
Resources after job selection: {'_cores': 11, '_nodes': 9223372036854775806}
---Convert sam to bam .
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc.sam.bam; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc_unpaired.sam, SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc.sam
Releasing 1 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 9.
8 of 14 steps (57%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (2):
bam_sort
assemble_unmapped
Selected jobs (1):
bam_sort
Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806}
---Sort bam .
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/sorted.bam, SRS044742_2018/SRS044742.0.assembly.out/sorted2.bam; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc.sam.bam
Releasing 12 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 5.
9 of 14 steps (64%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (2):
pilon_contigs
assemble_unmapped
Selected jobs (1):
assemble_unmapped
Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806}
---Assemble unmapped reads .
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/SRS044742.megahit/final.contigs.fa; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc.sam.unmapped.2.fq, SRS044742_2018/SRS044742.0.assembly.out/SRS044742.mc.sam.unmapped.1.fq
Releasing 12 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 3.
10 of 14 steps (71%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
pilon_contigs
Selected jobs (1):
pilon_contigs
Resources after job selection: {'_cores': 0, '_nodes': 9223372036854775806}
---Pilon polish contigs .
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/contigs.pilon.fasta; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/sorted.bam, SRS044742_2018/SRS044742.0.assembly.out/sorted2.bam, SRS044742_2018/SRS044742.0.assembly.out/contigs.fasta
Releasing 12 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 2.
11 of 14 steps (79%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
join_contigs
Selected jobs (1):
join_contigs
Resources after job selection: {'_cores': 11, '_nodes': 9223372036854775806}
---concanenate reference-guided and de novo contigs
Reason: Missing output files: SRS044742_2018/SRS044742.0.assembly.out/contigs.final.fasta; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/contigs.pilon.fasta, SRS044742_2018/SRS044742.0.assembly.out/SRS044742.megahit/final.contigs.fa
Releasing 1 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 0.
12 of 14 steps (86%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
create_tsv
Selected jobs (1):
create_tsv
Resources after job selection: {'_cores': 11, '_nodes': 9223372036854775806}
---information reference-guided and de novo contigs
Reason: Missing output files: SRS044742_2018/metacompass_summary.tsv; Input files updated by another job: SRS044742_2018/SRS044742.0.assembly.out/mc.refseq.fna, SRS044742_2018/SRS044742.0.assembly.out/SRS044742.megahit/final.contigs.fa, SRS044742_2018/SRS044742.0.assembly.out/contigs.final.fasta, SRS044742_2018/SRS044742.0.assembly.out/contigs.pilon.fasta, SRS044742_2018/SRS044742.0.assembly.out/contigs.fasta
Releasing 1 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 4.
13 of 14 steps (93%) done
Resources before job selection: {'_cores': 12, '_nodes': 9223372036854775807}
Ready jobs (1):
all
Selected jobs (1):
all
Resources after job selection: {'_cores': 11, '_nodes': 9223372036854775806}
localrule all:
input: SRS044742_2018/metacompass_summary.tsv
jobid: 1
reason: Input files updated by another job: SRS044742_2018/metacompass_summary.tsv
Releasing 1 _cores (now 12).
Releasing 1 _nodes (now 9223372036854775807).
Finished job 1.
14 of 14 steps (100%) done
unlocking
removing lock
removing lock
removed all locks
mv: cannot stat ‘SRS044742_2018/SRS044742.0.assembly.out/*merged.fq.mash*’: No such file or directory
checking for dependencies (Bowtie2, Blast, kmermask, Snakemake, etc)
Bowtie2--->[OK]
Blast+--->[OK]
kmer-mask--->[OK]
mash--->[OK]
Snakemake--->[OK]
MetaCompass finished succesfully!
The output will be in the specified folder example1_output:
ls SRS044742_2018/*
SRS044742_2018/metacompass.final.ctg.fa
SRS044742_2018/intermediate_files:
assembly_output
mapped_reads
megahit_output
pilon_output
reference_selection_output
unmapped_reads
SRS044742_2018/metacompass_logs:
SRS044742.0.bowtie2map.log
SRS044742.0.kmermask.log
SRS044742.0.megahit.log
SRS044742.0.pilon.map.log
SRS044742.0.reference_recruitement.log
SRS044742_2018/metacompass_output:
metacompass_assembly_stats.tsv
metacompass.final.ctg.fa
metacompass_summary.tsv