-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No bins created-- did we do something wrong or are there really no bins? #37
Comments
Hi, That sounds a bit off. Did you take the set of 10-15k contigs and do one binning run using 48 samples? Versus 48 binning runs with 1 sample each. The former (1 run, 48 samples, set of dereplicated contigs) is the correct usage. Were any parameters changed from their default settings? Kris |
Hi Kris, OK, so to clarify: we didn't do a co-assembly (that would break our server), so we have 48 separate sets of contigs. Should we combine those contigs into one dereplicated combined fasta file (which would contain something like 500k contigs) and do a vRhyme run on that? For the bam files then, should we map the reads of each sample against that combined fasta file? We used default settings for all the vRhyme runs. Thanks! |
It's possible that using 1 sample (1 coverage value) per contig didn't give vRhyme enough information to bin. It semi-equally uses coverage and sequence features. I've gotten 1 to work before but certainly not the same quality results. My suggestion is to dereplicate your 500k viral contigs and use the dereplicated set as contig input. Yes, then map the the reads of each sample. There's a couple ways to do that. vRhyme can handle dereplication, or otherwise it uses a general method similar to what dRep uses. Then you can either have vRhyme map by just inputting the fastq files (select either BWA or Bowtie2) or you can map yourself and input the bam files. This complicates things if you wanted vMAGs per sample to compare because at the end of binning you'd have combined vMAGs based on the dereplicated/combined set. For this vRhyme will generate a coverage file and you can assess coverage per contig per sample. However, as you know each of your samples invidually won't have the whole picture anyway due to variance in metagenome sequencing/assembly. I hope that answers your question. The main takeaway is that vRhyme and other coverage-based tools often rely on >1 sample to bin accurately even though they tend to let you input 1. |
Hi Kris, OK, thanks! The first time we did it, we did have multiple coverage values for each sample (i.e. bam files for sample 1 mapped to sample 2, and sample 3, and sample 4...) but still found no bins in any of the samples. We'll still give this a try, so we'll have more contigs to work with in the single binning run-- so we'll combine all the assembled contigs together and make new bam files. We'll see how it goes. Thanks, |
Hi again Kris et al., We tried what you suggested (combine all fasta files together, dereplicate, map reads from each sample to that combined set) and tried to bin with vRhyme-- and this time, it didn't even try to bin. Any ideas for what might be going on this time? For reference, here is the log file:
Here is a sample log from the first time we tried it, where we tried to bin each individual sample and got zero bins:
|
Hi Kris (and Karthik),
My student is trying to bin contigs identified by VirSorter using vRhyme. He is using as input contigs identified by VirSorter as viral, as well as sorted bam files that were mapped to the entire set of of contigs (which includes the ones that Virsorter did not flag as viral). (For reference, we had approximately 2-3 million contigs for each sample, but VirSorter identified about 10,-15,000 contigs as viral per sample. These samples were microbial metagenomes.) He ran vRhyme for 48 samples and vRhyme ran without errors, but it tells us we have no bins for any of the samples. Is this realistic, or did something likely go wrong? If so, do you have any thoughts?
Thanks!
-Rika
The text was updated successfully, but these errors were encountered: