-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High number of bacterial genes in vRhyme bins #19
Comments
Hi, Did you input predicted virus sequences or VLP sequencing contigs, or a whole assembly including microbes? |
Whole assembly including microbes |
vRhyme does not function to identify viral sequences and expects the input to be viruses. This is the source of microbial contamination since microbes were binned. Please see the "important note" in the program description section of the README. |
I see.. Probably, i misunderstood the Description section which said that ''vRhyme can take an entire metagenome as input, but the performance for a whole metagenome has not been fully evaluated.''. I will re-run it with identified viral contigs. Thank you. |
You can bin sequences as you did and the next step would be filtering out non-viral bins. This can help to recruit viral fragments into bins that otherwise cannot be predicted. But keep in mind that microbes will be binned too, leading to the indicated CheckV results and large bin sizes. |
yes, got it. better first I will do the viral contigs prediction and then run vRhyme |
HI,
I used vRhyme with the default settings on my assembled contigs. I concatenated contigs from the same bins into a single fasta file using the provided bin sequences.py script.Later, I used CheckV ( with prodigal -m option enabled) on the concatenated fasta file. Strangely, CheckV analysis revealed that a large number of the bins contained an extremely high number of host (bacterial) genes, accounting for more than 50% (many contigs with more than 90%) of the total number of genes. Surprisingly, CheckV indicates that many of these bins are complete and without contamination. However, the contig/genome size (many of them in the 500kb-4 Mb range) is too large to be considered a virus/phage. Is it normal to have this kinda results? I have attached the Checkv quality summary file for your reference.
quality_summary.txt
The text was updated successfully, but these errors were encountered: