Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkm2 flags certain species and genera as contaminated on a consistent basis. #93

Open
Fabian-Bastiaanssen opened this issue Jan 15, 2024 · 0 comments

Comments

@Fabian-Bastiaanssen
Copy link

Description:
I am encountering contamination scores ranging from 6-9% for the following species:

  • Blautia sp001304935
  • Enterocloster lavalensis
  • Enterocloster citroniae
  • Enterocloster clostridioformis
  • Hungatella sp005845265
  • Hungatella effluvii

The scores consistently fall within a 0.5% difference range between multiple isolates, even when obtained from different sources, such as the NCBI references.

Observations:

When analyzed using Metaphlan4, the genomes are tagged as 100% matching the species identified by GTDBK.
Consistent scores are observed between isolates of the same species.
Certain genera seem to contain multiple of these offenders.
Suspicion:
Given the alignment of results between isolates, the 100% match with Metaphlan4, and the recurring appearance of specific genera, there is a suspicion that there might be no actual contamination from other species or strains within the same species and that there is some set of genes or traits within these species causing this to flag up.

I'm not sure what the right approach for dealing with this is, but I thought I should put a notice out for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant