Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Got different results of the same genome from the different runs of checkm2 #103

Open
quliping opened this issue May 23, 2024 · 3 comments

Comments

@quliping
Copy link

quliping commented May 23, 2024

Hello, I found some of my MAGs have different completeness and contamination in different runs of checkm2. I first get 20 original MAGs and run checkM2 for them. In this run, e.g., the completeness and contamination of MAG bin.SY48.20 are 50.41%/6.91%. Next, I run the reassembly moddule of metawrap and got 20 reassembled version of the 20 original MAGs. I run checkM2 for the 40 MAGs and the completeness and contamination of the original version of bin.SY48.20 became to 49.87%/6.65%. Why the quality score of the same genome changed in the different runs of checkM2 with different input MAG sets?

this is the first run for the 20 original MAGs:
image

this is the second run for the 20 original (with 'orig' in the genome id, e.g., 'bin.SY48.20.orig' is the same genome of 'bin.SY48.20' in the first run):
image

Here is the other run of checkM2 using only two MAGs including the original and reassembled version of bin.SY48.20, and you can see that the the completeness and contamination were changed again:
image

@chklovski
Copy link
Owner

Hmm, could potentially be related to keras-team/keras#12400 or the BatchNormalization layers as part of the neural network model, leading to minor fluctuations in output.

Does this affect the 'General' (Gradient boost) results or only the 'Specific' (Neural network') results for your MAGs? You can test this running the --allmodels command. Are the differences in results in excess of 1% completeness or contamination?

Will see if I can reproduce it with my own MAGs.

@quliping
Copy link
Author

quliping commented May 23, 2024

Hmm, could potentially be related to keras-team/keras#12400 or the BatchNormalization layers as part of the neural network model, leading to minor fluctuations in output.

Does this affect the 'General' (Gradient boost) results or only the 'Specific' (Neural network') results for your MAGs? You can test this running the --allmodels command. Are the differences in results in excess of 1% completeness or contamination?

Will see if I can reproduce it with my own MAGs.

I run these three MAG sets again with the --allmodels command. Both 'General' and 'Specific' results of bin.SY48.20 were changed in the different runs ('2 MAGs' and '40 MAGs') of checkm2.
image
image

However, I don't know why that I could not reproduce the previous result of the '20 original MAGs'... It is strange that the completeness was 50.36% in the result that I report first time, but it became to 49.87% at this time whatever I using --allmodels or not...
image

@quliping
Copy link
Author

Hmm, could potentially be related to keras-team/keras#12400 or the BatchNormalization layers as part of the neural network model, leading to minor fluctuations in output.

Does this affect the 'General' (Gradient boost) results or only the 'Specific' (Neural network') results for your MAGs? You can test this running the --allmodels command. Are the differences in results in excess of 1% completeness or contamination?

Will see if I can reproduce it with my own MAGs.

Hello, may I ask if this issue has been resolved? The fluctuation of checkm2 result is no matter for genome with high quality but stilll has influence for genome near the quality threshould. I'm worried that others may come to different conclusions from me when reanalyzing my data. I'm not sure if it's appropriate to publish these genomes. Looking forward for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants