Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add stop codon to table and other small fixes #138

Merged
merged 10 commits into from
Dec 1, 2021

Conversation

mbhall88
Copy link
Member

@mbhall88 mbhall88 commented Nov 4, 2021

It would be very useful if mykrobe could accept stop codons (*) in catalogue mutations.

This PR is a draft as I can't figure out how to run the test suite in order to add tests for this functionality though. I would love some help @martinghunt

I have tested it on a panel locally and it seems to work, but I'd feel much better with unit tests.

This would address part of #137

@mbhall88 mbhall88 requested a review from martinghunt November 4, 2021 07:05
@martinghunt
Copy link
Member

Nice! This is super useful!

To run the tests, you'll need mongod installed, and running in the background (I run sudo mongod in a separate terminal and leave it there). Then tox should work (or sometimes rm -rf .tox first). Or run a specific test file with eg tox tests/annotation_tests/test_annotations.py.

One of the existing tests is now failing - the travis output is here https://app.travis-ci.com/github/Mykrobe-tools/mykrobe/jobs/546607570. I haven't dug into why, though, other than noticing that it's testing an X - ie any amino acid - change, so I guess there's some extra logic needed to handle stops and Xs.

@mbhall88
Copy link
Member Author

mbhall88 commented Nov 6, 2021

Ok, I'll give that a try and check out the failing test during the week.

@mbhall88 mbhall88 requested a review from iqbal-lab November 7, 2021 01:54
@mbhall88
Copy link
Member Author

mbhall88 commented Nov 7, 2021

One of the existing tests is now failing - the travis output is here https://app.travis-ci.com/github/Mykrobe-tools/mykrobe/jobs/546607570. I haven't dug into why, though, other than noticing that it's testing an X - ie any amino acid - change, so I guess there's some extra logic needed to handle stops and Xs.

Yes, this test failed because, currently, we include the stop codons when X is given. What is the right thing to do here? Does "change to anything" also include stop codons?

@martinghunt
Copy link
Member

One of the existing tests is now failing - the travis output is here https://app.travis-ci.com/github/Mykrobe-tools/mykrobe/jobs/546607570. I haven't dug into why, though, other than noticing that it's testing an X - ie any amino acid - change, so I guess there's some extra logic needed to handle stops and Xs.

Yes, this test failed because, currently, we include the stop codons when X is given. What is the right thing to do here? Does "change to anything" also include stop codons?

I'm not sure what the right thing to do is. Pinging @iqbal-lab ...

@mbhall88
Copy link
Member Author

mbhall88 commented Nov 8, 2021

Either way, we could conceivably add a CLI option for the user to indicate whether X should include stop codons?

@iqbal-lab
Copy link
Collaborator

We need

  1. Ability to express the current catalogues
  2. Users to be able to specify a catalogue, which gets turned into probes, and then when Mykrobe runs, and says what it found, be interpretable. If X is user defined at runtime, the json output is tricky

If we change X to mean including stop, and no other changes, then our current catalogue definition will mean something else, with different probes (adding stop), which we should test on our full 13k. On those grounds I'm tempted to say X should not include stop. But is complicated, maybe I missed something?

@mbhall88
Copy link
Member Author

mbhall88 commented Nov 8, 2021

No you've summed it up pretty well.

So, add some logic that says X is "any other amino acid (i.e., not stop codons)"?

@iqbal-lab
Copy link
Collaborator

agreed

@mbhall88 mbhall88 marked this pull request as ready for review November 16, 2021 06:46
@mbhall88
Copy link
Member Author

Now that this PR is ready for review, here is a clearer picture of the proposed changes.

Added

  • Support for including stop codons in the panel with the * character.
  • Panel version is now included in JSON output [closes Report panel version used in JSON output #119]
  • Shorthand CLI versions for common/long options:
    • -D: --min_proportion_expected_depth
    • -P: --custom_probe_set_path
    • -R: --custom_variant_to_resistance_json
    • -L: --custom_lineage_json
    • -O: --format

Changed

Fixed

@mbhall88
Copy link
Member Author

Note, there is a bit of "noise" in some files as I was running black as I went and obviously some files hadn't been formatted.

@mbhall88 mbhall88 changed the title add stop codon to table add stop codon to table and other small fixes Nov 16, 2021
@martinghunt
Copy link
Member

Looks good. There's one edge case that I don't know if we should worry about... @iqbal-lab ?

I think you're preventing having a specific amino acid change, and also an any amino acid change at the same position? eg A42C and A42X. Sorry if that's not the case and ignore this question!

We may want to allow both A42C and A42X because they could cause resistance to different drugs. IDK withouth checking what's in the existing catalog and whether we already hit this case in any of the species/panels.

@iqbal-lab
Copy link
Collaborator

That kind of thing might happen with fluoroquinolones and with rifampicin versus rifabutin; i don't believe it happens in the current catalogue, but i dont really have it in my head. But surely we can express that another way - at worst with an explicit list

@mbhall88
Copy link
Member Author

mbhall88 commented Nov 17, 2021

Hmmm, well I added that because I actually hit this case with a catalogue Arnold was trying to build with and make probes errors out when, for example, A42C and A42X both exist.

Hapy to take suggestions for how to allow both though.

@mbhall88
Copy link
Member Author

mbhall88 commented Nov 18, 2021

It seems appveyor is failing and tl;dr I think it is because of jamescasbon/PyVCF#332. Ultimately, this is a bigger issue and we should look to move away from pyvcf (which seems to not be maintained anymore - no commits since 2017). I'll open a separate issue for this.

Additionally, the test suite isn't running on travis as we have a negative credit balance. Maybe I should open another issue to move on to github actions?

This was referenced Nov 18, 2021
@iqbal-lab
Copy link
Collaborator

my understanding of this is mykrobe as it stands works with appveyor, and what does not work is your PR.
And your PR would work if you fixed/modified the input VCF.
That seems a more sensible approach than moving away from pyvcf, which would need a lot of testing.

@iqbal-lab
Copy link
Collaborator

Write that fast before going to bed, will reread tomorrow, sorry!

@mbhall88
Copy link
Member Author

mbhall88 commented Nov 30, 2021

my understanding of this is mykrobe as it stands works with appveyor, and what does not work is your PR. And your PR would work if you fixed/modified the input VCF. That seems a more sensible approach than moving away from pyvcf, which would need a lot of testing.

No, that isn't the issue with appveyor. I didn't touch any VCFs in this PR.

The issue (as mentioned above) is related to jamescasbon/PyVCF#332 and jamescasbon/PyVCF#334.

Basically, when the setuptools version is >= 58, the latest pyvcf (v0.6.8) fails to install as it uses old setuptools syntax, and instead, pyvcf (v0.4.3) gets installed, which causes the import error in the appveyor log. Because appveyor is installing things afresh, it is getting a recent-ish version of python/setuptools and therefore the pyvcf version being installed is too old.

This never impacted conda-related installations (which is what I generally use) as conda's dependency solver and install process is different to pip's.

The tests all pass on my local installation (which is in a conda env).

@martinghunt
Copy link
Member

Seeing as @mbhall88 's got the tests all passing locally, happy to merge. Can have another issue to get appveyor passing.

I think that means that having A42X and A42C is not allowed, which is fine. The workaround (which is what we currently have to do before this PR anyway) is to enumerate all the X mutations to handle where eg A42X=>drug1, and A42C=>drug2.

@mbhall88 mbhall88 merged commit b4a19c2 into Mykrobe-tools:master Dec 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants