Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use ma_qa_metric.type instead of .name #602

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

use ma_qa_metric.type instead of .name #602

wants to merge 1 commit into from

Conversation

arose
Copy link
Member

@arose arose commented Oct 29, 2022

addressing #597

@arose
Copy link
Member Author

arose commented Oct 29, 2022

@gtauriello what do you think, does this fix it for you?

@arose
Copy link
Member Author

arose commented Nov 7, 2022

ping @gtauriello

@gtauriello
Copy link

@arose Thanks for looking into this. And sorry for the delayed answer. I was travelling last week and could only look at it now.

I cannot fully judge if the code works as intended (I don't have a setup to test it), but from what I see in the proposed code changes, I would assume that it does the job for the currently existing ModelCIF files.

If it's ok, I comment here on how to make scores available for use (as commented on in #597).

I see 2 separate issues on that:

  1. What colour scheme to use for display. In SWISS-MODEL we chose to offer both colour schemes to any type of model ("Confidence gradient" for our classic "QMEAN" scheme and "Confidence class" for the one used by AlphaFold DB; see here for doc and here for an example with both SM and AFDB model). The schemes are scaled to [0,1] or [0,100] depending on the type of score. If you want to use a heuristic to pick between "gradient" and "class" for the user (as done currently), that's of course also fine.
  2. What label to set for the confidence score. Personally, I wouldn't call them "pLDDT Score" and "QMEAN Score" but rather label them as "X (Y)" with "X" being "Pred. lDDT" or "Pred. lDDT-CA" (depending on _ma_qa_metric.type) and "Y" being whatever is in _ma_qa_metric.name. This would be more robust for future scores. To clarify the used colour scheme one can additionally add "gradient" or "class" somewhere in the label.

Of course the suggestions above come at some amount of extra complexity for the code as one needs to keep track if we do "gradient" or "class" display, whether range is in [0,1] or [0,100] and what label to have for the score (as opposed to all of those being hard coded as it is now).

As a side note: it seems that ESMFold has been using AF-like pLDDT values but scaled to [0,1]. So if one ever sees ModelCIF files from them, those should have "pLDDT in [0,1]" as type.

@dsehnal
Copy link
Member

dsehnal commented Feb 20, 2025

Is this something we still want to address?

@gtauriello
Copy link

gtauriello commented Feb 21, 2025

@dsehnal I would be happy if this gets addressed since currently mol* is not able to display quality metrics in ModelCIF files generated with SWISS-MODEL (recent example).

I can't really judge from the code whether the suggested changes solve the problem but as described above it seemed to be fairly hard-coded on specific metrics with specific names.

Let me know if I can add further details on the ModelCIF definitions for quality metrics and how they are used in practice.

@dsehnal
Copy link
Member

dsehnal commented Feb 21, 2025

@gtauriello Thanks, can you please clarify this:

Of course the suggestions above come at some amount of extra complexity for the code as one needs to keep track if we do "gradient" or "class" display, whether range is in [0,1] or [0,100] and what label to have for the score (as opposed to all of those being hard coded as it is now).

What criteria do we need to use to determine the gradient/class display and the ranges?

Generally don't mind adding extra logic, just need to know what the criteria are.

@gtauriello
Copy link

Sure. So here is the proposed logic:

  • Pick range according to _ma_qa_metric.type. The values "pLDDT", "pLDDT all-atom" and "pLDDT to polymer" have a range in [0, 100] while "pLDDT all-atom in [0,1]" and "pLDDT in [0,1]" have it in [0, 1].
  • Either allow user to switch between gradient and class display or pick according to _ma_qa_metric.name. This will never be fully robust since any text is valid as the name. In my experience, one can distinguish AlphaFold-variants which use "plddt" and QMEAN-variants which use "qmean" (both case-insensitive) somewhere in the name. So I would use that as logic to pick (the default) display: gradient for QMEAN and class for AlphaFold.
  • Use label "[X] ([Y]; [Z])" for color theme name where [X] is _ma_qa_metric.name, [Y] is "gradient" or "class", and [Z] is _ma_qa_metric.type (one could also rename the types but not worth the effort I think)
  • Use label "[X] ([Z])" (same [X] and [Z] as above) for the scores shown in mouseover in the 3D view

As a general note: it is possible to have multiple scores in _ma_qa_metric with type = "pLDDT" and mode = "local". So far I have only encountered one set of models in ModelArchive (or elsewhere) with 2 per-residue scores (not public yet but I can also generate a mock example if needed). Not sure which per-residue score would be picked by mol* but at least the label texts suggested above make it explicit as one can distinguish them by _ma_qa_metric.name. And being able to switch between them as color theme would of course be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants