Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not dealing well with HETATM #387

Open
jeversbioprodict opened this issue Nov 27, 2024 · 1 comment
Open

Not dealing well with HETATM #387

jeversbioprodict opened this issue Nov 27, 2024 · 1 comment

Comments

@jeversbioprodict
Copy link

The problem

I want to align a PDB which contains a HETATM (non-standard residue), and foldmason seems to skip over this residue.

Expected Behavior

I would say either treat a non-standard residue as the closed standard residue, or as an X or something

Current Behavior

Pretending it's not there.

Steps to Reproduce (for bugs)

So you have the PDBs 2WSL and 1C7J. Subsequently you run:
foldmason easy-msa *.pdb result tmpfolder

Foldssek Output (for bugs)

The input (the PDBs) and the output (result_aa.fa) you can find in the attached zip file, but I want to focus on the missing CSS residue in the following part of 2WSL :

-DIIIATKNGKVRGMQLTVFGGTVTAFLGIPYAQPPLGRLRFKKPQSLTKWSDIWNATKYANSCQNIDQSFPGFHG
                                                               ^^

If you look at the sequence of the PDB, you will see that instead of
ANSCQN you will find ANSCCQN (note the double CC).

Context

In the PDB it looks like this:

ATOM    495  O   CYS A  65      41.116  19.446  37.391  1.00 20.75           O  
ATOM    496  CB  CYS A  65      38.570  17.457  38.204  1.00 20.24           C  
ATOM    497  SG  CYS A  65      36.929  17.234  38.987  1.00 21.61           S  
HETATM  498  N   CSS A  66      41.684  17.931  38.966  1.00 20.96           N  
HETATM  499  CA  CSS A  66      43.085  17.841  38.551  1.00 21.35           C  
HETATM  500  CB  CSS A  66      43.843  16.792  39.350  1.00 22.05           C  
HETATM  501  SG  CSS A  66      43.932  17.126  41.062  1.00 25.66           S  
HETATM  502  SD  CSS A  66      44.896  18.887  41.169  0.60 28.63           S  
HETATM  503  C   CSS A  66      43.170  17.443  37.101  1.00 21.61           C  
HETATM  504  O   CSS A  66      42.478  16.527  36.663  1.00 21.04           O  
ATOM    505  N   GLN A  67      44.043  18.126  36.370  1.00 21.19           N  
ATOM    506  CA  GLN A  67      44.207  17.892  34.955  1.00 20.72           C  
ATOM    507  C   GLN A  67      45.459  18.619  34.483  1.00 21.04           C  
ATOM    508  O   GLN A  67      45.855  19.653  35.056  1.00 20.90           O  
ATOM    509  CB  GLN A  67      43.000  18.434  34.181  1.00 19.91           C  

Your Environment

foldmason Version: 63949d980645892a878c8427cbbd124b8dba425f
foldseek Version: 2ad017897d3dab66dd33ea675e92215bdfb4a64d
@milot-mirdita milot-mirdita transferred this issue from steineggerlab/foldmason Nov 27, 2024
@milot-mirdita
Copy link
Member

Thanks for the issue report. This is an underlying issue in Foldseek, I transferred the issue over to Foldseek's issue tracker.

GemmiWrapper's threeAA2oneAA doesn't contain a mapping for CSS. Seems like Gemmi also has a long list of non-standard residues in the find_tabulated_residue function, that we can use for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants