-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem when loading pdbs with no bonds #658
Comments
Thanks for catching this! A PR would be great, but I think we should instead also do try:
array = pdb.get_structure(
pdb_file=self.file,
extra_fields=["b_factor", "occupancy", "charge", "atom_id"],
include_bonds=True,
)
except (ErrorThatIsRaised):
array = pdb.get_structure(
pdb_file=self.file,
extra_fields=["b_factor", "occupancy", "charge", "atom_id"],
include_bonds=False,
)
try:
array.bonds = connect_via_residue_names(array)
except AttributeError:
print("Not able to find bonds via residue.") This is also related to #634 where if the structure is missing a A more general way to handle all of these potential missing bits of data, but I am unsure what the best approach would be. |
biotite is very ready to throw an error if something isn't exactly as it expects it, which is a lot of the time in the realm of PDB files |
It would be nice if biotite raised different types of errors but it seems it just raises an AttributeError in case anything goes wrong. So in practice your suggestion would endup being something like try:
array = pdb.get_structure(
pdb_file=self.file,
extra_fields=["b_factor", "occupancy", "charge", "atom_id"],
include_bonds=True,
)
except AttributeError:
array = pdb.get_structure(
pdb_file=self.file,
extra_fields=["b_factor", "occupancy", "charge", "atom_id"],
include_bonds=False,
)
try:
array.bonds = connect_via_residue_names(array)
except AttributeError:
print("Not able to find bonds via residue.") which seems a bit odd. I guess the most general way that also avoids duplicated code would be to manually check if the extra fields are present in the pdb, raise an error if they aren't, get the structure without bonds, then build the bonds. At least with this pipeline if something goes wrong it would be easier to know if the problem is related with the bonds or the extra fields. However I am not sure how this checking should be done and probably would do that in another PR. |
It might be something that we re-implemented the underlying |
I can try to raise this issue with the biotite devs. Meanwhile do you think I can do a PR as a hot fix here? |
I'm happy to accept a PR fix here, worth also raising it with the biotite devs as well |
The problem
I encoutered some problems when trying to load PDB files that do not have bonds information and more than one chain identifier, such as
When trying to load it, There will be an error from biotite on line 29 of
molecularnodes/entities/molecule/pdb.py
because of the "include_bonds=True" flag.
My solution
I forked the main repository and made the following modifications
Now the array is loaded without bonds, which are built afterwards. In case biotite raises an AtributeError, then the atoms are loaded without any bond information.
If you think this is a good solution I can submit a pull request later. I am also open to suggestions about this solution.
The text was updated successfully, but these errors were encountered: