Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AtomArray Direct Loading #751

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

zachcp
Copy link
Contributor

@zachcp zachcp commented Feb 24, 2025

WIP RE: issue #750

Pencilling out a solution before getting too deep.

@BradyAJohnston
Copy link
Owner

LGTM, some discussion around overall implementation in the issue

pass

def _validate_structure(self, array: AtomArray):
# TODO: implement entity ID, sec_struct for PDB files
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A complete implementation should handle these fields as well by setting defaults. The assumption here is that the Array is being passed and the file is no longer available. I letft it unimplemented pending a good solution

@@ -65,6 +64,12 @@ def __init__(self, file_path: Union[str, Path, io.BytesIO]):
self._frames_collection: str | None
self._entity_type = EntityType.MOLECULE


@classmethod
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preferred entry point is now a class method instead of over loading the init/ Much cleaner

from ..base import EntityType


class Array(Molecule):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the. logic is in a new Array Class analogous to the others. This has to do a lot less work though!

@zachcp zachcp marked this pull request as ready for review February 25, 2025 17:30
@zachcp
Copy link
Contributor Author

zachcp commented Feb 25, 2025

Much simplified PR that is narrower in scope and follows the suggestions from the issue thread. Namely:

  1. Subclass Molecule as you do for other IO
  2. Do not use init

@BradyAJohnston
Copy link
Owner

After our discussion this morning I think we came to the conclusion that a new distinct AtomArray isn't the right way to go, but instead just rolling more things into the Molecule and instead computing and storing all of the array relevant stuff on the Molecule.array, with a potential init or class method to read a biotite.AtomArray directly.

@zachcp
Copy link
Contributor Author

zachcp commented Feb 26, 2025

Hey Brady, I'll write up my notes but we will need a way to handle and validate the inputs including for Array. So I would merge this and I can reorganize all of the inputs as a follow up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants