Skip to content

Versioning #580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Versioning #580

wants to merge 16 commits into from

Conversation

DavideTisi
Copy link
Contributor

@DavideTisi DavideTisi commented May 7, 2025

add checkpoint versioning. Default is 1. each arch have to implement a transfer function to a version to the other, now it just throw an error

Contributor (creator of pull-request) checklist

  • Tests updated (for new features and bugfixes)?
  • Documentation updated (for new features)?
  • Issue referenced (for PRs that solve an issue)?

Reviewer checklist

  • CHANGELOG updated with public API or any other important changes?

📚 Documentation preview 📚: https://metatrain--580.org.readthedocs.build/en/580/

@Copilot Copilot AI review requested due to automatic review settings May 7, 2025 13:16
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces checkpoint versioning by adding a version key to default hyperparameters and updating the model loading logic to handle version mismatches.

  • Added tests to verify that loading a checkpoint with a mismatched version raises a NotImplementedError.
  • Updated the model_from_checkpoint logic to convert version 0 to version 1 and to trigger checkpoint upgrade via the model’s upgrade_checkpoint method.
  • Added a new find_architectures_version utility function and updated various model and default-hypers.yaml files to include a default version number.

Reviewed Changes

Copilot reviewed 11 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/utils/test_io.py Added test case verifying error for wrong version checkpoint.
src/metatrain/utils/io.py Updated model loading logic to handle version checking and upgrades.
src/metatrain/utils/architectures.py Introduced find_architectures_version to retrieve current version value.
src/metatrain/soap_bpnn/model.py & default-hypers.yaml Added upgrade_checkpoint stub and version config for SOAP-BPNN model.
src/metatrain/pet/model.py & default-hypers.yaml Added upgrade_checkpoint stub and version config for PET model.
src/metatrain/gap/model.py & default-hypers.yaml Added upgrade_checkpoint stub and version config for GAP model.
src/metatrain/deprecated/pet/model.py & default-hypers.yaml Added upgrade_checkpoint stub and version config for deprecated PET.
Files not reviewed (4)
  • src/metatrain/deprecated/pet/schema-hypers.json: Language not supported
  • src/metatrain/gap/schema-hypers.json: Language not supported
  • src/metatrain/pet/schema-hypers.json: Language not supported
  • src/metatrain/soap_bpnn/schema-hypers.json: Language not supported

@DavideTisi DavideTisi force-pushed the versioning branch 2 times, most recently from 500c12b to 7d41219 Compare May 7, 2025 14:28
@DavideTisi DavideTisi requested review from Luthaf and PicoCentauri May 7, 2025 15:00
@@ -314,3 +314,11 @@ def export(
append_metadata_references(metadata, self.__default_metadata__)

return MetatensorAtomisticModel(self.eval(), metadata, capabilities)

def upgrade_checkpoint(checkpoint: Dict) -> Dict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function should be documented in new-architecture.rst

],
)
def test_load_model_checkpoint_wrong_version(path):
file = RESOURCES_PATH / "model-64-bit-version5000000.ckpt"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't write in the current directory but move around to a tmpdir first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but all the other test do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i mean the test for the command-line write in the same folder, from which this reads the checkpoint that transforms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, most tests start by changing to a temporary directory:

monkeypatch.chdir(tmp_path)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants