Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change default shard size to 1GB #357

Merged
merged 3 commits into from
Feb 27, 2025

Conversation

spencerschrock
Copy link
Contributor

Summary

The values was based on the benchmarks in #356.

The exact speed improvement depends on the size of the models being
serialized and of the individual files in the model. Speed improvements
ranged from a 5% improvement to a 87% improvement.

Manifest size is also influenced by shard size (and the model /
individual file size). Increasing shard size from ~1MB to 1GB decreases
manifest size by 3 orders of magnitude (99.9% reduction).

Release Note

  • Fine-tuned the shard sized used when hashing files, resulting in a modest speed improvement. The exact improvement is dependent on model size but we observed improvements of 5-87%. Manifest size was also reduced by 99.9%.

Documentation

NONE

Copy link
Contributor Author

@spencerschrock spencerschrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of diminishing returns on speed after 50 MiB shard size, but manifest size still decreases until we reach the approximate size of the normal file-based manifest (but at that point having shards at all is pointless).

Given HuggingFace has a default shard size of 5GB (5_000_000_000), I'm assuming something in the 1-2G(i?)B range makes sense here. Happy to do either

@spencerschrock spencerschrock marked this pull request as ready for review February 26, 2025 19:26
@spencerschrock spencerschrock requested review from a team as code owners February 26, 2025 19:26
mihaimaruseac
mihaimaruseac previously approved these changes Feb 26, 2025
The values was based on the benchmarks in b2ddcc0.

The exact speed improvement depends on the size of the models being
serialized and of the individual files in the model. Speed improvements
ranged from a 5% improvement to a 87% improvement.

Manifest size is also influenced by shard size (and the model /
individual file size). Increasing shard size from ~1MB to 1GB decreases
manifest size by 3 orders of magnitude (99.9% reduction).

Signed-off-by: Spencer Schrock <[email protected]>
Signed-off-by: Spencer Schrock <[email protected]>
They're roughly equal in performance, any differences either vary model
to model or run to run. However multiples of 1000 are slightly easier
for humans to visualize in things like shard names.

Signed-off-by: Spencer Schrock <[email protected]>
@mihaimaruseac mihaimaruseac merged commit 4a610f7 into sigstore:main Feb 27, 2025
33 checks passed
@spencerschrock spencerschrock deleted the shard-opt branch February 27, 2025 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants