Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing test for multipart uploads with checksums #6794

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

avantgardnerio
Copy link
Contributor

Which issue does this PR close?

Closes #6793.

Rationale for this change

Described in issue

What changes are included in this PR?

A failing test

Are there any user-facing changes?

Multipart uploads fail

@github-actions github-actions bot added the object-store Object Store Interface label Nov 25, 2024
@avantgardnerio avantgardnerio changed the title Failing test for multipart uploads with signatures Failing test for multipart uploads with digests Nov 25, 2024
@avantgardnerio avantgardnerio changed the title Failing test for multipart uploads with digests Failing test for multipart uploads with checksums Nov 25, 2024
@thinkharderdev
Copy link
Contributor

thinkharderdev commented Nov 25, 2024

So there are a couple of issues with how multi-part upload is implemented:

  1. The initial PUT request to CreateMultipartUpload must specify the checksum algo or else calling PutPart with the checksum will fail (eg set x-amz-checksum-algorithm: SHA256 on the request)
  2. When calling CompleteMultipartUpload we need to send all the part checksums in the request (see `https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html)

So this means that we need to keep the checksums in the PartId for when we complete the upload

@tustvold
Copy link
Contributor

I'm not sure about this, an AWS specific detail now leaks into the public trait for all stores... It also is not immediately obvious when the checksums are needed or not...

I think this needs a bit more thought, I'll try to find some time in the next few weeks

@tustvold tustvold added the api-change Changes to the arrow API label Nov 27, 2024
@avantgardnerio
Copy link
Contributor Author

this, an AWS specific detail now leaks into the public trait for all stores... It also is not immediately obvious when the checksums are needed or not...

FWIW: I verified in the new tests we are now writing the checksums at the correct times for when the digest functionality is enabled.

Now I am just trying to get the original tests working for when it isn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change Changes to the arrow API object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-part s3 uploads fail when using checksum
3 participants