Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Glue Iceberg REST only supports Access Key/Secret Signing #25257

Open
PeterAustinMoore opened this issue Mar 8, 2025 · 4 comments
Open

Comments

@PeterAustinMoore
Copy link

I have been scratching my head for little while on why enabling SigV4 was not working when trying to access a https://glue.us-west-2.amazonaws.com/iceberg endpoint (for non-S3Table access) - with

{"message":"The security token included in the request is invalid."}

as the persistent result.

I compared pyiceberg, which has a similar set up, and for tasks running in ECS and EKS, the access-key and access-secret from the current SigV4 appear to only include the S3 configuration access key and access secret values (neglecting the session token). I saw some other conversations around a similar topic #6102 & #6102

I am curious if something similar to pyiceberg would work (i.e. if the access key and access secret are provided, great, use them, but if not grab the DefaultCredentialsProvider and then use the key, secret, and session to generate the SigV4). That or perhaps use the use-web-identity-token-credentials-provider existing s3 client credentials instead of the values from the config.

@dain
Copy link
Member

dain commented Mar 8, 2025

As part of the recent PR to remove the deprecated Glue V1 APIs, Iceberg was Glue configs were updated to match the Hive connector, so this may not be an issue (BTW I'm not sure). I suggest trying with a build of the current code, or try when 473 is relesed.

@PeterAustinMoore
Copy link
Author

Ah this isn't for the direct glue integration (that is working great) but instead for Glue's REST endpoint functionality. The main reason behind this is that we can use the REST catalog for both local and cloud runtimes (and getting Glue itself running locally has been quite challenging). However, because glue REST is an AWS endpoint, it does need to be signed in order to work. The current code works if you are using api keys for a user, but if you assume a role, you need to provide the session token as well, it seems.

@electrum
Copy link
Member

electrum commented Mar 10, 2025

Is there some documentation that explains the architecture or flow? This is so that the REST catalog can provide Glue credentials in the same way that it provides S3 credentials?

@PeterAustinMoore
Copy link
Author

Is there some documentation that explains the architecture or flow?

I wasn't able to find any, I just went into the code itself. In general, this flow is specific to AWS. So if you set your rest-catalog.uri to https://glue.us-west-2.amazonaws.com/iceberg and you have to configure rest-catalog.sigv4-enabled=true and signing-name=glue at the moment it reads from your s3.aws-access-key and s3.aws-access-key which we historically haven't had to set because the regular AWS SDK will read from the AWS credentials (in our case in ECS via the assumed role).

This is so that the REST catalog can provide Glue credentials in the same way that it provides S3 credentials?

Not quite, because this is making an HTTP call (not using the SDK) you have to provide the signature along with the request. If you are using the SDK, it will look for credentials in a few places in order to do the authentication (for s3 and glue). At least, that is my understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants