Skip to content

Respect default_bucket_prefix by default for Estimator code_location #5208

Open
@NathanCYee

Description

@NathanCYee

Describe the feature you'd like
Currently the code_location attribute of the Estimator class defaults to using the output_bucket parameter docs link:

If not specified, the default code location is ‘s3://output_bucket/job-name/’.

The Session object also has a parameter default_bucket_prefix that can be configured.

Ideally, if

  1. The output_bucket part of output_path is the default_bucket
  2. _is_output_path_set_from_default_bucket_and_prefix is False

Then the default location should respect both the default_bucket as well as the default_bucket_prefix.

e.g. s3://default_bucket/default_bucket_prefix/job-name/

This change would be implemented in _stage_user_code_in_s3.

Otherwise, the default behavior creates artifacts at the root of the bucket. This means that default behavior for environments where IAM bucket write access is limited by prefix (i.e. SageMaker Unified Studio) will fail.

How would this feature be used? Please describe.
If this behavior is implemented, model code assets would be uploaded by default to a prefix where write access is allowed.

Describe alternatives you've considered
Currently code_location needs to be manually configured to work in SageMaker Unified Studio. This is poorly documented as part of features like ModelStep where it needs to be configured in repack_model_step_settings as the model.register output populates output_path by default in a pipeline.

If this change cannot be implemented in code, explicit documentation should be provided about configuring parameters to output code in the SageMaker Unified Studio project prefix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions