Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[runless] report_asset_materialization endpoint #16602

Merged

Conversation

alangenfeld
Copy link
Member

@alangenfeld alangenfeld commented Sep 18, 2023

add a rest endpoint one can post to to report runless asset materializations

the endpoint /report_asset_materialization/ takes

  • asset key
    • can be passed as url path, where / is the multipart delimiter like in from_user_string
    • as query param asset_key, which expects the db encoding of json array of strings / single string
    • as json post body with value passed directly to AssetKey
  • other properties
    • passed as query params or json encoded post body

How I Tested These Changes

added test coverage

@alangenfeld
Copy link
Member Author

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@@ -323,6 +405,11 @@ def build_routes(self):
"/download_debug/{run_id:str}",
self.download_debug_file_endpoint,
),
Route(
"/report_asset_materialization/{asset_key:path}",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

differs from report_runless_asset_event naming since we are encoding the type as the part of the url path to allow for ease of use

Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about multi-part asset keys?

@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch from 46c8ce8 to 85aace8 Compare September 18, 2023 20:03
@alangenfeld alangenfeld dismissed schrockn’s stale review September 18, 2023 20:03

multipart key support added

@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch 2 times, most recently from 36ec391 to 7b6651d Compare September 18, 2023 21:41
capture_output=True,
shell=True,
text=True,
check=False, # dont raise on non zero exit codes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch from 7b6651d to 807f55b Compare September 19, 2023 17:33
Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looking for test case that has multipart keys

Comment on lines 297 to 312
# multipart key
long_key = AssetKey(["foo", "bar", "baz"])
response = test_client.post(
f"/report_asset_materialization/{long_key.to_user_string()}",
)
assert response.status_code == 200
evt = instance.get_latest_materialization_event(long_key)
assert evt
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multipart key test here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. can you add assertion against fully expanded url (e.g. assert url == "report_asset_materialization/foo/bar/baz")
  2. can you add test when asset key components contain "/"

I'm concerned about the answers to 2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opted to support accepting asset_key via query params or post body to provide ways to workaround / in the asset key content. Could not find a clean path trying to deal url encoded / values being part of the path since the web framework resolves those encodings

@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch 2 times, most recently from eb5f3b3 to b387d7b Compare September 20, 2023 18:54
@github-actions
Copy link

Deploy preview for dagit-storybook ready!

✅ Preview
https://dagit-storybook-5huvwz0tk-elementl.vercel.app
https://al-09-18--runless-report-asset-materialization-endpoint.components-storybook.dagster-docs.io

Built with commit eb5f3b3.
This pull request is being automatically deployed with vercel-action

@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch from b387d7b to 5618d6d Compare September 20, 2023 18:59
@github-actions
Copy link

github-actions bot commented Sep 20, 2023

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-bieiapwrb-elementl.vercel.app
https://al-09-18--runless-report-asset-materialization-endpoint.core-storybook.dagster-docs.io

Built with commit b169682.
This pull request is being automatically deployed with vercel-action

@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch 3 times, most recently from ce88e4a to 5473048 Compare September 20, 2023 21:13
Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool.

Two questions:

  1. Are we sure we want an endpoint per type instead of a single endpoint for all event types?
  2. Can we put something in place to make sure the REST endpoint stays up-to-date with ExtContext methods?

Once possible path to doing this is writing an ExtContext implementation that hits the REST endpoint.

)

asset_key = None
if request.path_params.get("asset_key"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make sure we write a test or something that ensures that this endpoint stays up-to-date with the upcomingreport_asset_materialization method on ExtContext?

@alangenfeld
Copy link
Member Author

Are we sure we want an endpoint per type instead of a single endpoint for all event types?

Not sure. What pushed me to the current scheme was trying to support the simple base case of only posting to a path which led me to encoding the event type in the path. Some other ideas:

  • we could still have one "endpoint" and still encode it in the path /report_asset/<event_type>/<asset>/<key>/<parts>
  • we could have one endpoint and assume materialization by default, take some type arg
    Definitely open to other schemes as well

@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch from 5473048 to b169682 Compare September 21, 2023 21:01
@github-actions
Copy link

Deploy preview for dagster-university ready!

✅ Preview
https://dagster-university-ifjestnn3-elementl.vercel.app
https://al-09-18--runless-report-asset-materialization-endpoint.dagster-university.dagster-docs.io

Built with commit b169682.
This pull request is being automatically deployed with vercel-action

@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch 3 times, most recently from f835959 to 27696c0 Compare September 21, 2023 21:37
@alangenfeld alangenfeld force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch from 27696c0 to 7ee1775 Compare September 21, 2023 21:44
Comment on lines 453 to 454
# things that are missing on ExtContext.report_asset_materialization
KNOWN_DIFF = {"partition", "description"}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these get added to ExtContext.report_asset_materialization or is the premise of keeping these things in alignment not right?

Comment on lines +248 to +256
tags = None
if ReportAssetMatParam.data_version in json_body:
tags = {
DATA_VERSION_TAG: json_body[ReportAssetMatParam.data_version],
DATA_VERSION_IS_USER_PROVIDED_TAG: "true",
}
elif ReportAssetMatParam.data_version in request.query_params:
tags = {
DATA_VERSION_TAG: request.query_params[ReportAssetMatParam.data_version],
DATA_VERSION_IS_USER_PROVIDED_TAG: "true",
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smackesey to confirm this data_version handling

@alangenfeld alangenfeld dismissed schrockn’s stale review September 21, 2023 21:46

for conversing on report API sync

@Ramshackle-Jamathon Ramshackle-Jamathon force-pushed the al/09-18-_runless_report_asset_materialization_endpoint branch from 7ee1775 to 22219ad Compare October 3, 2023 18:31
@alangenfeld alangenfeld merged commit cedfc82 into master Oct 6, 2023
@alangenfeld alangenfeld deleted the al/09-18-_runless_report_asset_materialization_endpoint branch October 6, 2023 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants