Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a "scheduled_execution_timestamp" field to schedule tick data #26822

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

gibsondan
Copy link
Member

Summary:
This paves the way to make schedule retry behavior more like sensor retry behavior. Right now if a schedule fails, it retries for the same time using the same tick, and the timestamp of that tick is synonymous with the scheduled time in the cron schedule. This differs from sensor retries. This divergence in behavior is confusing and leads to issues like schedule failure alerts leading nowhere, since the schedule tick retried and succeeded.

By adding this field, we can eventually change this behavior so that a retried schedule tick keeps the scheduled_execution_timestamp field, but updates the timestamp field and creates a new tick on the timeline.

Insert changelog entry or delete this section.

Summary & Motivation

How I Tested These Changes

Changelog

Insert changelog entry or delete this section.

> Insert changelog entry or delete this section.
> Insert changelog entry or delete this section.
@gibsondan gibsondan changed the title Add a "scheduled_execution_timestamp" field to scheudle tick data Add a "scheduled_execution_timestamp" field to schedule tick data Jan 3, 2025
@gibsondan gibsondan force-pushed the consecutivefailurecount2 branch from 0c2a6ef to ae401a6 Compare January 3, 2025 22:34
Copy link

github-actions bot commented Jan 3, 2025

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-q0wdap46u-elementl.vercel.app
https://consecutivefailurecount2.core-storybook.dagster-docs.io

Built with commit 55088ae.
This pull request is being automatically deployed with vercel-action

Summary:
This paves the way to make schedule retry behavior more like sensor retry behavior. Right now if a schedule fails, it retries for the same time using the same tick, and the timestamp of that tick is synonymous with the scheduled time in the cron schedule. This differs from sensor retries. This divergence in behavior is confusing and leads to issues like schedule failure alerts leading nowhere, since the schedule tick retried and succeeded.

By adding this field, we can eventually change this behavior so that a retried schedule tick keeps the scheduled_execution_timestamp field, but updates the timestamp field and creates a new tick on the timeline.

> Insert changelog entry or delete this section.
@gibsondan gibsondan force-pushed the consecutivefailurecount2 branch from ae401a6 to 55088ae Compare January 6, 2025 22:13
@gibsondan gibsondan requested review from alangenfeld and prha January 6, 2025 22:13
@@ -551,6 +563,7 @@ def automation_condition_evaluation_id(self) -> int:
"instigator_name": "job_name",
"instigator_type": "job_type",
},
skip_when_none_fields={"scheduled_execution_time"},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is optional but i figured why bloat the serdes object for every sensor tick

@prha
Copy link
Member

prha commented Jan 7, 2025

I want to sanity check some assumptions... can there be an unrelated tick for a different scheduled execution time in between a failed schedule tick and its retry?

I'm asking because of this tick fetch, which will always fetch by timestamp and want to verify that we can't get into some perverse state when schedule_execution_time no longer matches timestamp.

@gibsondan
Copy link
Member Author

no, for a given schedule we only ever move forwards in.. scheduled execution time.

it's not quite ready yet, but the next step here will be #26824 - which starts writing the current time instead of the scheduled execution time, so retried ticks will have the same scheduled execution time but a different timestamp. But we would only ever do the most recent scheduled execution timestamp

@gibsondan gibsondan force-pushed the consecutivefailurecount branch 3 times, most recently from 3934081 to df6681f Compare January 12, 2025 22:30
Base automatically changed from consecutivefailurecount to master January 13, 2025 03:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants