Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/job resume #1704

Merged
merged 117 commits into from
Mar 12, 2025
Merged

Fix/job resume #1704

merged 117 commits into from
Mar 12, 2025

Conversation

woksin
Copy link
Contributor

@woksin woksin commented Feb 12, 2025

Summary

Fixes a lot of issues internally with the JobSystem that should greatly impact the stability and resilience of the system.

Changed

  • JobState - cannot guarantee for backwards compatibility in terms of changing the value of the enums particularly
  • JobStepState - cannot guarantee for backwards compatibility in terms of changing the value of the enums particularly

Fixed

  • Resuming Jobs
  • Stopping / Pausing Jobs
  • Deleting Jobs
  • Starting Jobs
  • Resilience around jobs and persistence over system shutdown

… of state machine state. We were just lucky that when starting jobs that we had stored the correct state on the Observer most of the time.
…n integration test because it can occationally fail because GetState is interleaved method
…so rework Start so that it is waits for the whole process of starting all jobsteps to be finished before returning. This makes the code more simple and safe because we don't have to think about being in multiple grain contexts in a single logical method execution and we don't need to use task completion sources and task continue with manually anymore. We can do this now because I made JobsManager reentrant so that we can perform multiple actions on separate jobs at the same time, which in my opinion makes more sense
…ests and then it is nice to wait for the Observer to be finished processing messages
@woksin woksin added the major label Mar 12, 2025
@einari einari merged commit 90d67d8 into main Mar 12, 2025
11 of 12 checks passed
@einari einari deleted the fix/job-resume branch March 12, 2025 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants