Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancel Running Simulations #1243

Merged
merged 7 commits into from
Nov 21, 2023
Merged

Cancel Running Simulations #1243

merged 7 commits into from
Nov 21, 2023

Conversation

Mythicaeda
Copy link
Contributor

@Mythicaeda Mythicaeda commented Nov 14, 2023

Description

Adds a SimulationCancelListener that, upon receiving a signal to cancel a simulation from the DB, sets a flag to tell the Driver to stop before it either picks up or begins to perform more work. With this change, the only time a simulation will not be stoppable aside from shutting down the container is if the user code contains an infinite loop. It also alters how the canceled flag is set in the DB to accurately reflect that "if this flag is set, then the simulation was canceled".

Breakdown of commits:

  • Commit 1 removes our auto-cancel triggers and sends a notification to the simulation_cancel channel when a simulation is canceled. The auto-cancel triggers were removed in order to give the users more control over when their simulations are canceled. This does mean that if anything is updated between a pending request and when it is picked up, it will now fail with an "Request is no longer relevant" message.
  • Commit 2 adds Java code to listen for the new notification added in (1). Rather than using a separate DB connection for this, it was added to the list of subscribed channels in the Worker's Listeners.
  • Commit 3 threads this listener through to the SimulationDriver. This is a breaking change, because it alters the signature of simulate.
  • Commit 4 alters the behavior of the Driver to check if the simulation has been canceled before it fetches or runs a batch, and, if it has, to return the results up until when it was canceled with its reason field set to how far into simulation it was (using the same JSON structure as when simulation fails with an exception).
  • Commits 5-7 update our tests.

Verification

  • A new e2e test was added to check that canceling works and returns incomplete results. This test uses the foo model.
  • DB Tests checking the removed triggers were also removed
  • SimulationDatasets in our e2eTests now contain the information from the status and reason columns.
  • Unit tests have been updated with the new simulate syntax. All of them were set to be uncancelable.

Documentation

No docs need to be updated.

Future work

  • Is there a generic name for SimulationFailure? Prior to this PR, it was exclusively being used to fill in the reason column for failed simulations, but it's now also filling in the reason column for canceled sims.
  • Scheduling ticket: Need a way to stop a scheduling run #1241
  • UI needs to be updated to 1) allow canceling an inprogress simulation and 2) to properly display what a canceled simulation looks like (Useful note: canceled sims can only be incomplete or pending. A failed or success sim with canceled set to true implies that the simulation either finished or failed before it received the canceled signal)

@Mythicaeda Mythicaeda requested a review from a team as a code owner November 14, 2023 01:02
@Mythicaeda Mythicaeda added breaking change A change that will require updating downstream code feature A new feature or feature request simulation Anything related to the simulation domain labels Nov 14, 2023
@Mythicaeda Mythicaeda force-pushed the feat/cancel-simulation branch from 5e9800f to c0293dd Compare November 14, 2023 01:13
@Mythicaeda Mythicaeda force-pushed the feat/cancel-simulation branch from c0293dd to 67e9505 Compare November 14, 2023 17:47
@Mythicaeda Mythicaeda force-pushed the feat/cancel-simulation branch from 67e9505 to ec4e7c2 Compare November 14, 2023 23:15
Copy link
Collaborator

@mattdailis mattdailis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! No major concerns

Key:
🔻 Issues to address before merge

🔶 Requests that should not block merge, but should at least be discussed

🔵 Recommendations that can be ignored if desired

Copy link
Collaborator

@mattdailis mattdailis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick manual test, re-approving 👍

@Mythicaeda Mythicaeda merged commit 9fa3cd4 into develop Nov 21, 2023
@Mythicaeda Mythicaeda deleted the feat/cancel-simulation branch November 21, 2023 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change A change that will require updating downstream code feature A new feature or feature request simulation Anything related to the simulation domain
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need a way to stop a running simulation
3 participants