Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional retry to workflow steps #1015

Open
osterman opened this issue Feb 4, 2025 · 0 comments
Open

Add optional retry to workflow steps #1015

osterman opened this issue Feb 4, 2025 · 0 comments

Comments

@osterman
Copy link
Member

osterman commented Feb 4, 2025

Describe the Feature

Steps in a workflow may need to get retried. We should support a configurable and optional retry mechanism with backoffs.

retry:
  max_attempts: 5       # Maximum number of retry attempts
  backoff_strategy: exponential # Options: "exponential", "constant", "linear"
  initial_delay: 2s     # Initial delay between retries (supports "ms", "s", "m", etc.)
  max_delay: 30s        # Maximum delay between retries
  random_jitter: true   # Whether to add jitter to backoff times
  multiplier: 2.0       # Multiplier for exponential backoff
  max_elapsed_time: 5m  # Maximum time to spend retrying before giving up

Expected Behavior

When a step exits non-zero, and a retry is configured, the step will be retried up until max_attempts. If it still fails, the workflow will exit non-zero.

Use Case

  • Run some raw terraform commands (e.g. terraform import) inside of a component directory.
  • Run some commands in /tmp (e.g. to download files and unpack them)

Describe Ideal Solution

Use https://github.com/cenkalti/backoff

workflows:
  test-1:
    description: "Test workflow"
    steps:
      - command: echo Command 1
        name: step1
        type: shell
      
        # All parameters are optional
        retry:
          max_attempts: 5       # Maximum number of retry attempts
          backoff_strategy: exponential # Options: "exponential", "constant", "linear"
          initial_delay: 2s     # Initial delay between retries (supports "ms", "s", "m", etc.)
          max_delay: 30s        # Maximum delay between retries
          random_jitter: true   # Whether to add jitter to backoff times
          multiplier: 2.0       # Multiplier for exponential backoff
          max_elapsed_time: 5m  # Maximum time to spend retrying before giving up

Or be able to set retries at the workflow level.

workflows:
  test-1:
    description: "Test workflow"
    # All parameters are optional
    retry:
      max_attempts: 5       # Maximum number of retry attempts
      backoff_strategy: exponential # Options: "exponential", "constant", "linear"
      initial_delay: 2s     # Initial delay between retries (supports "ms", "s", "m", etc.)
      max_delay: 30s        # Maximum delay between retries
      random_jitter: true   # Whether to add jitter to backoff times
      multiplier: 2.0       # Multiplier for exponential backoff
      max_elapsed_time: 5m  # Maximum time to spend retrying before giving up
    steps:
      - command: echo Command 1
        name: step1
        type: shell

Alternatives Considered

We could find some retry command that could be called in the shell step. However, that would require that command to be installed, along with proper error handling.

Additional Context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant