Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCRIPT_RUN_ROLLBACK failed when executing multiple SCRIPT_RUN stages. #5163

Open
ffjlabo opened this issue Aug 28, 2024 · 3 comments
Open

SCRIPT_RUN_ROLLBACK failed when executing multiple SCRIPT_RUN stages. #5163

ffjlabo opened this issue Aug 28, 2024 · 3 comments
Assignees
Labels
kind/bug Something isn't working

Comments

@ffjlabo
Copy link
Member

ffjlabo commented Aug 28, 2024

What happened:

If you perform a rollback with multiple Script Runs specified, the execution of the SCRIPT_RUN_ROLLBACK stage will fail.

dp-sig-build__Channel__-_CyberAgent_-_5_new_items_-_Slack

What you expected to happen:

Successfully finish executing the SCRIPT_RUN_ROLLBACK stage.

How to reproduce it:

Execute the deployment with multiple SCRIPT_RUN stage, and cancel after some of them are in the executing.

apiVersion: pipecd.dev/v1beta1
kind: KubernetesApp
spec:
  name: script-run-like-jenkins
  labels:
    env: example
    team: product
  pipeline:
    stages:
      - name: SCRIPT_RUN
        with:
          run: |
            sh script.sh
          onRollback: |
            echo rollback
      - name: SCRIPT_RUN
        with:
          run: |
            sleep 10
            sh script.sh
          onRollback: |
            echo $SR_DEPLOYMENT_ID
            echo $SR_APPLICATION_ID
            echo $SR_APPLICATION_NAME
            echo $SR_TRIGGERED_AT
            echo $SR_TRIGGERED_COMMIT_HASH
            echo $SR_REPOSITORY_URL
            echo $SR_SUMMARY
            echo $SR_CONTEXT_RAW
            sh script.sh
      - name: SCRIPT_RUN
        with:
          run: |
            sleep 10
            sh script.sh

Environment:

  • piped version:
  • control-plane version:
  • Others:
@ffjlabo ffjlabo added the kind/bug Something isn't working label Aug 28, 2024
@ffjlabo
Copy link
Member Author

ffjlabo commented Aug 30, 2024

[root cause]
The error occurs when piped tries to store the stage log to the completed SCRIPT_RUN_ROLLBACK stage.

piped identifies the target stage with stage ID to store the stage log.

The ID of the PredefinedStage is the const value.

  • var predefinedStages = map[string]config.PipelineStage{
    PredefinedStageK8sSync: {
    ID: PredefinedStageK8sSync,
    Name: model.StageK8sSync,
    Desc: "Sync by applying all manifests",
    },
    PredefinedStageTerraformSync: {
    ID: PredefinedStageTerraformSync,
    Name: model.StageTerraformSync,
    Desc: "Sync by automatically applying any detected changes",
    },
    PredefinedStageCloudRunSync: {
    ID: PredefinedStageCloudRunSync,
    Name: model.StageCloudRunSync,
    Desc: "Deploy the new version and configure all traffic to it",
    },
    PredefinedStageLambdaSync: {
    ID: PredefinedStageLambdaSync,
    Name: model.StageLambdaSync,
    Desc: "Deploy the new version and configure all traffic to it",
    },
    PredefinedStageECSSync: {
    ID: PredefinedStageECSSync,
    Name: model.StageECSSync,
    Desc: "Deploy the new version and configure all traffic to it",
    },
    PredefinedStageRollback: {
    ID: PredefinedStageRollback,
    Name: model.StageRollback,
    Desc: "Rollback the deployment",
    },
    PredefinedStageCustomSyncRollback: {
    ID: PredefinedStageCustomSyncRollback,
    Name: model.StageCustomSyncRollback,
    Desc: "Rollback the custom stages",
    },
    PredefinedStageScriptRunRollback: {
    ID: PredefinedStageScriptRunRollback,
    Name: model.StageScriptRunRollback,
    Desc: "Rollback the script run stage",
    },
    }

So if there are multiple predefined stages, piped refers the completed one.

@ffjlabo
Copy link
Member Author

ffjlabo commented Sep 6, 2024

I tried to add suffix to the stageID for SCRIPTRUN_ROLLBACK stage like this.
7a475a6

But it failed when rollback.
PipeCD

The error comes from finding the stage config with stageID on the executing stage.

// Load the stage configuration.
var stageConfig config.PipelineStage
var stageConfigFound bool
if ps.Predefined {
stageConfig, stageConfigFound = pln.GetPredefinedStage(ps.Id)
} else {
stageConfig, stageConfigFound = s.genericApplicationConfig.GetStage(ps.Index)
}
if !stageConfigFound {
lp.Error("Unable to find the stage configuration")
if err := s.reportStageStatus(ctx, ps.Id, model.StageStatus_STAGE_FAILURE, ps.Requires); err != nil {
s.logger.Error("failed to report stage status", zap.Error(err))
}
return model.StageStatus_STAGE_FAILURE
}

@ffjlabo
Copy link
Member Author

ffjlabo commented Sep 6, 2024

Currently, the SCRIPT_RUN_ROLLBACK stage is a predefined stage, and it is assumed that there are multiple in the pipeline.
But we should modify the spec to execute only one SCRIPT_RUN_ROLLBACK because of the reason below.

  • When storing the stage log, the target stage is identified by the stage ID.
  • If there are multiple stages with the same ID, if one completes, writing to the other stages will fail. This is because the stage log cannot be updated once it is completed.
  • The config of the predefined stage is identified by unique value. So we can't modify stage ID.

@ffjlabo ffjlabo added this to ROADMAP Oct 4, 2024
@ffjlabo ffjlabo moved this to 📋 New in ROADMAP Oct 4, 2024
@ffjlabo ffjlabo self-assigned this Oct 23, 2024
@ffjlabo ffjlabo added this to v0.50.0 Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Status: 🤔 In voting
Status: Todo
Development

Successfully merging a pull request may close this issue.

1 participant