Skip to content

Production checklist v1 #691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Production checklist v1 #691

wants to merge 12 commits into from

Conversation

justinegeffen
Copy link
Contributor

@justinegeffen justinegeffen commented Jul 3, 2025

WIP first draft of production checklist to cover infrastructure, retry strategy, security, and Platform limitations.

Outstanding

  • IAM policies content (to be added to Platform docs and linked)
  • Platform limitations section expansion
  • Security recommendations

@netlify /platform-cloud/getting-started/production-checklist

Copy link

netlify bot commented Jul 3, 2025

Deploy Preview for seqera-docs ready!

Name Link
🔨 Latest commit 91a145a
🔍 Latest deploy log https://app.netlify.com/projects/seqera-docs/deploys/68779d67823188000807c74c
😎 Deploy Preview https://deploy-preview-691--seqera-docs.netlify.app/platform-cloud/getting-started/production-checklist
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@justinegeffen justinegeffen added 1. Editor review Needs a language review 1. Dev/PM/SME Needs a review by a Dev/PM/SME labels Jul 11, 2025
@justinegeffen
Copy link
Contributor Author

@gavinelder, would really appreciate your input on the infra section. The goal is to keep iterating on this document and this is v1.

Stylistic and language changes

Signed-off-by: Justine Geffen <[email protected]>
Signed-off-by: Justine Geffen <[email protected]>
Signed-off-by: Justine Geffen <[email protected]>

Infrastructural requirements differ widely based on the workload you’re expecting.

To begin with, build out a proof of concept using the below recommendations, to create a baseline of your capacity requirements. Once you’re ready to move to production, take into account the increased workload you’d expect. Here are some starting points for estimating compute and database requirements.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any examples we can provide as a good template for getting started?


These autoscale for pipeline runs, but the sizing recommendation will be based on the workload and can vary significantly based on the number of pipelines, and number of concurrent processes, you have in mind. Consult the [Azure autoscaling documentation](https://learn.microsoft.com/en-us/azure/azure-monitor/autoscale/autoscale-get-started) for information about scaling in Azure.

## Spot instance retry strategy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we provide an example here or link to one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tutorial we link to does cover the retry strategy but you make a good point. Maybe this would be better as a list with links to sections within the tutorial.


## Seqera Platform limitations

When cancelling large runs, make sure to check that all jobs were killed in your compute environment (ZOMBIE JOBS). Large runs sometimes leak jobs because Nextflow is killed before it can cancel all of them, which can lead to significant cost overruns.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I do this as a user? How do I know what a zombie job is?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1. Dev/PM/SME Needs a review by a Dev/PM/SME 1. Editor review Needs a language review draft/WIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants