Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deploying to kubernetes guide #23978

Merged

Conversation

jamiedemaria
Copy link
Contributor

@jamiedemaria jamiedemaria commented Aug 27, 2024

Summary & Motivation

Guide for how to deploy dagster OSS to a kubernetes cluster. The guide is based on the existing guide here and my own notes i took when deploying to a k8s cluster a while ago. I tried to strip out anything from the existing guide that wasn't strictly necessary, but that leaves some content that might need to be put in other guides like:

  • adding secrets (used to connect to S3 in the existing guide)
  • setting up your jobs to have a kubernetes job per step

I made a new example project to deploy because the existing guide is op/jobs based and also requires things like an S3 account which felt like unnecessary setup for this guide. it's a super simple project, but i figured that the content of the project isn't the main point

Leaving my other thoughts/questions etc in PR comments

How I Tested These Changes

Changelog [New | Bug | Docs]

NOCHANGELOG

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @jamiedemaria and the rest of your teammates on Graphite Graphite

Copy link

github-actions bot commented Aug 27, 2024

Deploy preview for dagster-docs-beta ready!

✅ Preview
https://dagster-docs-beta-c7o4jcjid-elementl.vercel.app
https://jamie-doc-369-write-deploying-oss-towith-kubernetes.dagster.dagster-docs.io

Built with commit 0fa09ef.
This pull request is being automatically deployed with vercel-action

Copy link

github-actions bot commented Aug 27, 2024

Deploy preview for dagster-docs ready!

Preview available at https://dagster-docs-ehciadzx8-elementl.vercel.app
https://jamie-doc-369-write-deploying-oss-towith-kubernetes.dagster.dagster-docs.io

Direct link to changed pages:

docker build . -t iris_analysis:1
```
This builds the Docker image from Step 2.1 and gives it the name `iris_analysis` and tag `1`. You can set custom values for both the name and the tag. We recommend that each time you rebuild your Docker image, you assign a new value for the tag to ensure that the correct image is used when running your code.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the tech reviewer: i'd like to spot check that this statement is correct

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paging dr @jmsanders

Copy link
Contributor

@jmsanders jmsanders Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. I also wonder how much prior knowledge we want to assume here? Or how much we want to offload some of the pre-req stuff to third party docs (specifically the Docker and/or K8s docs)?

I think it's fair to assume if you're going to manage your own open source deployment of Dagster on K8s, you:

  • have basic familiarity with containers
  • have basic familiarity with k8s
  • or are comfortable reading up on those two things first

The `values.yaml` file contains configuration options you can set for your deployment. There are comments in the `values.yaml` file explaining these options, and you can learn more about them [here](/todo).

The minimal configuration options you need to set in order to deploy your project are the `deployments.name`, `deployments.image`, and `deployments.dagsterApiGrpcArgs` values. `deployments.name` should be a unique name for your deployment, and `deployments.image` should be set to match the Docker image you built and pushed in Step 2. `dagsterApiGrpcArgs` should be set to NEED HELP WITH HOW TO EXPLAIN THIS.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the minimal set of options you need to set (at least it worked for me locally)

Also could use some help with explaining what to set the dagsterApiGrpcArgs values to. Like how should a user decide what to set these as? it looks to me like it should follow the workspace.yaml file but i dont know if that's true in all cases


:::note
If you are running an older version of Dagster, pass the --version flag to `helm upgrade` with the version of Dagster you are running. For example, if you are running `dagster==1.7.4` you'll run the command `helm upgrade --install dagster dagster/dagster -f /path/to/values.yaml --version 1.7.4`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the tech reviewer: I want to confirm that this is the right command to pass --version to. The existing guide is not very specific


## Next steps
- Forwarding Dagster logs from a Kubernetes deployment to AWS, Azure, GCP
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just me jotting down ideas of things we could link to. Other ideas

  • guide on setting up your project to do k8s-job-per-step
  • k8s config in tags

@jamiedemaria jamiedemaria force-pushed the jamie/doc-369-write-deploying-oss-towith-kubernetes branch 2 times, most recently from 0eb03ea to 89c2ddc Compare August 27, 2024 19:59
@jamiedemaria jamiedemaria marked this pull request as ready for review August 27, 2024 20:00
@graphite-app graphite-app bot added the area: docs Related to documentation in general label Aug 27, 2024
Copy link

graphite-app bot commented Aug 27, 2024

Graphite Automations

"docs-beta - Assign Reviewers" took an action on this PR • (08/27/24)

3 reviewers were added to this PR based on Pedram Navid's automation.

@jamiedemaria jamiedemaria requested a review from gibsondan August 27, 2024 20:18
@jamiedemaria jamiedemaria force-pushed the jamie/doc-369-write-deploying-oss-towith-kubernetes branch from 89c2ddc to d2d2a98 Compare August 28, 2024 17:31
@jamiedemaria jamiedemaria force-pushed the jamie/doc-369-write-deploying-oss-towith-kubernetes branch 2 times, most recently from c58b8c5 to 1f7047d Compare August 28, 2024 19:32
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
@jamiedemaria jamiedemaria force-pushed the jamie/doc-369-write-deploying-oss-towith-kubernetes branch from 57c2a34 to 0fa09ef Compare August 28, 2024 20:30
@jamiedemaria
Copy link
Contributor Author

@gibsondan are you the right platform person to review this? anyone else i should loop in?

---

Deploying Dagster to a Kubernetes cluster is a popular option for maintaining a production Dagster deployment. Dagster uses Helm, a package manager for Kubernetes applications, to help manage deploying Dagster to a Kubernetes cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a production Dagster deployment

I think we should position this (and other OSS deployment) docs as reference/examples for how to run the Dagster-specific parts of a productionized Dagster deployment.

Productionizing it for real will probably require additional decisions around networking, access control, running persistent data stores, monitoring, etc. that are all out of scope for the Dagster project and have too large of a decision matrix for us to fully capture.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree. The goal of this guide shouldnt be to teach you how to build production systems, but rather, what the components of a Dagster production deployment are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i can reframe this intro. Do you think there are other systemic changes that need to be made to the doc to position it this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a doc that discusses thing like networking, access control data stores, etc something that we plan to do?

Copy link
Member

@gibsondan gibsondan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very nice, left a few comments!

docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved
- `deployments.name`, which should be a unique name for your deployment
- `deployments.image.name` and `deployments.image.tag`, which should be set to match the Docker image from Step 2
- `deployments.dagsterApiGrpcArgs`, which should be set to NEED HELP WITH HOW TO EXPLAIN THIS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.dagster.io/concepts/code-locations/workspace-files#running-your-own-grpc-server shows the arguments here - it's the arguments you would pass into the dagster api grpc command to spin up a code server. The helm chart does this for you. It's unfortunate that a user has to think about the word "grpc" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gibsondan what do you think of the description i ended up with?

docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved

:::note
If you are running an older version of Dagster, pass the `--version` flag to `helm upgrade` with the version of Dagster you are running. For example, if you are running `dagster==1.7.4` you'll run the command `helm upgrade --install dagster dagster/dagster -f /path/to/values.yaml --version 1.7.4`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is specifically the version of dagster that the system components will use like the webserver and the daemon (you can have your code running on earlier versions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do they need to pass a version flag when copying the values.yaml too then?

examples/deploy_k8s_beta/workspace.yaml Outdated Show resolved Hide resolved
@jamiedemaria jamiedemaria force-pushed the jamie/doc-369-write-deploying-oss-towith-kubernetes branch from 0fa09ef to e70536e Compare August 30, 2024 14:26
@jamiedemaria jamiedemaria requested a review from gibsondan August 30, 2024 15:42
@jamiedemaria jamiedemaria force-pushed the jamie/doc-369-write-deploying-oss-towith-kubernetes branch from e70536e to 19310e2 Compare August 30, 2024 18:39
@jamiedemaria jamiedemaria force-pushed the jamie/doc-369-write-deploying-oss-towith-kubernetes branch from 19310e2 to dff7acc Compare August 30, 2024 20:35
@jamiedemaria
Copy link
Contributor Author

I have to head out for the day (and I'm out next week). I give anyone full license to make changes, merge this PR, etc. I'm happy w the current state and don't have any changes I plan to make

cc @erinkcochran87

@shalabhc
Copy link
Contributor

I'm reviewing this and will approve as soon as I'm done.

Copy link
Contributor

@shalabhc shalabhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - approving.
I left one minor technical comment. Other comments are just around style and content so can be ignored in favor of the docs-team reviews.

## Step 1: Write and build a Docker image containing your Dagster project
### Step 1.1: Write a Dockerfile
Next, you'll build a Docker image that contains your Dagster project and all of its dependencies. The Dockerfile should:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should mention where exactly to put this Dockerfile relative to this project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the recommendation? I put mine at the top level, but i dont know if that's best practices

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, also must be top level for the COPY command to work.

Before you can deploy Dagster, you need to configure `kubectl` to develop against the Kubernetes cluster where you want Dagster to be deployed.

If you are using Docker Desktop and the included Kubernetes server, you will need to create a context first. If you already have a Kubernetes cluster and context created for your Dagster deployment you can skip running this command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we can assume the user knows the basics of kubernetes here and dont need special instructions for docker desktop. We should just have them run use-context below.

To deploy your project, you'll need to set the following options:
- `dagster-user-deployments.deployments.name`, which should be a unique name for your deployment
- `dagster-user-deployments.deployments.image.name` and `dagster-user-deployments.deployments.image.tag`, which should be set to match the Docker image from Step 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include the literal iris_analysis:1?



WORKDIR /iris_analysis/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technical comment (minor):
I think we should just have users install their project directly using a

pip install ./iris_analysis

Instead of installing dependencies dagster and pandas explicitly. The advantage is as they add more dependencies they dont need to modify their Dockerfile.

Copy link
Contributor

@PedramNavid PedramNavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beefy

@PedramNavid PedramNavid merged commit 4841b9e into master Aug 30, 2024
1 of 4 checks passed
@PedramNavid PedramNavid deleted the jamie/doc-369-write-deploying-oss-towith-kubernetes branch August 30, 2024 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: docs Related to documentation in general docathon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants