deploying to kubernetes guide #23978

jamiedemaria · 2024-08-27T19:30:54Z

Summary & Motivation

Guide for how to deploy dagster OSS to a kubernetes cluster. The guide is based on the existing guide here and my own notes i took when deploying to a k8s cluster a while ago. I tried to strip out anything from the existing guide that wasn't strictly necessary, but that leaves some content that might need to be put in other guides like:

adding secrets (used to connect to S3 in the existing guide)
setting up your jobs to have a kubernetes job per step

I made a new example project to deploy because the existing guide is op/jobs based and also requires things like an S3 account which felt like unnecessary setup for this guide. it's a super simple project, but i figured that the content of the project isn't the main point

Leaving my other thoughts/questions etc in PR comments

How I Tested These Changes

Changelog [New | Bug | Docs]

NOCHANGELOG

jamiedemaria · 2024-08-27T19:31:11Z

deploying to kubernetes guide #23978 👈
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @jamiedemaria and the rest of your teammates on Graphite

github-actions · 2024-08-27T19:33:13Z

Deploy preview for dagster-docs-beta ready!

✅ Preview
https://dagster-docs-beta-c7o4jcjid-elementl.vercel.app
https://jamie-doc-369-write-deploying-oss-towith-kubernetes.dagster.dagster-docs.io

Built with commit 0fa09ef.
This pull request is being automatically deployed with vercel-action

github-actions · 2024-08-27T19:35:31Z

Deploy preview for dagster-docs ready!

Preview available at https://dagster-docs-ehciadzx8-elementl.vercel.app
https://jamie-doc-369-write-deploying-oss-towith-kubernetes.dagster.dagster-docs.io

Direct link to changed pages:

docs/docs-beta/docs/guides/deployment/kubernetes.md

jamiedemaria · 2024-08-27T19:47:16Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+docker build . -t iris_analysis:1
+```
+This builds the Docker image from Step 2.1 and gives it the name `iris_analysis` and tag `1`. You can set custom values for both the name and the tag. We recommend that each time you rebuild your Docker image, you assign a new value for the tag to ensure that the correct image is used when running your code.


for the tech reviewer: i'd like to spot check that this statement is correct

paging dr @jmsanders

That is correct. I also wonder how much prior knowledge we want to assume here? Or how much we want to offload some of the pre-req stuff to third party docs (specifically the Docker and/or K8s docs)?

I think it's fair to assume if you're going to manage your own open source deployment of Dagster on K8s, you:

have basic familiarity with containers

have basic familiarity with k8s

or are comfortable reading up on those two things first

docs/docs-beta/docs/guides/deployment/kubernetes.md

jamiedemaria · 2024-08-27T19:51:26Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+The `values.yaml` file contains configuration options you can set for your deployment. There are comments in the `values.yaml` file explaining these options, and you can learn more about them [here](/todo).
+
+The minimal configuration options you need to set in order to deploy your project are the `deployments.name`, `deployments.image`, and `deployments.dagsterApiGrpcArgs` values. `deployments.name` should be a unique name for your deployment, and `deployments.image` should be set to match the Docker image you built and pushed in Step 2. `dagsterApiGrpcArgs` should be set to NEED HELP WITH HOW TO EXPLAIN THIS.


I believe this is the minimal set of options you need to set (at least it worked for me locally)

Also could use some help with explaining what to set the dagsterApiGrpcArgs values to. Like how should a user decide what to set these as? it looks to me like it should follow the workspace.yaml file but i dont know if that's true in all cases

docs/docs-beta/docs/guides/deployment/kubernetes.md

jamiedemaria · 2024-08-27T19:54:00Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+
+:::note
+If you are running an older version of Dagster, pass the --version flag to `helm upgrade` with the version of Dagster you are running. For example, if you are running `dagster==1.7.4` you'll run the command `helm upgrade --install dagster dagster/dagster -f /path/to/values.yaml --version 1.7.4`


for the tech reviewer: I want to confirm that this is the right command to pass --version to. The existing guide is not very specific

docs/docs-beta/docs/guides/deployment/kubernetes.md

jamiedemaria · 2024-08-27T19:56:41Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+
+## Next steps
+- Forwarding Dagster logs from a Kubernetes deployment to AWS, Azure, GCP


this is just me jotting down ideas of things we could link to. Other ideas

guide on setting up your project to do k8s-job-per-step

k8s config in tags

graphite-app · 2024-08-27T20:05:29Z

Graphite Automations

"docs-beta - Assign Reviewers" took an action on this PR • (08/27/24)

3 reviewers were added to this PR based on Pedram Navid's automation.

docs/docs-beta/docs/guides/deployment/kubernetes.md

jamiedemaria · 2024-08-29T16:22:02Z

@gibsondan are you the right platform person to review this? anyone else i should loop in?

jmsanders · 2024-08-29T21:33:08Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+---
+
+Deploying Dagster to a Kubernetes cluster is a popular option for maintaining a production Dagster deployment. Dagster uses Helm, a package manager for Kubernetes applications, to help manage deploying Dagster to a Kubernetes cluster.


a production Dagster deployment

I think we should position this (and other OSS deployment) docs as reference/examples for how to run the Dagster-specific parts of a productionized Dagster deployment.

Productionizing it for real will probably require additional decisions around networking, access control, running persistent data stores, monitoring, etc. that are all out of scope for the Dagster project and have too large of a decision matrix for us to fully capture.

I think I agree. The goal of this guide shouldnt be to teach you how to build production systems, but rather, what the components of a Dagster production deployment are.

ok i can reframe this intro. Do you think there are other systemic changes that need to be made to the doc to position it this way?

is a doc that discusses thing like networking, access control data stores, etc something that we plan to do?

gibsondan

This looks very nice, left a few comments!

docs/docs-beta/docs/guides/deployment/kubernetes.md

gibsondan · 2024-08-29T22:24:15Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+- `deployments.name`, which should be a unique name for your deployment
+- `deployments.image.name` and `deployments.image.tag`, which should be set to match the Docker image from Step 2
+- `deployments.dagsterApiGrpcArgs`, which should be set to NEED HELP WITH HOW TO EXPLAIN THIS.


https://docs.dagster.io/concepts/code-locations/workspace-files#running-your-own-grpc-server shows the arguments here - it's the arguments you would pass into the dagster api grpc command to spin up a code server. The helm chart does this for you. It's unfortunate that a user has to think about the word "grpc" here.

@gibsondan what do you think of the description i ended up with?

docs/docs-beta/docs/guides/deployment/kubernetes.md

gibsondan · 2024-08-29T22:25:38Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+
+:::note
+If you are running an older version of Dagster, pass the `--version` flag to `helm upgrade` with the version of Dagster you are running. For example, if you are running `dagster==1.7.4` you'll run the command `helm upgrade --install dagster dagster/dagster -f /path/to/values.yaml --version 1.7.4`


this is specifically the version of dagster that the system components will use like the webserver and the daemon (you can have your code running on earlier versions)

do they need to pass a version flag when copying the values.yaml too then?

examples/deploy_k8s_beta/workspace.yaml

jamiedemaria · 2024-08-30T20:38:58Z

I have to head out for the day (and I'm out next week). I give anyone full license to make changes, merge this PR, etc. I'm happy w the current state and don't have any changes I plan to make

cc @erinkcochran87

shalabhc · 2024-08-30T20:47:55Z

I'm reviewing this and will approve as soon as I'm done.

shalabhc

Looks good - approving.
I left one minor technical comment. Other comments are just around style and content so can be ignored in favor of the docs-team reviews.

shalabhc · 2024-08-30T21:25:37Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+## Step 1: Write and build a Docker image containing your Dagster project
+### Step 1.1: Write a Dockerfile
+Next, you'll build a Docker image that contains your Dagster project and all of its dependencies. The Dockerfile should:


I think we should mention where exactly to put this Dockerfile relative to this project.

what is the recommendation? I put mine at the top level, but i dont know if that's best practices

Yes, also must be top level for the COPY command to work.

shalabhc · 2024-08-30T21:29:24Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+Before you can deploy Dagster, you need to configure `kubectl` to develop against the Kubernetes cluster where you want Dagster to be deployed.
+
+If you are using Docker Desktop and the included Kubernetes server, you will need to create a context first. If you already have a Kubernetes cluster and context created for your Dagster deployment you can skip running this command.


I feel we can assume the user knows the basics of kubernetes here and dont need special instructions for docker desktop. We should just have them run use-context below.

shalabhc · 2024-08-30T21:31:32Z

docs/docs-beta/docs/guides/deployment/kubernetes.md

+To deploy your project, you'll need to set the following options:
+- `dagster-user-deployments.deployments.name`, which should be a unique name for your deployment
+- `dagster-user-deployments.deployments.image.name` and `dagster-user-deployments.deployments.image.tag`, which should be set to match the Docker image from Step 1


Maybe include the literal iris_analysis:1?

docs/docs-beta/docs/guides/deployment/kubernetes.md

shalabhc · 2024-08-30T21:40:33Z

examples/docs_beta_snippets/docs_beta_snippets/guides/deployment/kubernetes/Dockerfile

+
+
+WORKDIR /iris_analysis/
+


Technical comment (minor):
I think we should just have users install their project directly using a

pip install ./iris_analysis

Instead of installing dependencies dagster and pandas explicitly. The advantage is as they add more dependencies they dont need to modify their Dockerfile.

PedramNavid

beefy

jamiedemaria commented Aug 27, 2024

View reviewed changes

docs/docs-beta/docs/guides/deployment/kubernetes.md Outdated Show resolved Hide resolved