-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deploying to kubernetes guide #23978
deploying to kubernetes guide #23978
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @jamiedemaria and the rest of your teammates on Graphite |
Deploy preview for dagster-docs-beta ready! ✅ Preview Built with commit 0fa09ef. |
Deploy preview for dagster-docs ready! Preview available at https://dagster-docs-ehciadzx8-elementl.vercel.app Direct link to changed pages: |
docker build . -t iris_analysis:1 | ||
``` | ||
This builds the Docker image from Step 2.1 and gives it the name `iris_analysis` and tag `1`. You can set custom values for both the name and the tag. We recommend that each time you rebuild your Docker image, you assign a new value for the tag to ensure that the correct image is used when running your code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the tech reviewer: i'd like to spot check that this statement is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paging dr @jmsanders
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. I also wonder how much prior knowledge we want to assume here? Or how much we want to offload some of the pre-req stuff to third party docs (specifically the Docker and/or K8s docs)?
I think it's fair to assume if you're going to manage your own open source deployment of Dagster on K8s, you:
- have basic familiarity with containers
- have basic familiarity with k8s
- or are comfortable reading up on those two things first
The `values.yaml` file contains configuration options you can set for your deployment. There are comments in the `values.yaml` file explaining these options, and you can learn more about them [here](/todo). | ||
|
||
The minimal configuration options you need to set in order to deploy your project are the `deployments.name`, `deployments.image`, and `deployments.dagsterApiGrpcArgs` values. `deployments.name` should be a unique name for your deployment, and `deployments.image` should be set to match the Docker image you built and pushed in Step 2. `dagsterApiGrpcArgs` should be set to NEED HELP WITH HOW TO EXPLAIN THIS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is the minimal set of options you need to set (at least it worked for me locally)
Also could use some help with explaining what to set the dagsterApiGrpcArgs
values to. Like how should a user decide what to set these as? it looks to me like it should follow the workspace.yaml
file but i dont know if that's true in all cases
|
||
:::note | ||
If you are running an older version of Dagster, pass the --version flag to `helm upgrade` with the version of Dagster you are running. For example, if you are running `dagster==1.7.4` you'll run the command `helm upgrade --install dagster dagster/dagster -f /path/to/values.yaml --version 1.7.4` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the tech reviewer: I want to confirm that this is the right command to pass --version
to. The existing guide is not very specific
|
||
## Next steps | ||
- Forwarding Dagster logs from a Kubernetes deployment to AWS, Azure, GCP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just me jotting down ideas of things we could link to. Other ideas
- guide on setting up your project to do k8s-job-per-step
- k8s config in tags
0eb03ea
to
89c2ddc
Compare
Graphite Automations"docs-beta - Assign Reviewers" took an action on this PR • (08/27/24)3 reviewers were added to this PR based on Pedram Navid's automation. |
89c2ddc
to
d2d2a98
Compare
c58b8c5
to
1f7047d
Compare
57c2a34
to
0fa09ef
Compare
@gibsondan are you the right platform person to review this? anyone else i should loop in? |
--- | ||
|
||
Deploying Dagster to a Kubernetes cluster is a popular option for maintaining a production Dagster deployment. Dagster uses Helm, a package manager for Kubernetes applications, to help manage deploying Dagster to a Kubernetes cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a production Dagster deployment
I think we should position this (and other OSS deployment) docs as reference/examples for how to run the Dagster-specific parts of a productionized Dagster deployment.
Productionizing it for real will probably require additional decisions around networking, access control, running persistent data stores, monitoring, etc. that are all out of scope for the Dagster project and have too large of a decision matrix for us to fully capture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree. The goal of this guide shouldnt be to teach you how to build production systems, but rather, what the components of a Dagster production deployment are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i can reframe this intro. Do you think there are other systemic changes that need to be made to the doc to position it this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is a doc that discusses thing like networking, access control data stores, etc something that we plan to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very nice, left a few comments!
- `deployments.name`, which should be a unique name for your deployment | ||
- `deployments.image.name` and `deployments.image.tag`, which should be set to match the Docker image from Step 2 | ||
- `deployments.dagsterApiGrpcArgs`, which should be set to NEED HELP WITH HOW TO EXPLAIN THIS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.dagster.io/concepts/code-locations/workspace-files#running-your-own-grpc-server shows the arguments here - it's the arguments you would pass into the dagster api grpc
command to spin up a code server. The helm chart does this for you. It's unfortunate that a user has to think about the word "grpc" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gibsondan what do you think of the description i ended up with?
|
||
:::note | ||
If you are running an older version of Dagster, pass the `--version` flag to `helm upgrade` with the version of Dagster you are running. For example, if you are running `dagster==1.7.4` you'll run the command `helm upgrade --install dagster dagster/dagster -f /path/to/values.yaml --version 1.7.4` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is specifically the version of dagster that the system components will use like the webserver and the daemon (you can have your code running on earlier versions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do they need to pass a version flag when copying the values.yaml
too then?
0fa09ef
to
e70536e
Compare
e70536e
to
19310e2
Compare
19310e2
to
dff7acc
Compare
I have to head out for the day (and I'm out next week). I give anyone full license to make changes, merge this PR, etc. I'm happy w the current state and don't have any changes I plan to make |
I'm reviewing this and will approve as soon as I'm done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - approving.
I left one minor technical comment. Other comments are just around style and content so can be ignored in favor of the docs-team reviews.
## Step 1: Write and build a Docker image containing your Dagster project | ||
### Step 1.1: Write a Dockerfile | ||
Next, you'll build a Docker image that contains your Dagster project and all of its dependencies. The Dockerfile should: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should mention where exactly to put this Dockerfile relative to this project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the recommendation? I put mine at the top level, but i dont know if that's best practices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, also must be top level for the COPY command to work.
Before you can deploy Dagster, you need to configure `kubectl` to develop against the Kubernetes cluster where you want Dagster to be deployed. | ||
|
||
If you are using Docker Desktop and the included Kubernetes server, you will need to create a context first. If you already have a Kubernetes cluster and context created for your Dagster deployment you can skip running this command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we can assume the user knows the basics of kubernetes here and dont need special instructions for docker desktop. We should just have them run use-context
below.
To deploy your project, you'll need to set the following options: | ||
- `dagster-user-deployments.deployments.name`, which should be a unique name for your deployment | ||
- `dagster-user-deployments.deployments.image.name` and `dagster-user-deployments.deployments.image.tag`, which should be set to match the Docker image from Step 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe include the literal iris_analysis:1
?
|
||
|
||
WORKDIR /iris_analysis/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technical comment (minor):
I think we should just have users install their project directly using a
pip install ./iris_analysis
Instead of installing dependencies dagster
and pandas
explicitly. The advantage is as they add more dependencies they dont need to modify their Dockerfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beefy
Summary & Motivation
Guide for how to deploy dagster OSS to a kubernetes cluster. The guide is based on the existing guide here and my own notes i took when deploying to a k8s cluster a while ago. I tried to strip out anything from the existing guide that wasn't strictly necessary, but that leaves some content that might need to be put in other guides like:
I made a new example project to deploy because the existing guide is op/jobs based and also requires things like an S3 account which felt like unnecessary setup for this guide. it's a super simple project, but i figured that the content of the project isn't the main point
Leaving my other thoughts/questions etc in PR comments
How I Tested These Changes
Changelog [New | Bug | Docs]
NOCHANGELOG