Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-622 clean up Hybrid docs and add architecture diagram #26549

Merged
merged 9 commits into from
Dec 18, 2024
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Amazon ECS agent
sidebar_position: 50
sidebar_position: 30
---

import DocCardList from '@theme/DocCardList';
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,13 @@ sidebar_position: 10

The Hybrid architecture is the most flexible and secure way to deploy Dagster+. It allows you to run your user code in your environment while leveraging Dagster+'s infrastructure for orchestration and metadata management

<details>
<summary>Pre-requisites</summary>

Before you begin, you should have:

- A [Dagster+ account](/dagster-plus/getting-started)
- [Basic familiarity with Dagster](/getting-started/quickstart)

</details>

---

## Hybrid architecture overview

A **hybrid deployment** utilizes a combination of your infrastructure and Dagster-hosted backend services.

The Dagster backend services - including the web frontend, GraphQL API, metadata database, and daemons (responsible for executing schedules and sensors) - are hosted in Dagster+. You are responsible for running an [agent](/todo) in your environment.
The Dagster backend services - including the web frontend, GraphQL API, metadata database, and daemons (responsible for executing schedules and sensors) - are hosted in Dagster+. You are responsible for running an [agent](index.md#dagster-hybrid-agents) in your environment.

![Dagster+ Hybrid deployment architecture](/img/placeholder.svg)
![Dagster+ Hybrid deployment architecture](/images/dagster-cloud/deployment/hybrid-architecture.png)

Work is enqueued for your agent when:

Expand All @@ -35,27 +23,31 @@ The agent polls the agent API to see if any work needs to be done and launches u

All user code runs within your environment, in isolation from Dagster system code.

---

## The agent

Because the agent communicates with the Dagster+ control plane over the agent API, it's possible to support agents that operate in arbitrary compute environments.

This means that over time, Dagster+'s support for different user deployment environments will expand and custom agents can take advantage of bespoke compute environments such as HPC.

Refer to the [Agents documentation](/todo) for more info, including the agents that are currently supported.

---
See the [setup page](index.md#dagster-hybrid-agents) for a list of agents that are currently supported.

## Security

This section describes how Dagster+ interacts with user code. To summarize:
Dagster+ Hybrid relies on a shared security model.

The Dagster+ control plane is SOC 2 Type II certified and follows best practices such as:
- encrypting data at rest (AES 256) and in transit (TLS 1.2+)
- highly available, with disaster recovery and backup strategies
- only manages metadata such as pipeline names, execution status, and run duration

The execution environment is managed by the customer:
- Dagster+ doesn't have access to user code—your code never leaves your environment. Metadata about the code is fetched over constrained APIs.
- All connections to databases, file systems, and other resources are made from your environment.
- The execution environment only requires egress access to Dagster+. No ingress is required from Dagster+ to user environments.

- No ingress is required from Dagster+ to user environments
- Dagster+ doesn't have access to user code. Metadata about the code is fetched over constrained APIs.
- The Dagster+ agent is [open source and auditable](https://github.com/dagster-io/dagster-cloud)
Additionally, the Dagster+ agent is [open source and auditable](https://github.com/dagster-io/dagster-cloud)

These highlights are described in more detail below:
The following highlights are described in more detail below:

- [Interactions and queries](#interactions-and-queries)
- [Runs](#runs)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Docker agent
sidebar_position: 30
sidebar_position: 40
---

import DocCardList from '@theme/DocCardList';
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,25 +6,26 @@ sidebar_position: 20

In a Dagster+ Hybrid deployment, the orchestration control plane is run by Dagster+ while your Dagster code is executed within your environment.

[comment]: <> (TODO: Architecture diagram)
:::note
For an overview of the Hybrid design, including security considerations, see [Dagster+ Hybrid architecture](architecture.md).
:::

## Get started

To get started with a Hybrid deployment you'll need to:
To get started with a Hybrid deployment, you'll need to:

1. Create a [Dagster+ organization](https://dagster.cloud/signup)
2. Install a Dagster+ Hybrid Agent
2. [Install a Dagster+ Hybrid agent](#dagster-hybrid-agents)
3. [Add a code location](/dagster-plus/deployment/code-locations), typically using a Git repository and CI/CD

## Dagster+ Hybrid agents

The Dagster+ agent is a long-lived process that polls Dagster+'s API servers for new work.
The Dagster+ agent is a long-lived process that polls Dagster+'s API servers for new work. Currently supported agents include:

See the following guides for setting up an agent:
- [Kubernetes](/dagster-plus/deployment/deployment-types/hybrid/kubernetes)
- [AWS ECS](/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/new-vpc)
- [Docker](/dagster-plus/deployment/deployment-types/hybrid/docker)
- [Locally](/dagster-plus/deployment/deployment-types/hybrid/local)
- [Local agent](/dagster-plus/deployment/deployment-types/hybrid/local)


## What you'll see in your environment
Expand All @@ -44,20 +45,10 @@ When a run needs to be launched, Dagster+ enqueues instructions for your agent t

Your agent will send Dagster+ metadata letting us know the run has been launched. Your run's container will also send Dagster+ metadata informing us of how the run is progressing. The Dagster+ backend services will monitor this stream of metadata to make additional orchestration decisions, monitor for failure, or send alerts.

## Security
## Best practices

Dagster+ hybrid relies on a shared security model.
### Security

The Dagster+ control plane is SOC 2 Type II certified and follows best practices such as:
- encrypting data at rest (AES 256) and in transit (TLS 1.2+)
- highly available, with disaster recovery and backup strategies
- only manages metadata such as pipeline names, execution status, and run duration

The execution environment is managed by the customer:
- your code never leaves your environment
- all connections to databases, file systems, and other resources are made from your environment
- the execution environment only requires egress access to Dagster+

Common security considerations in Dagster+ hybrid include:
- [disabling log forwarding](/todo)
- [managing tokens](/todo)
You can do the following to make your Dagster+ Hybrid deployment more secure:
- [Disable log forwarding](/dagster-plus/deployment/management/settings/customizing-agent-settings#disabling-compute-logs)
- [Manage tokens](/dagster-plus/deployment/management/tokens/agent-tokens)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Kubernetes agent
sidebar_position: 40
sidebar_position: 20
---

import DocCardList from '@theme/DocCardList';
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Running a local agent
sidebar_position: 20
sidebar_position: 50
sidebar_label: Local agent
---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,8 @@ sidebar_position: 80
unlisted: true
---

{/* TODO move from https://docs.dagster.io/dagster-plus/deployment/agents/customizing-configuration */}
{/* TODO move from https://docs.dagster.io/dagster-plus/deployment/agents/customizing-configuration */}

## Disabling compute logs

{/* NOTE this is a placeholder section so the Hybrid deployment index page has somewhere to link to */}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading