Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC 600 ecs agent docs migration #26676

Merged
merged 5 commits into from
Dec 25, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,7 +1,260 @@
---
title: Configuration reference
sidebar_position: 400
unlisted: true
---

{/* TODO copy from https://docs.dagster.io/dagster-plus/deployment/agents/amazon-ecs/configuration-reference */}
:::note
This guide is applicable to Dagster+.
:::

This reference describes the various configuration options Dagster+ currently supports for [Amazon ECS agents](/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs).

---

## Per-location configuration

When [adding a code location](/dagster-plus/deployment/code-locations) to Dagster+ with an Amazon ECS agent, you can use the `container_context` key on the location configuration to add additional ECS-specific configuration that will be applied to any ECS tasks associated with that code location.

**Note**: If you're using the Dagster+ Github action, the `container_context` key can also be set for each location in your `dagster_cloud.yaml` file.

Check failure on line 18 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'GitHub' instead of 'Github'. Raw Output: {"message": "[Vale.Terms] Use 'GitHub' instead of 'Github'.", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 18, "column": 40}}}, "severity": "ERROR"}

The following example [`dagster_cloud.yaml`](/dagster-plus/deployment/code-locations/dagster-cloud-yaml) file illustrates the available fields:

```yaml
locations:
- location_name: cloud-examples
image: dagster/dagster-cloud-examples:latest
code_source:
package_name: dagster_cloud_examples
container_context:
ecs:
env_vars:
- DATABASE_NAME=staging
- DATABASE_PASSWORD
secrets:
- name: "MY_API_TOKEN"
valueFrom: "arn:aws:secretsmanager:us-east-1:123456789012:secret:FOO-AbCdEf:token::"
- name: "MY_PASSWORD"
valueFrom: "arn:aws:secretsmanager:us-east-1:123456789012:secret:FOO-AbCdEf:password::"
secrets_tags:
- "my_tag_name"
server_resources: # Resources for code servers launched by the agent for this location
cpu: 256
memory: 512
replica_count: 1
run_resources: # Resources for runs launched by the agent for this location
cpu: 4096
memory: 16384
execution_role_arn: arn:aws:iam::123456789012:role/MyECSExecutionRole
task_role_arn: arn:aws:iam::123456789012:role/MyECSTaskRole
mount_points:
- sourceVolume: myEfsVolume
containerPath: "/mount/efs"
readOnly: True
volumes:
- name: myEfsVolume
efsVolumeConfiguration:
fileSystemId: fs-1234
rootDirectory: /path/to/my/data
server_sidecar_containers:
- name: DatadogAgent
image: public.ecr.aws/datadog/agent:latest
environment:
- name: ECS_FARGATE
value: true
run_sidecar_containers:
- name: DatadogAgent
image: public.ecr.aws/datadog/agent:latest
environment:
- name: ECS_FARGATE
value: true
server_ecs_tags:
- key: MyEcsTagKey
value: MyEcsTagValue
run_ecs_tags:
- key: MyEcsTagKeyWithoutValue
repository_credentials: MyRepositoryCredentialsSecretArn
```

### Environment variables and secrets

Using the `container_context.ecs.env_vars` and `container_context.ecs.secrets` properties, you can configure environment variables and secrets for a specific code location.

```yaml
# dagster_cloud.yaml

locations:
- location_name: cloud-examples
image: dagster/dagster-cloud-examples:latest
code_source:
package_name: dagster_cloud_examples
container_context:
ecs:
env_vars:
- DATABASE_NAME=testing
- DATABASE_PASSWORD
secrets:
- name: "MY_API_TOKEN"
valueFrom: "arn:aws:secretsmanager:us-east-1:123456789012:secret:FOO-AbCdEf:token::"
- name: "MY_PASSWORD"
valueFrom: "arn:aws:secretsmanager:us-east-1:123456789012:secret:FOO-AbCdEf:password::"
secrets_tags:
- "my_tag_name"
```

| Property | Description |
|----------|-------------|
| container_context.ecs.env_vars | A list of keys or key-value pairs for task inclusion. If value unspecified, pulls from agent task. Example: `FOO_ENV_VAR` set to `foo_value`, `BAR_ENV_VAR` set to agent task value. |

Check failure on line 106 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'ECS' instead of 'ecs'. Raw Output: {"message": "[Vale.Terms] Use 'ECS' instead of 'ecs'.", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 106, "column": 21}}}, "severity": "ERROR"}
| container_context.ecs.secrets | Individual secrets specified using the [same structure as the ECS API.](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_Secret.html) |

Check failure on line 107 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'ECS' instead of 'ecs'. Raw Output: {"message": "[Vale.Terms] Use 'ECS' instead of 'ecs'.", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 107, "column": 21}}}, "severity": "ERROR"}
| container_context.ecs.secrets_tags | A list of tag names. Each secret tagged with any of those tag names in AWS Secrets Manager will be included in the launched tasks as environment variables. The name of the environment variable will be the name of the secret, and the value of the environment variable will be the value of the secret. |

Check failure on line 108 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'ECS' instead of 'ecs'. Raw Output: {"message": "[Vale.Terms] Use 'ECS' instead of 'ecs'.", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 108, "column": 21}}}, "severity": "ERROR"}


Refer to the following guides for more info about environment variables:

- [Dagster+ environment variables and secrets](/dagster-plus/deployment/management/environment-variables/)
- [Using environment variables and secrets in Dagster code](/guides/deploy/secrets)

---

## Per-job configuration: Resource limits

You can use job tags to customize the CPU and memory of every run for that job:

```python
from dagster import job, op

@op()
def my_op(context):
context.log.info('running')

@job(
tags = {
"ecs/cpu": "256",
"ecs/memory": "512",
}
)
def my_job():
my_op()
```

[Fargate tasks only support certain combinations of CPU and memory.](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html)

If the `ecs/cpu` or `ecs/memory` tags are set, they will override any defaults set on the code location or the deployment.

---

## Per-deployment configuration

This section describes the properties of the `dagster.yaml` configuration file used by Amazon ECS agents. Typically, this file is created by the CloudFormation template that deploys the agent and can be found within the agent task definition's command.

To change these properties, edit the CloudFormation template and redeploy the CloudFormation stack.

```yaml
instance_class:
module: dagster_cloud
class: DagsterCloudAgentInstance

dagster_cloud_api:
agent_token: <Agent Token String>
deployments:
- <Deployment Name>
- <Optional Additional Deployment Name>
branch_deployments: <true|false>

user_code_launcher:
module: dagster_cloud.workspace.ecs
class: EcsUserCodeLauncher
config:
cluster: <Cluster Name>
subnets:
- <Subnet Id 1>
- <Subnet Id 2>
security_group_ids:
- <Security Group ID>
service_discovery_namespace_id: <Service Discovery Namespace Id>
execution_role_arn: <Task Execution Role Arn>
task_role_arn: <Task Role Arn>
log_group: <Log Group Name>
launch_type: <"FARGATE"|"EC2">
server_process_startup_timeout: <Timeout in seconds>
server_resources:
cpu: <CPU value>
memory: <Memory value>
server_sidecar_containers:
- name: SidecarName
image: SidecarImage
<Additional container fields>
run_resources:
cpu: <CPU value>
memory: <Memory value>
run_sidecar_containers:
- name: SidecarName
image: SidecarImage
<Additional container fields>
mount_points:
- <List of mountPoints to pass into register_task_definition>
volumes:
- <List of volumes to pass into register_task_definition>
server_ecs_tags:
- key: MyEcsTagKey
value: MyEcsTagValue
run_ecs_tags:
- key: MyEcsTagKeyWithoutValue
repository_credentials: MyRepositoryCredentialsSecretArn

isolated_agents:
enabled: <true|false>
agent_queues:
include_default_queue: <true|false>
additional_queues:
- <queue name>
- <additional queue name>
```

### dagster_cloud_api properties

| Property | Description |
|----------|-------------|
| dagster_cloud_api.agent_token | An agent token for the agent to use for authentication. |
| dagster_cloud_api.deployments | The names of full deployments for the agent to serve. |
| dagster_cloud_api.branch_deployments | Whether the agent should serve all branch deployments. |


### user_code_launcher properties

Check failure on line 222 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'user_code_launcher'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'user_code_launcher'?", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 222, "column": 5}}}, "severity": "ERROR"}

Check failure on line 222 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Dagster.spelling] Is 'user_code_launcher' spelled correctly? Raw Output: {"message": "[Dagster.spelling] Is 'user_code_launcher' spelled correctly?", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 222, "column": 5}}}, "severity": "ERROR"}

| Property | Description |
|----------|-------------|
| config.cluster | Name of ECS cluster with Fargate or EC2 capacity provider |

Check failure on line 226 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Dagster.spelling] Is 'Fargate' spelled correctly? Raw Output: {"message": "[Dagster.spelling] Is 'Fargate' spelled correctly?", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 226, "column": 45}}}, "severity": "ERROR"}

Check failure on line 226 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'Fargate'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'Fargate'?", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 226, "column": 45}}}, "severity": "ERROR"}
| config.launch_type | ECS launch type: <br/>• `FARGATE`<br/>• `EC2` **Note:** Using this launch type requires you to have an EC2 capacity provider installed and additional operational overhead to run the agent.|
| config.subnets | **At least one subnet is required**. Dagster+ tasks require a route to the internet so they can access our API server. How this requirement is satisfied depends on the type of subnet provided: <br/>• **Public subnets** - The ECS agent will assign each task a public IP address. Note that [ECS tasks on EC2](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-networking-awsvpc.html) launched within public subnets do not have access to the internet, so a public subnet will only work for Fargate tasks. <br/>•**Private subnets** - The ECS agent assumes you've configured a NAT gateway with an attached NAT gateway. Tasks will **not** be assigned a public IP address. |

Check failure on line 228 in docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'subnet'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'subnet'?", "location": {"path": "docs/docs-beta/docs/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/configuration-reference.md", "range": {"start": {"line": 228, "column": 35}}}, "severity": "ERROR"}
| config.security_group_ids | A list of [security groups](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ec2-securitygroup.html) to use for tasks launched by the agent. |
| config.service_discovery_namespace_id | The name of a [private DNS namespace](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-servicediscovery-privatednsnamespace.html).<br/>The ECS agent launches each code location as its own ECS service. The agent communicates with these services via [AWS CloudMap service discovery.](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-discovery.html) |
| config.execution_role_arn | The ARN of the [Amazon ECS task execution IAM role](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_execution_IAM_role.html). This role allows ECS to interact with AWS resources on your behalf, such as getting an image from ECR or pushing logs to CloudWatch. <br/>**Note**:This role must include a trust relationship that allows ECS to use it.|
| config.task_role_arn | The ARN of the [Amazon ECS task IAM role](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html). This role allows the containers running in the ECS task to interact with AWS. <br/> **Note**: This role must include a trust relationship that allows ECS to use it. |
| config.log_group | The name of a CloudWatch [log group](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Working-with-log-groups-and-streams.html#Create-Log-Group). |
| config.server_process_startup_timeout | The amount of time, in seconds, to wait for code to import when launching a new service for a code location. If your code takes an unusually long time to load after your ECS task starts up and results in timeouts in the **Deployment** tab, you can increase this setting above the default. **Note** This setting isn't applicable to the time it takes for a job to execute. <br/>• **Default** - 180 (seconds)|
| config.ecs_timeout | How long (in seconds) to wait for ECS to spin up a new service and task for a code server. If your ECS tasks take an unusually long time to start and result in timeouts, you can increase this setting above the default. <br/>• **Default** - 300 (seconds) |
| config.ecs_grace_period | How long (in seconds) to continue polling if an ECS API endpoint fails during creation of a new code server (because the ECS API is eventually consistent). <br/>• **Default** - 30 (seconds)|
| config.server_resources | The resources that the agent should allocate to the ECS service for each code location that it creates. If set, must be a dictionary with a `cpu` and/or `memory` key. **Note**: [Fargate tasks only support certain combinations of CPU and memory.](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html) |
| config.server_sidecar_containers | Additional sidecar containers to include along with the Dagster container. If set, must be a list of dictionaries with valid ECS container definitions. |
| config.run_resources | The resources that the agent should allocate to the ECS task that it creates for each run. If set, must be a dictionary with a `cpu` and/or `memory` key. **Note**: [Fargate tasks only support certain combinations of CPU and memory.](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html) |
| config.run_sidecar_containers | Additional sidecar containers to include along with the Dagster container. If set, must be a list of dictionaries with valid ECS container definitions. |
| config.mount_points | Mount points to include in the Dagster container. If set, should be a list of dictionaries matching the `mountPoints` field when specifying a container definition to boto3. |
| config.volumes | Additional volumes to include in the task definition. If set, should be a list of dictionaries matching the volumes argument to `register_task_definition` in boto3. |
| config.server_ecs_tags | Additional ECS tags to include in the service for each code location. If set, must be a list of dictionaries, each with a `key` key and optional `value` key. |
| config.run_ecs_tags | AAdditional ECS tags to include in the task for each run. If set, must be a list of dictionaries, each with a `key` key and optional `value` key. |
| config.repository_credentials | Optional arn of the secret to authenticate into your private container registry. This does not apply if you are leveraging ECR for your images, see the [AWS private auth guide.](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/private-auth.html) |

### isolated_agents properties

| Property | Description |
|----------|-------------|
| isolated_agents.enabled | When enabled, agents are isolated and will not be able to access each others' resources. See the [Running multiple agents guide](/dagster-plus/deployment/deployment-types/hybrid/multiple) for more information. |

### agent_queues properties

These settings specify the queue(s) the agent will obtain requests from. See [Routing requests to specific agents](/dagster-plus/deployment/deployment-types/hybrid/multiple).

| Property | Description |
|----------|-------------|
| agent_queues.include_default_queue | If true, agent processes requests from default queue |
| agent_queues.additional_queues | List of additional queues for agent processing |
Original file line number Diff line number Diff line change
@@ -1,7 +1,89 @@
---
title: Existing VPC setup
sidebar_position: 200
unlisted: true
---

{/* TODO copy from https://docs.dagster.io/dagster-plus/deployment/agents/amazon-ecs/creating-ecs-agent-existing-vpc */}
:::note
This guide is applicable to Dagster+.
:::

In this guide, you'll set up and deploy an Amazon Elastic Container Service (ECS) agent in an existing VPC using CloudFormation. Amazon ECS agents are used to launch user code in ECS tasks.

Our CloudFormation template allows you to quickly spin up the ECS agent stack in an existing VPC. It also supports using a new or existing ECS cluster. The template code can be found [here](https://s3.amazonaws.com/dagster.cloud/cloudformation/ecs-agent.yaml). Refer to the [CloudFormation docs](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html) for more info about CloudFormation.

**For info about deploying an ECS agent in a new VPC**, check out the [ECS agents in new VPCs guide](/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/new-vpc).

---

## Prerequisites

To complete the steps in this guide, you'll need:

- **In Dagster+**:

- **Your organization and deployment names.**
- **Permissions in Dagster+ that allow you to manage agent tokens**. Refer to the [User permissions documentation](/dagster-plus/features/authentication-and-access-control/rbac/users) for more info.

- **In Amazon Web Services (AWS)**:
- **An existing VPC with the following:**
- **Subnets with access to the public internet**. Refer to the [AWS Work with VPCs guide](https://docs.aws.amazon.com/vpc/latest/userguide/working-with-vpcs.html) for more info.
- **Enabled `enableDnsHostnames` and `enableDnsSupport` DNS attributes**. Refer to the [AWS DNS attributes documentation](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html#vpc-dns-support) for more info.
- **Optional**: An existing ECS cluster with a [Fargate or EC2 capacity provider](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cluster-capacity-providers.html). The CloudFormation template will create a cluster for you if one isn't specified.

---

## Step 1: Generate a Dagster+ agent token

In this step, you'll generate a token for the Dagster+ agent. The Dagster+ agent will use this to authenticate to the agent API.

1. Sign in to your Dagster+ instance.
2. Click the **user menu (your icon) > Organization Settings**.
3. In the **Organization Settings** page, click the **Tokens** tab.
4. Click the **+ Create agent token** button.
5. After the token has been created, click **Reveal token**.

Keep the token somewhere handy - you'll need it to complete the setup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Horizontal rules can be removed.

---

## Step 2: Install the CloudFormation stack in AWS

Click the **Launch Stack** button to install the CloudFormation stack in your AWS account:

[<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png"/>](https://console.aws.amazon.com/cloudformation/home#/stacks/create/review?templateURL=https://s3.amazonaws.com/dagster.cloud/cloudformation/ecs-agent.yaml)

**Note**: Creating the CloudFormation stack may take a few minutes. Refresh the [AWS console **Stacks** page](https://console.aws.amazon.com/cloudformation/home#/stacks) to check the status.

---

## Step 3: Configure the agent

After the stack is installed, you'll be prompted to configure it. In the ECS wizard, fill in the following fields:

- **Dagster+ Organization**: Enter the name of your Dagster+ organization.
- **Dagster+ Deployment**: Enter the name of the Dagster+ deployment you want to use. Leave this field empty if the agent will only serve Branch deployments.
- **Enable Branch Deployments**: Whether to have this agent serve your ephemeral [Branch deployments](/dagster-plus/features/ci-cd/branch-deployments). Only a single agent should have this setting enabled.
- **Agent Token**: Paste the agent token you generated in [Step 1](#step-1-generate-a-dagster-agent-token).
- **Deploy VPC**: The existing VPC to deploy the agent into.
- **Deploy VPC Subnet**: A public subnet of the existing VPC to deploy the agent into.
- **Existing ECS Cluster**: Optionally, the name of an existing ECS cluster to deploy the agent in. Leave blank to create a new cluster
- **Task Launch Type**: Optionally, the launch type to use for new tasks created by the agent (FARGATE or EC2). Defaults to FARGATE.

The page should look similar to the following image. In this example, our organization name is `hooli` and our deployment is `prod`:

![Example Configuration for the ECS Agent CloudFormation Template](/images/dagster-cloud/agents/aws-ecs-stack-wizard-existing.png)

After you've finished configuring the stack in AWS, you can view the agent in Dagster+. To do so, navigate to the **Status** page and click the **Agents** tab. You should see the agent running in the **Agent statuses** section:

![Instance Status](/images/dagster-cloud/agents/dagster-cloud-instance-status.png)

---

## Next steps

Now that you've got your agent running, what's next?

- **If you're getting Dagster+ set up**, the next step is to [add a code location](/dagster-plus/deployment/code-locations) using the agent.

- **If you're ready to load your Dagster code**, refer to the [Adding Code to Dagster+](/dagster-plus/deployment/code-locations) guide for more info.

If you need to upgrade your ECS agent's CloudFormation template, refer to the [upgrade guide](/dagster-plus/deployment/deployment-types/hybrid/amazon-ecs/upgrading-cloudformation) for more info.
Loading
Loading