Skip to content

Commit

Permalink
[azure-docs] refine AzureBlobComputeLogManager guide
Browse files Browse the repository at this point in the history
Additions and modifications noted from dogfooding the guide
  • Loading branch information
mlarose committed Dec 11, 2024
1 parent ba9f895 commit 7196119
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 34 deletions.
4 changes: 4 additions & 0 deletions docs/content/dagster-plus/deployment/azure/acr-user-code.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -169,3 +169,7 @@ alt="Dagster+ code locations page showing the new code location"
width={1152}
height={320}
/>

## Next steps

Now that you have your code location deployed, you can follow the guide [here](/dagster-plus/deployment/azure/blob-compute-logs) to set up logging in your AKS cluster.
63 changes: 29 additions & 34 deletions docs/content/dagster-plus/deployment/azure/blob-compute-logs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -25,66 +25,56 @@ First, we'll enable the cluster to use workload identity. This will allow the AK
az aks update --resource-group <resource-group> --name <cluster-name> --enable-workload-identity
```

Then, we'll create a new managed identity for the AKS agent, and a new service account in our AKS cluster.
Then, we'll create a new managed identity for the AKS agent.

```bash
az identity create --resource-group <resource-group> --name agent-identity
kubectl create serviceaccount dagster-agent-service-account --namespace dagster-agent
```

Now we need to federate the managed identity with the service account.
We will need to find the name of the service account used by the Dagster+ Agent. If you used the [Dagster+ Helm chart](/dagster-plus/deployment/agents/kubernetes/configuring-running-kubernetes-agent), it should be `user-cloud-dagster-cloud-agent`. You can confirm by using this command:

```bash
kubectl get serviceaccount -n <dagster-agent-namespace>
```

Now we need to federate the managed identity with the service account used by the Dagster+ Agent.

```bash
az identity federated-credential create \
--name dagster-agent-federated-id \
--identity-name agent-identity \
--resource-group <resource-group> \
--issuer $(az aks show -g <resource-group> -n <aks-cluster-name> --query "oidcIssuerProfile.issuerUrl" -otsv) \
--subject system:serviceaccount:dagster-agent:dagster-agent-service-account
--subject system:serviceaccount:<dagster-agent-namespace>:<dagster-agent-service-account>
```

Finally, we'll edit our AKS agent deployment to use the new service account.

```bash
kubectl edit deployment <your-user-cloud-deployment> -n dagster-agent
```

In the deployment manifest, add the following lines:
New labels need to be added to the pods launched to enable the use of workload identity. If you're using the Dagster+ Helm Chart, you need to modify your values.yaml to add the following lines:

```yaml
metadata:
...
dagsterCloudAgent:
labels:
azure.workload.identity/use: "true"

workspace:
labels:
...
azure.workload.identity/use: "true"
spec:
...
template:
...
spec:
...
serviceAccountName: dagster-agent-sa
```
Note: If you need to retrieve the current values used by your Helm deployment, you can use this command: `helm get values user-cloud > current-values.yaml`.


If everything is set up correctly, you should be able to run the following command and see an access token returned:

```bash
kubectl exec -n dagster-agent -it <pod-in-cluster> -- bash
kubectl exec -n <dagster-agent-namespace> -it <pod-in-cluster> -- bash
# in the pod
curl -H "Metadata:true" "http://169.254.169.254/metadata/identity/oauth2/token?resource=https://storage.azure.com/"
apt update && apt install -y curl
curl -H "Metadata:true" "http://169.254.169.254/metadata/identity/oauth2/token?resource=https://storage.azure.com/&api-version=2018-02-01"
```

## Step 2: Configure Dagster to use Azure Blob Storage

Now, you need to update the helm values to use Azure Blob Storage for logs. You can do this by editing the `values.yaml` file for your user-cloud deployment.

Pull down the current values for your deployment:

```bash
helm get values user-cloud > current-values.yaml
```

Then, edit the `current-values.yaml` file to include the following lines:
Once again, you need to update the helm values to use Azure Blob Storage for logs. You can do this by editing the `values.yaml` file for your user-cloud deployment to include the following lines:

```yaml
computeLogs:
Expand All @@ -105,10 +95,15 @@ computeLogs:
Finally, update your deployment with the new values:

```bash
helm upgrade user-cloud dagster-cloud/dagster-cloud-agent -n dagster-agent -f current-values.yaml
helm upgrade user-cloud dagster-cloud/dagster-cloud-agent -n <dagster-agent-namespace> -f values.yaml
```

## Step 3: Verify logs are being written to Azure Blob Storage
## Step 3: Update your code location to enable the use of the AzureBlobComputeLogManager

- Add `dagster-azure` to your `setup.py` file. This will allow you to import the `AzureBlobComputeLogManager` class.


## Step 4: Verify logs are being written to Azure Blob Storage

It's time to kick off a run in Dagster to test your new configuration. If following along with the quickstart repo, you should be able to kick off a run of the `all_assets_job`, which will generate logs for you to test against. Otherwise, use any job that emits logs. When you go to the stdout/stderr window of the run page, you should see a log file that directs you to the Azure Blob Storage container.

Expand Down

0 comments on commit 7196119

Please sign in to comment.