Skip to content

Commit

Permalink
[dagster-powerbi] Add basic docs (#24269)
Browse files Browse the repository at this point in the history
## Summary

Introduce initial docs for Power BI, to send to initial design partners.
These are hidden from nav for now since the integration is not being
widely publicized.
  • Loading branch information
benpankow authored Sep 23, 2024
1 parent 90073c8 commit 3c3c32a
Show file tree
Hide file tree
Showing 13 changed files with 377 additions and 1 deletion.
Binary file modified docs/content/api/modules.json.gz
Binary file not shown.
Binary file modified docs/content/api/searchindex.json.gz
Binary file not shown.
Binary file modified docs/content/api/sections.json.gz
Binary file not shown.
167 changes: 167 additions & 0 deletions docs/content/integrations/powerbi.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
title: "Using Dagster with Power BI"
description: Represent your Power BI assets in Dagster
---

# Using Dagster with Power BI

<ExperimentalCallout />

This guide provides instructions for using Dagster with Power BI. Your Power BI assets, such as semantic models, data sources, reports, and dashboards, can be represented in the Dagster asset graph, allowing you to track lineage and dependencies between Power BI assets and upstream data assets you are already modeling in Dagster. You can also use Dagster to orchestrate Power BI semantic models, allowing you to trigger refreshes of these models on a cadence or based on upstream data changes.

## What you'll learn

- How to represent Power BI assets in the Dagster asset graph, including lineage to other Dagster assets.
- How to customize asset definition metadata for these Power BI assets.
- How to materialize Power BI semantic models from Dagster.
- How to customize how Power BI semantic models are materialized.

<details>
<summary>Prerequisites</summary>

- Familiarity with asset definitions and the Dagster asset graph
- Familiarity with Dagster resources - Familiarity with Power BI concepts, like semantic models, data sources, reports, and dashboards
- A Power BI workspace
- A service principal configured to access Power BI, or an API access token. For more information, see [Embed Power BI content with service principal and an application secret](https://learn.microsoft.com/en-us/power-bi/developer/embedded/embed-service-principal) in the Power BI documentation.

</details>

## Represent Power BI assets in the asset graph

To load Power BI assets into the Dagster asset graph, you must first construct a `PowerBIWorkspace` resource, which allows Dagster to communicate with your Power BI workspace. You'll need to supply your workspace ID and credentials. You may configure a service principal or use an API access token, which can be passed directly or accessed from the environment using `EnvVar`.

Dagster can automatically load all semantic models, data sources, reports, and dashboards from your Power BI workspace. Call the `build_defs()` function, which returns a `Definitions` object containing all the asset definitions for these Power BI assets.

```python file=/integrations/power-bi/representing-power-bi-assets.py
import uuid
from http import client
from typing import cast

from dagster_powerbi import PowerBIServicePrincipal, PowerBIToken, PowerBIWorkspace

from dagster import Definitions, EnvVar, asset, define_asset_job

# Connect using a service principal
resource = PowerBIWorkspace(
credentials=PowerBIServicePrincipal(
client_id=EnvVar("POWER_BI_CLIENT_ID"),
client_secret=EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=EnvVar("POWER_BI_TENANT_ID"),
),
workspace_id=EnvVar("POWER_BI_WORKSPACE_ID"),
)

# Alternatively, connect directly using an API access token
resource = PowerBIWorkspace(
credentials=PowerBIToken(api_token=EnvVar("POWER_BI_API_TOKEN")),
workspace_id=EnvVar("POWER_BI_WORKSPACE_ID"),
)

defs = resource.build_defs()
```

### Customize asset definition metadata for Power BI assets

By default, Dagster will generate asset keys for each Power BI asset based on its type and name and populate default metadata. You can further customize asset properties by passing a custom `DagsterPowerBITranslator` subclass to the `build_defs()` function. This subclass can implement methods to customize the asset keys or specs for each Power BI asset type.

```python file=/integrations/power-bi/customize-power-bi-asset-defs.py
from dagster_powerbi import (
DagsterPowerBITranslator,
PowerBIServicePrincipal,
PowerBIWorkspace,
)
from dagster_powerbi.translator import PowerBIContentData

from dagster import EnvVar
from dagster._core.definitions.asset_key import AssetKey
from dagster._core.definitions.asset_spec import AssetSpec

resource = PowerBIWorkspace(
credentials=PowerBIServicePrincipal(
client_id=EnvVar("POWER_BI_CLIENT_ID"),
client_secret=EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=EnvVar("POWER_BI_TENANT_ID"),
),
workspace_id=EnvVar("POWER_BI_WORKSPACE_ID"),
)


# A translator class lets us customize properties of the built
# Power BI assets, such as the owners or asset key
class MyCustomPowerBITranslator(DagsterPowerBITranslator):
def get_report_spec(self, data: PowerBIContentData) -> AssetSpec:
# We add a team owner tag to all reports
return super().get_report_spec(data)._replace(owners=["my_team"])

def get_semantic_model_spec(self, data: PowerBIContentData) -> AssetSpec:
return super().get_semantic_model_spec(data)._replace(owners=["my_team"])

def get_dashboard_spec(self, data: PowerBIContentData) -> AssetSpec:
return super().get_dashboard_spec(data)._replace(owners=["my_team"])

def get_dashboard_asset_key(self, data: PowerBIContentData) -> AssetKey:
# We prefix all dashboard asset keys with "powerbi" for organizational
# purposes
return super().get_dashboard_asset_key(data).with_prefix("powerbi")


defs = resource.build_defs(dagster_powerbi_translator=MyCustomPowerBITranslator)
```

### Load Power BI assets from multiple workspaces

Definitions from multiple Power BI workspaces can be combined by instantiating multiple `PowerBIWorkspace` resources and merging their definitions. This lets you view all your Power BI assets in a single asset graph:

```python file=/integrations/power-bi/multiple-power-bi-workspaces.py
from dagster_powerbi import PowerBIServicePrincipal, PowerBIWorkspace

from dagster import Definitions, EnvVar

credentials = PowerBIServicePrincipal(
client_id=EnvVar("POWER_BI_CLIENT_ID"),
client_secret=EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=EnvVar("POWER_BI_TENANT_ID"),
)

sales_team_workspace = PowerBIWorkspace(
credentials=credentials,
workspace_id="726c94ff-c408-4f43-8edf-61fbfa1753c7",
)

marketing_team_workspace = PowerBIWorkspace(
credentials=credentials,
workspace_id="8b7f815d-4e64-40dd-993c-cfa4fb12edee",
)

# We use Definitions.merge to combine the definitions from both workspaces
# into a single set of definitions to load
defs = Definitions.merge(
sales_team_workspace.build_defs(),
marketing_team_workspace.build_defs(),
)
```

## Materialize Power BI semantic models from Dagster

Dagster's default behavior is to pull in representations of Power BI semantic models as external assets, which appear in the asset graph but can't be materialized. However, you can instruct Dagster to allow you to materialize these semantic models, refreshing them, by passing `enable_refresh_semantic_models=True` to the `build_defs()` function:

```python file=/integrations/power-bi/materialize-semantic-models.py
import uuid
from typing import cast

from dagster_powerbi import PowerBIServicePrincipal, PowerBIWorkspace

from dagster import Definitions, EnvVar, asset, define_asset_job

resource = PowerBIWorkspace(
credentials=PowerBIServicePrincipal(
client_id=EnvVar("POWER_BI_CLIENT_ID"),
client_secret=EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=EnvVar("POWER_BI_TENANT_ID"),
),
workspace_id=EnvVar("POWER_BI_WORKSPACE_ID"),
)
defs = resource.build_defs(enable_refresh_semantic_models=True)
```

You can then add these semantic models to jobs or as targets of Dagster sensors or schedules to trigger refreshes of the models on a cadence or based on other conditions.
Binary file modified docs/next/public/objects.inv
Binary file not shown.
13 changes: 13 additions & 0 deletions docs/vale/styles/config/vocabularies/Dagster/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ PingOne
Polars
Postgres
Prometheus
Power BI
Pydantic
RBAC
RDS
Expand Down Expand Up @@ -133,3 +134,15 @@ uncomment
unpartitioned
vCPU
vCPUs
we have


SLA
SLAs
performant
SOC
GDPR
HIPAA
IAM
ECS
AWS
3 changes: 2 additions & 1 deletion examples/docs_beta_snippets/tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,11 @@ deps =
integrations: -e ../../python_modules/libraries/dagster-census
integrations: -e ../../python_modules/libraries/dagster-msteams
integrations: -e ../../python_modules/libraries/dagster-msteams
integrations: -e ../../python_modules/libraries/dagster-sdf
integrations: -e ../../python_modules/libraries/dagster-sdf
integrations: -e ../../python_modules/libraries/dagster-looker
integrations: -e ../../python_modules/libraries/dagster-prometheus
integrations: -e ../../python_modules/libraries/dagster-openai
integrations: -e ../../python_modules/libraries/dagster-powerbi
-e .
allowlist_externals =
/bin/bash
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
from dagster_powerbi import (
DagsterPowerBITranslator,
PowerBIServicePrincipal,
PowerBIWorkspace,
)
from dagster_powerbi.translator import PowerBIContentData

from dagster import EnvVar
from dagster._core.definitions.asset_key import AssetKey
from dagster._core.definitions.asset_spec import AssetSpec

resource = PowerBIWorkspace(
credentials=PowerBIServicePrincipal(
client_id=EnvVar("POWER_BI_CLIENT_ID"),
client_secret=EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=EnvVar("POWER_BI_TENANT_ID"),
),
workspace_id=EnvVar("POWER_BI_WORKSPACE_ID"),
)


# A translator class lets us customize properties of the built
# Power BI assets, such as the owners or asset key
class MyCustomPowerBITranslator(DagsterPowerBITranslator):
def get_report_spec(self, data: PowerBIContentData) -> AssetSpec:
# We add a team owner tag to all reports
return super().get_report_spec(data)._replace(owners=["my_team"])

def get_semantic_model_spec(self, data: PowerBIContentData) -> AssetSpec:
return super().get_semantic_model_spec(data)._replace(owners=["my_team"])

def get_dashboard_spec(self, data: PowerBIContentData) -> AssetSpec:
return super().get_dashboard_spec(data)._replace(owners=["my_team"])

def get_dashboard_asset_key(self, data: PowerBIContentData) -> AssetKey:
# We prefix all dashboard asset keys with "powerbi" for organizational
# purposes
return super().get_dashboard_asset_key(data).with_prefix("powerbi")


defs = resource.build_defs(dagster_powerbi_translator=MyCustomPowerBITranslator)
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import uuid
from typing import cast

from dagster_powerbi import PowerBIServicePrincipal, PowerBIWorkspace

from dagster import Definitions, EnvVar, asset, define_asset_job

resource = PowerBIWorkspace(
credentials=PowerBIServicePrincipal(
client_id=EnvVar("POWER_BI_CLIENT_ID"),
client_secret=EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=EnvVar("POWER_BI_TENANT_ID"),
),
workspace_id=EnvVar("POWER_BI_WORKSPACE_ID"),
)
defs = resource.build_defs(enable_refresh_semantic_models=True)
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
from dagster_powerbi import PowerBIServicePrincipal, PowerBIWorkspace

from dagster import Definitions, EnvVar

credentials = PowerBIServicePrincipal(
client_id=EnvVar("POWER_BI_CLIENT_ID"),
client_secret=EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=EnvVar("POWER_BI_TENANT_ID"),
)

sales_team_workspace = PowerBIWorkspace(
credentials=credentials,
workspace_id="726c94ff-c408-4f43-8edf-61fbfa1753c7",
)

marketing_team_workspace = PowerBIWorkspace(
credentials=credentials,
workspace_id="8b7f815d-4e64-40dd-993c-cfa4fb12edee",
)

# We use Definitions.merge to combine the definitions from both workspaces
# into a single set of definitions to load
defs = Definitions.merge(
sales_team_workspace.build_defs(),
marketing_team_workspace.build_defs(),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import uuid
from http import client
from typing import cast

from dagster_powerbi import PowerBIServicePrincipal, PowerBIToken, PowerBIWorkspace

from dagster import Definitions, EnvVar, asset, define_asset_job

# Connect using a service principal
resource = PowerBIWorkspace(
credentials=PowerBIServicePrincipal(
client_id=EnvVar("POWER_BI_CLIENT_ID"),
client_secret=EnvVar("POWER_BI_CLIENT_SECRET"),
tenant_id=EnvVar("POWER_BI_TENANT_ID"),
),
workspace_id=EnvVar("POWER_BI_WORKSPACE_ID"),
)

# Alternatively, connect directly using an API access token
resource = PowerBIWorkspace(
credentials=PowerBIToken(api_token=EnvVar("POWER_BI_API_TOKEN")),
workspace_id=EnvVar("POWER_BI_WORKSPACE_ID"),
)

defs = resource.build_defs()
1 change: 1 addition & 0 deletions examples/docs_snippets/tox.ini
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ deps =
-e ../../python_modules/libraries/dagster-k8s
-e ../../python_modules/libraries/dagster-pandas
-e ../../python_modules/libraries/dagster-postgres
-e ../../python_modules/libraries/dagster-powerbi
-e ../../python_modules/libraries/dagster-pyspark
-e ../../python_modules/libraries/dagster-slack
-e ../../python_modules/libraries/dagster-gcp-pandas
Expand Down
Loading

2 comments on commit 3c3c32a

@github-actions
Copy link

@github-actions github-actions bot commented on 3c3c32a Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deploy preview for dagster-docs-beta ready!

✅ Preview
https://dagster-docs-beta-pek28w3wu-elementl.vercel.app

Built with commit 3c3c32a.
This pull request is being automatically deployed with vercel-action

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deploy preview for dagster-docs ready!

✅ Preview
https://dagster-docs-2j1omip1h-elementl.vercel.app
https://master.dagster.dagster-docs.io

Built with commit 3c3c32a.
This pull request is being automatically deployed with vercel-action

Please sign in to comment.