From 0f9129c30cbf6dc4247d78223fd78c3d9be82b5a Mon Sep 17 00:00:00 2001 From: Colton Padden Date: Mon, 14 Oct 2024 14:26:25 -0400 Subject: [PATCH 01/12] [docs,beta] guide for writing a multi-asset integration --- .../docs/tutorial/multi-asset-integration.md | 278 ++++++++++++++++++ docs/docs-beta/sidebars.ts | 2 +- .../multi-asset-integration/integration.py | 93 ++++++ 3 files changed, 372 insertions(+), 1 deletion(-) create mode 100644 docs/docs-beta/docs/tutorial/multi-asset-integration.md create mode 100644 examples/docs_beta_snippets/docs_beta_snippets/guides/tutorials/multi-asset-integration/integration.py diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md new file mode 100644 index 0000000000000..dbd69aaa4de18 --- /dev/null +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -0,0 +1,278 @@ +--- +title: Create a multi-asset integration +description: Create a decorator based multi-asset integration +--- + +# Create a multi-asset integration + +When working in the Dagster ecosystem, you may have noticed that decorators are frequently used. For example, assets, jobs, and ops use decorators. If you have a service that produces many assets, it's possible to define it as a multi-asset decorator — offering a consistent and intuitive developer experience to existing Dagster APIs. + +In the context of Dagster, decorators are helpful because they often wrap some form of processing. For example, when writing an asset, you define your processing code and then annotate the function with the `asset` decorator /> decorator. Then, the internal Dagster code can register the asset, assign metadata, pass in context data, or perform any other variety of operations that are required to integrate your asset code with the Dagster platform. + +In this guide, you'll learn how to develop a multi-asset integration for a hypothetical replication tool. + +--- + +## Prerequisites + +To follow the steps in this guide, you'll need: + +- Familiarity with Dagster +- An understanding of Python decorators — [Real Python's Primer on Python Decorators](https://realpython.com/primer-on-python-decorators/) is a fantastic introduction + +--- + +## Step 1: Input + +For this guide, let's imagine a tool that replicates data between two databases. It's configured using a `replication.yaml` configuration file, in which a user is able to define source and destination databases, along with the tables that they would like to replicate between these systems. + +```yaml +connections: + source: + type: duckdb + connection: example.duckdb + destination: + type: postgres + connection: postgresql://postgres:postgres@localhost/postgres + +tables: + - name: users + primary_key: id + - name: products + primary_key: id + - name: activity + primary_key: id +``` + +For the integration we're building, we want to provide a multi-asset that encompasses this replication process, and generates an asset for each table being replicated. + +We will define a dummy function named `replicate` that will mock the replication process, and return a dictionary with the replication status of each table. In the real world, this could be a function in a library, or a call to a command-line tool. + +```python +import yaml + +from pathlib import Path +from typing import Mapping, Iterator, Any + + +def replicate(replication_configuration_yaml: Path) -> Iterator[Mapping[str, Any]]: + data = yaml.safe_load(replication_configuration_yaml.read_text()) + for table in data.get("tables"): + # < perform replication here, and get status > + yield {"table": table.get("name"), "status": "success"} +``` + +--- + +## Step 2: Implementation + +First, let's define a `Project` object that takes in the path of our configuration YAML file. This will allow us to encapsulate the logic that gets metadata and table information from our project configuration. + +```python +import yaml +from pathlib import Path + + +class ReplicationProject(): + def __init__(self, replication_configuration_yaml: str): + self.replication_configuration_yaml = replication_configuration_yaml + + def load(self): + return yaml.safe_load(Path(self.replication_configuration_yaml).read_text()) +``` + +Next, define a function that returns a `multi_asset` function. The `multi_asset` function is a decorator itself, so this allows us to customize the behavior of `multi_asset` and create a new decorator of our own: + +```python +def custom_replication_assets( + *, + replication_project: ReplicationProject, + name: Optional[str] = None, + group_name: Optional[str] = None, +) -> Callable[[Callable[..., Any]], AssetsDefinition]: + project = replication_project.load() + + return multi_asset( + name=name, + group_name=group_name, + specs=[ + AssetSpec( + key=table.get("name"), + ) + for table in project.get("tables") + ], + ) +``` + +Let's review what this code does: + +- Defines a function that returns a `multi_asset` function +- Loads our replication project and iterates over the tables defined in the input YAML file +- Uses the tables to create a list of `AssetSpec` objects and passes them to the `specs` parameter, thus defining assets that will be visible in the Dagster UI + +Next, we'll show you how to perform the execution of the replication function. + +Recall that decorators allow us to wrap a function that performs some operation. In the case of our `multi_asset`, we defined `AssetSpec` objects for our tables, and the actual processing that takes place will be in the body of the decorated function. + +In this function, we will perform the replication, and then yield `AssetMaterialization` objects indicating that the replication was successful for a given table. + +```python +from dagster import AssetExecutionContext + + +replication_project_path = "replication.yaml" +replication_project = ReplicationProject(replication_project_path) + + +@custom_replication_assets( + replication_project=replication_project, + name="my_custom_replication_assets", + group_name="replication", +) +def my_assets(context: AssetExecutionContext): + results = replicate(Path(replication_project_path)) + for table in results: + if table.get("status") == "SUCCESS": + yield AssetMaterialization(asset_key=str(table.get("name")), metadata=table) +``` + +There are a few limitations to this approach: + +- **We have not encapsulated the logic for replicating tables.** This means that users who use the `custom_replication_assets` decorator would be responsible for yielding asset materializations themselves. +- **Users can't customize the attributes of the asset**. + +For the first limitation, we can resolve this by refactoring the code in the body of our asset function into a Dagster resource. + +--- + +## Step 3: Moving the replication logic into a resource + +Refactoring the replication logic into a resource enables us to support better configurability and reusability of our logic. + +To accomplish this, we will extend the `ConfigurableResource` object to create a custom resource. Then, we will define a `run` method that will perform the replication operation: + +```python +from dagster import ConfigurableResource +from dagster._annotations import public + + +class ReplicationResource(ConfigurableResource): + @public + def run( + self, replication_project: ReplicationProject + ) -> Iterator[AssetMaterialization]: + results = replicate(Path(replication_project.replication_configuration_yaml)) + for table in results: + if table.get("status") == "SUCCESS": + # NOTE: this assumes that the table name is the same as the asset key + yield AssetMaterialization( + asset_key=str(table.get("name")), metadata=table + ) +``` + +Now, we can refactor our `custom_replication_assets` instance to use this resource: + +```python +@custom_replication_assets( + replication_project=replication_project, + name="my_custom_replication_assets", + group_name="replication", +) +def my_assets(replication_resource: ReplicationProject): + replication_resource.run(replication_project) +``` + +--- + +## Step 4: Using translators + +At the end of [Step 2](#step-2-implementation), we mentioned that end users were unable to customize asset attributes, like the asset key, generated by our decorator. Translator classes are the recommended way of defining this logic, and they provide users with the option to override the default methods used to convert a concept from your tool (e.g. a table name) to the corresponding concept in Dagster (e.g. asset key). + +To start, we will define a translator method to map the table specification to a Dagster asset key. **Note**: in a real world integration you will want to define methods for all common attributes like dependencies, group names, and metadata. + +```python +from dagster import AssetKey, _check as check + +from dataclasses import dataclass + + +@dataclass +class ReplicationTranslator: + @public + def get_asset_key(self, table_definition: Mapping[str, str]) -> AssetKey: + return AssetKey(str(table_definition.get("name"))) +``` + +Next, we'll update `custom_replication_assets` to use the translator when defining the `key` on the `AssetSpec`. Note that we took this opportunity to also include the replication project and translator instance on the `AssetSpec` metadata. This is a workaround that we tend to employ in this approach, as it makes it possible to define these objects once and then access them on the context of our asset. + +```python +def custom_replication_assets( + *, + replication_project: ReplicationProject, + name: Optional[str] = None, + group_name: Optional[str] = None, + translator: Optional[ReplicationTranslator] = None, +) -> Callable[[Callable[..., Any]], AssetsDefinition]: + project = replication_project.load() + + translator = ( + check.opt_inst_param(translator, "translator", ReplicationTranslator) + or ReplicationTranslator() + ) + + return multi_asset( + name=name, + group_name=group_name, + specs=[ + AssetSpec( + key=translator.get_asset_key(table), + metadata={ + "replication_project": project, + "replication_translator": translator, + }, + ) + for table in project.get("tables") + ], + ) +``` + +Finally, we have to update our resource to use the translator and project provided in the metadata. We are using the `check` method provided by `dagster._check` to ensure that the type of the object is appropriate as we retrieve it from the metadata. + +Now, we can use the same `translator.get_asset_key` when yielding the asset materialization, thus ensuring that our asset declarations match our asset materializations: + +```python +class ReplicationResource(ConfigurableResource): + @public + def run(self, context: AssetExecutionContext) -> Iterator[AssetMaterialization]: + metadata_by_key = context.assets_def.metadata_by_key + first_asset_metadata = next(iter(metadata_by_key.values())) + + project = check.inst( + first_asset_metadata.get("replication_project"), + ReplicationProject, + ) + + translator = check.inst( + first_asset_metadata.get("replication_translator"), + ReplicationTranslator, + ) + + results = replicate(Path(project.replication_configuration_yaml)) + for table in results: + if table.get("status") == "SUCCESS": + yield AssetMaterialization( + asset_key=translator.get_asset_key(table), metadata=table + ) +``` + +--- + +## Conclusion + +In this guide we walked through how to define a custom multi-asset decorator, a resource for encapsulating tool logic, and a translator for defining the logic to translate a specification to Dagster concepts. + +Defining integrations with this approach aligns nicely with the overall development paradigm of Dagster, and is suitable for tools that generate many assets. + +The code in its entirety can be seen below: + + diff --git a/docs/docs-beta/sidebars.ts b/docs/docs-beta/sidebars.ts index 6b77bccad07b2..8f65a786199a0 100644 --- a/docs/docs-beta/sidebars.ts +++ b/docs/docs-beta/sidebars.ts @@ -11,7 +11,7 @@ const sidebars: SidebarsConfig = { type: 'category', label: 'Tutorial', collapsed: false, - items: ['tutorial/tutorial-etl'], + items: ['tutorial/tutorial-etl', 'tutorial/multi-asset-integration'], }, { type: 'category', diff --git a/examples/docs_beta_snippets/docs_beta_snippets/guides/tutorials/multi-asset-integration/integration.py b/examples/docs_beta_snippets/docs_beta_snippets/guides/tutorials/multi-asset-integration/integration.py new file mode 100644 index 0000000000000..78de81b5173b3 --- /dev/null +++ b/examples/docs_beta_snippets/docs_beta_snippets/guides/tutorials/multi-asset-integration/integration.py @@ -0,0 +1,93 @@ +from dataclasses import dataclass +from pathlib import Path +from typing import Any, Callable, Iterator, Mapping, Optional + +import yaml + +from dagster import ( + AssetExecutionContext, + AssetKey, + AssetMaterialization, + AssetsDefinition, + AssetSpec, + ConfigurableResource, + _check as check, + multi_asset, +) +from dagster._annotations import public + + +def replicate(replication_configuration_yaml: Path) -> Iterator[Mapping[str, Any]]: + data = yaml.safe_load(replication_configuration_yaml.read_text()) + for table in data.get("tables"): + # < perform replication here, and get status > + yield {"table": table.get("name"), "status": "success"} + + +class ReplicationProject: + def __init__(self, replication_configuration_yaml: str): + self.replication_configuration_yaml = replication_configuration_yaml + + def load(self): + return yaml.safe_load(Path(self.replication_configuration_yaml).read_text()) + + +class ReplicationResource(ConfigurableResource): + @public + def run(self, context: AssetExecutionContext) -> Iterator[AssetMaterialization]: + metadata_by_key = context.assets_def.metadata_by_key + first_asset_metadata = next(iter(metadata_by_key.values())) + + project = check.inst( + first_asset_metadata.get("replication_project"), + ReplicationProject, + ) + + translator = check.inst( + first_asset_metadata.get("replication_translator"), + ReplicationTranslator, + ) + + results = replicate(Path(project.replication_configuration_yaml)) + for table in results: + if table.get("status") == "SUCCESS": + yield AssetMaterialization( + asset_key=translator.get_asset_key(table), metadata=table + ) + + +@dataclass +class ReplicationTranslator: + @public + def get_asset_key(self, table_definition: Mapping[str, str]) -> AssetKey: + return AssetKey(str(table_definition.get("name"))) + + +def custom_replication_assets( + *, + replication_project: ReplicationProject, + name: Optional[str] = None, + group_name: Optional[str] = None, + translator: Optional[ReplicationTranslator] = None, +) -> Callable[[Callable[..., Any]], AssetsDefinition]: + project = replication_project.load() + + translator = ( + check.opt_inst_param(translator, "translator", ReplicationTranslator) + or ReplicationTranslator() + ) + + return multi_asset( + name=name, + group_name=group_name, + specs=[ + AssetSpec( + key=translator.get_asset_key(table), + metadata={ + "replication_project": project, + "replication_translator": translator, + }, + ) + for table in project.get("tables") + ], + ) From e8d7ca91dca7fb599085ecc0dd81a6cae6a968fb Mon Sep 17 00:00:00 2001 From: Colton Padden Date: Mon, 14 Oct 2024 14:34:02 -0400 Subject: [PATCH 02/12] vale --- .../docs-beta/docs/tutorial/multi-asset-integration.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index dbd69aaa4de18..0b430f4a43065 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -5,7 +5,7 @@ description: Create a decorator based multi-asset integration # Create a multi-asset integration -When working in the Dagster ecosystem, you may have noticed that decorators are frequently used. For example, assets, jobs, and ops use decorators. If you have a service that produces many assets, it's possible to define it as a multi-asset decorator — offering a consistent and intuitive developer experience to existing Dagster APIs. +When working in the Dagster ecosystem, you may have noticed that decorators are frequently used. For example, assets, jobs, and ops use decorators. If you have a service that produces many assets, it's possible to define it as a multi-asset decorator-offering a consistent and intuitive developer experience to existing Dagster APIs. In the context of Dagster, decorators are helpful because they often wrap some form of processing. For example, when writing an asset, you define your processing code and then annotate the function with the `asset` decorator /> decorator. Then, the internal Dagster code can register the asset, assign metadata, pass in context data, or perform any other variety of operations that are required to integrate your asset code with the Dagster platform. @@ -18,7 +18,7 @@ In this guide, you'll learn how to develop a multi-asset integration for a hypot To follow the steps in this guide, you'll need: - Familiarity with Dagster -- An understanding of Python decorators — [Real Python's Primer on Python Decorators](https://realpython.com/primer-on-python-decorators/) is a fantastic introduction +- An understanding of Python decorators—[Real Python's Primer on Python Decorators](https://realpython.com/primer-on-python-decorators/) is a fantastic introduction --- @@ -147,7 +147,7 @@ For the first limitation, we can resolve this by refactoring the code in the bod ## Step 3: Moving the replication logic into a resource -Refactoring the replication logic into a resource enables us to support better configurability and reusability of our logic. +Refactoring the replication logic into a resource enables us to support better configuration and re-use of our logic. To accomplish this, we will extend the `ConfigurableResource` object to create a custom resource. Then, we will define a `run` method that will perform the replication operation: @@ -186,7 +186,7 @@ def my_assets(replication_resource: ReplicationProject): ## Step 4: Using translators -At the end of [Step 2](#step-2-implementation), we mentioned that end users were unable to customize asset attributes, like the asset key, generated by our decorator. Translator classes are the recommended way of defining this logic, and they provide users with the option to override the default methods used to convert a concept from your tool (e.g. a table name) to the corresponding concept in Dagster (e.g. asset key). +At the end of [Step 2](#step-2-implementation), we mentioned that end users were unable to customize asset attributes, like the asset key, generated by our decorator. Translator classes are the recommended way of defining this logic, and they provide users with the option to override the default methods used to convert a concept from your tool (for example, a table name) to the corresponding concept in Dagster (for example, asset key). To start, we will define a translator method to map the table specification to a Dagster asset key. **Note**: in a real world integration you will want to define methods for all common attributes like dependencies, group names, and metadata. @@ -203,7 +203,7 @@ class ReplicationTranslator: return AssetKey(str(table_definition.get("name"))) ``` -Next, we'll update `custom_replication_assets` to use the translator when defining the `key` on the `AssetSpec`. Note that we took this opportunity to also include the replication project and translator instance on the `AssetSpec` metadata. This is a workaround that we tend to employ in this approach, as it makes it possible to define these objects once and then access them on the context of our asset. +Next, we'll update `custom_replication_assets` to use the translator when defining the `key` on the `AssetSpec`. **Note** that we took this opportunity to also include the replication project and translator instance on the `AssetSpec` metadata. This is a workaround that we tend to employ in this approach, as it makes it possible to define these objects once and then access them on the context of our asset. ```python def custom_replication_assets( From 7701d0a21b0b090f2993a93afc51bf2a9c7ddda9 Mon Sep 17 00:00:00 2001 From: Colton Padden Date: Mon, 14 Oct 2024 14:36:09 -0400 Subject: [PATCH 03/12] fix yml highlighting --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index 0b430f4a43065..b6ca0e86c903f 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -26,7 +26,7 @@ To follow the steps in this guide, you'll need: For this guide, let's imagine a tool that replicates data between two databases. It's configured using a `replication.yaml` configuration file, in which a user is able to define source and destination databases, along with the tables that they would like to replicate between these systems. -```yaml +```yml connections: source: type: duckdb From f33bbb7af92e1d860805d16212bde8227c07457c Mon Sep 17 00:00:00 2001 From: colton Date: Fri, 20 Dec 2024 13:54:15 -0500 Subject: [PATCH 04/12] Update docs/docs-beta/docs/tutorial/multi-asset-integration.md Co-authored-by: Nikki Everett --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index b6ca0e86c903f..c146eb9ff5a09 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -11,7 +11,6 @@ In the context of Dagster, decorators are helpful because they often wrap some f In this guide, you'll learn how to develop a multi-asset integration for a hypothetical replication tool. ---- ## Prerequisites From fa399eb7c47b0ac43bda7914192d15bef71975bf Mon Sep 17 00:00:00 2001 From: colton Date: Fri, 20 Dec 2024 13:54:23 -0500 Subject: [PATCH 05/12] Update docs/docs-beta/docs/tutorial/multi-asset-integration.md Co-authored-by: Nikki Everett --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index c146eb9ff5a09..08e2762919e84 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -1,5 +1,5 @@ --- -title: Create a multi-asset integration +title: Creating a multi-asset integration description: Create a decorator based multi-asset integration --- From d2f954bc1f701a0c1d27cdbbfa8ff4bfd729793e Mon Sep 17 00:00:00 2001 From: colton Date: Fri, 20 Dec 2024 13:54:29 -0500 Subject: [PATCH 06/12] Update docs/docs-beta/docs/tutorial/multi-asset-integration.md Co-authored-by: Nikki Everett --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index 08e2762919e84..42fe17621a1c3 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -3,7 +3,6 @@ title: Creating a multi-asset integration description: Create a decorator based multi-asset integration --- -# Create a multi-asset integration When working in the Dagster ecosystem, you may have noticed that decorators are frequently used. For example, assets, jobs, and ops use decorators. If you have a service that produces many assets, it's possible to define it as a multi-asset decorator-offering a consistent and intuitive developer experience to existing Dagster APIs. From 2c816e6d24edb94515d2f029db03a260dfa46b6e Mon Sep 17 00:00:00 2001 From: colton Date: Fri, 20 Dec 2024 13:54:37 -0500 Subject: [PATCH 07/12] Update docs/docs-beta/docs/tutorial/multi-asset-integration.md Co-authored-by: Nikki Everett --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index 42fe17621a1c3..e0f3cb99f8a03 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -141,7 +141,6 @@ There are a few limitations to this approach: For the first limitation, we can resolve this by refactoring the code in the body of our asset function into a Dagster resource. ---- ## Step 3: Moving the replication logic into a resource From 9b48c249c10926bcdb717ecf85ffae8ca0a9c885 Mon Sep 17 00:00:00 2001 From: colton Date: Fri, 20 Dec 2024 13:54:48 -0500 Subject: [PATCH 08/12] Update docs/docs-beta/docs/tutorial/multi-asset-integration.md Co-authored-by: Nikki Everett --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index e0f3cb99f8a03..ea776f1e52e96 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -185,7 +185,11 @@ def my_assets(replication_resource: ReplicationProject): At the end of [Step 2](#step-2-implementation), we mentioned that end users were unable to customize asset attributes, like the asset key, generated by our decorator. Translator classes are the recommended way of defining this logic, and they provide users with the option to override the default methods used to convert a concept from your tool (for example, a table name) to the corresponding concept in Dagster (for example, asset key). -To start, we will define a translator method to map the table specification to a Dagster asset key. **Note**: in a real world integration you will want to define methods for all common attributes like dependencies, group names, and metadata. +To start, we will define a translator method to map the table specification to a Dagster asset key. + +:::note +In a real world integration, you will want to define methods for all common attributes like dependencies, group names, and metadata. +::: ```python from dagster import AssetKey, _check as check From 56be87673cf93e37cb3a09099880eda0c6e354da Mon Sep 17 00:00:00 2001 From: colton Date: Fri, 20 Dec 2024 13:54:56 -0500 Subject: [PATCH 09/12] Update docs/docs-beta/docs/tutorial/multi-asset-integration.md Co-authored-by: Nikki Everett --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index ea776f1e52e96..0990291a2c4d3 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -179,7 +179,6 @@ def my_assets(replication_resource: ReplicationProject): replication_resource.run(replication_project) ``` ---- ## Step 4: Using translators From 3200f44d1dcbb83ed620b556e57bde70b942cc02 Mon Sep 17 00:00:00 2001 From: colton Date: Fri, 20 Dec 2024 13:55:03 -0500 Subject: [PATCH 10/12] Update docs/docs-beta/docs/tutorial/multi-asset-integration.md Co-authored-by: Nikki Everett --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index 0990291a2c4d3..d5dc7f91c26a7 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -203,7 +203,11 @@ class ReplicationTranslator: return AssetKey(str(table_definition.get("name"))) ``` -Next, we'll update `custom_replication_assets` to use the translator when defining the `key` on the `AssetSpec`. **Note** that we took this opportunity to also include the replication project and translator instance on the `AssetSpec` metadata. This is a workaround that we tend to employ in this approach, as it makes it possible to define these objects once and then access them on the context of our asset. +Next, we'll update `custom_replication_assets` to use the translator when defining the `key` on the `AssetSpec`. + +:::note +Note that we took this opportunity to also include the replication project and translator instance on the `AssetSpec` metadata. This is a workaround that we tend to employ in this approach, as it makes it possible to define these objects once and then access them on the context of our asset. +::: ```python def custom_replication_assets( From d6ca7c97231c6b626f47394c3eb1bd5cfe22e9db Mon Sep 17 00:00:00 2001 From: colton Date: Fri, 20 Dec 2024 13:55:09 -0500 Subject: [PATCH 11/12] Update docs/docs-beta/docs/tutorial/multi-asset-integration.md Co-authored-by: Nikki Everett --- docs/docs-beta/docs/tutorial/multi-asset-integration.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index d5dc7f91c26a7..ecffa724fdd6c 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -269,7 +269,6 @@ class ReplicationResource(ConfigurableResource): ) ``` ---- ## Conclusion From c28b0c50220c9b0dbe8b049fa7e666bc0000448b Mon Sep 17 00:00:00 2001 From: Colton Padden Date: Fri, 20 Dec 2024 14:19:51 -0500 Subject: [PATCH 12/12] use note for prereq --- .../docs/tutorial/multi-asset-integration.md | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/docs/docs-beta/docs/tutorial/multi-asset-integration.md b/docs/docs-beta/docs/tutorial/multi-asset-integration.md index ecffa724fdd6c..92ec6f133c47b 100644 --- a/docs/docs-beta/docs/tutorial/multi-asset-integration.md +++ b/docs/docs-beta/docs/tutorial/multi-asset-integration.md @@ -3,22 +3,15 @@ title: Creating a multi-asset integration description: Create a decorator based multi-asset integration --- - When working in the Dagster ecosystem, you may have noticed that decorators are frequently used. For example, assets, jobs, and ops use decorators. If you have a service that produces many assets, it's possible to define it as a multi-asset decorator-offering a consistent and intuitive developer experience to existing Dagster APIs. In the context of Dagster, decorators are helpful because they often wrap some form of processing. For example, when writing an asset, you define your processing code and then annotate the function with the `asset` decorator /> decorator. Then, the internal Dagster code can register the asset, assign metadata, pass in context data, or perform any other variety of operations that are required to integrate your asset code with the Dagster platform. In this guide, you'll learn how to develop a multi-asset integration for a hypothetical replication tool. - -## Prerequisites - -To follow the steps in this guide, you'll need: - -- Familiarity with Dagster -- An understanding of Python decorators—[Real Python's Primer on Python Decorators](https://realpython.com/primer-on-python-decorators/) is a fantastic introduction - ---- +:::note +This guide assumes basic familiarity with Dagster and Python decorators. +::: ## Step 1: Input @@ -60,8 +53,6 @@ def replicate(replication_configuration_yaml: Path) -> Iterator[Mapping[str, Any yield {"table": table.get("name"), "status": "success"} ``` ---- - ## Step 2: Implementation First, let's define a `Project` object that takes in the path of our configuration YAML file. This will allow us to encapsulate the logic that gets metadata and table information from our project configuration. @@ -141,7 +132,6 @@ There are a few limitations to this approach: For the first limitation, we can resolve this by refactoring the code in the body of our asset function into a Dagster resource. - ## Step 3: Moving the replication logic into a resource Refactoring the replication logic into a resource enables us to support better configuration and re-use of our logic. @@ -179,7 +169,6 @@ def my_assets(replication_resource: ReplicationProject): replication_resource.run(replication_project) ``` - ## Step 4: Using translators At the end of [Step 2](#step-2-implementation), we mentioned that end users were unable to customize asset attributes, like the asset key, generated by our decorator. Translator classes are the recommended way of defining this logic, and they provide users with the option to override the default methods used to convert a concept from your tool (for example, a table name) to the corresponding concept in Dagster (for example, asset key). @@ -269,7 +258,6 @@ class ReplicationResource(ConfigurableResource): ) ``` - ## Conclusion In this guide we walked through how to define a custom multi-asset decorator, a resource for encapsulating tool logic, and a translator for defining the logic to translate a specification to Dagster concepts.