Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[11/n][dagster-fivetran] Implement materialization method in FivetranWorkspace #25961

Merged
merged 7 commits into from
Dec 5, 2024

Conversation

maximearmstrong
Copy link
Contributor

@maximearmstrong maximearmstrong commented Nov 15, 2024

Summary & Motivation

This PR implements FivetranWorkspace.sync_and_poll, the materialization method for Fivetran assets. This method:

  • calls FivetranClient.sync_and_poll
  • takes the FivetranOutput returned by FivetranClient.sync_and_poll and generates the asset materializations
  • yields MaterializeResult for each expected asset and AssetMaterialization for each unexpected asset
    • a connector table that was not in the connector at definitions loading time can be in the FivetranOutput. Eg. the table was added after definitions loading time and before sync.
  • logs a warning for each unmaterialized table
    • a connector table can be created at definitions loading time, but can be missing in the FivetranOutput. Eg. the table was deleted after definitions loading time and before sync.

Can be leveraged like:

from dagster_fivetran import FivetranWorkspace, fivetran_assets

import dagster as dg

fivetran_workspace = FivetranWorkspace(
    account_id=dg.EnvVar("FIVETRAN_API_KEY"),
    api_key=dg.EnvVar("FIVETRAN_API_KEY"),
    api_secret=dg.EnvVar("FIVETRAN_API_SECRET"),
)


@fivetran_assets(
    connector_id="connector_id",
    name="connector_id",
    group_name="connector_id",
    workspace=fivetran_workspace,
)
def fivetran_connector_assets(context: dg.AssetExecutionContext, fivetran: FivetranWorkspace):
    yield from fivetran.sync_and_poll(context=context)


defs = dg.Definitions(
    assets=[fivetran_connector_assets],
    resources={"fivetran": fivetran_workspace},
)

How I Tested These Changes

Additional tests with BK

Tested with a live Fivetran instance:

Screenshot 2024-11-26 at 5 46 31 PM

Changelog

[dagster-fivetran]

Copy link
Contributor Author

maximearmstrong commented Nov 15, 2024

@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-10 branch from 0521c7f to 6e115f1 Compare November 19, 2024 18:13
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from 0e0b1e2 to 78fe129 Compare November 19, 2024 18:13
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-10 branch from 6e115f1 to e07232e Compare November 20, 2024 00:48
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from 78fe129 to d3f1bf6 Compare November 20, 2024 00:48
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-10 branch from e07232e to 3be662b Compare November 21, 2024 23:55
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from d3f1bf6 to b9e54d6 Compare November 21, 2024 23:55
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-10 branch from 3be662b to 5a87eb7 Compare November 22, 2024 13:54
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch 4 times, most recently from b417e4e to 3150e3e Compare November 22, 2024 14:22
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-10 branch from ebe8457 to 748298f Compare November 27, 2024 19:28
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from c16e280 to 2df7622 Compare November 27, 2024 19:28
@maximearmstrong maximearmstrong changed the base branch from maxime/rework-fivetran-10 to maxime/use-translator-instance-in-load-fivetran-asset-specs November 27, 2024 19:28
@maximearmstrong maximearmstrong force-pushed the maxime/use-translator-instance-in-load-fivetran-asset-specs branch from d2fa78b to 67d0fcd Compare November 27, 2024 20:15
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from 2df7622 to 3317a7d Compare November 27, 2024 20:15
@maximearmstrong maximearmstrong force-pushed the maxime/use-translator-instance-in-load-fivetran-asset-specs branch from 67d0fcd to 6cf70a0 Compare November 27, 2024 21:18
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from 3317a7d to b0e8961 Compare November 27, 2024 21:18
materialized_asset_keys.add(materialization.asset_key)
else:
context.log.warning(
f"An unexpected asset was materialized: {materialization.asset_key}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe note that we're still going to yield a materialization event.

@@ -501,3 +515,25 @@ def all_api_mocks_fixture(
status=200,
)
yield fetch_workspace_data_api_mocks


@pytest.fixture(name="sync_and_poll")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poll_and_sync vs sync_and_poll

connector_id: str,
fetch_workspace_data_api_mocks: responses.RequestsMock,
sync_and_poll: MagicMock,
) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mention that this is actually testing sync_and_poll. Calling it the materialization method feels a bit overly specified to me.

assert len(materialized_asset_keys) == 4
assert my_fivetran_assets.keys == materialized_asset_keys

# Mocked FivetranClient.sync_and_poll returns API response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have some way of determining that the expected log messages actually happened - this just kinda makes sure it doesn't error. Could potentially directly invoke the fivetran asset and assert against the materializeresult objects + AssetMaterializations when appropriate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with capsys in 36e9da9

@maximearmstrong maximearmstrong force-pushed the maxime/use-translator-instance-in-load-fivetran-asset-specs branch from 6cf70a0 to 47cb47a Compare December 5, 2024 18:40
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from b0e8961 to c94dc6b Compare December 5, 2024 18:40
@maximearmstrong maximearmstrong force-pushed the maxime/use-translator-instance-in-load-fivetran-asset-specs branch from 47cb47a to 79f0f78 Compare December 5, 2024 20:13
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from c94dc6b to 36e9da9 Compare December 5, 2024 20:13
Base automatically changed from maxime/use-translator-instance-in-load-fivetran-asset-specs to master December 5, 2024 20:46
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-11 branch from 36e9da9 to 99e73bd Compare December 5, 2024 20:46
@maximearmstrong maximearmstrong merged commit f6a914b into master Dec 5, 2024
1 check failed
@maximearmstrong maximearmstrong deleted the maxime/rework-fivetran-11 branch December 5, 2024 21:13
pskinnerthyme pushed a commit to pskinnerthyme/dagster that referenced this pull request Dec 16, 2024
…Workspace (dagster-io#25961)

## Summary & Motivation

This PR implements `FivetranWorkspace.sync_and_poll`, the
materialization method for Fivetran assets. This method:
- calls `FivetranClient.sync_and_poll`
- takes the FivetranOutput returned by `FivetranClient.sync_and_poll`
and generates the asset materializations
- yields `MaterializeResult` for each expected asset and
`AssetMaterialization` for each unexpected asset
- a connector table that was not in the connector at definitions loading
time can be in the FivetranOutput. Eg. the table was added after
definitions loading time and before sync.
- logs a warning for each unmaterialized table
- a connector table can be created at definitions loading time, but can
be missing in the FivetranOutput. Eg. the table was deleted after
definitions loading time and before sync.

Can be leveraged like:

```python
from dagster_fivetran import FivetranWorkspace, fivetran_assets

import dagster as dg

fivetran_workspace = FivetranWorkspace(
    account_id=dg.EnvVar("FIVETRAN_API_KEY"),
    api_key=dg.EnvVar("FIVETRAN_API_KEY"),
    api_secret=dg.EnvVar("FIVETRAN_API_SECRET"),
)


@fivetran_assets(
    connector_id="connector_id",
    name="connector_id",
    group_name="connector_id",
    workspace=fivetran_workspace,
)
def fivetran_connector_assets(context: dg.AssetExecutionContext, fivetran: FivetranWorkspace):
    yield from fivetran.sync_and_poll(context=context)


defs = dg.Definitions(
    assets=[fivetran_connector_assets],
    resources={"fivetran": fivetran_workspace},
)
```

## How I Tested These Changes

Additional tests with BK

Tested with a live Fivetran instance:

<img width="1251" alt="Screenshot 2024-11-26 at 5 46 31 PM"
src="https://github.com/user-attachments/assets/e4253119-045f-4ac7-8b98-eb805e24a843">

## Changelog

[dagster-fivetran]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants