-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[10/n][dagster-fivetran] Implement fivetran_assets and build_fivetran_assets_definitions #25944
Conversation
FivetranWorkspaceData
to FivetranConnectorTableProps
method
#25797
python_modules/libraries/dagster-fivetran/dagster_fivetran/asset_defs.py
Show resolved
Hide resolved
2500b07
to
3f0e9aa
Compare
python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py
Outdated
Show resolved
Hide resolved
bd27402
to
357ff3f
Compare
6cbe4f4
to
0521c7f
Compare
def sync_and_poll( | ||
self, context: Optional[Union[OpExecutionContext, AssetExecutionContext]] = None | ||
): | ||
raise NotImplementedError() | ||
|
||
def __eq__(self, other): | ||
return ( | ||
isinstance(other, FivetranWorkspace) | ||
and self.account_id == other.account_id | ||
and self.api_key == other.api_key | ||
and self.api_secret == other.api_secret | ||
) | ||
|
||
def __hash__(self): | ||
return hash((self.account_id + self.api_key + self.api_secret)) | ||
|
||
|
||
@lru_cache(maxsize=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caching load_fivetran_asset_specs
with functools.lru_cache
requires FivetranWorkspace
to be hashable.
An alternative to caching load_fivetran_asset_specs
would be to call it in a FivetranWorkspace
cached property.
class FivetranWorkspace(ConfigurableResource):
...
@cache_property
def asset_specs(
self,
dagster_fivetran_translator: Type[DagsterFivetranTranslator] = DagsterFivetranTranslator
) -> Sequence[AssetSpec]:
return load_fivetran_asset_specs(
workspace=self,
dagster_fivetran_translator=dagster_fivetran_translator
)
For the context, after this and this discussions, we decided to detach spec loading from the resources, which is why I cached load_fivetran_asset_specs
instead of caching a property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging @benpankow @dpeng817 @schrockn @alangenfeld @OwenKephart for feedback on this specific comment and the PR description about caching the asset specs calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i bias towards @cached_method
on an object instance instead of @lru_cache(maxsize=None)
on a top level function - I think its much easier to reason about when things are expected to be cached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we went with cached_property for the FivetranWorkspace
equivalent in Airlift - I think that's likely the right move here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to a cached method in FivetranWorkspace
in 501caf7
python_modules/libraries/dagster-fivetran/dagster_fivetran/asset_decorator.py
Outdated
Show resolved
Hide resolved
python_modules/libraries/dagster-fivetran/dagster_fivetran/asset_decorator.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I... don't think we should add these. Reasoning:
- All other cases where we added new decorators were before we converged upon a new API design - and I think likely weren't considered holistically. The
@dbt_assets
decorator is an exception, but we didn't haveAssetSpec
yet. If we could go back and do it again, I have some doubts we would have added any of them (although there is some messy transformational stuff that it needs to do, so maybe I'm wrong) - We can always add it later. But if we add it now we're stuck supporting it and eventually adding arguments for all the asset spec things (like group_name here). Eventually you'll need to support some sort of splitting API, etc. I just think it's better to point users towards a more genericizable solution.
I'm also not sure I understand why we need both the decorator and build_fivetran_assets_definitions? Is the idea that it just provides progressively easier starting points?
If the actual function implementations were doing more complicated stuff, maybe I would see the additional value but these are pretty simple wrappers, thus I think we should hold off.
a773b5e
to
74db3e2
Compare
ebca057
to
4f7cd9a
Compare
74db3e2
to
c18340e
Compare
4f7cd9a
to
5f1b3f8
Compare
c18340e
to
3c9a56a
Compare
5f1b3f8
to
f78ad58
Compare
3c9a56a
to
48a3bb0
Compare
f78ad58
to
44865a4
Compare
48a3bb0
to
cd041b0
Compare
44865a4
to
a83dfc6
Compare
748298f
to
538348f
Compare
…d_fivetran_assets_definitions factory
538348f
to
e97963d
Compare
…_assets_definitions (#25944) ## Summary & Motivation This PR implements the `fivetran_assets` decorator and the `build_fivetran_assets_definitions` factory. - `fivetran_assets` can be used to create all assets for a given Fivetran connector, i.e. one asset per table in the connector. - `build_fivetran_assets_definitions` can be used to create all Fivetran assets defs, one per connector. It uses `fivetran_assets`. Both the asset decorator and factory use `load_fivetran_asset_specs`. This is motivated by the current implementation of `dagster-dbt`, `dagster-dlt` and `dagster-sling` - each leverages an asset decorator that loads the asset specs by itself. To avoid calling the Fivetran API each time `load_fivetran_asset_specs` is called, it is cached using `functools.lru_cache`. `load_fivetran_asset_specs` uses the state-backed defs, so reloading the code won't make additional calls to the Fivetran API, but calling `load_fivetran_asset_specs` N times in a script will make N calls to the Fivetran API. The goals here are: - make the Fivetran integration as similar as possible to the other ELT integrations by using the same patterns, eg. asset decorator - make the user experience as simple as possible and avoid having users manage the asset specs and number of calls to the Fivetran API. ## How I Tested These Changes Additional unit tests with BK. ## Changelog [dagster-fivetran] The `fivetran_assets` decorator is added. It can be used with the `FivetranWorkspace` resource and `DagsterFivetranTranslator` translator to load Fivetran tables for a given connector as assets in Dagster. The `build_fivetran_assets_definitions` factory can be used to create assets for all the connectors in your Fivetran workspace.
…and state-backed defs (#26133) ## Summary & Motivation Updates load_fivetran_asset_specs() and state-backed definitions to accept an instance of `DagsterFivetranTranslator`. See more about the motivation in the original thread [here](#25944 (comment)). ## How I Tested These Changes Additional unit tests to test custom translators with BK ## Changelog [dagster-fivetran] `load_fivetran_asset_specs` is updated to accept an instance of `DagsterFivetranTranslator` or custom subclass.
…_assets_definitions (dagster-io#25944) ## Summary & Motivation This PR implements the `fivetran_assets` decorator and the `build_fivetran_assets_definitions` factory. - `fivetran_assets` can be used to create all assets for a given Fivetran connector, i.e. one asset per table in the connector. - `build_fivetran_assets_definitions` can be used to create all Fivetran assets defs, one per connector. It uses `fivetran_assets`. Both the asset decorator and factory use `load_fivetran_asset_specs`. This is motivated by the current implementation of `dagster-dbt`, `dagster-dlt` and `dagster-sling` - each leverages an asset decorator that loads the asset specs by itself. To avoid calling the Fivetran API each time `load_fivetran_asset_specs` is called, it is cached using `functools.lru_cache`. `load_fivetran_asset_specs` uses the state-backed defs, so reloading the code won't make additional calls to the Fivetran API, but calling `load_fivetran_asset_specs` N times in a script will make N calls to the Fivetran API. The goals here are: - make the Fivetran integration as similar as possible to the other ELT integrations by using the same patterns, eg. asset decorator - make the user experience as simple as possible and avoid having users manage the asset specs and number of calls to the Fivetran API. ## How I Tested These Changes Additional unit tests with BK. ## Changelog [dagster-fivetran] The `fivetran_assets` decorator is added. It can be used with the `FivetranWorkspace` resource and `DagsterFivetranTranslator` translator to load Fivetran tables for a given connector as assets in Dagster. The `build_fivetran_assets_definitions` factory can be used to create assets for all the connectors in your Fivetran workspace.
…and state-backed defs (dagster-io#26133) ## Summary & Motivation Updates load_fivetran_asset_specs() and state-backed definitions to accept an instance of `DagsterFivetranTranslator`. See more about the motivation in the original thread [here](dagster-io#25944 (comment)). ## How I Tested These Changes Additional unit tests to test custom translators with BK ## Changelog [dagster-fivetran] `load_fivetran_asset_specs` is updated to accept an instance of `DagsterFivetranTranslator` or custom subclass.
Summary & Motivation
This PR implements the
fivetran_assets
decorator and thebuild_fivetran_assets_definitions
factory.fivetran_assets
can be used to create all assets for a given Fivetran connector, i.e. one asset per table in the connector.build_fivetran_assets_definitions
can be used to create all Fivetran assets defs, one per connector. It usesfivetran_assets
.Both the asset decorator and factory use
load_fivetran_asset_specs
. This is motivated by the current implementation ofdagster-dbt
,dagster-dlt
anddagster-sling
- each leverages an asset decorator that loads the asset specs by itself.To avoid calling the Fivetran API each time
load_fivetran_asset_specs
is called, it is cached usingfunctools.lru_cache
.load_fivetran_asset_specs
uses the state-backed defs, so reloading the code won't make additional calls to the Fivetran API, but callingload_fivetran_asset_specs
N times in a script will make N calls to the Fivetran API.The goals here are:
How I Tested These Changes
Additional unit tests with BK.
Changelog
[dagster-fivetran] The
fivetran_assets
decorator is added. It can be used with theFivetranWorkspace
resource andDagsterFivetranTranslator
translator to load Fivetran tables for a given connector as assets in Dagster. Thebuild_fivetran_assets_definitions
factory can be used to create assets for all the connectors in your Fivetran workspace.