Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5/n][dagster-fivetran] Implement FivetranWorkspaceData to FivetranConnectorTableProps method #25797

Merged
merged 8 commits into from
Nov 12, 2024

Conversation

maximearmstrong
Copy link
Contributor

@maximearmstrong maximearmstrong commented Nov 7, 2024

Summary & Motivation

This PR implements FivetranWorkspaceData.to_fivetran_connector_table_props_data(), a method that converts a FivetranWorkspaceData object to a list of FivetranConnectorTableProps.

To create the asset spec, we need one FivetranConnectorTableProps object per connector table. This method parses the API raw data to create the FivetranConnectorTablePropsobject, which is compatible with theDagsterFivetranTranslator`.

This will be used in the defs_from_state method of the stated-backed defs loader in a subsequent PR.

How I Tested These Changes

Additional unit test

Copy link
Contributor Author

maximearmstrong commented Nov 7, 2024

@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-4 branch from b17bd45 to 252be0c Compare November 8, 2024 14:02
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-5 branch from 19493b4 to d5a03b3 Compare November 8, 2024 14:02
@maximearmstrong maximearmstrong changed the title [5/n][dagster-fivetran] Implement FivetranWorkspaceData to FivetranConnectorTableProps method [5/n][dagster-fivetran] Implement FivetranWorkspaceData to FivetranConnectorTableProps method Nov 8, 2024
@maximearmstrong maximearmstrong marked this pull request as ready for review November 8, 2024 14:20
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-4 branch from 252be0c to 56205b0 Compare November 8, 2024 16:44
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-5 branch from 1a573eb to 16e3497 Compare November 8, 2024 16:44
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-4 branch from 56205b0 to 6c50270 Compare November 8, 2024 17:00
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-5 branch from 16e3497 to aad712a Compare November 8, 2024 17:00
Copy link
Contributor

@dpeng817 dpeng817 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some more design stuff

@@ -0,0 +1,30 @@
import uuid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; why is this in translator tests? Shouldn't it be in the same place as the previous workspace method tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 697e460

raise NotImplementedError()
data: List[FivetranConnectorTableProps] = []

for connector_id, connector_data in self.connectors_by_id.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In airlift, from the raw API data we retrieved, we constructed a "lookup" object which built up the actual usable asset data from a set of cacheable properties. See

for an example of what I mean.

Since this is over the "cacheable" boundary, I'm wondering if there's a likelihood if it being called a bunch, and if so I think the cacheable structure likely makes sense. At the very least, I think that the conversion to FivetranConnectorTableProps should probably be cached, but it might also make sense to cache the stuff from 73-83 here as properties if there's potential for reuse (but if not, then maybe that's not worth the effort)

Copy link
Contributor Author

@maximearmstrong maximearmstrong Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. This method is meant to be used only in FivetranWorkspaceDefsLoader.defs_from_state, see implement in next PR #25807.

But I think caching it makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cached the method in b272d07

for schema in schemas_data.values():
if schema["enabled"]:
schema_name = schema["name_in_destination"]
schema_tables: Dict[str, Dict[str, Any]] = cast(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strong code smell from this one. The complexity from the previous PR feels like it's spilling over into this one with all the casting and raw string munging we need to do. Let's strongly type all of these objects, I think things will feel much better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. This comment applies here as well - this is taken from the legacy code to replicate the behavior. It will be updated in the same PR refactoring how we are storing destinations and connectors.

table_name = table["name_in_destination"]
data.append(
FivetranConnectorTableProps(
table=f"{schema_name}.{table_name}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should be a standalone method / something reusable so that we're consistent about how it's constructed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 0040d59

@@ -80,7 +117,7 @@ def get_asset_spec(self, props: FivetranConnectorTableProps) -> AssetSpec:
schema_name, table_name = props.table.split(".")
schema_entry = next(
schema
for schema in props.schemas["schemas"].values()
for schema in props.schema_config["schemas"].values()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the name change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reflect Fivetran's ontology, see here

Copy link
Contributor

@dpeng817 dpeng817 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits for your consideration.

@@ -381,6 +382,16 @@
}


@pytest.fixture(name="api_key")
def api_key_fixture() -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; I feel like this makes more sense to be a constant than a fixture, personally. I feel like when I see a mountain of fixtures, I get overwhelmed by a test. I know that this doesn't originate with this PR, but I personally prefer TEST_API_KEY, TEST_API_SECRET, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair - done in ad9fd4e

Comment on lines 6 to 10
api_key: str,
api_secret: str,
connector_id: str,
destination_id: str,
group_id: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea these all feel like they could be constants and then the test feels a little less overwhelming, lol.

Base automatically changed from maxime/rework-fivetran-4 to master November 12, 2024 18:11
@maximearmstrong maximearmstrong force-pushed the maxime/rework-fivetran-5 branch from c407973 to 4913f6e Compare November 12, 2024 18:20
@maximearmstrong maximearmstrong merged commit d6f4f52 into master Nov 12, 2024
1 check passed
@maximearmstrong maximearmstrong deleted the maxime/rework-fivetran-5 branch November 12, 2024 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants