[9/n][dagster-fivetran] Implement base sync methods in FivetranClient #25911

maximearmstrong · 2024-11-13T22:19:58Z

Summary & Motivation

This PR reworks legacy sync methods and implements them in the FivetranClient:

update_schedule_type_for_connector is added based on legacy update_schedule_type and update_connector
_start_sync is based on the legacy start_sync and start_resync
- it avoids code duplication
- start_resync will be added in a subsequent PR
- the order of some steps has been reversed - we verify that a connector is syncable before updating the state of its schedule
start_sync is added based on legacy start_sync

Tests mock the request API calls and make sure that all calls are made.

How I Tested These Changes

Additional unit tests with BK

maximearmstrong · 2024-11-13T22:20:23Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

graphite-app · 2024-11-13T22:20:55Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py

+    def update_schedule_type(
+        self, connector_id: str, schedule_type: Optional[str] = None
+    ) -> Mapping[str, Any]:
+        """Updates the schedule type property of the connector to either "auto" or "manual".
+
+        Args:
+            connector_id (str): The Fivetran Connector ID. You can retrieve this value from the
+                "Setup" tab of a given connector in the Fivetran UI.
+            schedule_type (Optional[str]): Either "auto" (to turn the schedule on) or "manual" (to
+                turn it off).
+
+        Returns:
+            Dict[str, Any]: Parsed json data representing the API response.
+        """
+        if schedule_type not in ["auto", "manual"]:
+            check.failed(f"schedule_type must be either 'auto' or 'manual': got '{schedule_type}'")
+        return self.update_connector(connector_id, properties={"schedule_type": schedule_type})


The schedule_type parameter is marked as Optional[str] but the validation logic will raise an error if None is passed. Consider either removing the Optional type hint or adding a None check before the schedule_type validation to align the type hints with the actual behavior.

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

this is a good one.

python_modules/libraries/dagster-fivetran/dagster_fivetran/translator.py

python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py

dpeng817 · 2024-11-18T16:45:09Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/translator.py

+    @classmethod
+    def has_value(cls, value) -> bool:
+        return value in cls._value2member_map_
+
+    @classmethod
+    def values(cls) -> Sequence[str]:
+        return list(cls._value2member_map_.keys())


highly surprised you need to add these

It is possible to list and use __contains__ with the symbolic names, but no functions support _value2member_map_ in Enum. This is to avoid calling _value2member_map_ in the resource.

I see - so this is entirely due to the difference in casing between key and value? In that case maybe we should just make them match and add a comment

Updated FivetranConnectorScheduleType to subclass both str and Enum in 8e607df. Callsites have been updated too.

python_modules/libraries/dagster-fivetran/dagster_fivetran_tests/experimental/conftest.py

dpeng817 · 2024-11-18T16:49:53Z

python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py

+        )
+        self._start_sync(request_fn=request_fn, connector_id=connector_id)
+
+    def _start_sync(self, request_fn: Callable, connector_id: str) -> None:


nit: Can we be more explicit about the callable signature pls

Updated in d341d4b

dpeng817

I'm finding this PR hard to review;

a ton of methods are getting added, but it seems like only the base layer (start_sync, start_resync, poll_sync) are under test, and the assertion is kinda bare. See additional comments regarding testing
I don't know what the actual public API surface area is here that we're introducing, or what it's going to look like when it's used. For example, all this stuff about the previous_sync_completed_at: I was at first very concerned that this was a pretty awkward surface area, but then I inferred (and I could be wrong) that this isn't actually exposed to users. But I did that through context clues.
In general, I don't really understand what's coming from legacy code and what's new. I think some comments elucidating that could be quite useful.

More broadly, this is likely our last chance to make significant changes to these APIs until 2.0. So I think we should try our best to fix any changes in the APIs now rather than later.

dpeng817 · 2024-11-18T17:02:41Z

python_modules/libraries/dagster-fivetran/dagster_fivetran_tests/experimental/test_resources.py

+    all_api_mocks.calls.reset()
+    client.start_sync(connector_id=connector_id)
+    assert len(all_api_mocks.calls) == 3
+    assert f"{connector_id}/force" in all_api_mocks.calls[2].request.url
+
+    # resync calls
+    all_api_mocks.calls.reset()
+    client.start_resync(connector_id=connector_id, resync_parameters=None)
+    assert len(all_api_mocks.calls) == 3
+    assert f"{connector_id}/resync" in all_api_mocks.calls[2].request.url
+
+    # resync calls with parameters
+    all_api_mocks.calls.reset()
+    client.start_resync(connector_id=connector_id, resync_parameters={"property1": ["string"]})
+    assert len(all_api_mocks.calls) == 3
+    assert f"{connector_id}/schemas/tables/resync" in all_api_mocks.calls[2].request.url
+
+    # poll calls
+    all_api_mocks.calls.reset()
+    client.poll_sync(
+        connector_id=connector_id, previous_sync_completed_at=parser.parse(MIN_TIME_STR)
+    )


I'm finding these tests to be pretty insufficient, this is OK maybe for the underlying start_xsync fxns, but sync_and_poll, and resync_and_poll have pretty complicated logic that I think is worth testing the state changes on.

I can add tests for sync_and_poll and resync_and_poll - the initial goal was to test the full behavior with the FivetranWorkspace.sync_and_poll method that will be introduced in PR 11 of this stack.

I tend to prefer PRs to be relatively atomic in their additions + tests - so I think yea makes sense to add some tests here as well.

For what it's worth, with only 9/n and 10/n open, there was no way for me to know that context. I think it's useful to try to think about things from that perspective when writing up PR description, adding comments etc.

That's totally fair - I will update the tests here and make sure to apply this feedback in the future as well.

sounds good! Thanks

maximearmstrong · 2024-11-18T17:44:55Z

a ton of methods are getting added, but it seems like only the base layer (start_sync, start_resync, poll_sync) are under test, and the assertion is kinda bare. See additional comments regarding testing

@dpeng817 The focus of this PR is mainly to take what we had in the legacy code, implement it in the new FivetranClient, and rework the methods to:

make the code cleaner
refresh the behavior where errors were introduced, example, previously the schedule state of the connector was updated before we confirmed that the connector was syncable.

I don't know what the actual public API surface area is here that we're introducing, or what it's going to look like when it's used. For example, all this stuff about the previous_sync_completed_at: I was at first very concerned that this was a pretty awkward surface area, but then I inferred (and I could be wrong) that this isn't actually exposed to users. But I did that through context clues.

In most cases, user will call fivetran_workspace.sync_and_poll() (PR 11 in this stack) which sync and poll the connector, and materialize the assets.

In general, I don't really understand what's coming from legacy code and what's new. I think some comments elucidating that could be quite useful.

I can add comments to clarify where changes are made.

More broadly, this is likely our last chance to make significant changes to these APIs until 2.0. So I think we should try our best to fix any changes in the APIs now rather than later.

The first effort with this stack of PRs is the reproduce the current behavior but using the new API pattern. Once we have that, the goal would be to work with a design partner that is already using dagster-fivetran, make the transition from the previous pattern to the new one, then improve the behavior with the design partner.

…ient

…6059) ## Summary & Motivation This PR reworks legacy resync method and implements it in the `FivetranClient`: - `start_resync` is added based on legacy `start_resync` - `start_resync` leverages `_start_sync` introduced in #25911 - a [resync in Fivetran](https://fivetran.com/docs/rest-api/api-reference/connectors/resync-connector) is historical data sync - the endpoint and result is different, but logic around how to call and handle a resync is the same as a sync. - a resync can be done with or without resync parameters, using a different endpoint. Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK

…#25911) ## Summary & Motivation This PR reworks legacy sync methods and implements them in the `FivetranClient`: - `update_schedule_type_for_connector` is added based on legacy `update_schedule_type` and `update_connector` - `_start_sync` is based on the legacy `start_sync` and `start_resync` - it avoids code duplication - `start_resync` will be added in a subsequent PR - the order of some steps has been reversed - we verify that a connector is syncable before updating the state of its schedule - `start_sync` is added based on legacy `start_sync` Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK

…6059) ## Summary & Motivation This PR reworks legacy resync method and implements it in the `FivetranClient`: - `start_resync` is added based on legacy `start_resync` - `start_resync` leverages `_start_sync` introduced in #25911 - a [resync in Fivetran](https://fivetran.com/docs/rest-api/api-reference/connectors/resync-connector) is historical data sync - the endpoint and result is different, but logic around how to call and handle a resync is the same as a sync. - a resync can be done with or without resync parameters, using a different endpoint. Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK

…dagster-io#25911) ## Summary & Motivation This PR reworks legacy sync methods and implements them in the `FivetranClient`: - `update_schedule_type_for_connector` is added based on legacy `update_schedule_type` and `update_connector` - `_start_sync` is based on the legacy `start_sync` and `start_resync` - it avoids code duplication - `start_resync` will be added in a subsequent PR - the order of some steps has been reversed - we verify that a connector is syncable before updating the state of its schedule - `start_sync` is added based on legacy `start_sync` Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK

…gster-io#26059) ## Summary & Motivation This PR reworks legacy resync method and implements it in the `FivetranClient`: - `start_resync` is added based on legacy `start_resync` - `start_resync` leverages `_start_sync` introduced in dagster-io#25911 - a [resync in Fivetran](https://fivetran.com/docs/rest-api/api-reference/connectors/resync-connector) is historical data sync - the endpoint and result is different, but logic around how to call and handle a resync is the same as a sync. - a resync can be done with or without resync parameters, using a different endpoint. Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK

This was referenced Nov 13, 2024

[7/n][dagster-fivetran] Implement load_fivetran_asset_specs #25808

Merged

[8/n][dagster-fivetran] Implement FivetranConnector and FivetranDestination #25889

Merged

graphite-app bot reviewed Nov 13, 2024

View reviewed changes

maximearmstrong force-pushed the maxime/rework-fivetran-8 branch from 079dff2 to a8daee3 Compare November 14, 2024 23:48

maximearmstrong force-pushed the maxime/rework-fivetran-9 branch from c152a82 to 8562731 Compare November 14, 2024 23:48

graphite-app bot reviewed Nov 14, 2024

View reviewed changes

python_modules/libraries/dagster-fivetran/dagster_fivetran/translator.py Outdated Show resolved Hide resolved

maximearmstrong force-pushed the maxime/rework-fivetran-8 branch from a8daee3 to 283f3d8 Compare November 15, 2024 02:12

maximearmstrong force-pushed the maxime/rework-fivetran-9 branch from 8562731 to 0eb21fe Compare November 15, 2024 02:12

maximearmstrong mentioned this pull request Nov 15, 2024

[10/n][dagster-fivetran] Implement fivetran_assets and build_fivetran_assets_definitions #25944

Merged

graphite-app bot reviewed Nov 15, 2024

View reviewed changes

python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py Outdated Show resolved Hide resolved

python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py Outdated Show resolved Hide resolved

maximearmstrong marked this pull request as ready for review November 15, 2024 14:46

maximearmstrong self-assigned this Nov 15, 2024

maximearmstrong requested review from benpankow and dpeng817 November 15, 2024 14:46

Base automatically changed from maxime/rework-fivetran-8 to master November 15, 2024 14:53

maximearmstrong force-pushed the maxime/rework-fivetran-9 branch from bd27402 to 357ff3f Compare November 15, 2024 14:54

maximearmstrong mentioned this pull request Nov 15, 2024

[11/n][dagster-fivetran] Implement materialization method in FivetranWorkspace #25961

Merged

dpeng817 reviewed Nov 18, 2024

View reviewed changes

python_modules/libraries/dagster-fivetran/dagster_fivetran_tests/experimental/conftest.py Outdated Show resolved Hide resolved

dpeng817 reviewed Nov 18, 2024

View reviewed changes

dpeng817 requested changes Nov 18, 2024

View reviewed changes

dpeng817 reviewed Nov 18, 2024

View reviewed changes

maximearmstrong force-pushed the maxime/rework-fivetran-9 branch 2 times, most recently from 72d546e to a069876 Compare November 25, 2024 16:50

maximearmstrong mentioned this pull request Nov 25, 2024

[dagster-fivetran] Use Fivetran translator instance in load specs fn and state-backed defs #26133

Merged

maximearmstrong force-pushed the maxime/rework-fivetran-9 branch 2 times, most recently from cec7700 to 4e940a4 Compare November 26, 2024 22:49

maximearmstrong added 16 commits November 26, 2024 18:20

[9/n][dagster-fivetran] Implement sync and poll methods in FivetranCl…

0396877

…ient

Update Fivetran client and tests

7729ab3

Clean code

cf038a2

Add comments in test

ce1c6e6

Assert connector syncable before updating schedule

3057481

Split PRs; Implement base sync method

51a35a3

lint

70f0858

lint

343fe31

Update callable signature type hints

4fb0ae3

Update FivetranConnectorScheduleType to subclass str

027a1c5

Remove MIN_STR_TIME

ad13141

Rename assert_syncable to is_syncable

de46adc

Update description for disable_schedule_on_trigger

77d4395

Add connector_id in comments

66ee705

Make is_syncable a property

72025e8

Revert is_syncable property; rename to validate_syncable

5381ecd

maximearmstrong force-pushed the maxime/rework-fivetran-9 branch from 4e940a4 to 5381ecd Compare November 26, 2024 23:20

maximearmstrong merged commit 686979c into master Nov 27, 2024
1 check passed

maximearmstrong deleted the maxime/rework-fivetran-9 branch November 27, 2024 13:27

maximearmstrong mentioned this pull request Nov 27, 2024

[dagster-fivetran] Implement get_columns_config_for_table in FivetranClient #26181

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[9/n][dagster-fivetran] Implement base sync methods in FivetranClient #25911

[9/n][dagster-fivetran] Implement base sync methods in FivetranClient #25911

maximearmstrong commented Nov 13, 2024 •

edited

Loading

maximearmstrong commented Nov 13, 2024 •

edited

Loading

graphite-app bot Nov 13, 2024

dpeng817 Nov 18, 2024

dpeng817 Nov 18, 2024

maximearmstrong Nov 18, 2024

dpeng817 Nov 18, 2024

maximearmstrong Nov 20, 2024

dpeng817 Nov 18, 2024

maximearmstrong Nov 20, 2024

dpeng817 left a comment

dpeng817 Nov 18, 2024

maximearmstrong Nov 18, 2024 •

edited

Loading

dpeng817 Nov 19, 2024

maximearmstrong Nov 20, 2024

dpeng817 Nov 20, 2024

maximearmstrong commented Nov 18, 2024

[9/n][dagster-fivetran] Implement base sync methods in FivetranClient #25911

[9/n][dagster-fivetran] Implement base sync methods in FivetranClient #25911

Conversation

maximearmstrong commented Nov 13, 2024 • edited Loading

Summary & Motivation

How I Tested These Changes

maximearmstrong commented Nov 13, 2024 • edited Loading

graphite-app bot Nov 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dpeng817 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maximearmstrong Nov 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maximearmstrong commented Nov 18, 2024

maximearmstrong commented Nov 13, 2024 •

edited

Loading

maximearmstrong commented Nov 13, 2024 •

edited

Loading

maximearmstrong Nov 18, 2024 •

edited

Loading