-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[9/n][dagster-fivetran] Implement base sync methods in FivetranClient #25911
Conversation
FivetranWorkspaceData
to FivetranConnectorTableProps
method
#25797
def update_schedule_type( | ||
self, connector_id: str, schedule_type: Optional[str] = None | ||
) -> Mapping[str, Any]: | ||
"""Updates the schedule type property of the connector to either "auto" or "manual". | ||
|
||
Args: | ||
connector_id (str): The Fivetran Connector ID. You can retrieve this value from the | ||
"Setup" tab of a given connector in the Fivetran UI. | ||
schedule_type (Optional[str]): Either "auto" (to turn the schedule on) or "manual" (to | ||
turn it off). | ||
|
||
Returns: | ||
Dict[str, Any]: Parsed json data representing the API response. | ||
""" | ||
if schedule_type not in ["auto", "manual"]: | ||
check.failed(f"schedule_type must be either 'auto' or 'manual': got '{schedule_type}'") | ||
return self.update_connector(connector_id, properties={"schedule_type": schedule_type}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schedule_type
parameter is marked as Optional[str]
but the validation logic will raise an error if None
is passed. Consider either removing the Optional
type hint or adding a None
check before the schedule_type
validation to align the type hints with the actual behavior.
Spotted by Graphite Reviewer
Is this helpful? React 👍 or 👎 to let us know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good one.
079dff2
to
a8daee3
Compare
c152a82
to
8562731
Compare
python_modules/libraries/dagster-fivetran/dagster_fivetran/translator.py
Outdated
Show resolved
Hide resolved
a8daee3
to
283f3d8
Compare
8562731
to
0eb21fe
Compare
python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py
Outdated
Show resolved
Hide resolved
python_modules/libraries/dagster-fivetran/dagster_fivetran/resources.py
Outdated
Show resolved
Hide resolved
bd27402
to
357ff3f
Compare
@classmethod | ||
def has_value(cls, value) -> bool: | ||
return value in cls._value2member_map_ | ||
|
||
@classmethod | ||
def values(cls) -> Sequence[str]: | ||
return list(cls._value2member_map_.keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
highly surprised you need to add these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to list and use __contains__
with the symbolic names, but no functions support _value2member_map_
in Enum. This is to avoid calling _value2member_map_
in the resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see - so this is entirely due to the difference in casing between key and value? In that case maybe we should just make them match and add a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated FivetranConnectorScheduleType
to subclass both str
and Enum
in 8e607df. Callsites have been updated too.
python_modules/libraries/dagster-fivetran/dagster_fivetran_tests/experimental/conftest.py
Outdated
Show resolved
Hide resolved
) | ||
self._start_sync(request_fn=request_fn, connector_id=connector_id) | ||
|
||
def _start_sync(self, request_fn: Callable, connector_id: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we be more explicit about the callable signature pls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in d341d4b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm finding this PR hard to review;
- a ton of methods are getting added, but it seems like only the base layer (start_sync, start_resync, poll_sync) are under test, and the assertion is kinda bare. See additional comments regarding testing
- I don't know what the actual public API surface area is here that we're introducing, or what it's going to look like when it's used. For example, all this stuff about the
previous_sync_completed_at
: I was at first very concerned that this was a pretty awkward surface area, but then I inferred (and I could be wrong) that this isn't actually exposed to users. But I did that through context clues. - In general, I don't really understand what's coming from legacy code and what's new. I think some comments elucidating that could be quite useful.
More broadly, this is likely our last chance to make significant changes to these APIs until 2.0. So I think we should try our best to fix any changes in the APIs now rather than later.
all_api_mocks.calls.reset() | ||
client.start_sync(connector_id=connector_id) | ||
assert len(all_api_mocks.calls) == 3 | ||
assert f"{connector_id}/force" in all_api_mocks.calls[2].request.url | ||
|
||
# resync calls | ||
all_api_mocks.calls.reset() | ||
client.start_resync(connector_id=connector_id, resync_parameters=None) | ||
assert len(all_api_mocks.calls) == 3 | ||
assert f"{connector_id}/resync" in all_api_mocks.calls[2].request.url | ||
|
||
# resync calls with parameters | ||
all_api_mocks.calls.reset() | ||
client.start_resync(connector_id=connector_id, resync_parameters={"property1": ["string"]}) | ||
assert len(all_api_mocks.calls) == 3 | ||
assert f"{connector_id}/schemas/tables/resync" in all_api_mocks.calls[2].request.url | ||
|
||
# poll calls | ||
all_api_mocks.calls.reset() | ||
client.poll_sync( | ||
connector_id=connector_id, previous_sync_completed_at=parser.parse(MIN_TIME_STR) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm finding these tests to be pretty insufficient, this is OK maybe for the underlying start_xsync fxns, but sync_and_poll, and resync_and_poll have pretty complicated logic that I think is worth testing the state changes on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add tests for sync_and_poll
and resync_and_poll
- the initial goal was to test the full behavior with the FivetranWorkspace.sync_and_poll
method that will be introduced in PR 11 of this stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to prefer PRs to be relatively atomic in their additions + tests - so I think yea makes sense to add some tests here as well.
For what it's worth, with only 9/n and 10/n open, there was no way for me to know that context. I think it's useful to try to think about things from that perspective when writing up PR description, adding comments etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's totally fair - I will update the tests here and make sure to apply this feedback in the future as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good! Thanks
@dpeng817 The focus of this PR is mainly to take what we had in the legacy code, implement it in the new
In most cases, user will call fivetran_workspace.sync_and_poll() (PR 11 in this stack) which sync and poll the connector, and materialize the assets.
I can add comments to clarify where changes are made.
The first effort with this stack of PRs is the reproduce the current behavior but using the new API pattern. Once we have that, the goal would be to work with a design partner that is already using dagster-fivetran, make the transition from the previous pattern to the new one, then improve the behavior with the design partner. |
72d546e
to
a069876
Compare
cec7700
to
4e940a4
Compare
4e940a4
to
5381ecd
Compare
…6059) ## Summary & Motivation This PR reworks legacy resync method and implements it in the `FivetranClient`: - `start_resync` is added based on legacy `start_resync` - `start_resync` leverages `_start_sync` introduced in #25911 - a [resync in Fivetran](https://fivetran.com/docs/rest-api/api-reference/connectors/resync-connector) is historical data sync - the endpoint and result is different, but logic around how to call and handle a resync is the same as a sync. - a resync can be done with or without resync parameters, using a different endpoint. Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK
…#25911) ## Summary & Motivation This PR reworks legacy sync methods and implements them in the `FivetranClient`: - `update_schedule_type_for_connector` is added based on legacy `update_schedule_type` and `update_connector` - `_start_sync` is based on the legacy `start_sync` and `start_resync` - it avoids code duplication - `start_resync` will be added in a subsequent PR - the order of some steps has been reversed - we verify that a connector is syncable before updating the state of its schedule - `start_sync` is added based on legacy `start_sync` Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK
…6059) ## Summary & Motivation This PR reworks legacy resync method and implements it in the `FivetranClient`: - `start_resync` is added based on legacy `start_resync` - `start_resync` leverages `_start_sync` introduced in #25911 - a [resync in Fivetran](https://fivetran.com/docs/rest-api/api-reference/connectors/resync-connector) is historical data sync - the endpoint and result is different, but logic around how to call and handle a resync is the same as a sync. - a resync can be done with or without resync parameters, using a different endpoint. Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK
…dagster-io#25911) ## Summary & Motivation This PR reworks legacy sync methods and implements them in the `FivetranClient`: - `update_schedule_type_for_connector` is added based on legacy `update_schedule_type` and `update_connector` - `_start_sync` is based on the legacy `start_sync` and `start_resync` - it avoids code duplication - `start_resync` will be added in a subsequent PR - the order of some steps has been reversed - we verify that a connector is syncable before updating the state of its schedule - `start_sync` is added based on legacy `start_sync` Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK
…gster-io#26059) ## Summary & Motivation This PR reworks legacy resync method and implements it in the `FivetranClient`: - `start_resync` is added based on legacy `start_resync` - `start_resync` leverages `_start_sync` introduced in dagster-io#25911 - a [resync in Fivetran](https://fivetran.com/docs/rest-api/api-reference/connectors/resync-connector) is historical data sync - the endpoint and result is different, but logic around how to call and handle a resync is the same as a sync. - a resync can be done with or without resync parameters, using a different endpoint. Tests mock the request API calls and make sure that all calls are made. ## How I Tested These Changes Additional unit tests with BK
Summary & Motivation
This PR reworks legacy sync methods and implements them in the
FivetranClient
:update_schedule_type_for_connector
is added based on legacyupdate_schedule_type
andupdate_connector
_start_sync
is based on the legacystart_sync
andstart_resync
start_resync
will be added in a subsequent PRstart_sync
is added based on legacystart_sync
Tests mock the request API calls and make sure that all calls are made.
How I Tested These Changes
Additional unit tests with BK