Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create time filtering for bulk actions table #23560

Merged
merged 13 commits into from
Aug 22, 2024

Conversation

jamiedemaria
Copy link
Contributor

@jamiedemaria jamiedemaria commented Aug 9, 2024

Summary & Motivation

Adds ability to filter on create time to the BulkActions table so that we dont need to filter out results manually when serving the RunsFeed. Does this by adding a BulkActionsFilter like RunsFilter so that if we want to add more filtering capabilities we can add them to this object rather than as new parameters on the get_backfills method

companion internal pr https://github.com/dagster-io/internal/pull/11031

How I Tested These Changes

@jamiedemaria jamiedemaria force-pushed the jamie/graphene-mega-runs branch from e9bda94 to 55ab112 Compare August 12, 2024 18:27
@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from 738c84f to da4bd56 Compare August 12, 2024 18:27
Copy link

vercel bot commented Aug 12, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
dagster-docs-next ❌ Failed (Inspect) Aug 12, 2024 9:38pm

@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from da4bd56 to f36fa63 Compare August 12, 2024 21:07
@jamiedemaria jamiedemaria force-pushed the jamie/graphene-mega-runs branch from 1b3f915 to 89a653e Compare August 12, 2024 21:38
@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from f36fa63 to bc64137 Compare August 12, 2024 21:38
@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from bc64137 to 89321ea Compare August 13, 2024 20:55
Base automatically changed from jamie/graphene-mega-runs to master August 14, 2024 13:53
@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from 89321ea to b75f5db Compare August 14, 2024 15:56
@jamiedemaria jamiedemaria marked this pull request as ready for review August 14, 2024 16:26
Copy link

github-actions bot commented Aug 14, 2024

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-64ffca6a5-elementl.vercel.app
https://jamie-bulk-actions-table-filters.core-storybook.dagster-docs.io

Built with commit c2976ee.
This pull request is being automatically deployed with vercel-action

@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from 4143158 to cf67d4a Compare August 15, 2024 18:13
if status:
query = query.where(BulkActionsTable.c.status == status.value)
query = db_select([BulkActionsTable.c.body, BulkActionsTable.c.timestamp])
if status or (filters and filters.status):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunate that we have to do all of this checking around status. I assume we'd have to do a deprecation warning cycle if we wanted to have status filtering be done via the BulkActionsFilter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so? I don't have a great sense for how often people are manually calling this from user code with the status arg.

It's not marked as a @public method on the instance, but it is exposed on the cloud graphql implementation. If you wanted to be super-conservative, you could log some volume metrics in Cloud first before deprecating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, should we take this opportunity to make it a statuses (plural) filter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Will we need them to render the different tabs in the Runs view?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we need the plural statuses, but RunsFilter does statuses and this internal fn also uses statuses so it could be nice to align

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Runs page supports filtering for multiple statuses, so i'm updating the BulkActions filter to support multiple statuses in this stacked PR #23772

Copy link
Member

@prha prha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but can you split apart the core / graphql changes into separate PRs?

We'd also want to add tests for the storage implementations in our abstract test suite that flexes the backfill filtering. This ensures that our cloud implementations are also under test.

There should be accompanying internal PRs for this, right?

if status:
query = query.where(BulkActionsTable.c.status == status.value)
query = db_select([BulkActionsTable.c.body, BulkActionsTable.c.timestamp])
if status or (filters and filters.status):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so? I don't have a great sense for how often people are manually calling this from user code with the status arg.

It's not marked as a @public method on the instance, but it is exposed on the cloud graphql implementation. If you wanted to be super-conservative, you could log some volume metrics in Cloud first before deprecating.

@jamiedemaria
Copy link
Contributor Author

There should be accompanying internal PRs for this, right?

yeah. i'm finishing it up and will post soon

@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from c2976ee to ecb72d4 Compare August 15, 2024 19:40
@jamiedemaria jamiedemaria changed the title basic create filtering for bulk actions table create time filtering for bulk actions table Aug 15, 2024
Copy link
Member

@prha prha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mapping out that getting filter parity with runs is still a big (incomplete endeavor). I don't think any of that should block this PR though.

@@ -38,6 +40,13 @@ def from_graphql_input(graphql_str):
return BulkActionStatus(graphql_str)


@record
class BulkActionsFilter:
status: Optional[BulkActionStatus] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we will need to expose most of the fields that we currently do on the RunsFilter. Support for those fields is probably out of scope for this PR though.

The question about supporting single vs multiple statuses though will probably depend on the mapping of BulkActionStatus to the various types in the Runs view UI status filters.

We should flag that as a thing we need to figure out (status), as well as some of the other filters (e.g. job_name, tags, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - to add more context if someone ends up looking at this PR in the future: the main goal of adding this filter now is to remove this block of code https://github.com/dagster-io/dagster/pull/23560/files#diff-c46424223e58f81b69323dee30256f0a38733eac79198279ebbcf5cec19ee16eL461-L492

I noted these things down for the filtering discussion

@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from 0261690 to fba1cfa Compare August 16, 2024 15:07
@@ -373,6 +373,7 @@ def get_backfills(
status: Optional[BulkActionStatus] = None,
cursor: Optional[str] = None,
limit: Optional[int] = None,
filters: Optional[BulkActionsFilter] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider changing the order of args? And maybe (in a separate PR) switching callers from using status to using filters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think ideally filters would be before status, but there could be callsites that dont use kwargs right? get_backfills(BulkActionStatus.COMPLETED)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any callsites that don't use kwargs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would potentially worry about user code callsites, but get_backfills is not marked as @public, so I think you could add a changelog note and be fine about changing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll stack a branch to do this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

branch for re-ordering #23773

@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from b484605 to b341c13 Compare August 19, 2024 14:34
if cursor:
cursor_query = db_select([BulkActionsTable.c.id]).where(
BulkActionsTable.c.key == cursor
)
query = query.where(BulkActionsTable.c.id < cursor_query)
if filters and filters.created_after:
query = query.where(BulkActionsTable.c.timestamp > filters.created_after)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prha the time-based filters for the RunsTable replace the timezone filters.created_after.replace(tzinfo=None) but the columns in the RunsTable are DATETIME. In the BulkActionsTable they are TIMESTAMP and i found that the postgres run storage tests failed when i replaced the timezone with None. The tests all pass when i leave the tzinfo as is, but i haven't been able to find a reason why online. Do you forsee any issues with not doing the timezone conversion here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more info -
when we add the backfill to the DB we call datetime_from_timestamp which will return a UTC datetime

timestamp=datetime_from_timestamp(partition_backfill.backfill_timestamp),

In the graphene layer, I'm doing the same conversion so anything coming from the UI will also get converted to UTC and filtering will be correct https://github.com/dagster-io/dagster/pull/23682/files#diff-61fefa33db2a378b1c50f360e3aa830124007f7bc8f4195005286b529b6e60cdR388

But maybe the right thing to do is to have a custom constructor on BulkActionFilters that takes timestamp instead of datetime and does the conversion there so that all of the datetimes on the filter are guranteed to be converted to UTC

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either way works...

@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch 4 times, most recently from 68d0651 to 1e70e2b Compare August 20, 2024 20:57
@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from 1e70e2b to 489a7ae Compare August 21, 2024 14:11
@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch 2 times, most recently from c58431c to d215c69 Compare August 22, 2024 15:21
@jamiedemaria jamiedemaria force-pushed the jamie/bulk-actions-table-filters branch from d215c69 to 42de3cc Compare August 22, 2024 16:06
@jamiedemaria jamiedemaria merged commit 8fd3c25 into master Aug 22, 2024
1 check failed
@jamiedemaria jamiedemaria deleted the jamie/bulk-actions-table-filters branch August 22, 2024 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants