-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Filter collections #36
feat: Filter collections #36
Conversation
…rue` - should not matter as we define a default value anyway
…her than override
tap_mongodb/connector.py
Outdated
collections = self.database.list_collection_names( | ||
authorizedCollections=True, | ||
nameOnly=True, | ||
filter={"$or": [{"name": c} for c in self._collections]} if self._collections else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter={"$or": [{"name": c} for c in self._collections]} if self._collections else None, | |
filter={"name": {"$in": self._collections}} if self._collections else None, |
This would be ideal, but I get the following error:
pymongo.errors.OperationFailure: can't get regex from filter doc not a regex, full error: {'ok': 0, 'errmsg': "can't get regex from filter doc not a regex", 'code': 8000, 'codeName': 'AtlasError'}
Can't see any indication from docs that this isn't supported:
dd03fbd
to
4cd6373
Compare
ac5d4de
to
d847006
Compare
It seems there's significant bitrot in the CI 😞. If we can fix it I'm happy to merge this PR. We may also need/want to drop support for a few EOL Python versions. |
@edgarrmondragon I've got a CI fix branch in the works (trying not to touch too much here) |
tap_mongodb/tap.py
Outdated
th.OneOf( | ||
th.StringType, | ||
th.ArrayType(th.StringType), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Meltano doesn't yet support setting union types and that is arguably how most of the time this tap will be used, wdyt of only accepting an array of strings for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel that strongly about the union type - this just started out as a POC for @melgazar9 who wanted to filter one collection at a time, so I thought it made sense to support that pattern as a string also. I think the Meltano setting definition needs to be kind: array
as you pointed out, but that shouldn't have any bearing here. If you think it's confusing, I'm all for removing it in favour of a better UX/DX. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove it. It adds little benefit and complicates things for automatic settings
generation from the tap's --about
output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…only type that can support both single and multiple values
Head branch was pushed to by a user without write access
6a9d1dd
to
0da7373
Compare
0da7373
to
ae5c785
Compare
Thanks @ReubenFrankel! |
Implements collection filtering for improved discover performance. New
filter_collections
setting accepts a string or array of strings to filter collection names by.