Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow to set optional range #23

Merged
merged 15 commits into from
Apr 24, 2024

Conversation

lancondrej
Copy link
Contributor

No description provided.

README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@ReubenFrankel ReubenFrankel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments - we can finalise the README changes last, as I have some questions about behaviour.

tap_google_sheets/tap.py Outdated Show resolved Hide resolved
tap_google_sheets/streams.py Outdated Show resolved Hide resolved
tap_google_sheets/tests/test_first_line_range.py Outdated Show resolved Hide resolved
tap_google_sheets/tap.py Outdated Show resolved Hide resolved
@ReubenFrankel
Copy link
Contributor

Since I don't have edit access, please can you add

        th.Property(
            "range",
            th.StringType,
            description=(
                "Optionally choose a range of data using cell start and end coordinates"
                " - see [A1 notation](https://developers.google.com/sheets/api/guides/concepts#expandable-1)"  # noqa: E501
                " for more information"
            ),
            required=False,
        ),

to tap_google_sheets/tap.py here, and

        - name: range

to meltano.yml here.

@lancondrej
Copy link
Contributor Author

Since I don't have edit access, please can you add

        th.Property(
            "range",
            th.StringType,
            description=(
                "Optionally choose a range of data using cell start and end coordinates"
                " - see [A1 notation](https://developers.google.com/sheets/api/guides/concepts#expandable-1)"  # noqa: E501
                " for more information"
            ),
            required=False,
        ),

to tap_google_sheets/tap.py here, and

        - name: range

to meltano.yml here.

Sure I'm going to add it as well as all other comments. Thanks.

@ReubenFrankel
Copy link
Contributor

ReubenFrankel commented Apr 23, 2024

@lancondrej I appreciate the effort but I'd rather not drop Python 3.7 support just yet for this feature PR in order to get property pattern validation with the newer SDK version.

I think your regex just needed tweaking slightly to require both start and end cell coordinates:

Current

^([A-Za-z]*)(\d*):([A-Za-z]*)(\d*)$

image

Proposed

^([A-Za-z]+|\d+|([A-Za-z]\d+)):([A-Za-z]+|\d+|([A-Za-z]\d+))$

image

@lancondrej
Copy link
Contributor Author

I remake the range to support all valid A1 notation inputs. Use pattern validation for StringType -> this required singer-sdk update -> so stop supporting python 3.7, I also need to update other libs cause without it test starts failing for py 3.10 and more. So I do this update and also setup CI to test all supported python versions.

@ReubenFrankel
Copy link
Contributor

Also for the record, in #23 (comment) I was talking about manually raising a ConfigValidationError (i.e. from singer_sdk.exceptions import ConfigValidationError) - sorry, should have been clearer.

@ReubenFrankel
Copy link
Contributor

I remake the range to support all valid A1 notation inputs. Use pattern validation for StringType -> this required singer-sdk update -> so stop supporting python 3.7, I also need to update other libs cause without it test starts failing for py 3.10 and more. So I do this update and also setup CI to test all supported python versions.

Yeah, I understand why you did it, but I think we should just focus on the feature in this PR and not introduce breaking changes.

@lancondrej
Copy link
Contributor Author

I remake the range to support all valid A1 notation inputs. Use pattern validation for StringType -> this required singer-sdk update -> so stop supporting python 3.7, I also need to update other libs cause without it test starts failing for py 3.10 and more. So I do this update and also setup CI to test all supported python versions.

Yeah, I understand why you did it, but I think we should just focus on the feature in this PR and not introduce breaking changes.

I understand, I just suppose that stopping supporting python3.7 is not such a big deal considering it's almost a year after the end of life. But if you persist on supporting 3.7 I will revert those libs updates and just keep the validation directly here https://github.com/Matatika/tap-google-sheets/pull/23/files#diff-d43a9cad9a8c2d3bd8e4ddc0b914dcd00de39830c6bccbc6bcc10f25a63d509fR173

@lancondrej
Copy link
Contributor Author

Proposed

^([A-Za-z]+|\d+|([A-Za-z]\d+)):([A-Za-z]+|\d+|([A-Za-z]\d+))$

image

I was also trying this one, but unfortunatelly it's not fully true cause e.g. A:5 match but it's not valid, and e.g. AA1:AF20 is valid but not match.

@ReubenFrankel
Copy link
Contributor

I understand, I just suppose that stopping supporting python3.7 is not such a big deal considering it's almost a year after the end of life.

Dropping support for 3.7 is definitely on the to-do list - I just don't think this PR is the place to do it. I need to have a discussion with the team about how best to do that, especially here because this is the default Google Sheets variant for Meltano that I am aware quite a few people use, and I don't want to break their pipelines without a migration plan in place first. 😅

But if you persist on supporting 3.7 I will revert those libs updates and just keep the validation directly here https://github.com/Matatika/tap-google-sheets/pull/23/files#diff-d43a9cad9a8c2d3bd8e4ddc0b914dcd00de39830c6bccbc6bcc10f25a63d509fR173

This sounds good. 👍

@ReubenFrankel
Copy link
Contributor

AA1:AF20 is valid but not match

Ah, I missed some +s

^([A-Za-z]+|\d+|([A-Za-z]+\d+)):([A-Za-z]+|\d+|([A-Za-z]+\d+))$

image

A:5 match but it's not valid

Right, maybe we could perform that validation after the regex has been evaluated? To be honest, I think we will have caught most legitimate config errors with the above, so I'm not that concerned if a user hits a 400 Bad Request for the edge-case ranges like A:5 and 5:A (that should be enough of an indication to prompt them to check their config and realise their range value doesn't make sense). As long as it's not some programmatic exception like IndexError.

@ReubenFrankel
Copy link
Contributor

ReubenFrankel commented Apr 24, 2024

It's not pretty, but I think this regex satisfies all cases FWIW:

^([A-Za-z]+(?!:\d+$)|\d+(?!:[A-Za-z]+$)|[A-Za-z]+\d+):([A-Za-z]|\d+|[A-z]+\d+)$

image

@lancondrej
Copy link
Contributor Author

It's not pretty, but I think this regex satisfies all cases FWIW:

^([A-Za-z]+(?!:\d+$)|\d+(?!:[A-Za-z]+$)|[A-Za-z]+\d+):([A-Za-z]|\d+|[A-z]+\d+)$

image

Yes true, it satisfies almost all we need, I would rather limit the amount of [A-Za-z] to max 3, and number should be max 7 digits. But actually I would prefer to split it to multiple regex as I have it, cause the main purpose of it is that thanks to empty groups () in regexp it helps to parse an input to proper parts (columns and rows).

@lancondrej
Copy link
Contributor Author

I reverted the update and kept just validation in get_first_line_range

Copy link
Contributor

@ReubenFrankel ReubenFrankel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much there now, just some last bits

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
lancondrej and others added 2 commits April 24, 2024 14:14
@ReubenFrankel ReubenFrankel merged commit 55162e5 into Matatika:master Apr 24, 2024
4 checks passed
@ReubenFrankel
Copy link
Contributor

LGTM! Thanks @lancondrej

@lancondrej lancondrej deleted the sheet-range branch April 25, 2024 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants