Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow to set optional range #23

Merged
merged 15 commits into from
Apr 24, 2024
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ Setting | Required | Type | Description |
`sheet_id` | Required | String | Your target google sheet id
`output_name` | Optional | String | Optionailly rename the stream and output file or table from the tap
`child_sheet_name` | Optional | String | Optionally choose a different sheet from your Google Sheet file
`range` | Optional | String | Optionally choose a range of data from your Google Sheet file (empty mean whole sheet). Range is defined using [A1 notation](https://developers.google.com/sheets/api/guides/concepts#expandable-1). Both start and end cell/column/row needs to be specified. Examples: `B5:G45` (starts at B5 and goes to end of column G45, inclusively), `A:T` (same as `A1:T` whole columns A to T), `3:5` (rows 3 to 5, whole sheet columns), `D3:ZZZ` (starts at D3 and goes to end of sheet, ZZZ is the last column in Google Sheets)
`key_properties` | Optional | Array of Strings | Optionally choose primary key column(s) from your Google Sheet file. Example: `["column_one", "column_two"]`
`sheets` | Optional | Array of Objects | Optionally provide a list of configs for each sheet/stream. See "Per Sheet Config" below. Overrides the `sheet_id` provided at the root level.

Expand All @@ -74,6 +75,7 @@ Setting | Required | Type | Description |
`sheet_id` | Required | String | Your target google sheet id
`output_name` | Optional | String | Optionailly rename the stream and output file or table from the tap
`child_sheet_name` | Optional | String | Optionally choose a different sheet from your Google Sheet file
`range` | Optional | String | Optionally choose a range of data from your Google Sheet file (empty mean whole sheet). Range is defined using [A1 notation](https://developers.google.com/sheets/api/guides/concepts#expandable-1). Both start and end cell/column/row needs to be specified. Examples: `B5:G45` (starts at B5 and goes to end of column G45, inclusively), `A:T` (same as `A1:T` whole columns A to T), `3:5` (rows 3 to 5, whole sheet columns), `D3:ZZZ` (starts at D3 and goes to end of sheet, ZZZ is the last column in Google Sheets)
`key_properties` | Optional | Array of Strings | Optionally choose primary key column(s) from your Google Sheet file. Example: `["column_one", "column_two"]`

### Environment Variable
Expand All @@ -85,6 +87,7 @@ These settings expand into environment variables of:
- `TAP_GOOGLE_SHEETS_SHEET_ID`
- `TAP_GOOGLE_SHEETS_OUTPUT_NAME`
- `TAP_GOOGLE_SHEETS_CHILD_SHEET_NAME`
- `TAP_GOOGLE_SHEETS_RANGE`
- `TAP_GOOGLE_SHEETS_KEY_PROPERTIES`
- `TAP_GOOGLE_SHEETS_SHEETS`

Expand Down Expand Up @@ -122,7 +125,7 @@ These settings expand into environment variables of:

## Roadmap

- [ ] Add setting to optionally allow the selection of a range of data from a sheet. (Add an optional range setting).
- [x] Add setting to optionally allow the selection of a range of data from a sheet. (Add an optional range setting).

- [ ] Improve default behavior of a sheet with multiple columns of the same name and `target-postgres`.

Expand Down
6 changes: 5 additions & 1 deletion tap_google_sheets/streams.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,11 @@ class GoogleSheetsStream(GoogleSheetsBaseStream):
@property
def path(self):
"""Set the path for the stream."""
return f"/{self.stream_config['sheet_id']}/values/{self.child_sheet_name}"
path = f"/{self.stream_config['sheet_id']}/values/{self.child_sheet_name}"
sheet_range = self.stream_config.get("range", "")
ReubenFrankel marked this conversation as resolved.
Show resolved Hide resolved
if sheet_range:
path += f"!{sheet_range}"
return path

def parse_response(self, response: requests.Response) -> Iterable[dict]:
"""Parse response, build response back up into json, update stream schema."""
Expand Down
19 changes: 18 additions & 1 deletion tap_google_sheets/tap.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,22 @@ def get_first_visible_child_sheet_name(self, google_sheet_data: requests.Respons

return sheet_in_sheet_name

@staticmethod
def get_first_line_range(stream_config):
"""Get the range of the first line in the google sheet."""
first_line_range = "1:1"
range = stream_config.get("range")
if range:
start_column, start_line, end_column, end_line = re.findall(
ReubenFrankel marked this conversation as resolved.
Show resolved Hide resolved
r"^([A-Za-z]*)(\d*):([A-Za-z]*)(\d*)$", range
ReubenFrankel marked this conversation as resolved.
Show resolved Hide resolved
)[0]
start_column = start_column or ""
start_line = start_line or "1"
end_column = end_column or ""

first_line_range = start_column + start_line + ":" + end_column + start_line
return first_line_range

def get_sheet_data(self, stream_config):
"""Get the data from the selected or first visible sheet in the google sheet."""
config_stream = GoogleSheetsBaseStream(
Expand All @@ -147,7 +163,8 @@ def get_sheet_data(self, stream_config):
+ stream_config["sheet_id"]
+ "/values/"
+ stream_config.get("child_sheet_name", "")
+ "!1:1",
+ "!"
+ self.get_first_line_range(stream_config),
)

prepared_request = config_stream.prepare_request(None, None)
Expand Down
28 changes: 28 additions & 0 deletions tap_google_sheets/tests/test_first_line_range.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import unittest

from tap_google_sheets.tap import TapGoogleSheets


class TestFirstLineRange(unittest.TestCase):
ReubenFrankel marked this conversation as resolved.
Show resolved Hide resolved
def test_first_line_range(self):
"""Test first line range."""
test_pairs = [
("", "1:1"),
("1:1", "1:1"),
("A1:G", "A1:G1"),
("A5:G", "A5:G5"),
("A6:GE56", "A6:GE6"),
("A6:K38", "A6:K6"),
("A4:", "A4:4"),
ReubenFrankel marked this conversation as resolved.
Show resolved Hide resolved
("C:G", "C1:G1"),
]
for test_input, expected in test_pairs:
stream_config = {"range": test_input}
self.assertEqual(
expected, TapGoogleSheets.get_first_line_range(stream_config)
)

def test_empty_range(self):
"""Test empty range."""
stream_config = {}
self.assertEqual("1:1", TapGoogleSheets.get_first_line_range(stream_config))
Loading