- #290: Boolean data-type casting was buggy somewhere between
v1.0.0
andv1.0.2
and resulted to all non null strings to be given aTrue
value. We now handle boolean conversion explicitly by mapping stringsfalse
andtrue
to pythonFalse
andTrue
that pandas can actually understand and expose to the database appropriately. NOTE: If the user asks for a column to be cast to boolean but this column contains any other string than the aforementioned ones (and capitalised variants)sheetwork
will throw aColumnNotBooleanCompatibleError
that will help the user locate the offending column as well as the offending value(s).
#282: Sheetwork now retries obtaining google sheets up to 3 times (max duration 10s) if it hits an
APIError
because the end-user or service account was rate limited or other common service availability errors encountered by end users.Check the PR to see the exact set of
APIError
that sheetwork will attempt retrying for.
- #252: The logger file handler now always prints
DEBUG
-and-up level messages
#231: * Users can now use their end user/personal account. Previously, it was only possible to authenticate using a service_account. Thanks for recent changes in
gspread
the auth flow allows for differenciation between end user and service account that is much simpler so we ported this to here burnash/gspread#762- Internal Note/Curiosity: To make
oauth
work we had to patchgspread
's default credential paths (see burnash/gspread#826). Hopefully, this is temporary.
- Internal Note/Curiosity: To make
#253: Success of failure message for database table creation now fully qualifies the table (
<database>.<schema>.<table>
). This makes the messages a lot more usable to a user who might want to copy paste and check that the table has been correctly created. (Also some ugly hardcoding in the catalog queries have been squashed).#257: Allows more granular control over table (and schema) creation via the
sheetwork_project.yml
file: -always_create_table
is now the new way to ensure that tables always get created. Whether the table is going to be replaced or trucated is governed by thedesctructive_create_table
flag. Closes #251IMPORTANT NOTE:
always_create
is now internally remapped toalways_create_table
and will be deprecated in a future major release.#262: The target schema now gets created if
always_create_schema
isTrue
. Under the hood: SnowflakeAdaptor checks if the schema already exists on the database before creating it.#265: In order to override (or set on demand) object creation (tables and schemas) we now provide the following CLI arguments: -
create-table
-create-schema
-destructive-create-table
These arguments are "companions" which can override the following project configuration arguments: -
always_create_table
-always_create_schema
-destructive_create_table
IMPORTANT:CLI flags will override whatever is already present in the project config
#282: Sheetwork now retries obtaining google sheets up to 3 times (max duration 10s) if it hits an
APIError
because the end-user or service account was rate limited or other common service availability errors encountered by end users.Check the PR to see the exact set of
APIError
that sheetwork will attempt retrying for.
- #229: Uses the new
service_account()
wrapper fromgspread
to authenticate to a service account. This brings some stability with regards to the deprecation ofoauth2
by google as the converstion is now handled bygspread
.
- #150: Columns that are now mentioned in the sheet.yml are first checked for presence in the sheet and ignored or skipped if not present with warning.
- #155: Schema specification hierarchy is fixed: Flags > Config > Project.
- #206: Pandas dataframe casting is disabled due to issues with mixed ints and strings (see #205, #204)
- #221: Attempts to reintroduce datatype casting to solve issue with dates converion (see #216 for issue). Since the mixed str and ints issue is not solved on pandas side, int conversion doesn't actually happen (for now Snowflake deals with it ok and converts to the reguested int format).
- #151: Raises errors when a sheet contains duplicate columns
- #156: Interactive cleanup is a bit more intereactive
- #169: Adds
InitTask
tosheetwork
to ease users set their projects up. - #195: Sheetwork now checks for available updates on start (provided you have an internet connection)
- #154: Logging to file always debug, logging messages in CLI look more like pretty prints.
- #161: Simplify
SheetBag
internals:check_table
is moved to the db adapter - #163: Fixes broken interactive flow of asking whether to push to db.
- #171: CLI logging/progress messages are now timed
- #173: Sheetwork now uses an adaptor/plugin design to allow and facilitate extensions of the tool to other databases.
- #193: CLI arguments are now POSIX
- #207: An proper sheetwork error is thrown when you do not provide a command to
sheetwork
in CLI - #208: Profile error messages are now a bit more helpful and more nicely formatted
- #210: Use and try to fix most warnings from Pylance in an attempt to have more strict typing
- #215: Poetry is now used a the package and dependencies manager
- #218: When passing
--log-level debug
in CLI the format of the console output looks more like proper logs instead of the pretty prints to make following logs more easy
There have been releases before. But at the time we were managing things differently. The old changelog can be consulted in _old_changelog.md