- Bump
pipelinewise-tap-github
from1.0.2
to1.0.3
- Make a postfix for Snowflake schemas in end-to-end tests.
- Bump
google-cloud-bigquery
from1.24.0
to2.31.0
(Changelog)
New
- Added cleanup method for state file.
- Bump
pytest-cov
from2.12.1
to3.0.0
(Changelog) - Bump
joblib
from1.0.0
to1.1.0
- Bump
flake8
from3.9.2
to4.0.1
- Bump
jinja2
from3.0.1
to3.0.2
- Bump
python-dotenv
from0.19.0
to0.19.1
- Bump
target-snowflake
from1.14.0
to1.14.1
- Bump
ansible
from4.4.0
to4.7.0
- Bump
pytest
from6.2.4
to6.2.5
Changes
- Fully migrate CI to Github Actions.
- Update
ujson
requirement from==4.1.*
to>=4.1,<4.3
- Update
tzlocal
requirement from<2.2,>=2.0
to>=2.0,<4.1
Fixes
- Make process in docker-compose file.
- proc.info parsing in a case cmdline is None!
New
- Add new transformation type: MASK-STRING-SKIP-ENDS
- Bump
pipelinewise-target-snowflake
from1.13.1
to1.14.0
(Changelog)- Support
date
property format - Don't log record on failure to avoind exposing data
- Support
Changes
- Use Makefile for installation
- Enforce PEP8
Fixes
-
Dates out of range (with year > 9999) in FastSync from PG.
-
Bump
pipelinewise-tap-postgres
from1.8.0
to1.8.1
(Changelog)- LOG_BASED: Handle dates with year > 9999.
- INCREMENTAL & FULL_TABLE: Avoid processing timestamps arrays as timestamp
-
Decimal
not JSON serializable in FastSync MongoDB -
Don't use non-existent FastSync for MongoDB-Redshift pipelines.
- Bump
pipelinewise-tap-github
from1.0.1
to1.0.2
- Update a few vulnerable or outdated dependencies to latest
- Bump
pipelinewise-tap-github
from1.0.0
to1.0.1
- Bump
pipelinewise-tap-kafka
from4.0.0
to4.0.1
- Bump
tap-jira
from2.0.0
to2.0.1
- Bump
pipelinewise-target-s3-csv
from1.4.0
to1.5.0
- Support
"none"
as a value for--connectors
ininstall.sh
script to install a stripped down Pipelinewise without any connectors. - Optimize Dockerfile
- Do not log invalid json objects if they fail validation against json schema.
- Replace
github-tap
with forkpipelinewise-tap-github
version1.0.0
- Add schema validation for github tap
- Increase batch_size_rows from 1M to 5M
- Increase split_file_chunk_size_mb from 2500 to 5000
- Add latest tag to docker image
- Bump
pipelinewise-tap-s3-csv
from1.2.1
to1.2.2
- Update pymongo requirement from
<3.12,>=3.10
to>=3.10,<3.13
- Bump
pipelinewise-target-snowflake
from1.13.0
to1.13.1
- Fixed an issue with S3 metadata required for decryption not being included in archived load files
- Fixed an issue in fastsync to BigQuery data type mapping
- Add
location
config parameter to fastsync to BigQuery
- Add
split_large_files
option to FastSync target-snowflake to load large files in parallel into Snowflake - Add
archive_load_files
option to FastSync target-snwoflake to archive load files on S3 - Bump
pipelinewise-tap-postgres
from1.7.1
to1.8.0
- Add discovering of partitioned table
- Bump
pipelinewise-target-snowflake
from1.12.0
to1.13.0
- Add
archive_load_files
parameter to optionally archive load files on S3
- Add
- Add
batch_wait_limit_seconds
option to every tap/target combination - Bump
pipelinewise-target-snowflake
from1.11.1
to1.12.0
- Add
batch_wait_limit_seconds
option
- Add
- Bump
pipelinewise-tap-mysql
from1.4.2
to1.4.3
- Bump a few vulnerable and security outdated packages
- Bump
pipelinewise-target-snowflake
from1.11.0
to1.11.1
- Add transformation validation post import check to detect and deny load time transformations that's changing data types
- Fixed an issue when fastsync to Postgres and Snowflake were failing if multiple load time transformations defined on the same column
- Fixed an issue when fastsync not using unique file names and causing table name collision in the target database
- Bump
pipelinewise-tap-mysql
from1.4.0
to1.4.2
- Fixed an issue when data sometimes lost during
LOG_BASED
replication
- Fixed an issue when data sometimes lost during
- Bump
pipelinewise-tap-twilio
from1.1.1
to1.1.2
- Fix missing elements for streams without ordered response
- Bump
pipelinewsie-target-snowflake
from1.10.1
to1.11.0
- Add support for AWS profile based authentication to FastSync tap-s3-csv.
- Update TransferWise references to Wise
- Bump
pipelinewise-tap-twilio
to1.1.1
- Bump
psycopg-binary
from2.8.5
to2.8.6
- Drop postgres replication slot in case of full re-sync of a tap
- Add
fastsync_parallelism
optional parameter to customize the number of cores to use for parallelisation in FastSync - Bump
pipelinewise-tap-twilio
to1.0.2
- Add tap-twilio
- patch
pipelinewise-tap-snowflake
- Bumping dependencies of Pipelinewise
New
- Support environement variables in tap yaml files and rendering them with jinja2 template.
Fixes
- bump pipelinewise-target-snowflake to 1.10.1
- Map Mysql's
tinyint(1) unsigned
column type to targets' number column type - Bumping dependencies of Pipelinewise
- Detect the copyright year dynamically
-
Bumping
snowflake-connector-python
across all componenets that uses to2.3.6
-
Tagging all queries issues to Snowflake by FastSync Snowflake and singer target-snowflake.
-
Add ssl support to mongodump in FastSync mongodb.
-
Add support for MySQL spatial types.
-
Fix issues build PPW docker images
-
Update documentation.
- Add tap-mixpanel
- Bump
joblib
to 0.16.0 to fix some issues when running on python 3.8
- Add
--profiler
optional parameter to pipelinewise commands - Use
--debug
logging in every subprocess - Fixed an issue when fastsync not extracting NULL characters correctly from MySQL
Tap Postgres
- Bump
pipelinewise-tap-postgres
to 1.7.1- Parse data from json(b) when converting a row to a record message in log based replication method.
Tap MySQL
- Bump
pipelinewise-tap-mysql
to 1.3.8- Fix mapping bit to boolean values
Tap Slack
- Bump
pipelinewise-tap-slack
to 1.1.0- Extract user profiles from
users.list
API endpoint - Extract message attachments from
conversations.history
API endpoint - Fixed an issue when incremental bookmarks were not sent correctly in the
STATE
messages
- Extract user profiles from
- Exit as failure when another instance of the tap is running or the tap is not enabled
Tap Slack
- Bump
pipelinewise-tap-slack
to 1.0.1- Fixed an issue when
thread_ts
values were not populated correctly inmessages
andthreads
streams
- Fixed an issue when
- Add tap-slack
- Add tap-shopify
Tap MongoDB
- Bump
pipelinewise-tap-mongodb
to 1.2.0- Add support for SRV urls
- Fixed an issue when missing empty breadcrumb in tap properties file didn't raise an exception
- Add option to build docker images only with selected tap and target connectors
Tap Postgres
- Bump
pipelinewise-tap-postgres
to 1.7.0- Option to enable SSL mode
- Fixed an issue when timestamps out of the ISO-8601 range caused some failures
- Fixed an issue when when postgres replication slot name not generated correctly and contained invalid characters
Target Postgres
- Bump
pipelinewise-target-postgres
to 2.1.0- Option to enable SSL mode
Tap MySQL
- Bump
pipelinewise-tap-mysql
to 1.3.7- Fixed an issue when
tap-mysql
was logging every extracted record on INFO level - Fixed an issue when
TIME
column types replaced the whole record
- Fixed an issue when
Target S3 CSV
- Bump
pipelinewise-target-s3-csv
to 1.4.0- Fixed an issue when
target-s3-csv
created temp files in system/tmp
instead of PPW specific~/.pipelinewise/tmp
- Fixed an issue when
FastSync
- Fixed an issue when MySQL
TIME
column type mapping was not in sync with target-postgres and target-snowflakeTIME
type mappings - Fixed an issue when Postgres
TIMESTAMP WITH TIME ZONE
columns were not mapped correctly to the UTC equivalent data types in the target
Tap Kafka
- Performance improvements
- Change the syntax of
primary_keys
option from JSONPath to/slashed/paths
ala XPath
- Fixed an issue when tap was not started if stream buffer size is greater than 1G
- Increase max batch_size_rows to 1000k from 500k
- Increesa max stream_buffer_size to 2500
Tap MySQL
- Fix two issues when a new discovery is done after detecting new changes in binlogs.
- Improve alert messages to include botocore and generic python exception and error patterns in the alerts
Tap S3 CSV, Target Snowflake, Target S3 CSV, Target Redshift
- Add
aws_profile
option to support Profile based authentication to S3 - Add option to authenticate to S3 using
AWS_PROFILE
,AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
andAWS_SESSION_TOKEN
environment variables
Target Snowflake
- Fixed an issue when target-snowflake was failing when
QUOTED_IDENTIFIERS_IGNORE_CASE
snowflake parameter set to True - Fixed an issue when new
SCHEMA
singer message triggered a flush event even if the newly receivedSCHEMA
message is the same as the previous one - Add
s3_endpoint_url
option to support non-native S3 accounts
Target S3 CSV
- Add
naming_convention
option to create custom and dynamically named files on S3
Tap Snowflake
- Eliminate some warning messages of
optional pandas and/or pyarrow not installed
.
Tap Zendesk
- Fixed and issue when
rate_limit
,max_workers
andbatch_size
were not configurable via the tap-zendesk YAML file
- Fixed an issue when
stop_tap
command didn't kill running tap and child processes
Tap MySQL
- revert back to
pipelinewise-tap-mysql
to 1.3.2- 1.3.3 is breaking the replication
- Fixed an issue when
stop_tap
command doesn't kill child processes only the parent PPW executable
Tap MongoDB
- Bump
pipelinewise-tap-mongodb
to 1.1.0- Add
await_time_ms
parameter to control how long the log_based method would wait for new change streams before stopping, default is 1000ms=1s which is the default anyway in the server. - Add
update_buffer_size
parameter to control how many update operation we should keep in the memory before having to make a call tofind
operation to get the documents from the server. The default value is 1, i.e every detected update will be sent to stdout right away.
- Add
Tap MySQL
- Bump
pipelinewise-tap-mysql
to 1.3.3- During
LOG_BASED
runtime, detect new columns, incl renamed ones, by comparing the columns in the binlog event to the stream schema, and if there are any additional columns, run discovery and send a newSCHEMA
message to target. This helps avoid data loss.
- During
Tap Zendesk
- Bump
pipelinewise-tap-zendesk
to 1.2.1- Use
start_time
query parameter to load satisfaction_ratings stream incrementally
- Use
Target Snowflake
- Bump
pipelinewise-target-snowflake
to 1.7.0- Add
s3_acl
option to support ACL for S3 upload
- Add
Target Redshift
- Bump
pipelinewise-target-redshift
to 1.5.0- Add
s3_acl
option to support ACL for S3 upload
- Add
- Add tap-github
- Extract and send known error patterns from logs to alerts
Tap MongoDB
- Bump
pipelinewise-tap-mongodb
to 1.0.1- Fix case where resume tokens are not json serializable by extracting and saving
_data
only
- Fix case where resume tokens are not json serializable by extracting and saving
Tap Zendesk
- Bump
pipelinewise-tap-zendesk
to 1.2.0- Configurable
rate_limit
,max_workers
andbatch_size
parameters
- Configurable
- Fixed an issue when vault encrypted values were not in loaded from
config.yml
- Add generic alert sender with Slack and VictorOps integration
Tap Postgres
- Bump
pipelinewise-tap-postgres
to 1.6.3- Fixed a data loss issue when running
LOG_BASED
the tap not sending newSCHEMA
- Fixed a data loss issue when running
- Fixed an issue when using FastSync on big MongoDB collections caused memory errors
- Fixed an issue when
sync_tables
command was not working and failed with exception - Fixed an issue when custom
stream_buffer_size
option produced unreadable log files
- Add tap-mongodb with FastSync components to Snowflake and Postgres
- Add tap-google-analytics (as an optional extra connector, with no FastSync)
- Add configurable
stream_buffer_size
option to use large buffers between taps and targets to avoid taps being blocked by long running targets.
FastSync
- Fixed an issue when some bad but valid MySQL dates are not loaded correctly into Snowflake
Tap MySQL
- Bump
pipelinewise-tap-mysql
to 1.3.2- Fixed some dependency issues and bump
pymysql
to 0.9.3 - Full changelog at https://github.com/transferwise/pipelinewise-tap-mysql/blob/master/CHANGELOG.md#132-2020-06-15
- Fixed some dependency issues and bump
Target Snowflake
- Bump
pipelinewise-target-snowflake
to 1.6.6- Fixed an issue when new columns sometimes not added to target table
- Fixed an issue when the query runner returned incorrect value when multiple queries running in one transaction
- FUll changelog at https://github.com/transferwise/pipelinewise-target-snowflake/blob/master/CHANGELOG.md#166-2020-06-26
- Support reserved words as table and column names across every component, including fastsync and singer executables
- Support loading tables with space in the name
- Add tap-zuora
- Switch to
psycopg-binary
2.8.5 in every component, including fastsync and singer executables
FastSync
- Fixed an issue when composite primary keys not created correctly by fastsync
- Create database specific unique replication slot names from tap-postgres
- Fixed an issue when parallel running
CREATE SCHEMA IF NOT EXISTS
commands caused deadlock in PG - Support fastsync between tap-mysql, tap-postgres, tap-s3-csv to target-snowflake, target-postgres and target-redshift
Tap Postgres
- Bump
pipelinewise-tap-postgres
to 1.6.2- Fixed issue when
JSON
type not converted to dictionary - Fixed an issue when existing replication slot not found
- Fixed issue when
Tap MySQL
- Bump
pipelinewise-tap-mysql
to 1.3.0- Add optional
session_sqls
connection parameter - Support MySQL
JSON
column type
- Add optional
Tap Oracle
- Bump
pipelinewise-tap-oracle
to 1.0.1- Fixed an issue when output messages were not compatible with
pipelinewise-transform-field
component
- Fixed an issue when output messages were not compatible with
Target Snowflake
- Bump
pipelinewise-target-snowflake
to 1.6.4- Fix loading tables with space in the name
Target Postgres
- Bump
pipelinewise-target-postgres
to 2.0.0- Implement missing and equivalent features of
pipelinewise-target-snowflake
- Full changelog at https://github.com/transferwise/pipelinewise-target-postgres/blob/master/CHANGELOG.md#200-2020-05-02
- Implement missing and equivalent features of
Target Redshift
- Bump
pipelinewise-target-redshift
to 2.0.0- Implement missing and equivalent features of
pipelinewise-target-snowflake
- Full changelog at https://github.com/transferwise/pipelinewise-target-redshift/blob/master/CHANGELOG.md#140-2019-05-11
- Implement missing and equivalent features of
FastSync
- To Snowflake: Support for IAM roles, AWS Session Tokens and to pass credentials as environment variables
Tap Kafka
- Bump
pipelinewise-tap-kafka
to 3.0.0- Add local storage of consumed messages and instant commit kafka offsets
- Add more configurable options:
consumer_timeout_ms
,session_timeout_ms
,heartbeat_interval_ms
,max_poll_interval_ms
- Add two new fixed output columns:
MESSAGE_PARTITION
andMESSAGE_OFFSET
Tap Snowflake
- Bump
pipelinewise-tap-snowflake
to 2.0.0- Discover only the required tables to avoid issues when too many tables in the database causing
SHOW COLUMNS
column to return more than the maximum 10000 rows
- Discover only the required tables to avoid issues when too many tables in the database causing
Target Snowflake
- Bump
pipelinewise-target-snowflake
to 1.6.3- Generate compressed CSV files by default. Optionally can be disabled by the
no_compression
config option
- Generate compressed CSV files by default. Optionally can be disabled by the
- Support tap/target config files with
.yaml
extension when importing config - Fixed dependency conflict in install script
- Fixed an issue when
add_metadata_columns
was not defined ininheritable_config.json
FastSync
- From MySQL: Increased default batch size to 50.000 rows when fastsync exporting data from MySQL tables
- To Snowflake: Log inserts, updates and csv file sizes in the same format to target-snowflake connector
Tap Kafka
- Bump
pipelinewise-tap-kafka
to 2.1.1- Commit offset from the state file and not from the consumed messages
Tap Snowflake
- Bump
pipelinewise-tap-snowflake
to 1.1.2- Fixed some dependency conflicts
Target Snowflake
- Bump
pipelinewise-target-snowflake
to 1.6.2- Log inserts, updates and csv file sizes in a more consumable format
Singer transformation
- Make tranformation consistent between FastSync and Singer by updating transform-field to transform without trimming.
tap-snowflake
- Remove PIPELINEWISE.COLUMNS cache table.
FastSync S3-csv to Snowflake
- Fix bug when
date_overrides
is present.
FastSync and singer target-snowflake
- Remove PIPELINEWISE.COLUMNS cache table.
FastSync Postgres
- Support reserved words as table names.
Install script
- update script to search full name plugins.
Tap Postgres
- Bump
pipelinewise-tap-postgres
to 1.5.1- Support per session wal_sender_timeout
FastSync Postgres & Mysql
- fix "'NoneType' object has no attribute 'upper'" that happens when table has no PK.
- fix "Information schema query returned too much data".
FastSync Postgres - Handle reserved words in column names in FastSync from PostgreSQL
- Bump
ansible
to 2.7.16
FastSync MySQL
- Handle reserved words in column names in FastSync from MySQL
- Fixed issue when parallelism
and parallelism_max
parameters were not used in tap YAML files
Tap Postgres
- Bump
tap-postgres
to 1.4.1- Remove unused timestamps in log
Logging refactoring:
- Structured logs in Pipelinewise, FastSync and majority of plugins.
- Include a logging config file in Pipelinewise repository and package here.
- Ability to provide a custom logging config by setting the env variable
LOGGING_CONF_FILE
to be the path to the.conf
file
Tap Jira
- Bump
tap-jira
to 2.0.0- Update key property for stream users
FastSync MySQL - Fix bug: map BINARY MySQL column to BINARY type IN SF
Transform field
- Bump
pipelinewise-transform-field
to 1.1.2- Make validation turned off by default.
- FastSync: Changed the default /tmp folder for snowflake encryption
Target Snowflake
- Bump
pipelinewise-target-snowflake
to 1.4.1- Changed the default /tmp folder for encryption
- FastSync: Support BINARY and VARBINARY column types from MySQL sources
- FastSync: Fixed an issue when
MASK-HIDDEN
type of transformations were not applied in Snowflake targets - Write temporary files to
~/.pipelinewise/tmp
directory - Add
stop_tap
command - Fixed an issue when post import Primary Keys check was not working correctly
- Fixed an issue when
discover_tap
command sometimes was failing
Tap MySQL
- Bump
pipelinewise-tap-mysql
to 1.1.3- Support to extract BINARY and VARBINARY column types
- Improved performance of reading data from MySQL binary log
- Increase default session
wait_timeout
to 28800 - Increase default session
innodb_lock_wait_timeout
to 3600
Tap S3 CSV
- Bump
pipelinewise-tap-s3-csv
to 1.0.7- Improved column type guesser
Tap Kafka
- Bump
pipelinewise-tap-kafka
to 2.0.0- Rewamp output schema, export the consumed JSON messages from Kafka topics to fixed columns
- Disable data flattening
Target Snowflake
- Bump
pipelinewise-target-snowflake
to 1.3.0- Load binary data into Snowflake
BINARY
column types - Adjust timestamps from taps automatically to the max allowed
9999-12-31 23:59:59
when it's required - Add
validate_record
optional parameter and default to False - Add
temp_dir
optional parameter to overwrite system defaults
- Load binary data into Snowflake
- FastSync: Add fastsync support from S3-CSV to Snowflake
- Add post import checks to detect tables with no primary key early
- Add optional
--connectors
to the install script to install taps and targets selectively
Tap Zendesk
- Forked singer connector to
pipelinewise-tap-zendesk==1.0.0
- Improved performance by getting data from Zendesk API in parallel
Tap Postgres
- Bump
pipelinewise-tap-postgres
to 1.3.0- Add
max_run_seconds
configurable option - Add
break_at_end_lsn
configurable option - Only send feedback when lsn_comitted has increased
- Add
Tap Snowflake
- Bump
pipelinewise-tap-snowflake
to 1.0.5- Bump
snowflake-connector-python
to 2.0.4
- Bump
Tap Kafka
- Bump
pipelinewise-tap-kafka
to 1.0.2- Add
encoding
configurable option
- Add
Target Redshift
- Bump
pipelinewise-target-redshift
to 1.1.0- Emit new state message as soon as data flushed to Redshift
- Add
flush_all_streams
option - Add
max_parallelism
option
- Save state message as soon as received from a target connector
- Fixed issue when docker executable not started on non bash enabled systems
- Exit gracefully on SIGINT (CTRL+C) and SIGTERM (kill)
- Add tap run summary table when tap run finished
- Add
--extra_log
optional parameter torun_tap
command in CLI - Add
validate
command to CLI - Optimised string formatting
- More accurate logging of number of exported rows in MySQL FastSync
- Fixed an issue when Snowflake cache table was not refreshed after FastSync comleted from MySQL to Snowflake
Tap Postgres
- Bump
pipelinewise-tap-postgres
to 1.2.0- Bump to
psycopg2
2.8.4 with auto keep-alive feature - Remove LOG_BASED stream bookmarks from state if it has been de-selected
- Convert time with timezone columns to UTC
- Updating stream to lsn position before sending STATE message
- Removed database name from stream-id
- Bump to
- FastSync: Convert time with timezone columns to UTC
Target Snowflake
- Bump
snowflake-connector-python
to 2.0.3 - Bump
pipelinewise-target-snowflake
to 1.1.6- Emit state message as soon as new data flushed and loaded into Snowflake
- Enforce autocommit and secure connection
- Optional
flush_all_streams
option - Configurable
parallelism
option - Configurable
parallelism_max
option - Fixed issue when updating bookmarks failed when no STATE message received from tap
- FastSync: Enforce autocommit and secure connection
Target Redshift
- Bump pipelinewise-target-redshift to 1.0.7
- Configurable COPY option
- Configurable parallelism option
- Grant permissions to users and groups individually
- FastSync: Grant permissions to users and groups individually
Target Postgres
- Bump pipelinewise-target-postgres to 1.0.4
- Fixed issue when permission not granted correctly on newly created tables
- Updated Tap Postgres, Tap Redshift pages with new features
- Removed
sync_period
references
Transform Field
- Bump pipelinewise-transform-field to 1.1.1
- Add MASK-HIDDEN transformation type
Tap S3-CSV
- Bump pipelinewise-tap-s3-csv to 1.0.5
- Add non-AWS S3 support
Tap Postgres
- Bump pipelinewise-tap-postgres to 1.1.6
- FastSync: Fixed issue when 24:00:00 formatted timestamps not loaded from Postgres to Snowflake
Target Redshift
- Bump pipelinewise-target-redshift to 1.0.6
- Fixed issue when AWS credentials sometimes were visible in logs
- Updated Tap S3 CSV pages
- Add contribution page
Tap Postgres
- Bump tap-postgres to 1.1.5
- Lowercase pg_replication slot name
- FastSync: Lowercase pg_replication slot name
Target Redshift
- Bump pipelinewise-target-redshift to 1.0.5
- Set varchar column length dynamically
- FastSync: Set varchar column length dynamically
Tap Oracle
- Add Tap Oracle singer connector
- Add Oracle Instant Client to docker image
- Fixed sample YAML files for multiple connectors
- Fixed typos in multiple pages
- Fixed hard_delete option
- Updated contributors
- Add Tap Oracle
- Build docker image with no pipelinewise user
- Fixed issue when arguments were not passed correctly to docker container
- Initial release