Skip to content
This repository has been archived by the owner on Sep 23, 2024. It is now read-only.

Do not overwrite data with null when updating (#275) #276

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Tolsto
Copy link
Contributor

@Tolsto Tolsto commented May 10, 2022

Problem

Record messages for some updates from certain taps, e.g.
Postgres with log-based replication when doing deletes,
will only contain metadata and pkey columns. The current
MERGE logic would then set all non-included columns in the
target to null.

See #275

Proposed changes

Set the value of columns that were not included in the record message to a random string instead of None. Then, filter out columns with that value in the MERGE statement.

Types of changes

What types of changes does your code introduce to PipelineWise?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

  • Description above provides context of the change
  • I have added tests that prove my fix is effective or that my feature works
  • Unit tests for changes (not needed for documentation changes)
  • CI checks pass with my changes
  • Bumping version in setup.py is an individual PR and not mixed with feature or bugfix PRs
  • Commit message/PR title starts with [AP-NNNN] (if applicable. AP-NNNN = JIRA ID)
  • Branch name starts with AP-NNN (if applicable. AP-NNN = JIRA ID)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions

Tolsto added a commit to Tolsto/pipelinewise that referenced this pull request May 10, 2022
Currently, deleting a record when using Postgres with log-based replication
will overwrite all non-PK and non-metadata columns in Snowflake with `null`.

This is an integration test for the fix in
transferwise/pipelinewise-target-snowflake#276
Tolsto added 2 commits May 11, 2022 17:42
Record messages for some updates from certain taps, e.g.
Postgres with log-based replication when doing deletes,
will only contain metadata and pkey columns. The current
MERGE logic would then set all non-included columns in the
target to null.
@Tolsto Tolsto force-pushed the fix-partial-updates branch from 3c3f4ae to 962f687 Compare May 11, 2022 15:42
We cannot assume here that the latest version of a record
will always contain all columns. For instance, in the case
of delete records coming from Postgres via log-based replication
the record message will only contain the primary key columns.
@Tolsto Tolsto force-pushed the fix-partial-updates branch from 2b0ad7b to b4307f1 Compare May 11, 2022 23:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant