[Bug] dbt-redshift microbatch materialization strategy does not use unique_key in config block #927
Open
3 of 11 tasks
Labels
feature:microbatch
Issues related to the microbatch incremental strategy
pkg:dbt-redshift
Issue affects dbt-redshift
type:bug
Something isn't working as documented
Is this a new bug?
Which packages are affected?
Current Behavior
Microbatch incremental models built on Redshift are not respecting the
unique_key
provided in config block. When a microbatch incremental model is run, theunique_key
is ignored and not reflected in the Data Manipulation Language generated by the microbatch incremental strategy to identify which records to delete from the existing model before inserting new records. This results in microbatch incremental models that can contain duplicate values of aunique_key
.Expected Behavior
Microbatch incremental models built on Redshift respect the
unique_key
provided in config block. When a microbatch incremental model is run, theunique_key
is reflected in the Data Manipulation Language generated by the microbatch incremental strategy to identify which records to delete from the existing model before inserting new records. This results in microbatch incremental models that can not contain duplicate values of aunique_key
.Steps To Reproduce
Latest
.3.. Define the configuration in the model similar to the below
{{ config( materialized='incremental', incremental_strategy='microbatch', unique_key='model_unique_key', event_time='model_event_time', begin='2025-01-01', batch_size='model_batch_size' ) }}
model_unique_key
.model_event_time
that will have them run in the latest batch. Ideally, yourmodel_event_time
is pointed at a timestamp that CAN change, such asrecord_updated_at
.dbt build --select your_model
model_unique_key
will fail.Relevant log output
Environment
Additional Context
Expected Adapter-specific Behavior based on Documentation
The Adapter-specific Behavior section in the incremental microbatch documentation specifies that:
Which implies to a user that the microbatch materialization strategy for Redshift will mirror the expected behavior of the
delete+insert
materialization strategy.The Incremental materialization strategies section in the Redshift configurations documentation specifies that:
Which implies to a user that the
delete+insert
relies on aunique_key
and will respect thisunique_key
.Issue in Source Code for
dbt-redshift
adapterThe issue is found in the incremental_merge.sql file in the
dbt-adapters
repository. The specific section to reference begins on Line 57.It appears that handling for the
unique_key
config was intentionally removed from this microbatch strategy. I don't think this is correct --unique_key
should be an optional config value that can be provided to microbatch incremental models that are run in Redshift.Futhermore, I'll note that the function cannot just call out to the default implementation of
delete+insert
. Calling out to the default__get_delete_insert_merge_sql macro would generate Data Manipulation Language that uses AND conditions in the WHERE clause of the DELETE statement. This can cause issues with microbatch models that use this strategy from my testing. Instead, I had to roll a custom microbatch incremental strategy, which is as follows:The text was updated successfully, but these errors were encountered: