-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request - support for column encryption with pgcrypto
module
#181
Comments
Curious one for sure. Possible of course but not sure the best implementation method. My first idea hops to stream maps but that doesnt' work here as stream maps are in memory transforms. It's like instead of stream maps, you want SQL column maps. Or something. The idea being you could modify the SQL used to populate the TEMP table before getting upsertted. Maybe the config would look something like sql_column_transform:
- "sink_name":
"column_name": pgp_sym_encrypt(column_name, encryption_key) This would theoretically allow us to let this work with a number of other use cases that I can't imagine right now but also seem a bit overkill. Easiest thing to do today with a postgres feature is probably to encrypt using a stream map with python which bypasses the need to develop this but I don't know how important using postgres native functionality is to you and others! |
This is basically what we have today "out of the box" - the issue is that it's "easy" to decrypt with Python in a stream map, but it's not possible to decrypt these fields after they've been written to the database. I think if we were hosting Posgtgres ourselves and had full access to create new Python UDFs, that it would be possible, but AWS RDS does not allow that access. |
You can use any encryption mechanism you'd like right so some of them you can decrypt like AES or something and you could definitely decrypt it. When you pull the data out of the postgres db do you need to be able to decrypt it right in the DB? Or can you consume it with python again to decrypt it? https://stackoverflow.com/questions/56311405/aws-rds-postgres-crypto-functions-doesnt-work-even-with-the-pgcrypto-extension is interesting |
We are using
target-postgres
with Postgres 13 on AWS RDS Aurora.A lot of the data that we are moving from upstream sources into Postgres has been encrypted in the source system, and can't be decrypted within Postgres after it is loaded.
I have written a custom mapper/transform plugin using the SDK that can decrypt all of these records during
meltano run
, but I don't want to write them to the database as plain text - there is sensitive data in there (PII) and I would like to keep this data encrypted at rest.What I'm hoping to achieve is something like this: for a given target-postgres plugin instance, configure it with a column name and an encryption key, and have the tap use the
pgp_sym_encrypt
function of thepgcrypto
module.I'm happy to try to contribute this feature myself but I wanted to open an issue first and see if there is a best way to make this change.
The text was updated successfully, but these errors were encountered: