-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ability to add partitions in Athena to resources #1349
Conversation
✅ Deploy Preview for dlt-hub-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution :)
We need:
- Move to adapter (see comment)
- A test for creating a table with partitions
- A test for adding a partition to an existing table
- A small update in the athena docs about the partitions (see how it is done bigquery)
@@ -91,6 +91,7 @@ class TColumnType(TypedDict, total=False): | |||
data_type: Optional[TDataType] | |||
precision: Optional[int] | |||
scale: Optional[int] | |||
partition: Optional[bool] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look at how we add partitions in bigquery with the bigquery adapter, it's very easy to do and you can copy most of the code from there probably.
Few comments from my side, because there is something missing:
Not sure if dlt has integration tests available to run, but you should add integrations tests to verify that the result of adding data to a partitioned table is correct - @sh-rp can for sure help you out here. |
@nicor88 yes we have end to end test and we can test the layout. ideally we could somehow test this via some SQL command. we'll put you as a reviewer here. OK? also when looking at the documentation: if I use ADD PARTITION and LOCATION I can have any file layout I want. I just need to provide a file name to LOCATION and that's it, right? |
@Vitalii0-o we are moving to #1403 |
In theory yes, practically it's a bad practises, because we introduce inconsistency between the S3 layout and the partition definition, and on scale could slow the engine. Also, feel free to add me anywhere or ping me when necessary :) happy to help. |
dlt version
0.4.10
Describe the problem
Athena code there is no way to add partitions for Athena tables
Expected behavior
Added the ability to partition Athena tables in a schema
Steps to reproduce
try to create a partitioned table in Athena
Operating system
Linux
Runtime environment
Airflow
Python version
3.11
dlt data source
No response
dlt destination
AWS Athena / Glue Catalog
Other deployment details
No response
Additional information
No response