Example scripts for working with AWS S3 and Athena
pip install -r requirements.txt
AWS_ACCESS: AWS_ACCESS_KEY
AWS_SECRET: AWS_SECRET_KEY
BUCKET_NAME: NAME_THE_BUCKET_UNIQUELY
python setup.py
python query_csv.py
python teardown.py
- Athena does not support multiple file types within a directory
- multi-line JSON files not supported
- AWS Glue has schema automatic exploration feature to help with table definitions
- Query results are on S3 but may be delayed in execution
- Generating Mock Data - Mockaroo
- Boto3 Documentation
- AWS Big Data Blog: Analyzing Data in S3 using Amazon Athena
- Medium Article: Automating AWS Athena batch jobs with Python 3
- Break up scripts with multiple keys to for adherence to least privilege principle
- Build Nested JSON example - AWS Blog
- AWS Docs - Regex Example