This Python project implements an Extract, Transform, Load (ETL) system designed to cleanse raw Netflix data.
Obs: The data used in this project was created for study purposes and is not official from Netflix.
The system retrieves raw Netflix data.
The ETL system cleanses and transforms the extracted data to prepare it for analysis.
Dealing with inconsistencies: Address inconsistencies in data formats, units, or naming conventions
Create new features from existing data to enhance analysis.
The transformed data is loaded into a xmlx.
- Install Python. (This project was made with Python 3.11.6)
- Create a virtual environment to isolate project dependencies (recommended):
python -m venv venv
venv/bin/activate # Windows: venv\Scripts\activate
- Install required dependencies using pip:
pip install -r requirements.txt
python src/scripts/main.py
Feel free to contribute, report issues, or suggest improvements.