Skip to content

MauricioDolacio/netflix-data-etl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Netflix Data ETL

This Python project implements an Extract, Transform, Load (ETL) system designed to cleanse raw Netflix data.
Obs: The data used in this project was created for study purposes and is not official from Netflix.

python logo pandas logo

Features

Extract

The system retrieves raw Netflix data.

Transform

The ETL system cleanses and transforms the extracted data to prepare it for analysis.
Dealing with inconsistencies: Address inconsistencies in data formats, units, or naming conventions
Create new features from existing data to enhance analysis.

Load

The transformed data is loaded into a xmlx.

Installation

  • Install Python. (This project was made with Python 3.11.6)
  • Create a virtual environment to isolate project dependencies (recommended):
  python -m venv venv
  venv/bin/activate  # Windows: venv\Scripts\activate
  • Install required dependencies using pip:
  pip install -r requirements.txt

Running

  python src/scripts/main.py

Contributing

Feel free to contribute, report issues, or suggest improvements.

About

An ETL system that cleans raw Netflix data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages