Pinned Loading
-
Realtime_Data_Streaming
Realtime_Data_Streaming PublicThis project is a detailed guide for building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka, and Elasticsearch. It explains every stage from data acqu…
Python
-
Reddit_Data_Engineering_project
Reddit_Data_Engineering_project PublicThis project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and serv…
Python
-
Nashville-Housing-Data
Nashville-Housing-Data PublicThis repository contains SQL queries used for data preprocessing and transformation on the Nashville Housing dataset. These queries aim to standardize data formats, populate missing values, break d…
-
ELT-Data-Pipeline
ELT-Data-Pipeline PublicThis project implements a data pipeline using industry-standard tools such as dbt, Snowflake, and Airflow. It facilitates the extraction, loading, and transformation (ELT) process of data, enabling…
-
citibike_user_prediction
citibike_user_prediction PublicThis project aims to predict the user type of a bike-sharing service based on trip duration. Additionally, the project explores whether a customer could be a potential subscriber by analyzing the f…
Jupyter Notebook
-
Movies-Correlation-Using-Python
Movies-Correlation-Using-Python PublicThis repository contains Python code for performing data analysis on a movie dataset. The analysis includes visualizations and statistical insights derived from the data.
Jupyter Notebook
If the problem persists, check the GitHub status page or contact support.