-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from vishalmhjn/dev
feat: merge major changes
- Loading branch information
Showing
13 changed files
with
895 additions
and
73 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# Traffic Waves | ||
|
||
Traffic-Waves is a voluntary project focused on daily traffic predictions in Paris, utilizing data from [Open Data Paris](https://opendata.paris.fr/explore/dataset/comptages-routiers-permanents/information) | ||
|
||
## Overview | ||
|
||
The project leverages deep learning techniques to analyze historical traffic data and make predictions for daily traffic patterns in Paris. This aids in providing insights for commuters and city planners alike. | ||
|
||
|
||
### 1. Deployed System on AWS | ||
The system is deployed on AWS, utilizing various services to process and analyze the traffic data efficiently. | ||
|
||
 | ||
|
||
1. __AWS EventBridge Trigger for EC2 Instance Start__ | ||
|
||
This Lambda function is designed to start an EC2 instance of type t2.large upon receiving an event trigger from AWS EventBridge. The function is intended to automate the process of starting EC2 instances everyday according to specified configurations. | ||
|
||
2. __EC2 Instance Bootstrap with Custom User Data__ | ||
|
||
This documentation outlines the process of bootstrapping an EC2 instance using custom user data. Upon boot, the instance executes a sequence of scripts to set up a Python environment with the necessary dependencies such as Python 3, PyTorch, MLFlow, and Pandas. Additionally, it sequentially runs specific scripts to initialize the environment and perform further configuration. | ||
|
||
a) __Paris Traffic Data Aggregator__ : This Python script is designed to aggregate real-time traffic data for specified locations in Paris using the Open Paris API. The aggregated data is then merged into a single DataFrame and uploaded to an Amazon S3 bucket as a CSV file for further analysis. | ||
|
||
b) __Data Processing and Upload__: This Python script is designed to run on an EC2 instance to process and upload traffic data to an Amazon S3 bucket. It retrieves data from various CSV files stored in the specified S3 bucket, performs data manipulation and merging operations, and then uploads the processed data back to S3 for further analysis. | ||
* It merges the static attributes data with the real-time traffic data based on the Paris ID. | ||
* Timestamps are converted to a numeric time index for temporal analysis. | ||
* Additional features such as day of the week and hour of the day are extracted from timestamps. | ||
* Missing values in the real-time traffic data are filled using corresponding values from the historical trends data. | ||
|
||
c) __Inference Script for Traffic Prediction__: This Python script is designed to run on an EC2 instance for making traffic predictions using a pre-trained machine learning model. The script retrieves preprocessed data from an Amazon S3 bucket, loads the pre-trained model using MLflow, performs inference on the test data, and then uploads the prediction results back to the S3 bucket. | ||
|
||
d) __Data Aggregation and Normalization Script__: This Python script is designed to aggregate, normalize, and upload traffic data from various sources to an Amazon S3 bucket. It retrieves real-time traffic data, prediction results from previous days, and prediction results for the current day from the specified S3 bucket, processes and normalizes the data, and then uploads it as a JSON file to another S3 bucket for Web visualization. | ||
|
||
e) __Real-time Traffic Visualization Web Application__: A S3 bucket is configured for static website hosting. A HTML file is used for visualizing real-time traffic data using D3.js. The application fetches traffic data from a JSON file stored in the same Amazon S3 bucket, creates dynamic line plots for each detector (Paris ID), and displays them on the web page. | ||
* You can access the daily batch predictions on th [S3 Website](http://traffiq-paris.s3-website.eu-north-1.amazonaws.com/) | ||
|
||
3. __AWS EventBridge Trigger for EC2 Instance Stop__ | ||
|
||
This Lambda function is designed to stop the running EC2 instance upon receiving an event trigger from AWS EventBridge. The function is stop after the daily inference task is complete and automated to avoid unnecessary EC2 costs. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
from flask import Flask, render_template, jsonify | ||
|
||
import pandas as pd | ||
from config_interface import * | ||
from frontend import DashboardData | ||
|
||
pd.options.mode.chained_assignment = None | ||
|
||
dashboard_object = DashboardData( | ||
path_o_t_1=f"../data/processed_data/inference_data_{INFERENCE_INPUT_DATE_FMT}.csv", | ||
path_pt_1=f"../predictions/knn_{INFERENCE_INPUT_DATE_FMT}.csv", | ||
path_pt=f"../predictions/knn_{INFERENCE_PREDICTION_DATE_FMT}.csv", | ||
path_variance=f"../data/variance/df_var_2023.csv", | ||
) | ||
|
||
dashboard_object.read_data() | ||
|
||
app = Flask(__name__) | ||
|
||
|
||
@app.route("/") | ||
def index(): | ||
return render_template("index.html") | ||
|
||
|
||
@app.route("/data.json") | ||
def data(): | ||
data_asset = dashboard_object.processing_pipeline() | ||
|
||
return DashboardData.write_to_json("../frontend/data.json", data_asset) | ||
|
||
|
||
@app.after_request | ||
def add_header(response): | ||
response.headers["Cache-Control"] = "no-cache, no-store, must-revalidate" | ||
response.headers["Pragma"] = "no-cache" | ||
response.headers["Expires"] = "0" | ||
return response | ||
|
||
|
||
if __name__ == "__main__": | ||
app.run(debug=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
from config_data import file_model_train | ||
|
||
TRAINING_PARAMS = { | ||
"metric": "smape", | ||
"training": True, | ||
"data_path": file_model_train, | ||
"model_output_dir": "modeloutput/", | ||
"seed": 46, | ||
"test_proportion": 0.15, | ||
"validation_proportion": 0.15, | ||
"patience": 25, | ||
"train_episodes": 1000, | ||
"batch_size": 512, | ||
} | ||
|
||
FORECASTING_PARAMS = { | ||
"lb": 24, | ||
"ph": 24, | ||
} | ||
|
||
## month and day are considered as static for the forecasting horizon, short-term forecasting | ||
FEATURE_SETTINGS = { | ||
"dyn_to_static": ["day", "hour"], | ||
"include_occupancy": False, | ||
"dynamic_categorical_features": ["day", "hour"], | ||
"static_categorical_features": [], | ||
"dynamic_continous_features": [ | ||
# "speed_kph_mean", | ||
# "speed_kph_stddev", | ||
"q", | ||
], | ||
"static_continous_features": [ | ||
"maxspeed", | ||
"lanes", | ||
"length", | ||
], | ||
"other_columns": ["time_idx", "paris_id"], | ||
"occupancy_column": ["k"], | ||
"target_as_autoregressive_feature": ["q"], | ||
"target_column": ["qt"], | ||
} | ||
|
||
# Using the CONFIG dictionary | ||
metric = TRAINING_PARAMS["metric"] | ||
training = TRAINING_PARAMS["training"] | ||
data_path = TRAINING_PARAMS["data_path"] | ||
train_episodes = TRAINING_PARAMS["train_episodes"] | ||
|
||
dynamic_continous_features = FEATURE_SETTINGS["dynamic_continous_features"] | ||
dynamic_categorical_features = FEATURE_SETTINGS["dynamic_categorical_features"] | ||
static_continous_features = FEATURE_SETTINGS["static_continous_features"] | ||
static_categorical_features = FEATURE_SETTINGS["static_categorical_features"] | ||
|
||
continous_features = [*static_continous_features, *dynamic_continous_features] | ||
categorical_features = [*static_categorical_features, *dynamic_categorical_features] | ||
|
||
dyn_to_static = FEATURE_SETTINGS["dyn_to_static"] | ||
|
||
occupancy_column = FEATURE_SETTINGS["occupancy_column"] | ||
other_columns = FEATURE_SETTINGS["other_columns"] | ||
target_as_autoregressive_feature = FEATURE_SETTINGS["target_as_autoregressive_feature"] | ||
target_column = FEATURE_SETTINGS["target_column"] | ||
|
||
if not FEATURE_SETTINGS["include_occupancy"]: | ||
pass | ||
else: | ||
dynamic_continous_features.append(*occupancy_column) | ||
|
||
if FEATURE_SETTINGS["include_occupancy"]: | ||
dynamic_features = [ | ||
# TODO: ordering as first: categorical and second: continous in the list | ||
# is important for preprocess data fucntion | ||
*dynamic_categorical_features, | ||
*dynamic_continous_features, | ||
*occupancy_column, | ||
*target_column, | ||
] | ||
else: | ||
dynamic_features = [ | ||
# TODO: ordering as first: categorical and second: continous in the list | ||
# is important for preprocess data fucntion | ||
*dynamic_categorical_features, | ||
*dynamic_continous_features, | ||
*target_column, | ||
] | ||
static_features = [*static_categorical_features, *static_continous_features] |
Oops, something went wrong.