This repository contains the code and instructions for implementing the experiments described in the journal article A Framework for Measuring and Benchmarking Fairness of Generative Crowd-Flow Models, published in ACM Journal on Computing and Sustainable Societies.
Read it here.
To set up the environment, install the dependencies by using:
pip install -r requirements.txt
-
DeepGravity:
deepgravity_steep5
: DeepGravity with steepness parameter set to 5.deepgravity_steep20
: DeepGravity with steepness parameter set to 20.
-
Gravity Model:
gravity_model_steep5
: Gravity model with steepness parameter set to 5.gravity_model_steep20
: Gravity model with steepness parameter set to 20.
-
Statsmodels Library: This is used by the gravity models. Minor fixes have been applied to prevent training overflow issues in rare cases.
$LOCATION can either be NY or WA.
- Convert raw data located in
data/$LOCATION/raw/
using the scriptdata/$LOCATION/raw/0_raw_to_original.ipynb
. This will generate the following files in folderdata/$LOCATION/
:
boundary.geojson
tessellation.geojson
flow.csv
- Add external data to the
data/$LOCATION/
folder:
demographics.csv
: Obtained from the Census Bureau.features.csv
: Extracted from OpenStreetMaps.
- Run the scripts
experiment/1_train_test.py
andexperiment/2_sampling.py
. These scripts will process the raw data and generate processed data inprocessed_data/
folder. For example:
processed_data/NY_steep5/
processed_data/NY_steep20/
processed_data/WA_steep5/
processed_data/WA_steep20/
Run the script experiment/3_data_integration.py
to integrate data into the processed_data
folder. This script generates datasets such as:
new_york0
tonew_york24
washington0
towashington24
Each folder contains datasets sampled with different fairness sampling strategies. Below is a detailed explanation of dataset naming conventions:
Dataset Name | Sampling Method | Random Seed |
---|---|---|
new_york0 / washington0 | Unbiased | NA |
new_york1 / washington1 | Ascending Demographic Disparity Sampling | 1 |
new_york2 / washington2 | Ascending Demographic Disparity Sampling | 2 |
new_york3 / washington3 | Ascending Demographic Disparity Sampling | 3 |
new_york4 / washington4 | Ascending Demographic Disparity Sampling | 4 |
new_york5 / washington5 | Ascending Demographic Disparity Sampling | 5 |
new_york6 / washington6 | Ascending Demographic Disparity No Sampling | NA |
new_york7 / washington7 | Descending Demographic Disparity Sampling | 1 |
new_york8 / washington8 | Descending Demographic Disparity Sampling | 2 |
new_york9 / washington9 | Descending Demographic Disparity Sampling | 3 |
new_york10 / washington10 | Descending Demographic Disparity Sampling | 4 |
new_york11 / washington11 | Descending Demographic Disparity Sampling | 5 |
new_york12 / washington12 | Descending Demographic Disparity No Sampling | NA |
new_york13 / washington13 | Ascending Disparity Sampling | 1 |
new_york14 / washington14 | Ascending Disparity Sampling | 2 |
new_york15 / washington15 | Ascending Disparity Sampling | 3 |
new_york16 / washington16 | Ascending Disparity Sampling | 4 |
new_york17 / washington17 | Ascending Disparity Sampling | 5 |
new_york18 / washington18 | Ascending Disparity No Sampling | NA |
new_york19 / washington19 | Descending Disparity Sampling | 1 |
new_york20 / washington20 | Descending Disparity Sampling | 2 |
new_york21 / washington21 | Descending Disparity Sampling | 3 |
new_york22 / washington22 | Descending Disparity Sampling | 4 |
new_york23 / washington23 | Descending Disparity Sampling | 5 |
new_york24 / washington24 | Descending Disparity No Sampling | NA |
- Manually copy the results to the respective
deepgravity_steep{X}/data
folder.
- For Gravity Models, run the script
4_g_run.py
in thegravity_model_steep{X}
folder. - For DeepGravity Models, run the script
5_run.py
in thedeepgravity_steep{X}
folder.
- Run
experiment/6_evaluation.py
to evaluate model performance. - Generate evaluation plots using
experiment/7_evaluation_plot.py
.
The final results and evaluation plots will be saved under evaluation/steep{X}
folder and help benchmark the fairness of the generative crowd-flow models. Use the generated results to analyze disparities and evaluate model effectiveness under different sampling strategies.