Evaluate the effectiveness of water consumption campaign in 12 districts
- A campaign is running to reduce the water consumption in District 1.
- The city has 11 districts (Campaign only runs at District 1)
- Let's see how effective this campaign is
- The row is the daily water consumption (in liters). There are total 364 rows (364 days) continuously from 01-January.
- The columns are 11 districts
- The campaign ran only in District 1 for the last 28 days (02-Dec to 30-Dec)
- Data samples are shown below
Day | District_1 | District_2 | District_3 | District_4 | District_5 | District_6 | District_7 | District_8 | District_9 | District_10 | District_11 |
---|---|---|---|---|---|---|---|---|---|---|---|
01-01 | 30000.00 | 25188.97 | 28538.15 | 31483.59 | 30486.67 | 30892.30 | 30613.86 | 27324.14 | 25658.25 | 28994.79 | 27645.36 |
02-01 | 31859.96 | 32538.84 | 38301.84 | 28500.64 | 33390.60 | 30254.08 | 24096.06 | 30740.24 | 28504.75 | 32948.60 | 32895.67 |
03-01 | 31516.08 | 36534.43 | 24865.96 | 37001.22 | 30877.25 | 26671.67 | 23436.28 | 30992.98 | 27555.64 | 30934.14 | 31562.64 |
04-01 | 28790.81 | 19551.51 | 32441.73 | 35832.19 | 40637.68 | 35049.81 | 32555.86 | 28242.00 | 27142.70 | 31642.02 | 27085.91 |
05-01 | 27434.27 | 33289.90 | 30563.99 | 36903.76 | 36365.24 | 27596.44 | 19360.99 | 28404.61 | 33131.36 | 29676.91 | 23879.14 |
- Evaluate whether the campaign have any significant impact on the water consumption in District 1? How much?
- : Campaign didn't have any impact on District 1 (coefficient of treament in linear equation with consumption = 0)
- : Campaign actually reduce the water consumption in District 1 (coefficient of treament in linear equation is significantly negative)
- One-tailed test valuation
preproc.py
to plot, eda, and decompose time-series into cyclic seasonality features
- There could be a consumption distribution shift after the campaign lauched in District 1
- However, we're still unsure if the change is due to campaign or it's just seasonal effect, or it's just a randomness
- Weekly cyclic features as the habit of using water can be repeated with the same weekday
- Yearly cyclic features as the habit of using water can vary depend upon season (i.e: more water used in summer than winter)
- Fourier series are decomposed at level 3
OLS Regression Results
==============================================================================
Dep. Variable: District_1 R-squared: 0.936
Model: OLS Adj. R-squared: 0.932
Method: Least Squares F-statistic: 216.1
Date: Sun, 24 Oct 2021 Prob (F-statistic): 1.83e-187
Time: 10:04:31 Log-Likelihood: -3102.9
No. Observations: 364 AIC: 6254.
Df Residuals: 340 BIC: 6347.
Df Model: 23
Covariance Type: nonrobust
=======================================================================================
coef std err t P>|t| [0.025 0.975]
---------------------------------------------------------------------------------------
Intercept 6353.9583 1368.048 4.645 0.000 3663.054 9044.862
sin_week_1n 1326.6134 121.505 10.918 0.000 1087.617 1565.610
cos_week_1n 1052.4853 113.875 9.242 0.000 828.498 1276.473
sin_week_2n -338.5787 97.675 -3.466 0.001 -530.703 -146.455
cos_week_2n 736.4401 103.287 7.130 0.000 533.278 939.602
sin_week_3n -385.2357 98.001 -3.931 0.000 -578.001 -192.471
cos_week_3n -172.2709 95.951 -1.795 0.073 -361.003 16.461
sin_year_1n 240.9986 107.444 2.243 0.026 29.659 452.338
cos_year_1n -1610.5931 188.649 -8.538 0.000 -1981.659 -1239.527
sin_year_2n 376.8972 123.336 3.056 0.002 134.300 619.495
cos_year_2n 605.0006 143.525 4.215 0.000 322.692 887.309
sin_year_3n -61.6749 124.646 -0.495 0.621 -306.850 183.500
cos_year_3n -457.5879 128.098 -3.572 0.000 -709.551 -205.624
District_2 0.0176 0.014 1.262 0.208 -0.010 0.045
District_3 0.0243 0.014 1.793 0.074 -0.002 0.051
District_4 0.0100 0.014 0.690 0.491 -0.018 0.038
District_5 0.0318 0.015 2.124 0.034 0.002 0.061
District_6 0.0666 0.015 4.544 0.000 0.038 0.095
District_7 0.0483 0.015 3.307 0.001 0.020 0.077
District_8 0.0548 0.017 3.226 0.001 0.021 0.088
District_9 0.1075 0.018 5.863 0.000 0.071 0.144
District_10 0.1373 0.022 6.180 0.000 0.094 0.181
District_11 0.3010 0.026 11.588 0.000 0.250 0.352
Treament_District_1 -1192.1777 503.144 -2.369 0.018 -2181.845 -202.510
==============================================================================
Omnibus: 11.642 Durbin-Watson: 1.427
Prob(Omnibus): 0.003 Jarque-Bera (JB): 17.340
Skew: -0.233 Prob(JB): 0.000172
Kurtosis: 3.962 Cond. No. 2.06e+06
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 2.06e+06. This might indicate that there are
strong multicollinearity or other numerical problems.
- Regression result shows that p-value of treatment coefficient (-1192.1777) = 0.005 < 0.05 / 2 for district 1
- This means we can reject the null hypothesis. The coefficient of treament effect is negatively significant and different from 0
- the daily water consumption at District 1 is predicted to be lower by 1192 litres (compared with the usual day), thanks to this campaign
- Also, p-value of variable
cos_week_3n
,sin_year_3n
andconsumption of district 2 to 4
> 0.05, so this means those variables are not statistically significantly different from 0. So we can remove this from function - The consumption of district 5 - 11 signficantly impacts the consumption at district 1
- Adjusted R squared = 0.932
- RMSE = 2020.5
- MAE = 1723.7
- MAPE = 0.056
- So we conclude the campaign at District 1 has the great impact to the consumption at District. The campaign can effectively reduce the water consumption by 880 litres, on average
- But do the campaign in District 1 also impacts other Districts ???
Evaluate if the campaign in District 1 also impact other districts. Now we replace the fitting function with each district from 2 to 11
We can have the p-value
summary of each coefficients for each district regression analysis as follows:
Output | Coeff. Treament District 1 | Coeff. District 1 | Coeff. District 2 | Coeff. District 3 | Coeff. District 4 | Coeff. District 5 | Coeff. District 6 | Coeff. District 7 | Coeff. District 8 | Coeff. District 9 | Coeff. District 10 | Coeff. District 11 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
District 1 | 0.005 | 0.516 | 0.238 | 0.260 | 0.016 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | |
District 2 | 0.252 | 0.516 | 0.374 | 0.860 | 0.463 | 0.403 | 0.448 | 0.562 | 0.046 | 0.138 | 0.961 | |
District 3 | 0.909 | 0.238 | 0.374 | 0.914 | 0.430 | 0.772 | 0.985 | 0.154 | 0.328 | 0.977 | 0.748 | |
District 4 | 0.335 | 0.260 | 0.860 | 0.914 | 0.753 | 0.617 | 0.831 | 0.783 | 0.215 | 0.311 | 0.241 | |
District 5 | 0.568 | 0.016 | 0.463 | 0.430 | 0.753 | 0.490 | 0.553 | 0.863 | 0.427 | 0.700 | 0.350 | |
District 6 | 0.363 | 0.000 | 0.403 | 0.772 | 0.617 | 0.490 | 0.087 | 0.413 | 0.978 | 0.498 | 0.242 | |
District 7 | 0.773 | 0.000 | 0.448 | 0.985 | 0.831 | 0.553 | 0.087 | 0.249 | 0.675 | 0.126 | 0.422 | |
District 8 | 0.730 | 0.000 | 0.562 | 0.154 | 0.763 | 0.863 | 0.413 | 0.249 | 0.886 | 0.433 | 0.380 | |
District 9 | 0.876 | 0.000 | 0.046 | 0.328 | 0.215 | 0.427 | 0.978 | 0.675 | 0.886 | 0.787 | 0.447 | |
District 10 | 0.988 | 0.000 | 0.138 | 0.977 | 0.311 | 0.700 | 0.498 | 0.126 | 0.433 | 0.787 | 0.409 | |
District 11 | 0.985 | 0.000 | 0.961 | 0.748 | 0.241 | 0.350 | 0.242 | 0.422 | 0.380 | 0.447 | 0.409 |
- The campaign does not significantly affect other districts. However
- It's interesting to see some pairs of district are correlated in water consumption:
- District 1 with: district 5 - 11
- District 2 with: district 9
- District 6 with: district 7
- The campaign have significant impact on the water consumption in District 1.
- Daily water consumption at District 1 is predicted to be lower by 1192 litres thanks to the campaign effect, compared with the usual day
- In the next steps, we could evaluate post-campaign effect on district 1, if the effect is just instantly one-off or it really changes the water usage behavior
- If campaign at district 1 has long-term effects, we could design the same campaigns for other districts