Skip to content
Conway Wong edited this page Mar 23, 2016 · 3 revisions

PEMS Pivot Data

PEMS 5-minute station conventional highway (CH) readings have been pivoted and converted to Parquet files stored on S3. The parent bucket is found at s3://dse-team2-2014/parquet, and yearly 5-minute pivot files are stored in buckets with the naming convention s3://dse-team2-2014/parquet/pivot_YYYY where YYYY is between 2008 and 2015.

  • For example, s3://dse-team2-2014/parquet/pivot_2010

Each record represents the accumulation of 5-minute station readings for a particular station, for a single day. Below is a description of the columns of the Parquet files.

Column Type Description Unit
station_id int Unique station identifier
district_id int District number
year int
day_of_year int Julian Day
day_of_week int 1=Sunday, 2=Monday, ... 7=Saturday
direction int station's respective highway direction of travel 1=N, 2=S, 3=E, 4=W
total_flow_HHMM (x288) double 5-minute station reading for total flow. For example, the column total_flow_1305 represents the total flow for the respective station between 1:05 and 1:10 PM # of vehicles
occupancy_HHMM (x288) double Average occupancy across all lanes over the 5-minute period expressed as a decimal number between 0 and 1 %
speed_HHMM (x288) double 5-minute station reading for speed MPH

For more information about the data, please refer to http://pems.dot.ca.gov/.

Useful References

Clone this wiki locally