Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: adds unreferenced datasets to sources.md #630

Merged
merged 1 commit into from
Nov 26, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 20 additions & 6 deletions SOURCES.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,9 +147,7 @@ Generated using `/scripts/github.py`.

Combined Land-Surface Air and Sea-Surface Water Temperature Anomalies (Land-Ocean Temperature Index, L-OTI), 1880-2023. Source: NASA's Goddard Institute for Space Studies https://data.giss.nasa.gov/gistemp/

## `population_engineers_hurricanes.csv`

Data about engineers from https://www.bls.gov/oes/tables.htm. Hurricane data from http://www.nhc.noaa.gov/paststate.shtml. Income data from https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_07_3YR_S1901&prodType=table.
## `income.json`

## `iowa-electricity.csv`

Expand Down Expand Up @@ -205,6 +203,8 @@ Calculated from `londongBoroughs.json` using `d3.geoCentroid`.

Selected rail lines simplified from `tfl_lines.json` at https://github.com/oobrien/vis/tree/master/tube/data

## `lookup_groups.csv`, `lookup_people.csv`

## `miserables.json`

## `monarchs.json`
Expand Down Expand Up @@ -237,19 +237,20 @@ The `commonwealth` field is used to flag the period from 1649 to 1660, which inc
#### Recent updates

The dataset was revised in Aug. 2024. James II's reign now ends in 1688 (previously 1689).


### Data Source and Licensing
Source data has been verified against the [kings & queens](https://www.royal.uk/kings-and-queens-1066
) and [interregnum](https://www.royal.uk/interregnum-1649-1660
) [official website of the British royal family](https://www.royal.uk) pages of the official Web site of the British royal family (retrieved in Aug. 2024). Content on the site is protected by Crown Copyright. Under the [UK Government Licensing Framework](https://www.nationalarchives.gov.uk/information-management/re-using-public-sector-information/uk-government-licensing-framework/crown-copyright/), most Crown copyright information is available under the [Open Government Licence v3.0](https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/).



## `movies.json`

The dataset has well known and intentionally included errors. This dataset is used for instructional purposes, including the need to reckon with dirty data.

## `normal-2d.json`

## `obesity.json`

## `ohlc.json`

This dataset contains the performance of the Chicago Board Options Exchange [Volatility Index](https://en.wikipedia.org/wiki/VIX) ([VIX](https://finance.yahoo.com/chart/%5EVIX?ltr=1#eyJpbnRlcnZhbCI6ImRheSIsInBlcmlvZGljaXR5IjoxLCJ0aW1lVW5pdCI6bnVsbCwiY2FuZGxlV2lkdGgiOjgsInZvbHVtZVVuZGVybGF5Ijp0cnVlLCJhZGoiOnRydWUsImNyb3NzaGFpciI6dHJ1ZSwiY2hhcnRUeXBlIjoibGluZSIsImV4dGVuZGVkIjpmYWxzZSwibWFya2V0U2Vzc2lvbnMiOnt9LCJhZ2dyZWdhdGlvblR5cGUiOiJvaGxjIiwiY2hhcnRTY2FsZSI6ImxpbmVhciIsInN0dWRpZXMiOnsidm9sIHVuZHIiOnsidHlwZSI6InZvbCB1bmRyIiwiaW5wdXRzIjp7ImlkIjoidm9sIHVuZHIiLCJkaXNwbGF5Ijoidm9sIHVuZHIifSwib3V0cHV0cyI6eyJVcCBWb2x1bWUiOiIjMDBiMDYxIiwiRG93biBWb2x1bWUiOiIjRkYzMzNBIn0sInBhbmVsIjoiY2hhcnQiLCJwYXJhbWV0ZXJzIjp7IndpZHRoRmFjdG9yIjowLjQ1LCJjaGFydE5hbWUiOiJjaGFydCJ9fX0sInBhbmVscyI6eyJjaGFydCI6eyJwZXJjZW50IjoxLCJkaXNwbGF5IjoiXlZJWCIsImNoYXJ0TmFtZSI6ImNoYXJ0IiwidG9wIjowfX0sInNldFNwYW4iOnt9LCJsaW5lV2lkdGgiOjIsInN0cmlwZWRCYWNrZ3JvdWQiOnRydWUsImV2ZW50cyI6dHJ1ZSwiY29sb3IiOiIjMDA4MWYyIiwiZXZlbnRNYXAiOnsiY29ycG9yYXRlIjp7ImRpdnMiOnRydWUsInNwbGl0cyI6dHJ1ZX0sInNpZ0RldiI6e319LCJzeW1ib2xzIjpbeyJzeW1ib2wiOiJeVklYIiwic3ltYm9sT2JqZWN0Ijp7InN5bWJvbCI6Il5WSVgifSwicGVyaW9kaWNpdHkiOjEsImludGVydmFsIjoiZGF5IiwidGltZVVuaXQiOm51bGwsInNldFNwYW4iOnt9fV19)) in the summer of 2009.
Expand All @@ -262,6 +263,8 @@ Palmer Archipelago (Antarctica) penguin data collected and made available by [Dr

Assets from the video game [Celeste](http://www.celestegame.com/).

## `points.json`

## `political-contributions.json`

Summary financial information on contributions to candidates for U.S. elections. An updated version of this datset is available from the "all candidates" files (in pipe-delimited format) on the [bulk data download](https://www.fec.gov/data/browse-data/?tab=bulk-data) page of the U.S. Federal Election Commission, or, alternatively, via [OpenFEC](https://api.open.fec.gov/developers/). Information on each of the 25 columns is available from the [FEC All Candidates File Description](https://www.fec.gov/campaign-finance-data/all-candidates-file-description/). The sample dataset in `political-contributions.json` contains 58 records with dates from 2015.
Expand Down Expand Up @@ -295,6 +298,10 @@ When using this dataset, please refer to IPUMS USA [terms of use](https://usa.ip

Steven Ruggles, Katie Genadek, Ronald Goeken, Josiah Grover, and Matthew Sobek. Integrated Public Use Microdata Series: Version 6.0. Minneapolis: University of Minnesota, 2015. http://doi.org/10.18128/D010.V6.0

## `population_engineers_hurricanes.csv`

Data about engineers from https://www.bls.gov/oes/tables.htm. Hurricane data from http://www.nhc.noaa.gov/paststate.shtml. Income data from https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_07_3YR_S1901&prodType=table.

## `seattle-weather.csv`

Data from [NOAA](https://www.ncdc.noaa.gov/cdo-web/datatools/records). Daily weather records with metric units. Transformed using `/scripts/weather.py`. We synthesized the categorical "weather" field from multiple fields in the original dataset. This data is intended for instructional purposes.
Expand All @@ -311,6 +318,8 @@ S&P 500 index values from 2000 to 2020, retrieved from [Yahoo Finance](https

## `stocks.csv`

## `udistrict.json`

## `unemployment-across-industries.json`

Industry-level unemployment statistics from the [Current Population Survey](https://www.census.gov/programs-surveys/cps.html) (CPS), published monthly by the U.S. Bureau of Labor Statistics. Includes unemployed persons and unemployment rate across 11 private industries, as well as agricultural, government, and self-employed workers. Covers January 2000 through February 2010. Industry classification follows format of CPS [Table A-31](https://www.bls.gov/web/empsit/cpseea31.htm).
Expand Down Expand Up @@ -370,6 +379,9 @@ When using BLS public data API and datasets, users should adhere to the [BLS Ter
3. Do not use the BLS logo without permission.

For detailed methodology and technical information about LAUS estimates, refer to the [BLS Handbook of Methods](https://www.bls.gov/opub/hom/lau/home.htm).

## `uniform-2d.json`

## `us-10m.json`

## `us-employment.csv`
Expand All @@ -382,6 +394,8 @@ Totals are included for the [22 "supersectors"](https://download.bls.gov/pub/tim

A calculated "nonfarm_change" column has been appended with the month-to-month change in that supersector's employment. It is useful for illustrating how to make bar charts that report both negative and positive values.

## `us-state-capitals.json`

## `volcano.json`

Maunga Whau (Mt Eden) is one of about 50 volcanos in the Auckland volcanic field. This data set gives topographic information for Maunga Whau on a 10m by 10m grid. Digitized from a topographic map by Ross Ihaka, adapted from [R datasets](https://stat.ethz.ch/R-manual/R-patched/library/datasets/html/volcano.html). These data should not be regarded as accurate.
Expand Down