forked from openair-project/openair
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
171 lines (130 loc) · 6.62 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
---
output:
github_document:
toc: FALSE
---
<!-- Edit the README.Rmd only!!! The README.md is generated automatically from README.Rmd. -->
openair: open source tools for air quality data analysis
========================================================
For the main **openair** website, see <http://davidcarslaw.github.io/openair/>.
```{r echo=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
warning = FALSE,
message = FALSE,
eval = FALSE,
cache = TRUE
)
knitr::opts_chunk$set(
fig.path = "tools/"
)
```
[![Travis-CI Build Status](https://travis-ci.org/davidcarslaw/openair.svg?branch=master)](https://travis-ci.org/davidcarslaw/openair)
![](http://cranlogs.r-pkg.org/badges/grand-total/openair)
<img src="inst/plume.png" alt="openair logo" width="35%" />
**openair** is an R package developed for the purpose of analysing air
quality data --- or more generally atmospheric composition data. The
package is extensively used in academia, the public and private
sectors. The project was initially funded by the UK Natural
Environment Research Council ([NERC](http://www.nerc.ac.uk/)), with
additional funds from Defra. The most up to date information on
openair can be found in the package itself and the
[manual](https://www.dropbox.com/s/2n7wdyursdul8dk/openairManual.pdf?dl=0)
which provides an introduction to R with a focus on air quality data
as well as extensive reproducible examples. An archive of newsletters in also available at the same location.
The current newsletter (Issue 18) summarises some of the recent changes to the package and is available [here](http://rpubs.com/carslaw/newsletter18).
## Installation
Installation of openair from GitHub is easy using the devtools
package. Note, because openair contains C++ code a compiler is also
needed. For Windows - for example,
[Rtools](https://cran.r-project.org/bin/windows/Rtools/) is needed.
```{r}
require(devtools)
install_github('davidcarslaw/openair')
```
I also try to keep up to date versions of the package
[here](https://www.dropbox.com/sh/x8mmf5d54wfo5vq/AACpKqsrPkTC0guiu5ftiKNna?dl=0)
if you can't build the package yourself.
## Description
**openair** has developed over several years to help analyse atmospheric composition data; initially focused on air quality data.
This package continues to develop and input from other developers would be welcome. A summary of some of the features are:
- **Access to data** from several hundred UK air pollution monitoring
sites through the `importAURN` and `importKCL` functions.
- **Utility functions** such as `timeAverage` and `selectByDate` to
make it easier to manipulate atmospheric composition data.
- Flexible **wind and pollution roses** through `windRose` and `pollutionRose`.
- Flexible plot conditioning to easily plot data by hour or the day,
day of the week, season etc. through the openair `type` option
available in most functions.
- More sophisticated **bivariate polar plots** and conditional
probability functions to help characterise different sources of
pollution. A paper on the latter is available
[here](http://www.sciencedirect.com/science/article/pii/S1364815214001339).
- Access to NOAA [Hysplit](http://ready.arl.noaa.gov/HYSPLIT.php)
pre-calculated annual 96-hour back **trajectories** and many
plotting and analysis functions e.g. trajectory frequencies,
Potential Source Contribution Function and trajectory clustering.
- Many functions for air quality **model evaluation** using the
flexible methods described above e.g. the `type` option to easily
evaluate models by season, hour of the day etc. These include key
model statistics, Taylor Diagram, Conditional Quantile plots.
## Brief examples
### Import data from the UK Automatic Urban and Rural Network
It is easy to import hourly data from 100s of sites and to import
several sites at one time and several years of data.
```{r, eval=TRUE}
library(openair)
kc1 <- importAURN(site = "kc1", year = 2011:2012)
head(kc1)
```
### Utility functions
Using the `selectByDate` function it is easy to select quite complex
time-based periods. For example, to select weekday (Monday to Friday)
data from June to September for 2012 *and* for the hours 7am to 7pm
inclusive:
```{r, eval = TRUE}
sub <- selectByDate(kc1, day = "weekday", year = 2012, month = 6:9, hour = 7:19)
head(sub)
```
Similarly it is easy to time-average data in many flexible ways. For
example, 2-week means can be calculated as
```{r,eval=TRUE}
sub2 <- timeAverage(kc1, avg.time = "2 week")
```
### The `type` option
One of the key aspects of openair is the use of the `type` option,
which is available for almost all openair functions. The `type` option
partitions data by different categories of variable. There are many
built-in options that `type` can take based on splitting your data by
different date values. A summary of in-built values of type are:
* "year" splits data by year
* "month" splits variables by month of the year
* "monthyear" splits data by year *and* month
* "season" splits variables by season. Note in this case the user can
also supply a `hemisphere` option that can be either "northern"
(default) or "southern"
* "weekday" splits variables by day of the week
* "weekend" splits variables by Saturday, Sunday, weekday
* "daylight" splits variables by nighttime/daytime. Note the user must
supply a `longitude` and `latitude`
* "dst" splits variables by daylight saving time and non-daylight
saving time (see manual for more details)
* "wd" if wind direction (`wd`) is available `type = "wd"` will split
the data up into 8 sectors: N, NE, E, SE, S, SW, W, NW.
* "seasonyear (or "yearseason") will split the data into year-season intervals, keeping the months of a season together. For example, December 2010 is considered as part of winter 2011 (with January and February 2011). This makes it easier to consider contiguous seasons. In contrast, `type = "season"` will just split the data into four seasons regardless of the year.
If a categorical variable is present in a data frame e.g. `site` then
that variables can be used directly e.g. `type = "site"`.
`type` can also be a numeric variable. In this case the numeric
variable is split up into 4 *quantiles* i.e. four partitions
containing equal numbers of points. Note the user can supply the
option `n.levels` to indicate how many quantiles to use.
### Wind roses and pollution roses
**openair** can plot basic wind roses very easily provided the variables
`ws` (wind speed) and `wd` (wind direction) are available.
```{r,eval=TRUE,fig.width=4,fig.height=4.5}
windRose(mydata)
```
However, the real flexibility comes from being able to use the `type` option.
```{r,eval=TRUE}
windRose(mydata, type = "year", layout = c(4, 2))
```