title | author | date | output | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ggplot2 Introduction |
Hank Flury |
August 02, 2019 |
|
-We need to install "ggplot2" in order to use ggplot
-We will also be useing "dplyr" for some data manipulation
if(! "ggplot2" %in% row.names(installed.packages())){
install.packages("ggplot2")
}
if(! "dplyr" %in% row.names(installed.packages())){
install.packages("dplyr")
}
library(ggplot2)
library(dplyr)
-We will use the dataset "avgs" which contains earthquake events in the PNW.
-"quakes" will be used for excercises
quakes <- read.csv("earthquakes_ggplot_demo.csv", stringsAsFactors = FALSE) %>%
mutate(magGroup = as.factor((mag > 3) + (mag > 4))) %>%
mutate(date = as.Date(regmatches(time, regexpr("\\d{4}-\\d{2}-\\d{2}", time)))) %>%
arrange(magGroup) %>% select(date, latitude, longitude, depth, mag, mag, nst, gap,
horizontalError, depthError, magError, magGroup)
avgs <- quakes %>%
select(depth, mag, depthError, magGroup) %>%
group_by(magGroup) %>% summarize_if(is.numeric, list(~mean(., na.rm = TRUE))) %>%
arrange(magGroup)
head(avgs)
## # A tibble: 3 x 4
## magGroup depth mag depthError
## <fct> <dbl> <dbl> <dbl>
## 1 0 15.2 2.70 4.26
## 2 1 19.7 3.34 1.47
## 3 2 15.2 4.27 0.893
- Much more customizable than base R graphing functions
- Automation of different aspects such as legend generation
- More robust in its handeling of NAs and missing data
- The ability to save your plots as objects
- Very Similar to dplyr
- "+" replaces "%>%"
- "ggplot()" is the basis of the plot
- Add geoms to give the graph what you actually want
- Miscelaneous functions help customize the plot to your needs
- ggplot() creates the space in which the plot is created
- all parts of your graph are "added" to ggplot()
- Common Parameters
- data - The data for your plot
- aes() - Set inheritable aestetic traits for your graph
ggplot()
- Right now our plot is just blank since we have not told it what to plot
-
Common geoms
- geom_line()
- geom_point()
- geom_bar()
-
geoms are what actually make up your plot
-
Multiple geoms can be added to the same plot
- The first geom added will be the bottom layer
ggplot() +
geom_bar() +
geom_errorbar()
-Our plot is still blank since we have not given it any data
- Denoted by aes()
- Aesthetics are used to define how your geoms should look
- Set the x, and y, positions
- Other common variables
- size
- color
- shape
- Anything that is not defined by a variable can be set outside of the aesthetics
ggplot() +
geom_bar(aes(x = avgs$magGroup, y = avgs$depth), stat = "identity") +
geom_errorbar(aes(x = avgs$magGroup, ymin = avgs$depth - avgs$depthError,
ymax = avgs$depth + avgs$depthError))
-There's a better way to write this!
ggplot(data = avgs, aes(x = magGroup)) +
geom_bar(aes(y = depth), stat = "identity") +
geom_errorbar(aes(ymin = depth - depthError, ymax = depth + depthError))
- Aesthetics that vary based on the data go inside aes()
- aes(x = magGroup, y = depth, fill = magGroup)
- Static aesthetics go inside the geom, but outside the aes().
- geom_histogram(aes(x = magGroup, y = depth), fill = "red")
- Easiest way to add axis labels or a title is through labs()
- Add labs as if it were another geom
- labs(x = "Magnitude Group", y = "Depth", main = "Depth vs. Magnitude Group")
ggplot(data = avgs, aes(x = magGroup)) +
geom_bar(aes(y = depth), stat = "identity", fill = "skyblue", color = "black") +
geom_errorbar(aes(ymin = depth - depthError, ymax = depth + depthError),
width = .4, color = "orange", size = 1.5) +
scale_x_discrete(labels = c("Low Magnitude", "Medium Magnitude", "High Magnitude")) +
labs(x = "Magnitude Group", y = "Depth (km)") +
ggtitle("Average Depth by Magnitude Group") +
theme_bw()
-
Using the "quakes" dataset, create a scatterplot of magnitude against depth. Create a title, axis labels and assign a new color to the points.
-
On the same graph, vary a point's color base on its Magnitude Group. Be sure to also modify the legend.
-
Suppose we want to emphasize the point for which we are most confident about the magnitude accuracy. Make the opacity of the point proportional to the magnitude error. If you can, make it so there is not a legend for opacity.
-
Challenge Recreate the "lollipop" graph from the beginning of the presentation. Make the color the magnitude group, and come up with your own variable for the shape. Note: the y axis must be in the range [2,8], the title should include the date range of included earthquakes
head(quakes, n = 2)
## date latitude longitude depth mag nst gap horizontalError
## 1 2018-12-23 44.82633 -123.7868 48.58 2.53 25 123 0.37
## 2 2018-11-23 45.36230 -116.1946 10.00 3.00 NA 65 3.40
## depthError magError magGroup
## 1 0.69 0.09799848 0
## 2 2.00 0.04800000 0