Skip to content

Latest commit

 

History

History
209 lines (150 loc) · 5.79 KB

ggplot_presentation_output.md

File metadata and controls

209 lines (150 loc) · 5.79 KB
title author date output
ggplot2 Introduction
Hank Flury
August 02, 2019
ioslides_presentation
fig_caption fig_retina fig_width fig_height keep_md smaller
true
1
5
3
true
true

Getting Started

-We need to install "ggplot2" in order to use ggplot

-We will also be useing "dplyr" for some data manipulation

if(! "ggplot2" %in% row.names(installed.packages())){
  install.packages("ggplot2")
}
if(! "dplyr" %in% row.names(installed.packages())){
  install.packages("dplyr")
}
library(ggplot2)
library(dplyr)

Data

-We will use the dataset "avgs" which contains earthquake events in the PNW.

-"quakes" will be used for excercises

quakes <- read.csv("earthquakes_ggplot_demo.csv", stringsAsFactors = FALSE) %>% 
  mutate(magGroup = as.factor((mag > 3) + (mag > 4))) %>% 
  mutate(date = as.Date(regmatches(time, regexpr("\\d{4}-\\d{2}-\\d{2}", time)))) %>% 
  arrange(magGroup) %>% select(date, latitude, longitude, depth, mag, mag, nst, gap,
    horizontalError, depthError, magError, magGroup)
avgs <- quakes %>% 
  select(depth, mag, depthError, magGroup) %>% 
  group_by(magGroup) %>% summarize_if(is.numeric, list(~mean(., na.rm = TRUE))) %>% 
  arrange(magGroup)
head(avgs)
## # A tibble: 3 x 4
##   magGroup depth   mag depthError
##   <fct>    <dbl> <dbl>      <dbl>
## 1 0         15.2  2.70      4.26 
## 2 1         19.7  3.34      1.47 
## 3 2         15.2  4.27      0.893

Advantages to ggplot

  • Much more customizable than base R graphing functions
  • Automation of different aspects such as legend generation
  • More robust in its handeling of NAs and missing data
  • The ability to save your plots as objects

ggplot Structure

  • Very Similar to dplyr
    • "+" replaces "%>%"
  • "ggplot()" is the basis of the plot
  • Add geoms to give the graph what you actually want
  • Miscelaneous functions help customize the plot to your needs

ggplot()

  • ggplot() creates the space in which the plot is created
  • all parts of your graph are "added" to ggplot()
  • Common Parameters
    • data - The data for your plot
    • aes() - Set inheritable aestetic traits for your graph

ggplot()

ggplot()

  • Right now our plot is just blank since we have not told it what to plot

geoms

  • Common geoms

    • geom_line()
    • geom_point()
    • geom_bar()
  • geoms are what actually make up your plot

  • Multiple geoms can be added to the same plot

    • The first geom added will be the bottom layer

geoms

ggplot() +
  geom_bar() +
  geom_errorbar()

-Our plot is still blank since we have not given it any data

Aesthetics

  • Denoted by aes()
  • Aesthetics are used to define how your geoms should look
    • Set the x, and y, positions
    • Other common variables
      • size
      • color
      • shape
  • Anything that is not defined by a variable can be set outside of the aesthetics

Aesthetics

ggplot() +
  geom_bar(aes(x = avgs$magGroup, y = avgs$depth), stat = "identity") +
  geom_errorbar(aes(x = avgs$magGroup, ymin = avgs$depth - avgs$depthError, 
                    ymax = avgs$depth + avgs$depthError))

Aesthetics

-There's a better way to write this!

ggplot(data = avgs, aes(x = magGroup)) +
  geom_bar(aes(y = depth), stat = "identity") +
  geom_errorbar(aes(ymin = depth - depthError, ymax = depth + depthError))

Non-Essential Aesthetics

  • Aesthetics that vary based on the data go inside aes()
    • aes(x = magGroup, y = depth, fill = magGroup)
  • Static aesthetics go inside the geom, but outside the aes().
    • geom_histogram(aes(x = magGroup, y = depth), fill = "red")
  • Easiest way to add axis labels or a title is through labs()
    • Add labs as if it were another geom
    • labs(x = "Magnitude Group", y = "Depth", main = "Depth vs. Magnitude Group")

Non-Essential Aesthetics

ggplot(data = avgs, aes(x = magGroup)) +
  geom_bar(aes(y = depth), stat = "identity", fill = "skyblue", color = "black") +
  geom_errorbar(aes(ymin = depth - depthError, ymax = depth + depthError), 
                width = .4, color = "orange", size = 1.5) +
  scale_x_discrete(labels = c("Low Magnitude", "Medium Magnitude", "High Magnitude")) +
  labs(x = "Magnitude Group", y = "Depth (km)") + 
  ggtitle("Average Depth by Magnitude Group") +
  theme_bw()

Exercises

  1. Using the "quakes" dataset, create a scatterplot of magnitude against depth. Create a title, axis labels and assign a new color to the points.

  2. On the same graph, vary a point's color base on its Magnitude Group. Be sure to also modify the legend.

  3. Suppose we want to emphasize the point for which we are most confident about the magnitude accuracy. Make the opacity of the point proportional to the magnitude error. If you can, make it so there is not a legend for opacity.

  4. Challenge Recreate the "lollipop" graph from the beginning of the presentation. Make the color the magnitude group, and come up with your own variable for the shape. Note: the y axis must be in the range [2,8], the title should include the date range of included earthquakes

head(quakes, n = 2)
##         date latitude longitude depth  mag nst gap horizontalError
## 1 2018-12-23 44.82633 -123.7868 48.58 2.53  25 123            0.37
## 2 2018-11-23 45.36230 -116.1946 10.00 3.00  NA  65            3.40
##   depthError   magError magGroup
## 1       0.69 0.09799848        0
## 2       2.00 0.04800000        0