-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathLists.Rmd
108 lines (73 loc) · 3.17 KB
/
Lists.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
```{r echo=FALSE, results='hide'}
library(rblocks)
```
# Lists
In this chapter, we're going to learn about lists. Lists can be a bit confusing the first time you begin to use them. Heaven knows it took me ages to get comfortable with them. However, they're a very powerful way to structure data and, once mastered, will give you all kinds of control over pretty much anything the world can throw at you. If vectors are R's atom, lists are molecules.
By the end of this chapter, you will know:
* What is a list and how are they created?
* What is the difference between a list and vector?
* When and how do I use `lapply`?
Lists have data of arbitrary complexity. Any type, any length. Note the new `[[ ]]` double bracket operator.
```{r }
x <- list()
typeof(x)
x[[1]] <- c("Hello", "there", "this", "is", "a", "list")
x[[2]] <- c(pi, exp(1))
summary(x)
str(x)
```
## Lists Overview
```{r echo=FALSE}
make_block(x)
```
### [ vs. [[
`[` is (almost always) used to set and return an element of the same type as the _containing_ object.
`[[` is used to set and return an element of the same type as the _contained_ object.
This is why we use `[[` to set an item in a list.
Don't worry if this doesn't make sense yet. It's difficult for most R programmers.
### Recursive storage
Lists can contain other lists as elements.
```{r }
y <- list()
y[[1]] <- "Lou Reed"
y[[2]] <- 45
x[[3]] <- y
```
```{r echo=FALSE}
make_block(x)
```
### List metadata
Again, typically names. However, these become very important for lists. Names are handled with the special `$` operator. `$` permits access to a single element. (A single element of a list can be a vector!)
```{r}
y[[1]] <- c("Lou Reed", "Patti Smith")
y[[2]] <- c(45, 63)
names(y) <- c("Artist", "Age")
y$Artist
y$Age
```
### `lapply`
`lapply` is one of many functions which may be applied to lists. Can be difficult at first, but very powerful. Applies the same function to each element of a list.
```{r }
myList <- list(firstVector = c(1:10)
, secondVector = c(89, 56, 84, 298, 56)
, thirdVector = c(7,3,5,6,2,4,2))
lapply(myList, mean)
lapply(myList, median)
lapply(myList, sum)
```
### Why `lapply`?
Two reasons:
1. It's expressive. A loop is a lot of code which does little to clarify intent. `lapply` indicates that we want to apply the same function to each element of a list. Think of a formula that exists as a column in a spreadsheet.
2. It's easier to type at an interactive console. In its very early days, `S` was fully interactive. Typing a `for` loop at the console is a tedius and unnecessary task.
### Summary functions
Because lists are arbitrary, we can't expect functions like `sum` or `mean` to work. Use `lapply` to summarize particular list elements.
## Exercises
* Create a list with two elements. Have the first element be a vector with 100 numbers. Have the second element be a vector with 100 dates. Give your list the names: "Claim" and "AccidentDate".
* What is the average value of a claim?
## Answers
```{r }
myList <- list()
myList$Claims <- rlnorm(100, log(10000))
myList$AccidentDate <- sample(seq.Date(as.Date('2000-01-01'), as.Date('2009-12-31'), length.out = 1000), 100)
mean(myList$Claims)
```