generated from jtr13/cctemplate
-
Notifications
You must be signed in to change notification settings - Fork 139
/
Treemapify_Tutorial_CC.Rmd
135 lines (115 loc) · 6.49 KB
/
Treemapify_Tutorial_CC.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
# ggplot2_treemapify
```{r, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Download treemapify
In August 2017, R language has added new geometric objects supporting ggplot2 dendrograms. From then, there is no need to resort to the assistance of third-party packages. And it can be downloaded either using "install.package("treemapify")" or "devtools::install_github("wilkox/treemapify")".
After downloading this package, we will have an additional functon in our ggplot2 which is named "geom_treemap()".
```{r}
## Load packages needed for this instruction
library("ggplot2")
library("treemapify")
library("RColorBrewer")
```
## Basic Visualization of Treemap
Before we give example using treemapify package to draw treemap figure, we can first get to know this kind of figures:
In the tree diagram, the numerical variables are converted into rectangular area sizes, and the types of variables are distinguished by labels.
```{r}
example <- data.frame(value<-c(100, 23, 1300, 1083, 260, 17, 30),
kind<-c('A','B','C','D','E','F','G'))
example$kind <- as.factor(example$kind)
ggplot(example, aes(area=value,fill=kind,label=kind)) +
geom_treemap()+
geom_treemap_text(place='center')
```
We can also use the treemap() function in the treemap() package.
```{r}
library(treemap)
group=c(rep("group-1", 3), rep("group-2",2), rep("group-3",4))
subgroup=paste("subgroup" , c(1,2,3,1,2,1,2,3,4), sep="-")
value=c(10, 6, 7, 9, 10, 7, 2, 2,20)
data=data.frame(group,subgroup,value)
# treemap
treemap(data,
index=c("group","subgroup"), #sub plots two groups
vSize="value", # Sizes are allocated according to the numeric variable
type="index") # Color according to the classification
```
In this function, we can set color palette from the RColorBrewer presets or make our own
```{r}
treemap(data, index=c("group","subgroup"), vSize="value",
type="index",
palette = "Set3",
title="TreemapExample")
```
## Visualization using treemapify package
We will have an example using the data set G20 included in treemapfiy package. We can first have a preview of this data set:
```{r}
str(G20)
head(G20)
```
This data set describes the economic indicators of the 20 countries participating in the summit. It contains five fields, namely the global region (region), country name (country), and GDP indicator (gdp_mil_usd) (should be some kind of secondary calculation Indicator), human development index (hdi), already economic development level (econ_classification).
The tree diagram is a special type of graph without an explicit coordinate system. Relying on the square algorithm, the sample population square is divided into a single rectangular box according to the actual observation value accounted for in the population. Therefore, it needs at least one numeric variable as an input parameter.
### Plot a simple Treemap figure
```{r}
ggplot(G20, aes(area = gdp_mil_usd)) +
geom_treemap()
```
Because area only defines the square size of a numeric variable, the fill color can be defined separately. But color can often be used alone as a way of expressing a numerical metric.
```{r}
## Draw the plot with simple blue color
ggplot(G20, aes(area = gdp_mil_usd)) +
geom_treemap(fill="light blue")
## Draw the plot using scale_fill_distiller with palette
ggplot(G20, aes(area = gdp_mil_usd, fill = hdi)) +
geom_treemap()+
scale_fill_distiller(palette="Blues")
```
### Add labels onto the treemap
The author of the package wrote an optimized text label function geom_treemap_text for the ggplot treemap (Because treemap exceeds the category of the three traditional coordinate systems. There is no explicit coordinate system. The algorithm is relatively special and one cannot use Geom_text() to add tags).
The **place** parameter controls the position of the label in each box relative to the surroundings, and grow (which is set default to be True) controls whether the label is adaptive to the box size (largely scaled up and down)
```{r}
ggplot(G20, aes(area = gdp_mil_usd, fill = hdi, label = country)) +
geom_treemap() +
geom_treemap_text(colour = "Blue",
grow = TRUE, # default value
place = "center") +
scale_fill_distiller(palette="Blues")
```
### Subgroups
This package supports sub-groups, which is extensively applied in practical application scenarios. For example, while observing the size of national indicators, we also want to obtain the overall indicators of the region to which the country belongs. By adding sub-groups, We can obtain information in two dimensions. By setting the subgroup parameter (a categorical variable) in the aesthetic mapping, the function can automatically complete the variable aggregation calculation of the subgroup within the function, and use a frame to show the size of the subcategory in the graph formation.
The **reflow** parameter is used to control whether the label adapts to the size of the rectangular block. If the original size exceeds the rectangular block, it will be automatically displayed in a new line.
```{r}
ggplot(G20, aes(area = gdp_mil_usd,
fill = hdi,
label = country,
subgroup = region)) +
geom_treemap() +
geom_treemap_subgroup_border() +
geom_treemap_subgroup_text(place = "center",
alpha = 0.3,
colour ="black",
min.size = 0) +
geom_treemap_text(colour = "Blue",
place = "topleft",
reflow = TRUE)+
scale_fill_distiller(palette="Blues")
```
When you feel that using sub-group cannot achieve a good visualization, geom_treemap also supports the facet_grid function in ggplot2. This is the benefit of all ggplot2 extension functions and can inherit the advanced graphics properties derived from ggplot2.
```{r}
ggplot(G20, aes(area = gdp_mil_usd,
fill = region,
label = country)) +
geom_treemap() +
geom_treemap_text(colour = "black") +
facet_wrap( ~ econ_classification) +
scale_fill_brewer(palette="Blues")+
labs(title = "The G-20 major economies by economic classification",
caption = "The area of each country is proportional to its relative GDP within the economic group (advanced or developing)",
fill = "Region" ) +
theme(legend.position = "bottom",
plot.caption=element_text(hjust=0))
```
## Source
1. https://cran.r-project.org/web/packages/treemapify/vignettes/introduction-to-treemapify.html\
2. https://github.com/wilkox/treemapify