Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geom_text_repel when using size aesthetic #14

Closed
ghost opened this issue Jan 14, 2016 · 12 comments
Closed

geom_text_repel when using size aesthetic #14

ghost opened this issue Jan 14, 2016 · 12 comments

Comments

@ghost
Copy link

ghost commented Jan 14, 2016

In cases where there is a size aesthetic mapping, points might become quite large in a scatter plot. When there is no need for repelling, the labels may overlap with their corresponding points.
I used to use hjust/vjust with (geom_text) to correct that. What can I do now?

@slowkow
Copy link
Owner

slowkow commented Jan 14, 2016

Thanks for the comment! Please post a minimal code example and an image of the plot.

@ghost
Copy link
Author

ghost commented Jan 14, 2016

Not exactly minimal, but handy.

library(ggrepel)

ggplot(my.data, aes(V2,V5,label=V10)) +
    geom_text_repel(data=my.data[my.data$V4=="yes",], size=3.0, alpha = 0.85, na.rm = TRUE) +
    geom_point(aes(color = V4, size = V3)) +
    scale_colour_manual(values=c("no" = rgb(145, 207, 96, max = 255), "yes" = rgb(215, 48, 39, max = 255)),labels = c("No", "Yes")) +
    scale_fill_manual(values=c(rgb(145, 207, 96, max = 255), rgb(215, 48, 39, max = 255)),labels = c("No", "Yes")) +
    stat_smooth(aes(colour = V4,fill =V4), method="lm", se=TRUE,alpha = 0.10, show.legend=FALSE) +
    xlab("X label") + ylab("Y label") +
    labs(size="Size legend: ", colour="Colour legend: ") +
    guides(fill="none",colour = guide_legend(override.aes = list(size=5))) +
    ggtitle('Main Title') +
    theme_bw() +
    theme(legend.position = "top",
          legend.key = element_rect(linetype = 0),
          legend.text = element_text(size = rel(1.0)),
          legend.title = element_text(size = rel(1.0), hjust = 0),
          plot.title = element_text(size = rel(1.05)),
          panel.margin = unit(c(1, 1, 2, 0.5), "lines"),
          axis.title.x = element_text(vjust = .5),
          axis.title.y = element_text(vjust = .8)
    )



my.data <-
    structure(
        list(
            V1 = c(
                "100", "101", "114", "133", "137", "146",
                "148", "158", "159", "160", "165", "170", "174", "192", "196",
                "215", "224", "245", "251", "264", "265", "278", "279", "280",
                "282", "283", "286", "287", "288", "289", "290", "293", "299",
                "311", "378", "392", "395", "404", "407", "455", "463", "465",
                "466", "469", "470", "472", "473", "474", "477", "480", "494",
                "498", "504", "510", "511", "514", "528", "529", "533", "534",
                "535", "540", "544", "608", "628", "740", "458", "512", "539",
                "541", "563", "579", "635", "700", "705", "715", "10004", "10005",
                "10012"
            ), V2 = c(
                44.6, 15.2, 16.3, 12.2, 36.7, 12.2, 12.2, 24.4,
                24.4, 22.3, 28.4, 18.3, 18.2, 28.3, 39.3, 10.1, 32.4, 10.1, 18.3,
                18.2, 36.5, 33.5, 16.3, 22.3, 24.4, 12.2, 24.3, 12.2, 18.3, 18.3,
                8.2, 20.3, 27.4, 18.2, 10.1, 17.3, 36.5, 10.2, 18.2, 4.1, 22.3,
                34.5, 22.3, 12.2, 48.7, 13.1, 24.4, 18.3, 18.3, 18.3, 27.4, 36.5,
                18.3, 30.5, 30.5, 18.3, 7.1, 11.2, 36.5, 30, 40.6, 18.4, 18.3,
                26.4, 12.2, 12.2, 48.7, 30.4, 24.4, 36.5, 24.3, 36.5, 32.5, 16.2,
                18.3, 24.4, 22.3, 30.4, 24.4
            ), V3 = c(
                110.83, 2.18, 4.43, 1.51,
                56.21, 1.9, 2.42, 12.55, 8.62, 5.48, 15.92, 14.47, 9.44, 35.02,
                38.04, 1.84, 29.69, 2.86, 10.09, 84.54, 326.2, 37.12, 105.24,
                21.03, 3.62, 4.15, 7.98, 0.87, 6.35, 5.95, 3.87, 41.96, 38.47,
                8.11, 1.2, 8.26, 1.49, 1.28, 12.08, 29.13, 42.4, 53.57, 48.02,
                16.99, 209.73, 17.03, 14.58, 4.54, 30.29, 2.64, 35.79, 0.99,
                7.02, 37.66, 150.54, 11.42, 2.93, 11.47, 71.82, 45.25, 178.76,
                59.35, 11.95, 17.18, 0.19, 3.11, 156.19, 79.05, 10.4, 86.2, 13.92,
                293.13, 374.24, 21.22, 2.05, 227.04, 25.03, 17.17, 57.23
            ), V4 = c(
                "no",
                "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no",
                "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no",
                "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no",
                "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no",
                "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no",
                "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "yes",
                "yes", "no", "yes", "yes", "yes", "yes", "yes", "yes", "yes",
                "yes", "yes", "yes"
            ), V5 = c(
                33.5, 4.1, 7.9, 5.1, 24, 12.2, 20.8,
                8.8, 14.2, 5, 8.5, 13.2, 3.6, 5.5, 15.3, 5.4, 50.9, 3.3, 16.2,
                71.2, 107.1, 39.5, 35.8, 14.1, 3.6, 3.5, 3.1, 14.2, 7.5, 10.2,
                3.1, 10.2, 27.3, 2, 3.9, 26.4, 2.7, 18.3, 3.6, 24.4, 50.2, 27.7,
                14.8, 9.7, 44.9, 3.7, 14.5, 3.6, 10.1, 18.7, 2.6, 36.5, 3.5,
                25.2, 14, 8.1, 2, 4, 43.6, 14.3, 44.6, 25.6, 9.8, 8.1, 4.1, 5.3,
                49.4, 29, 24.3, 6.3, 68.5, 21.3, 18.3, 14.9, 18, 10.6, 88.3,
                92.2, 95.2
            ), V6 = c(
                75, 27, 48, 42, 65, 100, 170, 36, 58, 22,
                30, 72, 20, 19, 39, 53, 157, 33, 89, 391, 293, 118, 220, 63,
                15, 29, 13, 116, 41, 56, 38, 50, 100, 11, 39, 153, 7, 179, 20,
                595, 225, 80, 66, 80, 92, 28, 59, 20, 55, 102, 9, 100, 19, 83,
                46, 44, 28, 36, 119, 48, 110, 139, 54, 31, 34, 43, 101, 95, 100,
                17, 282, 58, 56, 92, 98, 43, 396, 303, 390
            ), V7 = c(
                1997, 1998,
                1998, 1998, 1999, 1999, 1999, 1999, 2002, 2003, 1999, 2003, 1999,
                2001, 2001, 2000, 2004, 2000, 2001, 2002, 2002, 2005, 2003, 2003,
                2006, 2003, 2005, 2006, 2003, 2006, 2003, 2001, 2002, 2002, 2002,
                2007, 2003, 2003, 2004, 2004, 2007, 2006, 2007, 2006, 2007, 2006,
                2006, 2006, 2006, 2006, 2006, 2005, 2006, 2006, 2007, 2006, 2006,
                2007, 2006, 2006, 2006, 2007, 2008, 2009, 2009, 2013, 2007, 2012,
                2007, 2013, 2009, 2013, 2013, 2013, 2013, 2014, 2007, 2007, 2006
            ), V8 = c(
                2.48497757847534, 0.143421052631579, 0.271779141104294,
                0.123770491803279, 1.53160762942779, 0.155737704918033, 0.198360655737705,
                0.514344262295082, 0.35327868852459, 0.245739910313901, 0.56056338028169,
                0.790710382513661, 0.518681318681319, 1.23745583038869, 0.96793893129771,
                0.182178217821782, 0.916358024691358, 0.283168316831683, 0.551366120218579,
                4.64505494505495, 8.93698630136986, 1.10805970149254, 6.45644171779141,
                0.94304932735426, 0.148360655737705, 0.34016393442623, 0.328395061728395,
                0.0713114754098361, 0.346994535519126, 0.325136612021858, 0.471951219512195,
                2.06699507389163, 1.40401459854015, 0.445604395604396, 0.118811881188119,
                0.477456647398844, 0.0408219178082192, 0.125490196078431, 0.663736263736264,
                7.10487804878049, 1.90134529147982, 1.55275362318841, 2.15336322869955,
                1.39262295081967, 4.30657084188912, 1.3, 0.597540983606557, 0.248087431693989,
                1.6551912568306, 0.144262295081967, 1.30620437956204, 0.0271232876712329,
                0.383606557377049, 1.23475409836066, 4.93573770491803, 0.624043715846994,
                0.412676056338028, 1.02410714285714, 1.96767123287671, 1.50833333333333,
                4.40295566502463, 3.22554347826087, 0.653005464480874, 0.650757575757576,
                0.0155737704918033, 0.254918032786885, 3.20718685831622, 2.60032894736842,
                0.426229508196721, 2.36164383561644, 0.57283950617284, 8.03095890410959,
                11.5150769230769, 1.30987654320988, 0.112021857923497, 9.30491803278689,
                1.12242152466368, 0.564802631578948, 2.34549180327869
            ), V9 = c(
                0.751121076233184,
                0.269736842105263, 0.484662576687117, 0.418032786885246, 0.653950953678474,
                1, 1.70491803278689, 0.360655737704918, 0.581967213114754, 0.224215246636771,
                0.299295774647887, 0.721311475409836, 0.197802197802198, 0.19434628975265,
                0.389312977099237, 0.534653465346535, 1.57098765432099, 0.326732673267327,
                0.885245901639344, 3.91208791208791, 2.93424657534247, 1.17910447761194,
                2.19631901840491, 0.632286995515695, 0.147540983606557, 0.286885245901639,
                0.127572016460905, 1.16393442622951, 0.409836065573771, 0.557377049180328,
                0.378048780487805, 0.502463054187192, 0.996350364963504, 0.10989010989011,
                0.386138613861386, 1.52601156069364, 0.073972602739726, 1.79411764705882,
                0.197802197802198, 5.95121951219512, 2.25112107623318, 0.802898550724638,
                0.663677130044843, 0.795081967213115, 0.921971252566735, 0.282442748091603,
                0.594262295081967, 0.19672131147541, 0.551912568306011, 1.02185792349727,
                0.0948905109489051, 1, 0.191256830601093, 0.826229508196721,
                0.459016393442623, 0.442622950819672, 0.28169014084507, 0.357142857142857,
                1.19452054794521, 0.476666666666667, 1.09852216748768, 1.39130434782609,
                0.53551912568306, 0.306818181818182, 0.336065573770492, 0.434426229508197,
                1.01437371663244, 0.953947368421053, 0.995901639344262, 0.172602739726027,
                2.81893004115226, 0.583561643835616, 0.563076923076923, 0.919753086419753,
                0.983606557377049, 0.434426229508197, 3.95964125560538, 3.03289473684211,
                3.9016393442623
            ), V10 = c(
                "", "", "", "", "", "", "", "", "",
                "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
                "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
                "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",
                "", "", "", "", "", "", "", "", "", "458", "512", "", "541",
                "563", "579", "635", "700", "705", "715", "10004", "10005", "10012"
            ), V11 = c(
                "100", "101", "114", "133", "137", "146", "148", "158",
                "159", "160", "165", "170", "174", "192", "196", "215", "224",
                "245", "251", "264", "265", "278", "279", "280", "282", "283",
                "286", "287", "288", "289", "290", "293", "299", "311", "378",
                "392", "395", "404", "407", "455", "463", "465", "466", "469",
                "470", "472", "473", "474", "477", "480", "494", "498", "504",
                "510", "511", "514", "528", "529", "533", "534", "535", "540",
                "544", "608", "628", "740", "458", "512", "539", "541", "563",
                "579", "635", "700", "705", "715", "10004", "10005", "10012"
            )
        ), .Names = c("V1",
                      "V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10", "V11"), row.names = c(
                          1L,
                          2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
                          16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L,
                          29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L,
                          42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L,
                          55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L,
                          68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 86L, 87L, 88L
                      ), class = "data.frame"
    )

rplot

@slowkow
Copy link
Owner

slowkow commented Jan 14, 2016

Thanks for sharing the interesting use case! In the future, I'll consider adding back hjust and vjust, or maybe nudge_x and nudge_y. I think it might be useful for situations like this and others.

Since I deleted those parameters, try using box.padding = unit(0.75, "lines") like this:

library(ggrepel)

set.seed(423)
ggplot(my.data, aes(V2,V5,label=V10)) +
  geom_point(aes(color = V4, size = V3)) +
  scale_colour_manual(values=c("no" = rgb(145, 207, 96, max = 255), "yes" = rgb(215, 48, 39, max = 255)),labels = c("No", "Yes")) +
  scale_fill_manual(values=c(rgb(145, 207, 96, max = 255), rgb(215, 48, 39, max = 255)),labels = c("No", "Yes")) +
  stat_smooth(aes(colour = V4,fill =V4), method="lm", se=TRUE,alpha = 0.10, show.legend=FALSE) +
  geom_text_repel(
    data=my.data[my.data$V4=="yes",],
    size=3.0, alpha = 0.85, na.rm = TRUE,
    box.padding = unit(0.75, "lines")
  ) +
  xlab("X label") + ylab("Y label") +
  labs(size="Size legend: ", colour="Colour legend: ") +
  guides(fill="none",colour = guide_legend(override.aes = list(size=5))) +
  ggtitle('Main Title') +
  theme_bw() +
  theme(legend.position = "top",
        legend.key = element_rect(linetype = 0),
        legend.text = element_text(size = rel(1.0)),
        legend.title = element_text(size = rel(1.0), hjust = 0),
        plot.title = element_text(size = rel(1.05)),
        panel.margin = unit(c(1, 1, 2, 0.5), "lines"),
        axis.title.x = element_text(vjust = .5),
        axis.title.y = element_text(vjust = .8)
  )

image

@ghost
Copy link
Author

ghost commented Jan 14, 2016

​It looks ok with box.padding
You can close the case.

Thank you very much.

On Thu, Jan 14, 2016 at 5:08 PM, Kamil Slowikowski <[email protected]

wrote:

Thanks for sharing the interesting use case! In the future, I'll consider
adding back hjust and vjust, or maybe nudge_x and nudge_y. I think it
might be useful for situations like this and others.

Since I deleted those parameters, try using box.padding = unit(0.75,
"lines") like this:

library(ggrepel)

set.seed(423)
ggplot(my.data, aes(V2,V5,label=V10)) +
geom_point(aes(color = V4, size = V3)) +
scale_colour_manual(values=c("no" = rgb(145, 207, 96, max = 255), "yes" = rgb(215, 48, 39, max = 255)),labels = c("No", "Yes")) +
scale_fill_manual(values=c(rgb(145, 207, 96, max = 255), rgb(215, 48, 39, max = 255)),labels = c("No", "Yes")) +
stat_smooth(aes(colour = V4,fill =V4), method="lm", se=TRUE,alpha = 0.10, show.legend=FALSE) +
geom_text_repel(
data=my.data[my.data$V4=="yes",],
size=3.0, alpha = 0.85, na.rm = TRUE,
box.padding = unit(0.75, "lines")
) +
xlab("X label") + ylab("Y label") +
labs(size="Size legend: ", colour="Colour legend: ") +
guides(fill="none",colour = guide_legend(override.aes = list(size=5))) +
ggtitle('Main Title') +
theme_bw() +
theme(legend.position = "top",
legend.key = element_rect(linetype = 0),
legend.text = element_text(size = rel(1.0)),
legend.title = element_text(size = rel(1.0), hjust = 0),
plot.title = element_text(size = rel(1.05)),
panel.margin = unit(c(1, 1, 2, 0.5), "lines"),
axis.title.x = element_text(vjust = .5),
axis.title.y = element_text(vjust = .8)
)

[image: image]
https://cloud.githubusercontent.com/assets/209714/12327795/5a846cda-baa6-11e5-9e90-5f2989970dc9.png


Reply to this email directly or view it on GitHub
#14 (comment).

@slowkow slowkow closed this as completed Jan 14, 2016
@glomek
Copy link

glomek commented May 5, 2020

Hi I'm affraid the case should not be closed, as box.padding does not do the job. When neighbouring points differ substantially in size label for small point x might be visually closer to large point y. When I set large box.padding to keep label of large points from overlapping the point, labels of small points land really far from them - it doesn't look good.
Example:
tibble(x=c(1,2,3,1,2,3),y=c(1,1,1,2,2,2),size=c(1,5,30,1000,5,500),label='this is quite \n long label') %>% ggplot() + geom_point(aes(x,y,size=size),alpha=0.5)+ geom_text_repel(aes(x,y,label=label),box.padding=1)+ scale_size(range = c(5, 30))+ xlim(0,5)+ylim(0,3)+ theme_bw()+ theme(legend.position = "none")

Rplot02

@slowkow
Copy link
Owner

slowkow commented May 6, 2020

@glomek

Could I please ask if you might be able to try installing the development version of ggrepel from Github and then running this example?

devtools::install_github("slowkow/ggrepel")

I copied the example code here:

my_pal <- function(range = c(1, 6)) {
  force(range)
  function(x) scales::rescale(x, to = range, from = c(0, 1))
}

dat <- mtcars
dat$car <- rownames(dat)

ggplot(dat, aes(wt, mpg, label = car)) +
  geom_point(aes(size = cyl), alpha = 0.6) + # data point size
  continuous_scale(
    aesthetics = c("size", "point.size"), scale_name = "size",
    palette = my_pal(c(2, 15)),
    guide = guide_legend(override.aes = list(label = "")) # hide "a" in legend
  ) +
  geom_text_repel(
    aes(point.size = cyl), # data point size
    size = 5, # font size in the text labels
    point.padding = 0, # additional padding around each point
    min.segment.length = 0, # draw all line segments
    max.time = 1, max.iter = 1e5, # stop after 1 second, or after 100,000 iterations
    box.padding = 0.3 # additional padding around each text label
  ) +
  theme(legend.position = "right")

@glomek
Copy link

glomek commented May 7, 2020

It surely looks much better, but not perfect
Code after small modifications:

my_pal <- function(range = c(1, 6)) {
  force(range)
  function(x) scales::rescale(x, to = range, from = c(0, 1))
}

dat <- mtcars
dat$car <- rownames(dat)

ggplot(dat, aes(wt, mpg, label = car)) +
  geom_point(aes(size = cyl^2), alpha = 0.6) + # data point size
  continuous_scale(
    aesthetics = c("size", "point.size"), scale_name = "size",
    palette = my_pal(c(3,15)),
    guide = guide_legend(override.aes = list(label = "")) # hide "a" in legend
  ) +
  geom_text_repel(
    aes(point.size = cyl^2), # data point size
    size = 4, # font size in the text labels
    point.padding = 0, # additional padding around each point
    min.segment.length = 0, # draw all line segments
    max.time = 1, max.iter = 1e5, # stop after 1 second, or after 100,000 iterations
    box.padding = 0.3 # additional padding around each text label
  ) +
  theme(legend.position = "none")

There are still some overlaps. It looks like the label 'is aware' of the size of its point, but ignores that neighbouring points may also be oversized.
I've also noticed that there is a problem with adjusting position of multiline labels, but that is probably separate issue.

wykres

@z3tt
Copy link

z3tt commented Aug 4, 2021

@slowkow, this is awesome, thank you so much! It worked perfectly for my use case for several plots with varying bubble sizes.

Any plan to include this as an argument in geom_text|label_repel()?

@slowkow
Copy link
Owner

slowkow commented Aug 4, 2021

@z3tt Sorry, but I don't understand. Could I please ask if you might elaborate?

@z3tt
Copy link

z3tt commented Aug 6, 2021

@slowkow Sorry for being imprecise. I used your dynamic scaling of the range to label bubbles where a variable is mapped to size.
As far as I know, there is no option to make this work without adding the my_pal() and continuous_scale() manually and I wondered if it could be implemented in the package as well as an argument, e.g. the suggested point.size aesthetic, distance = "dynamic" or whatever looks attractive to you. Hope this makes sense now.

@slowkow
Copy link
Owner

slowkow commented Aug 6, 2021

@z3tt Unfortunately, I am not aware of any approach to simplify the dynamic point size functionality in the above example 🙁

The challenge is that we need two independent layers (geom_point and geom_text_repel) to "see" the same point size. The best solution I could imagine is shown in the example above. We need to define the point size outside of the two layers (with my_pal()), and then tell each of the layers to use that definition (with continuous_scale()).

I realize that this example is going to be difficult for newcomers to ggplot2 to fully understand. In my experience, I need to re-read the code each time I want to understand it, and I need to copy-paste it to get it right. This makes me think it may be worthwhile to write up a blog post or a new documentation article that tries to clarify each of the steps.

I wish we could change the ggplot2 design a little bit, so that each layer can "see" the previous layers somehow... but it's not obvious to me how to implement this or whether this kind of feature would be useful for other applications.

It is possible that I have overlooked a simpler way to implement dynamic point size repulsion... maybe someone can develop a simpler code example? Many users seem interested in this feature, so improvements would be welcome.

There's a bit of a rabbit hole if you want to keep reading about this theme:

@z3tt
Copy link

z3tt commented Aug 7, 2021

I am definitely not a newcomer to ggplot2, however, I will always need to look this up given I don't need it very often^^
Thanks for the feedback, I understand now that it is problematic/impossible with the current ggplot2 layer system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants