Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init an article about wrapping glue #338

Merged
merged 7 commits into from
Aug 30, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 86 additions & 65 deletions vignettes/wrappers.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Therefore, you'd prefer to use `<<` and `>>` as the opening and closing delimite
Spoiler alert: here's the correct way to write such a wrapper:

```{r}
myglue <- function(..., .envir = parent.frame()) {
my_glue <- function(..., .envir = parent.frame()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth a footnote pointing out that this is the same pattern you use in abort()/cli_abort() wrappers? And in defer()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the same! I added a sentence in the intro. I think it's worthy to point this out.

glue(..., .open = "<<", .close = ">>", .envir = .envir)
}
```
Expand All @@ -36,45 +36,71 @@ This is the key move:

If you'd like to understand why this is the way, keep reading.

## A first attempt that does not work
## Working example

Here's a simple attempt at writing a wrapper around `glue()`:
Here's an abbreviated excerpt of the roxygen comment that generates the documentation for the starwars dataset in dplyr:

```r
#' \describe{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is much better!

#' \item{name}{Name of the character}
#' \item{height}{Height (cm)}
#' \item{mass}{Weight (kg)}
#' \item{species}{Name of species}
#' \item{films}{List of films the character appeared in}
#' }
```

To produce such text programmatically, the first step might be to generate the `\item` lines from a named list of column names and descriptions.
Notice that `{` and `}` are important to the `\describe{...}` syntax, so this is an example where it is nice for glue to use different delimiters for expressions.

Put the metadata in a suitable list:

```{r}
myglue0 <- function(...) {
glue(..., .open = "@@", .close = "~~")
}
sw_meta <- list(
name = "Name of the character",
height = "Height (cm)",
mass = "Weight (kg)",
species = "Name of species",
films = "List of films the character appeared in"
)
```

From superficial experimentation, `myglue0()` appears to work:
Define a custom glue wrapper and use it inside another helper that generates `\item` entries[^1]:

[^1]: We're using `<@` and `@>` as the delimiters because this vignette is authored using R Markdown and knitr. The delimiters `<<` and ``>>` actually have special meaning in knitr, so it was easiest to use something else.
jennybc marked this conversation as resolved.
Show resolved Hide resolved

```{r}
fn_def <- "
@@NAME~~ <- function(x) {
@@BODY~~
}"
my_glue = function(...) {
glue(..., .open = "<@", .close = "@>", .envir = parent.frame())
}

myglue0(fn_def, NAME = "one_plus_one", BODY = "1 + 1")
named_list_to_items <- function(x) {
my_glue("\\item{<@names(x)@>}{<@x@>}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think of it, is there a reason to not do just < and >?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work in this exact example, but I don't want to model that in a vignette. I know some folks will copy exactly what they see here and I think < and > are too likely to collide with text in the template that should not get evaluated. Generally speaking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about (( or [[? I'm just thinking that <@ looks hard to type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well all of those come up a lot in real R code (like, the code you might want to evaluate) so again I shied away from them. It really sucks that you can't use <<. You can't even triple them <<< because that also matches knitr's regex (unintentionally, I think).

}
```

Apply `named_list_to_items()` to starwars metadata:

```{r}
named_list_to_items(sw_meta)
```

However slightly more sophisticated and realistic usage of `myglue0()` reveals a big problem.
Here we use `myglue0()` inside another function, `fn_builder0()`:
Here's how this would fail if we did *not* handle `.envir` correctly in our wrapper function:

```{r, error = TRUE}
fn_builder0 <- function(NAME, BODY) {
fn_def <- "
@@NAME~~ <- function(x) {
@@BODY~~
}"
myglue0(fn_def)
my_glue_WRONG <- function(...) {
glue(..., .open = "<@", .close = "@>")
}

named_list_to_items_WRONG <- function(x) {
my_glue_WRONG("\\item{<@names(x)@>}{<@x@>}")
}

fn_builder0("two_times_two", "2 * 2")
named_list_to_items_WRONG(sw_meta)
```

What do you mean `NAME` is not found?!?
It's one of the arguments of `fn_builder0()`.
It should be "two_times_two".
It can be hard to understand why `x` can't be found, when it is clearly available inside `named_list_to_items_WRONG()`.
Why isn't `x` available to `my_glue_WRONG()`?

## Where does `glue()` evaluate code?

Expand All @@ -87,65 +113,60 @@ glue(..., .envir = parent.frame(), ...)

The expressions inside a glue string are evaluated with respect to `.envir`, which defaults to the environment where `glue()` is called from.

Let's make new artificial versions of our functions that make it easy to tell where the inner `glue()` call is getting its values.
Everything is simple when evaluating `glue()` in the global environment.
jennybc marked this conversation as resolved.
Show resolved Hide resolved

```{r}
myglue1 <- function(...) {
NAME <- "myglue_execution_env"
glue(..., .open = "@@", .close = "~~")
}
x <- 0
y <- 0
z <- 0

fn_builder1 <- function(NAME, BODY) {
fn_def <- "
@@NAME~~ <- function(x) {
@@BODY~~
}"
myglue1(fn_def)
}
glue("{x} {y} {z}")
```

Let's also strategically define `BODY` in the global environment

```{r}
BODY <- "global_env"
```
Now we wrap `glue()` in our own simple function, `myglue1()`.
Notice that `myglue()` doesn't capture its caller environment and pass that along.

Now let's call our function builder and observe which values are being used for `NAME` and `BODY`:
When we execute `myglue1()` in the global environment, there's no obvious problem.

```{r}
fn_builder1(NAME = "user_NAME", BODY = "user_BODY")
```
myglue1 <- function(...) {
x <- 1
glue(...)
}

Neither `NAME` nor `BODY` is getting the values that our user provided in the call to `fn_builder1()`!
myglue1("{x} {y} {z}")
```

That's because the innermost call to `glue()` is looking in these places, in this order, for `NAME` and `BODY`:
However, if we call our `myglue1()` inside another function, we see that all is not well.

1. The ephemeral execution environment of our glue wrapper, `myglue1()`. Here `NAME` has the value "myglue_execution_env" there and that explains part of our result.
2. The environment where `myglue1()` is defined, which is the global environment. `BODY` has the value "global_env" there and that explains the rest of our result.
```{r}
myglue2 <- function(...) {
x <- 2
y <- 2
myglue1(...)
}

Note that the execution environment of `fn_builder1()`, which holds the `NAME` and `BODY` specified by the user, is not consulted at all.
This is obviously very bad.
myglue2("{x} {y} {z}")
```

## A wrapper that works
Why do `x` and `y` not have the value 2?
Because `myglue1()` and its eventual call to `glue()` have no access to the execution environment of `myglue2()`.

We fix our `glue()` wrapper by capturing its caller environment and passing that along to `glue()` to use for evaluation:
If you want your glue wrapper to behave like `glue()` itself and to work as expected inside other functions, make sure it captures its caller environment and passes that along to `glue()`.

```{r}
myglue <- function(..., .envir = parent.frame()) {
glue(..., .open = "@@", .close = "~~", .envir = .envir)
myglue3 <- function(..., .envir = parent.frame()) {
x <- 3
glue(..., .envir = .envir)
}

fn_builder <- function(NAME, BODY) {
fn_def <- "
@@NAME~~ <- function(x) {
@@BODY~~
}"
myglue(fn_def)
}
```
myglue3("{x} {y} {z}")

Now our function builder works as intended:
myglue4 <- function(...) {
x <- 4
y <- 4
myglue3(...)
}

```{r}
fn_builder(NAME = "one_plus_one", BODY = "1 + 1")
myglue4("{x} {y} {z}")
```
Loading