-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.qmd
700 lines (462 loc) · 22.2 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
---
title: "Test that! No literally, testthat..."
author: "Cam Race"
format:
revealjs:
logo: logo.png
smaller: true
incremental: true
footer: Automated pass / fail QA in R, November 2024
theme: white
class: smaller
fig-cap-location: top
layout-valign: center
css: "theme.css"
---
## What I hope to cover
- A quick overview of what is possible when using testthat for quick pass / fail checks in
- R code (e.g. in analysis or pipelines)
- R Shiny applications
- R packages
- Show a number of examples of testing with R code that
- R users can steal, adapt, and reuse
- Non R users or senior managers can use as inspiration when directing their teams quality assurance (QA)
- Show some starting points for you to get going with
## Disclaimer: I am not going to
- Show you through one specific project
- Instead, I'll compensate by scatter-gunning some examples in the hope they provide useful inspiration
- Teach you to QA
- There are many aspects of QA that require analytical skills, experience and insight to assess and make judgement calls, you can't do all QA with just pass / fail tests
- Lots of QA is specific to the analysis you're doing
- Teach you how to build R Shiny apps or R packages
- I will skim over some of the details of these to focus on the testing
- Both have had multiple [coffee and coding sessions](https://educationgovuk.sharepoint.com/sites/sarpi/g/AC/Coffee%20and%20Coding.aspx?xsdata=MDV8MDJ8fGVmYmNjMzE0N2ExNjRhNTg4NTQ1MDhkZDA5NGQzY2RifGZhZDI3N2M5YzYwYTRkYTFiNWYzYjNiOGIzNGE4MmY5fDB8MHw2Mzg2NzY5NDk0OTQ2Mjk3MDF8VW5rbm93bnxWR1ZoYlhOVFpXTjFjbWwwZVZObGNuWnBZMlY4ZXlKV0lqb2lNQzR3TGpBd01EQWlMQ0pRSWpvaVYybHVNeklpTENKQlRpSTZJazkwYUdWeUlpd2lWMVFpT2pFeGZRPT18MXxMMk5vWVhSekx6RTVPakF3T1RGa05EVmxMVEl6TTJFdE5EUTFNeTFoWkdWaUxUUTVObVF5WlRBd09XWmpZMTh6WWpKaU1qSTNNeTFoWmpJM0xUUTVZMk10WVRFMk15MHpaakJrWlRGaU1ESXlaV1pBZFc1eExtZGliQzV6Y0dGalpYTXZiV1Z6YzJGblpYTXZNVGN6TWpBNU9ERTBPRE14TXc9PXw3OWI5ZTBkOGY5YTc0OWVmMjUxYzA4ZGQwOTRkM2NkOXw0ZGQzYzYyMGVjMTc0YTZlYWVmMDY0NDhjZTNmNmM1ZA%3D%3D&sdata=U2t3UjEwVHBYY081NkZHMmk2ZzFEUlZ6SGtVUjdtb2NWSlZ0YmJtR2NqWT0%3D&ovuser=fad277c9-c60a-4da1-b5f3-b3b8b34a82f9%2CCameron.RACE%40EDUCATION.GOV.UK&OR=Teams-HL&CT=1732098230181&clickparams=eyJBcHBOYW1lIjoiVGVhbXMtRGVza3RvcCIsIkFwcFZlcnNpb24iOiI0OS8yNDEwMjAwMTMxOCIsIkhhc0ZlZGVyYXRlZFVzZXIiOmZhbHNlfQ%3D%3D) in their own right - have a watch back of them
- Have a look at the [Analysts' Guide R Shiny guidance](https://dfe-analytical-services.github.io/analysts-guide/writing-visualising/dashboards_rshiny.html#learning-r-shiny)
- For R packages look no further than [R packages by Hadley Wickham and Jennifer Bryan](https://r-pkgs.org/)
## If you want to get more confident using R
- See our [unit's R workshop](https://dfe-analytical-services.github.io/analysts-guide/learning-development/learning-support.html#technical-workshops) offer
- OR some of the many, [great free online resources for getting started with R](https://dfe-analytical-services.github.io/analysts-guide/learning-development/r.html)
- Remember, the best coders don't have lots of unique knowledge, they're just usually the most willing to search for the answers to problems themselves instead of relying on others, so get stuck in yourself and get curious!
## Basic example of what we're going to look at
Many of you who've written R code before will have come across something like this:
```{r echo=TRUE}
my_table <- as.data.frame("")
if(nrow(my_table) == 0){
stop("There are no rows in the table, this can't be right!")
}
```
- This is an example of pass / fail automated QA!
- It checks for a condition and then if it meets a criteria it stops the code and tries to give a semi-informative error
- I'm going to show how you can use testthat for this
- and then how you can expand on QA in Shiny and R packages
## Why use automated pass / fail checks as a part of QA?
- They are incredibly...
- fast
- reliable
- easy to understand
- They are the best way to start including simple automated checks into your code and data with minimal extra time or overhead to your processes
<div>
![Crafting and pun credit: Mark Pearson](img/qa-lity_street.png){width="25%"}
</div>
## Working as a part of a team?
- Adding in quick automated pass fail checks will
- Make it easier for everyone to run the same code reliably
- Make it easier for everyone to understand what the code is supposed to be doing
- Cut down the time when investigating issues
- Gives assurance when reviewing each others code that you're making the same assumptions
- All because it clearly and simply documents and tests your assumptions
## Working as a part of a team?
You can also use them to add in catches for steps that require manual intervention e.g.
``` r
# Check the above and flag discrepancies with master table
if(length(newPublications) != 0){
warning(
message("-----------------------"),
message("WARNING: There are new publications on the EES find stats page that are not in the master table."),
message("-----------------------"),
message("The following publications are new:"),
print(newPublications),
message(""),
message("Please update the master table in SQL using the updateMasterTable.R script in this repo."),
message(""),
)
}
if(length(newlyOnEES) != 0){
warning(
message("---------------------"),
message("WARNING: A publication marked as unpublished in the master table is now published on EES."),
message("---------------------"),
message("The following publications are newly published to EES:"),
print(newlyOnEES),
message(""),
message("Please update the rows for these publications in the master table using the updateMasterTable.R script in this repo."),
message(""),
message("---------------------")
)
}
if(length(newPublications) == 0 && length(newlyOnEES) == 0){
message("-----------------------")
message("PASS - Master table is up to date with the live service!")
message("-----------------------")
}
```
## Working alone?
If you're like me, and ol' Frederick here, you might end up having conversations with yourself about the code...
!['You talkin’ to me? You talkin' to me? Well I'm the only one here. Who the duck do you think you're talking to?' Frederick Bickles, 2024](img/frederick_taxi_driver.png){fig-alt="'You talkin’ to me? You talkin' to me? Well I'm the only one here. Who the duck do you think you're talking to?'" width="70%"}
<!-- -->
- Adding in quick automated pass fail checks will
- Make it easier for you to run the same code reliably
- Make it easier for you to remember what the code is supposed to be doing
- Cut down the time when investigating issues
- Give assurance when writing code that it and the outputs are being QA'd consistently
## Packages we'll use
- For everyone
- [testthat](https://testthat.r-lib.org/)
- For R Shiny developers
- [shinytest2](https://rstudio.github.io/shinytest2/) (Side note: Are you still using the original shinytest? Stop. Migrate to shinytest2 now.)
- For R package developers
- [usethis](https://usethis.r-lib.org/)
- [devtools](https://devtools.r-lib.org/)
- Bonus level:
- [GitHub Actions](https://github.com/features/actions) (not a package, nor specifically for R, but something that is genuinely excellent and you should make use of if you can)
# A whole new world...
![](img/frederick_r_magic_carpet.png){fig-alt="A mediocre visual pun smushing R with Disney's Aladdin" fig-align="center"}
# Testing R code with testthat
## Testing principles
::: fragment
You should always think of automated tests as three phases
1. GIVEN
2. WHEN
3. THEN
:::
## Testing principles
- GIVEN
- Two numbers
- WHEN
- You run a function on those numbers to add them together
- THEN
- You get the sum of those two numbers
## Testing principles - testthat example
``` r
library(testthat)
# GIVEN
x <- 2
y <- 3
# WHEN
z <- sum(x, y)
# THEN
# Instead of an if statement we use the expect_* functions from testthat
expect_equal(z, 5)
# If this passes nothing happens, it doesn't print to the console, and the rest of your code will run as normal
```
## Testing principles - tidied example
``` r
library(testthat)
# GIVEN
x <- 2
y <- 3
# WHEN & THEN (often written this way for conciseness)
expect_equal(sum(x, y), 5)
```
## What if it fails?
```{r eval=FALSE, echo=TRUE}
library(testthat)
# GIVEN
x <- 2
y <- 2 # uh-oh spaghetti-o
# WHEN & THEN
expect_equal(sum(x, y), 5)
```
```{r}
library(testthat)
# GIVEN
x <- 2
y <- 2 # uh-oh spaghetti-o
# WHEN & THEN
try(expect_equal(sum(x, y), 5))
```
This throws an error and stops your code - this is particularly helpful if you have certain expectations that later code relies on, and allows you to fix the problem before it compounds and causes lots of later errors.
## Example data you can use yourself
```{r echo=TRUE}
head(dplyr::starwars)
# Could also do it this way
#
# library(dplyr)
# head(starwars)
```
## Examples on star wars data
```{r eval=FALSE, echo=TRUE}
# Check that df_starwars is a data frame
testthat::expect_true(dplyr::starwars |> is.data.frame())
# An alternative that I know will fail
testthat::expect_true(dplyr::starwars |> is.vector())
```
```{r}
try(testthat::expect_true(dplyr::starwars |> is.vector()))
```
- These kind of class / type checks are particularly useful for
- Catching issues early in processes to prevent compound errors
- Quality checking output data
## Examples on star wars data
```{r echo=TRUE, eval=FALSE}
# Check that the height column is numeric
testthat::expect_true(dplyr::starwars$height |> is.numeric())
# An alternative that I know will fail
testthat::expect_true(dplyr::starwars$height |> is.character())
```
```{r}
try(testthat::expect_true(dplyr::starwars$height |> is.character()))
```
## Examples on star wars data
```{r echo=TRUE, eval=FALSE}
# Check that no characters have a height over 500cm
testthat::expect_lt(dplyr::starwars$height |> max(na.rm=TRUE), 500)
# An example I know will fail
testthat::expect_gt(dplyr::starwars$height |> max(na.rm=TRUE), 500)
```
```{r}
try(testthat::expect_gt(dplyr::starwars$height |> max(na.rm=TRUE), 500))
```
- Again, this kind of check can be easily applied to sense-check your data ranges, before, during and after processing
## Examples on star wars data
```{r echo=TRUE, eval=FALSE}
# Check that there are at least unique 50 characters
testthat::expect_gte(
dplyr::starwars |>
dplyr::distinct(name) |>
nrow(),
50
)
# An example I know will fail
testthat::expect_lte(
dplyr::starwars |>
dplyr::distinct(name) |>
nrow(),
50
)
```
```{r}
try(
testthat::expect_lte(
dplyr::starwars |>
dplyr::distinct(name) |>
nrow(),
50
)
)
```
- This could be particularly useful if you want to have a quick check to make sure you have the right number of LA's and haven't accidentally lost any in a join or aggregation
## Examples on star wars data
```{r echo=TRUE, eval=FALSE}
# Check that a common problematic rogue entry is there
testthat::expect_in(
"Darth Vader",
dplyr::starwars$name
)
# Check that if all items in a vector are present
testthat::expect_contains(
dplyr::starwars$name,
c("Professor Charles Xavier", "C-3PO", "R2-D2")
)
```
```{r}
try(testthat::expect_contains(
dplyr::starwars$name,
c("Professor Charles Xavier", "C-3PO", "R2-D2")
))
```
## Example on locations data
```{r echo=TRUE, eval=TRUE}
head(dfeR::wd_pcon_lad_la_rgn_ctry)
```
## Example on locations data
```{r echo=TRUE, eval=FALSE}
lookup <- dfeR::wd_pcon_lad_la_rgn_ctry
# Check that the lookup file contains all LAs in England in 2024
expect_contains(
lookup$la_name,
dfeR::fetch_las(year = 2024, countries = "England") |>
pull(la_name)
)
# Example of it failing
expect_contains(
lookup$la_name,
"Westeros"
)
```
```{r}
lookup <- dfeR::wd_pcon_lad_la_rgn_ctry
try(
expect_contains(
lookup$la_name,
"Westeros"
)
)
```
- An example where this might be helpful is when doing analysis across boundary changes, and you've had a specific issue with a rogue location not appearing in your latest year of data when it should be there
- It's quick and easy to add specific checks to give you and others relying on your analysis peace of mind that those issues can't be repeated!
## Most common expect\_\* functions
- expect_equal() / expect_identical()
- expect_true() / expect_false()
- expect_lt() / expect_lte() / expect_gt / expect_gte()
- expect_contains() / expect_in()
- expect_error() / expect_warning() / expect_message()
- expect_no_error() / expect_no_warning() / expect_no_message()
- You can even build [custom expectations](https://testthat.r-lib.org/articles/custom-expectation.html)
- Full list and documentation on the [testthat site](https://testthat.r-lib.org/reference/index.html)
## When and where should I do this?
Really you could be doing this before, during, and after your analysis code! For example, you might already do this kind of thing in SQL scripts
``` sql
SELECT * FROM my_table
-- 1,551 rows
SELECT DISTINCT (la_name) FROM my_table
-- 153 rows
```
- This is how you should think of these checks, a way of regularly documenting your assumptions through the code!
## When and where should I do this?
``` {.r .R}
library(dplyr)
library(testthat)
# Import data
raw_data <- starwars
# Check raw data has at least 50 rows
expect_gte(nrow(raw_data), 50)
# Process data to aggregate by species
species_summary <- raw_data |>
group_by(species) |>
summarise(
avg_height = mean(height, na.rm = TRUE),
avg_mass = mean(mass, na.rm = TRUE)
)
# Check the processing
expect_equal(ncol(species_summary), 3)
expect_equal(
nrow(species_summary),
raw_data$species |>
unique() |>
length()
)
# Imagine the final output is the summary table
# Sense check the output data
expect_true(species_summary$avg_mass |> is.numeric())
expect_true(all(!is.nan(species_summary$avg_mass)))
expect_gte(species_summary$avg_mass |> min(), 20)
```
## When and where should I do this?
For more examples of code using this have a look at:
- [RAP knowledge share on QA](https://educationgovuk.sharepoint.com/:f:/r/sites/lvewp00086/SSU%20%20open%20sharing/Reproducible%20Analytical%20Pipeline%20(RAP)%20resources/Knowledge-share%20series?csf=1&web=1&e=S76Ld6) had a few good examples of this
- [Coffee and coding on testing R code](https://web.microsoftstream.com/video/6806b4ba-7b13-4c3c-96f1-9b91c0f1d85b) (Rachel Tadd - 2020)
# Testing R Shiny applications
## Different R Shiny tests
Testing an application is different to dotting expectations through your code, you'll often have a 'suite' or 'suites' of tests that you run periodically in bulk to quickly check the latest version of the code still does all the things it used to.
Test types are:
1. Unit tests
- Low level
- Check the behaviour of specific functions
- testthat is ideal for this
- Very quick to create and to run
2. Integration tests
- Medium level, integrating a number of moving parts
- Shiny has testServer() built in already
- Pretty quick to run, bit more effort to set up
3. UI (User Interface) or 'End to end' tests
- Full test of the user experience and all components
- shinytest2 gives you the full end to end tests (also known as UI tests)
- Take longer to run, and a lot more effort to set up
## Different forms of tests
![](img/testing-pyramid.webp)
## Running the tests
- In an R Shiny application you'll usually have a tests/ folder that contains all of your test scripts
- Assuming you have shinytest2 installed, you can automatically run all of the above types of tests together using `shinytest2::test_app()`
- Our [DfE shiny template](https://github.com/dfe-analytical-services/shiny-template) has examples of this - demo
# Bonus: Continuous integration (CI)
## Bonus: Continuous integration (CI)
- Who doesn't love bonus content? This is here as a bonus as it actually applies to multiple types of projects, not just R Shiny applications.
- Continuous integration is
- the practice of integrating code changes frequently and ensuring that the integrated codebase is in a workable state
- For our purposes, this means that every time we make a new version of the code, tests are run automatically against that code, speeding up the process!
- If your code is open and hosted in GitHub (i.e. is safe to share publicly), then you get GitHub Actions for free, and it's incredibly powerful
- Free access to virtual machines that will run and test your code for you
- A free way to automate code scripts on Git events (commits / pushes / PRs) or on a scheduled job (e.g. once everyday at midnight)
- Integrate your tests with version control processes, so you have a clear audit trail and can proactively prevent issues from getting to your live application
- This is particularly useful where you have specific test folders or scripts that you want to run periodically against a 'product' but not inline in the actual code, e.g. R Shiny applications, R packages, deployed R models etc
## Example of CI
For an example Shiny app that runs continuous integration, have a look at the [explore education statistics data screener on GitHub](https://github.com/dfe-analytical-services/dfe-published-data-qa).
# Testing R packages
## What is an R package
Every R user will have used some R packages at some point. For example:
```{r echo=TRUE}
library(dplyr)
```
- This loads all of the functions that are premade in dplyr, ready into your R session for you to use!
- It's similar conceptually to having a `functions.R` script in your project and then running `source("functions.R")` so you can make use of the functions!
## What does a package actually look like?
Think of it as being able to reuse others functions and having documentation on how they work
```{r echo=TRUE}
library(dfeR)
comma_sep(3000)
comma_sep
```
## What does code in a package actually look like?
```{r echo=TRUE}
#' Comma separate
#'
#' @description
#' Adds separating commas to big numbers. If a value is not numeric it will
#' return the value unchanged and as a string.
#'
#' @param number number to be comma separated
#' @param nsmall minimum number of digits to the right of the decimal point
#'
#' @return string
#' @export
#'
#' @examples
#' comma_sep(100)
#' comma_sep(1000)
#' comma_sep(3567000)
comma_sep <- function(number,
nsmall = 0L) {
format(number,
big.mark = ",", nsmall = nsmall, trim = TRUE,
scientific = FALSE
)
}
```
The documentation ([roxygen2](https://roxygen2.r-lib.org/)) and easy automation built around R packages using [devtools](https://devtools.r-lib.org/) and [usethis](https://usethis.r-lib.org/) is a great reason why structuring code as an R package can be helpful even just for pipelines within teams
- See the [RAP knowledge share 7 'Beyond statistics'](https://educationgovuk.sharepoint.com/:v:/r/sites/lvewp00086/SSU%20%20open%20sharing/Reproducible%20Analytical%20Pipeline%20(RAP)%20resources/Knowledge-share%20series/Knowledge%20share%207%20-%20RAP%20projects%20OUTSIDE%20of%20statistics%20production!.mp4?csf=1&web=1&e=yO4Maq), Matt Jago talks through his team's move from Excel to using an R package for their pipeline
## Package tests examples
Most commonly in an R package you have unit tests against your functions so that you can ensure they reliably behave as expected for users of the package.
Example - [dfeR tests folder](https://github.com/dfe-analytical-services/dfeR/tree/main/tests)
Example - [dfeshiny tests folder](https://github.com/dfe-analytical-services/dfeshiny/tree/main/tests)
# Wrap up
## Additional tips
1. Whenever you hit an issue, add a quick testthat test for it!
- It's a great way to prevent issues repeating themselves
- Also allows you to slowly and steadily build up a set of checks that cover things you know could go wrong!
2. Start small
- Don't feel like you need to write all of the tests immediately - it's a great way to lose lots of time in rabbit holes
- Fewer, reliable tests that check a small amount of things well, are more valuable than a whopping suite of complicated and temperamental checks that check lots of things but add on lots of overhead
3. Don't feel like you need to re-test other packages functions
- that's overkill, most packages already test their functions, so you can (mostly) trust them
- however, it can be wise to do some research on the quality of the packages you're using...
- has it been updated recently?
- is it deployed on CRAN?
- does it have good test coverage?
- if in doubt get in touch with us and we can take a look (explore.statistics\@education.gov.uk)
## Remember
1. GIVEN - WHEN - THEN
2. Pass / fail tests are super quick to add
- fast, reliable feedback on your code
- help to speed up development by catching issues early
- help to document expectations through your code
3. Pass / fail tests will never replace all QA
4. If you can think of it, you can probably code it, get coding!
## Any questions?
cameron.race\@education.gov.uk
Slides made with Quarto:
::: {.nonincremental}
- [GitHub repository](https://github.com/dfe-analytical-services/pass-fail-qa-cc-20241120)
- [Slides themselves (deployed to GitHub pages)](https://dfe-analytical-services.github.io/pass-fail-qa-cc-20241120/)
- [Get started on your own Quarto slides](https://quarto.org/docs/presentations/)
:::