-
Notifications
You must be signed in to change notification settings - Fork 58
/
pkg_building.Rmd
529 lines (321 loc) · 49.2 KB
/
pkg_building.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
---
aliases:
- building.html
---
# Packaging Guide {#building}
```{block, type="summaryblock"}
rOpenSci accepts packages that meet our guidelines via a streamlined [Software Peer Review process](#whatissoftwarereview). To ensure a consistent style across all of our tools we have written this chapter highlighting our guidelines for package development. Please also read and apply our [chapter about continuous integration (CI)](#ci). Further guidance for after the review process is provided in the third section of this book starting with [a chapter about collaboration](#collaboration).
We recommend that package developers read Hadley Wickham and Jenny Bryan's thorough book on package development which is available for [free online](https://r-pkgs.org/). Our guide is partially redundant with other resources but highlights rOpenSci's guidelines.
To read why submitting a package to rOpenSci is worth the effort to meet guidelines, have a look at [reasons to submit](#whysubmit).
```
## Package name and metadata {#package-name-and-metadata}
### Naming your package {#naming-your-package}
- We strongly recommend short, descriptive names in lower case. If your package deals with one or more commercial services, please make sure the name does not violate branding guidelines. You can check if your package name is available, informative and not offensive by using the [`pak::pkg_name_check()` function](https://pak.r-lib.org/reference/pkg_name_check.html); also use a search engine as you'd thus see if it's offensive in a language other than English. In particular, do *not* choose a package name that's already used on CRAN or Bioconductor.
- There is a trade-off between the advantages of a unique package name and a less original package name.
- A more unique package name might be easier to track (for you and us to assess package use for instance, less false positives when typing its name in GitHub code search) and search (for users to ask "how to use package blah" in a search engine).
- On the other hand a *too* unique package name might make the package less discoverable (that is to say, to find it by searching "how to do this-thing in R"). It might be an argument for naming your package something very close to its topic such as [geojson](https://github.com/ropensci/geojson)).
- Find other interesting aspects of naming your package [in this blog post by Nick Tierney](https://www.njtierney.com/post/2018/06/20/naming-things/), and in case you change your mind, find out [how to rename your package in this other blog post of Nick's](https://www.njtierney.com/post/2017/10/27/change-pkg-name/).
### Creating metadata for your package {#creating-metadata-for-your-package}
We recommend you to use the [`codemetar` package](https://github.com/ropensci/codemetar) for creating and updating a JSON [CodeMeta](https://codemeta.github.io/) metadata file for your package via `codemetar::write_codemeta()`. It will automatically include all useful information, including [GitHub topics](#grooming). CodeMeta uses [Schema.org terms](https://schema.org/) so as it gains popularity the JSON metadata of your package might be used by third-party services, maybe even search engines.
## Platforms {#platforms}
- Packages should run on all major platforms (Windows, macOS, Linux). Exceptions may be granted packages that interact with system-specific functions, or wrappers for utilities that only operate on limited platforms, but authors should make every effort for cross-platform compatibility, including system-specific compilation, or containerization of external utilities.
## Package API {#package-api}
### Function and argument naming {#function-and-argument-naming}
- Functions and arguments naming should be chosen to work together to form a common, logical programming API that is easy to read, and auto-complete.
- Consider an `object_verb()` naming scheme for functions in your package that take a common data type or interact with a common API. `object` refers to the data/API and `verb` the primary action. This scheme helps avoid namespace conflicts with packages that may have similar verbs, and makes code readable and easy to auto-complete. For instance, in **stringi**, functions starting with `stri_` manipulate strings (`stri_join()`, `stri_sort()`, and in **googlesheets** functions starting with `gs_` are calls to the Google Sheets API (`gs_auth()`, `gs_user()`, `gs_download()`).
- For functions that manipulate an object/data and return an object/data of the same type, make the object/data the first argument of the function so as to enhance compatibility with the pipe operators (base R's `|>`, magrittr's `%>%`).
- We strongly recommend `snake_case` over all other styles unless you are porting over a package that is already in wide use.
- Avoid function name conflicts with base packages or other popular ones (e.g. `ggplot2`, `dplyr`, `magrittr`, `data.table`)
- Argument naming and order should be consistent across functions that use similar inputs.
- Package functions importing data should not import data to the global environment, but instead must return objects. Assignments to the global environment are to be avoided in general.
### Console messages {#console-messages}
- Use either the [cli package](https://cli.r-lib.org/), or base R's tools (`message()` and `warning()`) to communicate with the user in your functions.
- Highlights of the cli package include: automatic wrapping, respect of the [NO_COLOR convention](https://cli.r-lib.org/articles/cli-config-user.html?q=no#no_color), many [semantic elements](https://cli.r-lib.org/articles/semantic-cli.html), and extensive documentation. Read more in a [blog post](https://blog.r-hub.io/2023/11/30/cliff-notes-about-cli/).
- Please do not use `print()` or `cat()` unless it's for a `print.*()` or `str.*()` methods, as these methods of printing messages are harder for users to suppress.
- Provide a way for users to opt out of verbosity, preferably at the package level: make message creation dependent on an environment variable or option (like ["usethis.quiet"](https://usethis.r-lib.org/reference/ui.html?q=usethis.quiet#silencing-output) in the usethis package), rather than on a function parameter. The control of messages could be on several levels ("none, "inform", "debug") rather than logical (no messages at all / all messages). Control of verbosity is useful for end users but also in tests. More interesting comments can be found in an [issue of the tidyverse design guide](https://github.com/tidyverse/design/issues/42).
### Interactive/Graphical Interfaces {#interactive-graphical-interfaces}
If providing graphical user interface (GUI) (such as a Shiny app), to facilitate workflow, include a mechanism to automatically reproduce steps taken in the GUI. This could include auto-generation of code to reproduce the same outcomes, output of intermediate values produced in the interactive tool, or simply clear and well-documented mapping between GUI actions and scripted functions. (See also ["Testing"](#testing) below.)
The [`tabulizer` package](https://github.com/ropensci/tabulizer) e.g. has an interactive workflow to extract tables, but can also only extract coordinates so one can re-run things as a script. Besides, two examples of shiny apps that do code generation are [https://gdancik.shinyapps.io/shinyGEO/](https://gdancik.shinyapps.io/shinyGEO/), and [https://github.com/wallaceEcoMod/wallace/](https://github.com/wallaceEcoMod/wallace/).
### Input checking
We recommend your package use a consistent method of your choice for [checking inputs](https://blog.r-hub.io/2022/03/10/input-checking/) -- either base R, an R package, or custom helpers.
### Packages wrapping web resources (API clients)
If your package accesses a web API or another web resource,
- Make sure requests send an [user agent](https://httr2.r-lib.org/articles/wrapping-apis.html#user-agent), that is, a way to identify what (your package) or who sent the request. The users should be able to override the package's default user agent. Ideally the user agent should be different on continuous integration services, and in development (based on, for instance, the GitHub usernames of the developers).
- You might choose different (better) defaults than the API, in which case you should document them.
- Your package should help with pagination, by allowing the users to not worry about it at all since your package does all necessary requests.
- Your package should help with rate limiting according to the API rules.
- Your package should reproduce API errors, and possibly explain them in informative error messages.
- Your package could export high-level functions and low-level functions, the latter allowing users to call API endpoints directly with more control (like `gh::gh()`).
For more information refer to the blog post [Why You Should (or Shouldn't) Build an API Client](https://ropensci.org/blog/2022/06/16/publicize-api-client-yes-no).
## Code Style {#code-style}
- For more information on how to style your code, name functions, and R scripts inside the `R/` folder, we recommend reading the [code chapter in The R Packages book](https://r-pkgs.org/Code.html). We recommend the [`styler` package](https://github.com/r-lib/styler) for automating part of the code styling. We suggest reading the [Tidyverse style guide](https://style.tidyverse.org/).
- You can choose to use `=` over `<-` as long you are consistent with one choice within your package. We recommend avoiding the use of `->` for assignment within a package. If you do use `<-` throughout your package, and you also use `R6` in that package, you'll be forced to use `=` for assignment within your `R6Class` construction - this is not considered an inconsistency because you can't use `<-` in this case.
## CITATION file {#citation-file}
- If your package does not yet have a CITATION file, you can create one with `usethis::use_citation()`, and populate it with values generated by the `citation()` function.
- CRAN requires CITATION files to be declared as [`bibentry` items](https://stat.ethz.ch/R-manual/R-devel/library/utils/html/bibentry.html), and not in the previously-accepted form of [`citEntry()`](https://stat.ethz.ch/R-manual/R-devel/library/utils/html/citEntry.html).
- If you archive each release of your GitHub repo on Zenodo, add the [Zenodo top-level DOI](https://help.zenodo.org/#versioning) to the CITATION file.
- If one day [**after** review at rOpenSci](#authors-guide) you publish a software publication about your package, add it to the CITATION file.
- Less related to your package itself but to what supports it: if your package wraps a particular resource such as data source or, say, statistical algorithm, remind users of how to cite that resource via e.g. `citHeader()`. [Maybe even add the reference for the resource](https://discuss.ropensci.org/t/citation-of-original-article-when-implementing-specific-methods/2312).
As an example see [the dynamite CITATION file](https://github.com/ropensci/dynamite/blob/main/inst/CITATION) which refers to the R manual as well as other associated publications.
```r
citHeader("To cite dynamite in publications use:")
bibentry(
key = "dynamitepaper",
bibtype = "Misc",
doi = "10.48550/ARXIV.2302.01607",
url = "https://arxiv.org/abs/2302.01607",
author = c(person("Santtu", "Tikka"), person("Jouni", "Helske")),
title = "dynamite: An R Package for Dynamic Multivariate Panel Models",
publisher = "arXiv",
year = "2023"
)
bibentry(
key = "dmpmpaper",
bibtype = "Misc",
title = "Estimating Causal Effects from Panel Data with Dynamic
Multivariate Panel Models",
author = c(person("Santtu", "Tikka"), person("Jouni", "Helske")),
publisher = "SocArxiv",
year = "2022",
url = "https://osf.io/preprints/socarxiv/mdwu5/"
)
bibentry(
key = "dynamite",
bibtype = "Manual",
title = "Bayesian Modeling and Causal Inference for Multivariate
Longitudinal Data",
author = c(person("Santtu", "Tikka"), person("Jouni", "Helske")),
note = "R package version 1.0.0",
year = "2022",
url = "https://github.com/ropensci/dynamite"
)
```
- You could also create and store a `CITATION.cff` thanks to the [cffr package](https://docs.ropensci.org/cffr/). It also provides a [GitHub Action workflow](https://docs.ropensci.org/cffr/reference/cff_gha_update.html) to keep the `CITATION.cff` file up-to-date.
## README {#readme}
- All packages should have a README file, named `README.md`, in the root of the repository. The README should include, from top to bottom:
- The package name.
- Badges for continuous integration and test coverage, the badge for rOpenSci peer-review once it has started (see below), a repostatus.org badge, and any other badges (e.g. [R-universe](https://ropensci.org/blog/2021/10/14/runiverse-badges/)).
- Short description of goals of package (what does it do? why should a potential user care?), with descriptive links to all vignettes unless the package is small and there's only one vignette repeating the README. Please also ensure the vignettes are rendered and readable, see [the "documentation website" section](#website)).
- Installation instructions using e.g. the [remotes package](https://remotes.r-lib.org/), [pak package](https://pak.r-lib.org/), or [R-universe](https://ropensci.org/blog/2021/06/22/setup-runiverse/).
- Any additional setup required (authentication tokens, etc).
- Brief demonstration usage.
- If applicable, how the package compares to other similar packages and/or how it relates to other packages.
- Citation information i.e. Direct users to the preferred citation in the README by adding boilerplate text "here's how to cite my package". See e.g. [ecmwfr README](https://github.com/bluegreen-labs/ecmwfr#how-to-cite-this-package-in-your-article).
If you use another repo status badge such as a [lifecycle](https://www.tidyverse.org/lifecycle/) badge, please also add a [repostatus.org](https://www.repostatus.org/) badge. [Example of a repo README with two repo status badges](https://github.com/ropensci/ijtiff#ijtiff-).
- Once you have submitted a package and it has passed editor checks, add a peer-review badge via
```
[![](https://badges.ropensci.org/<issue_id>_status.svg)](https://github.com/ropensci/software-review/issues/<issue_id>)
```
where issue\_id is the number of the issue in the software-review repository. For instance, the badge for [`rtimicropem`](https://github.com/ropensci/rtimicropem) review uses the number 126 since it's the [review issue number](https://github.com/ropensci/software-review/issues/126). The badge will first indicated "under review" and then "peer-reviewed" once your package has been onboarded (issue labelled "approved" and closed), and will link to the review issue.
- If your README has many badges consider ordering them in an html table to make it easier for newcomers to gather information at a glance. See examples in [`drake` repo](https://github.com/ropensci/drake) and in [`qualtRics` repo](https://github.com/ropensci/qualtRics/). Possible sections are
- Development (CI statuses cf [CI chapter](#ci), Slack channel for discussion, repostatus)
- Release/Published ([CRAN version and release date badges from METACRAN](https://www.r-pkg.org/services#badges), [CRAN checks API badge](https://github.com/r-hub/cchecksbadges), Zenodo badge)
- Stats/Usage (downloads e.g. [download badges from r-hub/cranlogs](https://github.com/r-hub/cranlogs.app#badges))
The table should be more wide than it is long in order to mask the rest of the README.
- If your package connects to a data source or online service, or wraps other software, consider that your package README may be the first point of entry for users. It should provide enough information for users to understand the nature of the data, service, or software, and provide links to other relevant data and documentation. For instance,
a README should not merely read, "Provides access to GooberDB," but also include,
"..., an online repository of Goober sightings in South America. More
information about GooberDB, and documentation of database structure and metadata
can be found at *link*".
- We recommend not creating `README.md` directly, but from a `README.Rmd` file (an R Markdown file) if you have any demonstration code. The advantage of the `.Rmd` file is you can combine text with code that can be easily updated whenever your package is updated.
- Consider using `usethis::use_readme_rmd()` to get a template for a `README.Rmd` file and to automatically set up a pre-commit hook to ensure that `README.md` is always newer than `README.Rmd`.
- Extensive examples should be kept for a vignette. If you want to make the vignettes more accessible before installing the package, we suggest [creating a website for your package](#website).
- Add a [code of conduct and contribution guidelines](#friendlyfiles).
- See the [`gistr` README](https://github.com/ropensci/gistr#gistr) for a good example README to follow for a small package, and [`bowerbird` README](https://github.com/ropensci/bowerbird) for a good example README for a larger package.
## Documentation {#documentation}
### General {#docs-general}
- All exported package functions should be fully documented with examples.
- If there is potential overlap or confusion with other packages providing similar functionality or having a similar name, add a note in the README, main vignette and potentially the Description field of DESCRIPTION. Examples in [rtweet README](https://docs.ropensci.org/rtweet/), [rebird README](https://docs.ropensci.org/rebird/#auk-vs-rebird), and the non-rOpensci package [slurmR](https://uscbiostats.github.io/slurmR/index.html#vs).
- The package should contain top-level documentation for `?foobar`, (or ``?`foobar-package` `` if there is a naming conflict). Optionally, you can use both `?foobar` and ``?`foobar-package` `` for the package level manual file, using `@aliases` roxygen tag. [`usethis::use_package_doc()`](https://usethis.r-lib.org/reference/use_package_doc.html) adds the template for the top-level documentation.
- The package should contain at least one **HTML** vignette providing a substantial coverage of package functions, illustrating realistic use cases and how functions are intended to interact. If the package is small, the vignette and the README may have very similar content.
- As is the case for a README, top-level documentation or vignettes may be the first point of entry for users. If your package connects to a data source or online service, or wraps other software, it should provide enough information for users to understand the nature of the data, service, or software, and provide links to other relevant data and documentation. For instance, a vignette intro or documentation should not merely read, "Provides access to GooberDB," but also include, "..., an online repository of Goober sightings in South America. More information about GooberDB, and documentation of database structure and metadata can be found at *link*". Any vignette should outline prerequisite knowledge to be able to understand the vignette upfront.
The general vignette should present a series of examples progressing in complexity from basic to advanced usage.
- Functionality likely to be used by only more advanced users or developers might be better put in a separate vignette (e.g. programming/NSE with dplyr).
- The README, the top-level package docs, vignettes, websites, etc., should all have enough information at the beginning to get a high-level overview of the package and the services/data it connects to, and provide navigation to other relevant pieces of documentation. This is to follow the principle of *multiple points of entry* i.e. to take into account the fact that any piece of documentation may be the first encounter the user has with the package and/or the tool/data it wraps.
- The vignette(s) should include citations to software and papers where appropriate.
- If your package provides access to a data source, we require that DESCRIPTION contains both (1) A brief identification and/or description of the organisation responsible for issuing data; and (2) The URL linking to public-facing page providing, describing, or enabling data access (which may often differ from URL leading directly to data source).
- Only use package startup messages when necessary (function masking for instance). Avoid package startup messages like "This is foobar 2.4-0" or citation guidance because they can be annoying to the user. Rely on documentation for such guidance.
- You can choose to have a README section about use cases of your package (other packages, blog posts, etc.), [example](https://github.com/ropensci/vcr#example-packages-using-vcr).
### roxygen2 use {#roxygen-2-use}
- We request all submissions to use [roxygen2](https://roxygen2.r-lib.org/) for documentation. roxygen2 is an R package that compiles `.Rd` files to your `man` folder in your package from tags written above each function. roxygen2 has [support for Markdown syntax](https://roxygen2.r-lib.org/articles/rd-formatting.html). One key advantage of using roxygen2 is that your `NAMESPACE` will always be automatically generated and up to date.
- More information on using roxygen2 documentation is available in the [R packages book](https://r-pkgs.org/man.html) and in [roxygen2 website itself](https://roxygen2.r-lib.org/).
- If you were writing Rd directly without roxygen2, the [Rd2roxygen](https://cran.r-project.org/web/packages/Rd2roxygen/index.html) package contains functions to convert Rd to roxygen documentation.
- All functions should document the type of object returned under the `@return` heading.
- The default value for each parameter should be clearly documented. For example, instead of writing `A logical value determining if ...`, you should write ``A logical value (default `TRUE`) determining if ...``. It is also good practice to indicate the default values directly in your function definition:
```{r, eval=FALSE}
f <- function(a = TRUE) {
# function code
}
```
- Documentation should support user navigation by including useful [cross-links](https://roxygen2.r-lib.org/reference/tags-index-crossref.html) between related functions and documenting related functions together in groups or in common help pages. In particular, the `@family` tags, that automatically creates "See also" links and [can help group](https://pkgdown.r-lib.org/reference/build_reference.html) functions together on pkgdown sites, is recommended for this purpose. See [the "manual" section of The R Packages book](https://r-pkgs.org/man.html) and [the "function grouping" section of the present chapter](#function-grouping) for more details.
- You can re-use documentation pieces (e.g. details about authentication, related packages) across the vignettes/README/man pages. Refer to [roxygen2 vignette on documentation reuse](https://roxygen2.r-lib.org/articles/reuse.html).
- For including examples, you can use the classic `@examples` tag (plural "examples") but also the `@example <path>` tag (singular "example") for storing the example code in a separate R script (ideally under `man/`), and the `@exampleIf` tag for running examples conditionally and avoiding R CMD check failures. Refer to [roxygen2 documentation about examples](https://roxygen2.r-lib.org/articles/rd.html#examples).
- Add `#' @noRd` to internal functions. You might be interested in the [devtag experimental package](https://github.com/moodymudskipper/devtag) for getting local manual pages when using `#' @noRd`.
- Starting from roxygen2 version 7.0.0, `R6` classes are officially supported. See the [roxygen2 docs](https://roxygen2.r-lib.org/articles/rd-other.html#r6) for details on how to document `R6` classes.
### URLs in documentation {#ur-ls-in-documentation}
This subsection is particularly relevant to authors wishing to submit their package to CRAN.
CRAN will check URLs in your documentation and does not allow redirect status codes such as 301.
You can use the [urlchecker](https://github.com/r-lib/urlchecker) package to reproduce these checks and, in particular, replace URLs with the URLs they redirect to.
Others have used the option to escape some URLs (change `<https://ropensci.org/>` to `https://ropensci.org/`, or `\url{https://ropensci.org/}` to `https://ropensci.org/`.), but if you do so, you will need to implement some sort of URL checking yourself to prevent them from getting broken without your noticing. Furthermore, links would not be clickable from local docs.
## Documentation website {#website}
We recommend creating a documentation website for your package using [`pkgdown`](https://github.com/r-lib/pkgdown). The R packages book features a [chapter on pkgdown](https://r-pkgs.org/website.html), and of course `pkgdown` has [its own documentation website](https://pkgdown.r-lib.org/).
There are a few elements we'd like to underline here.
### Automatic deployment of the documentation website {#docsropensci}
You only need to worry about automatic deployment of your website until approval and transfer of your package repo to the ropensci organization; indeed, after that a pkgdown website will be built for your package after each push to the GitHub repo. You can find the status of these builds at `https://dev.ropensci.org/job/package_name`, e.g. [for `magick`](https://dev.ropensci.org/job/magick); and the website at `https://docs.ropensci.org/package_name`, e.g. [for `magick`](https://docs.ropensci.org/magick). The website build will use your pkgdown config file if you have one, except for the styling that will use the [`rotemplate` package](https://github.com/ropensci-org/rotemplate/). The resulting website will have a local search bar. Please report bugs, questions and feature requests about the central builds at [https://github.com/ropensci/docs/](https://github.com/ropensci/docs/) and about the template at [https://github.com/ropensci/rotemplate/](https://github.com/ropensci/rotemplate/).
*If your package vignettes need credentials (API keys, tokens, etc.) to knit, you might want to [precompute them](https://ropensci.org/technotes/2019/12/08/precompute-vignettes/) since credentials cannot be used on the docs server.*
Before submission and before transfer, you could use the [approach documented by `pkgdown`](https://pkgdown.r-lib.org/reference/deploy_site_github.html) or the [`tic` package](https://docs.ropensci.org/tic/) for automatic deployment of the package's website. This would save you the hassle of running (and remembering to run) `pkgdown::build_site()` yourself every time the site needs to be updated. First refer to our [chapter on continuous integration](#ci) if you're not familiar with continuous integration. In any case, do not forget to update all occurrences of the website URL after transfer to the ropensci organization.
### Grouping functions in the reference {#function-grouping}
When your package has many functions, use grouping in the reference, which you can do more or less automatically.
If you use roxygen2 above version 6.1.1, you should use the `@family` tag in your functions documentation to indicate grouping. This will give you links between functions in the local documentation of the installed package ("See also" section) *and* allow you to use the `pkgdown` `has_concept` function in the config file of your website. Non-rOpenSci example courtesy of [`optiRum`](https://github.com/lockedata/optiRum): [family tag](https://github.com/lockedata/optiRum/blob/master/R/APR.R#L17), [`pkgdown` config file](https://github.com/lockedata/optiRum/blob/master/_pkgdown.yml) and [resulting reference section](https://itsalocke.com/optirum/reference/).
To customize the text of the cross-reference title created by roxygen2 (`Other {family}:`), refer to [roxygen2 docs regarding how to provide a `rd_family_title` list in `man/roxygen/meta.R`](https://roxygen2.r-lib.org/articles/rd.html#cross-references).
Less automatically, see the example of [`drake` website](https://docs.ropensci.org/drake/) and [associated config file
](https://github.com/ropensci/drake/blob/master/_pkgdown.yml).
### Branding of authors {#branding-of-authors}
You can make the names of (some) authors clickable by adding their URL, and you can even replace their names with a logo (think rOpenSci... or your organisation/company!). See [`pkgdown` documentation](https://pkgdown.r-lib.org/reference/build_home.html?q=authors#yaml-config-authors).
### Tweaking the navbar {#tweaking-the-navbar}
You can make your website content easier to browse by tweaking the navbar, refer to [`pkgdown` documentation](https://pkgdown.r-lib.org/articles/pkgdown.html#navigation-bar). In particular, note that if you name the main vignette of your package "pkg-name.Rmd", it'll be accessible from the navbar as a `Get started` link instead of via `Articles > Vignette Title`.
### Math rendering {#mathjax}
Please refer to [pkgdown documentation](https://pkgdown.r-lib.org/dev/articles/customise.html#math-rendering).
Our template is compatible with this configuration.
### Package logo {#package-logo}
To use your package logo in the pkgdown homepage, refer to [`usethis::use_logo()`](https://usethis.r-lib.org/reference/use_logo.html).
If your package doesn't have any logo, the [rOpenSci docs builder](#docsropensci) will use rOpenSci logo instead.
## Authorship {#authorship}
The `DESCRIPTION` file of a package should list package authors and contributors to a package, using the `Authors@R` syntax to indicate their roles (author/creator/contributor etc.) if there is more than one author, and using the comment field to indicate the ORCID ID of each author, if they have one (cf [this post](https://ropensci.org/technotes/2018/10/08/orcid/)). See [this section of "Writing R Extensions"](https://cran.rstudio.com/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file) for details. If you feel that your reviewers have made a substantial contribution to the development of your package, you may list them in the `Authors@R` field with a Reviewer contributor type (`"rev"`), like so:
```
person("Bea", "Hernández", role = "rev",
comment = "Bea reviewed the package (v. X.X.XX) for rOpenSci, see <https://github.com/ropensci/software-review/issues/116>"),
```
Only include reviewers after asking for their consent. Read more in this blog post ["Thanking Your Reviewers: Gratitude through Semantic Metadata"](https://ropensci.org/blog/2018/03/16/thanking-reviewers-in-metadata/). Please do not list editors as contributors. Your participation in and contribution to rOpenSci is thanks enough!
### Authorship of included code {#authorship-included-code}
Many packages include code from other software. Whether entire files or single functions are included from other packages, rOpenSci packages should follow [the CRAN *Repository Policy*](https://cran.r-project.org/web/packages/policies.html):
> The ownership of copyright and intellectual property rights of all components of the package must be clear and unambiguous (including from the authors specification in the DESCRIPTION file). Where code is copied (or derived) from the work of others (including from R itself), care must be taken that any copyright/license statements are preserved and authorship is not misrepresented.
>
> Preferably, an ‘Authors@R' field would be used with ‘ctb' roles for the authors of such code. Alternatively, the ‘Author' field should list these authors as contributors.
>
> Where copyrights are held by an entity other than the package authors, this should preferably be indicated via ‘cph' roles in the ‘Authors@R' field, or using a ‘Copyright' field (if necessary referring to an inst/COPYRIGHTS file).
>
> Trademarks must be respected.
## Licence {#licence}
The package needs to have a [CRAN](https://svn.r-project.org/R/trunk/share/licenses/license.db) or [OSI](https://opensource.org/licenses) accepted license.
For more explanations around licensing, refer to the [R packages book](https://r-pkgs.org/license.html).
## Testing {#testing}
- All packages should pass `R CMD check`/`devtools::check()` on all major platforms.
- All packages should have a test suite that covers major functionality of the package. The tests should also cover the behavior of the package in case of errors.
- It is good practice to write unit tests for all functions, and all package code in general, ensuring key functionality is covered. Test coverage below 75% will likely require additional tests or explanation before being sent for review.
- We recommend using [testthat](https://testthat.r-lib.org/) for writing tests. Strive to write tests as you write each new function. This serves the obvious need to have proper testing for the package, but allows you to think about various ways in which a function can fail, and to *defensively* code against those. [More information](https://r-pkgs.org/tests.html).
- Tests should be easy to understand. We suggest reading the blog post [*"Why Good Developers Write Bad Unit Tests"*](https://mtlynch.io/good-developers-bad-tests/) by Michael Lynch.
- Packages with Shiny apps should use a unit-testing framework such as [`shinytest2`](https://rstudio.github.io/shinytest2/) or [`shinytest`](https://rstudio.github.io/shinytest/articles/shinytest.html) to test that interactive interfaces behave as expected.
- For testing your functions creating plots, we suggest using [vdiffr](https://vdiffr.r-lib.org/), an extension of the testthat package that relies on [testthat snapshot tests](https://testthat.r-lib.org/articles/snapshotting.html).
- If your package interacts with web resources (web APIs and other sources of data on the web) you might find the [HTTP testing in R book by Scott Chamberlain and Maëlle Salmon](https://books.ropensci.org/http-testing/) relevant. Packages helping with HTTP testing (corresponding HTTP clients):
- [httptest2](https://enpiar.com/httptest2/) ([httr2](https://httr2.r-lib.org/));
- [httptest](https://enpiar.com/r/httptest/) ([httr](https://httr.r-lib.org/));
- [vcr](https://docs.ropensci.org/vcr/) ([httr](https://httr.r-lib.org/), [crul](https://docs.ropensci.org/crul));
- [webfakes](https://webfakes.r-lib.org/) ([httr](https://httr.r-lib.org/), [httr2](https://httr2.r-lib.org/), [crul](https://docs.ropensci.org/crul), [curl](https://jeroen.r-universe.dev/curl#)).
- testthat has a function `skip_on_cran()` that you can use to not run tests on CRAN. We recommend using this on all functions that are API calls since they are quite likely to fail on CRAN. These tests should still run on continuous integration. Note that from testthat 3.1.2 `skip_if_offline()` automatically calls `skip_on_cran()`. More info on [CRAN preparedness for API wrappers](https://books.ropensci.org/http-testing/cran-preparedness.html).
- If your package interacts with a database you might find [dittodb](https://docs.ropensci.org/dittodb) useful.
- Once you've set up [continuous integration (CI)](#ci), use your package's code coverage report (cf [this section of our book](#coverage)) to identify untested lines, and to add further tests.
- Even if you use [continuous integration](#ci), we recommend that you run tests locally prior to submitting your package (you might need to set `Sys.setenv(NOT_CRAN="true")`).
## Examples {#examples}
- Include extensive examples in the documentation. In addition to demonstrating how to use the package, these can act as an easy way to test package functionality before there are proper tests. However, keep in mind we require tests in contributed packages.
- You can run examples with `devtools::run_examples()`. Note that when you run R CMD CHECK or equivalent (e.g., `devtools::check()`) your examples that are not wrapped in `\dontrun{}` or `\donttest{}` are run. Refer to the [summary table](https://roxygen2.r-lib.org/articles/rd.html#functions) in roxygen2 docs.
- To safe-guard examples (e.g. requiring authentication) to be run on CRAN you need to use `\dontrun{}`. However, for a first submission CRAN won't let you have all examples escaped so. In this case you might add some small toy examples, or wrap example code in `try()`. Also refer to the `@exampleIf` tag present, at the time of writing, in roxygen2 development version.
- In addition to running examples locally on your own computer, we strongly advise that you run examples on one of the [continuous integration systems](#ci). Again, examples that are not wrapped in `\dontrun{}` or `\donttest{}` will be run, but for those that are you can configure your continuous integration builds to run them via R CMD check arguments `--run-dontrun` and/or `--run-donttest`.
## Package dependencies {#pkgdependencies}
- Consider the trade-offs involved in relying on a package as a dependency. On one hand,
using dependencies reduces coding effort, and can build on useful functionality developed by
others, especially if the dependency performs complex tasks, is high-performance,
and/or is well vetted and tested. On the other hand, having many dependencies
places a burden on the maintainer to keep up with changes in those packages, at risk
to your package's long-term sustainability. It also
increases installation time and size, primarily a consideration on your and others' development cycle, and in automated build systems. "Heavy" packages - those with many dependencies themselves, and those with large amounts of compiled code - increase this cost. Here are some approaches to reducing
dependencies:
- Small, simple functions from a dependency package may be better copied into
your own package if the dependency if you are using only a few functions
in an otherwise large or heavy dependency. (See [*Authorship* section
above](#authorship-included-code) for how to acknowledge original authors
of copied code.) On the other hand, complex functions with many edge
cases (e.g. parsers) require considerable testing and vetting.
- An common example of this is in returning tidyverse-style "tibbles" from package
functions that provide data.
One can avoid the modestly heavy **tibble** package dependency by returning
a tibble created by modifying a data frame like so:
```
class(df) <- c("tbl_df", "tbl", "data.frame")
```
(Note that this approach is [not universally endorsed](https://twitter.com/krlmlr/status/1067856118385381377).)
- Ensure that you are using the package where the function is defined,
rather than one where it is re-exported. For instance many functions in **devtools** can be found in smaller specialty packages such as **sessioninfo**. The `%>%` function
should be imported from **magrittr**, where it is defined, rather than the heavier
**dplyr**, which re-exports it.
- Some dependencies are preferred because they provide easier to interpret
function names and syntax than base R solutions. If this is the primary
reason for using a function in a heavy dependency, consider wrapping
the base R approach in a nicely-named internal function in your package. See e.g. the [rlang R script providing functions with a syntax similar to purrr functions](https://github.com/r-lib/rlang/blob/9b50b7a86698332820155c268ad15bc1ed71cc03/R/standalone-purrr.R).
- If dependencies have overlapping functionality, see if you can rely on only one.
- More dependency-management tips can be found in the chapter ["Dependencies: Mindset and Background" of the R packages book](https://r-pkgs.org/dependencies-mindset-background.html) and in a [post by
Scott Chamberlain](https://recology.info/2018/10/limiting-dependencies/).
- Use `Imports` instead of `Depends` for packages providing functions from other packages. Make sure to list packages used for testing (`testthat`), and documentation (`knitr`, roxygen2) in your `Suggests` section of package dependencies (if you use `usethis` for adding testing infrastructure via [`usethis::use_testthat()`](https://usethis.r-lib.org/reference/use_testthat.html) or a vignette via [usethis::use\_vignette()](https://usethis.r-lib.org/reference/use_vignette.html), the necessary packages will be added to `DESCRIPTION`). If you use any package in the examples or tests of your package, make sure to list it in `Suggests`, if not already listed in `Imports`.
- If your (not Bioconductor) package depends on Bioconductor packages, make sure the installation instructions in the README and vignette are clear enough even for an user who is not familiar with the Bioconductor release cycle.
- Should the user use [`BiocManager`](https://www.bioconductor.org/install/index.html#why-biocmanagerinstall) (recommended)? Document this.
- Is the automatic installation of Bioconductor packages by `install.packages()` enough? In that case, mention that the user needs to run `setRepositories()` if they haven't set the necessary Bioconductor repositories yet.
- If your package depends on Bioconductor after a certain version, mention it in DESCRIPTION and in the installation instructions.
- Specifying minimum dependencies (e.g. `glue (>= 1.3.0)` instead of just `glue`) should be a conscious choice. If you know for a fact that your package will break below a certain dependency version, specify it explicitly.
But if you don't, then no need to specify a minimum dependency. In that case when a user reports a bug which is explicitly related to an older version of a dependency then address it then.
An example of bad practice would be for a developer to consider the versions of their current state of dependencies to be the minimal version. That would needlessly force everyone to upgrade (causing issues with other packages) when there is no good reason behind that version choice.
- For most cases where you must expose functions from dependencies to the user, you should import and re-export those individual functions rather than listing them in the `Depends` fields. For instance, if functions in your package produce `raster` objects, you might re-export only printing and plotting functions from the **raster** package.
- If your package uses a *system* dependency, you should
- Indicate it in DESCRIPTION;
- Check that it is listed by [`sysreqsdb`](https://github.com/r-hub/sysreqsdb#sysreqs) to allow automatic tools to install it, and [submit a contribution](https://github.com/r-hub/sysreqsdb#contributing) if not;
- Check for it in a `configure` script ([example](https://github.com/ropensci/magick/blob/c116b2b8505f491db72a139b61cd543b7a2ce873/DESCRIPTION#L19)) and give a helpful error message if it cannot be found ([example](https://github.com/cran/webp/blob/master/configure)).
`configure` scripts can be challenging as they often require hacky solutions
to make diverse system dependencies work across systems. Use examples ([more here](https://github.com/search?q=org%3Acran+anticonf&type=Code)) as a starting point but note that it is common to encounter bugs and edge cases and often violate CRAN policies. Do not hesitate to [ask for help on our forum](https://discuss.ropensci.org/).
## Recommended scaffolding {#recommended-scaffolding}
- For HTTP requests we recommend using [httr2](https://httr2.r-lib.org), [httr](https://httr.r-lib.org), [curl](https://jeroen.r-universe.dev/curl#), or [crul](http://docs.ropensci.org/crul/) over [RCurl](https://cran.rstudio.com/web/packages/RCurl/). If you like low level clients for HTTP, curl is best, whereas httr2, httr and crul are better for higher level access.
- For parsing JSON, use [jsonlite](https://github.com/jeroen/jsonlite) instead of [rjson](https://cran.rstudio.com/web/packages/rjson/) or [RJSONIO](https://cran.rstudio.com/web/packages/RJSONIO/).
- For parsing, creating, and manipulating XML, we strongly recommend [xml2](https://cran.rstudio.com/web/packages/xml2/) for most cases. [You can refer to Daniel Nüst's notes about migration from XML to xml2](https://gist.github.com/nuest/3ed3b0057713eb4f4d75d11bb62f2d66).
- For spatial data, the [sp](https://github.com/edzer/sp/) package should be considered deprecated in favor of [sf](https://r-spatial.github.io/sf/), and the packages rgdal, rgdal, and rgdal will be retired by the end of 2023. We recommend use of the spatial suites developed by the [r-spatial](https://github.com/r-spatial) and [rspatial](https://github.com/rspatial) communities. See [this GitHub issue](https://github.com/ropensci/software-review-meta/issues/47) for relevant discussions.
## Version Control {#version-control}
- Your package source files have to be under version control, more specifically tracked with [Git](https://happygitwithr.com/). You might find the [gert package](https://docs.ropensci.org/gert/) relevant, as well as some of [usethis Git/GitHub related functionality](https://usethis.r-lib.org/reference/index.html#section-git-and-github); you can however use git as you want.
- Make sure to list "scrap" such as `.DS_Store` files in .gitignore. You might find the [`usethis::git_vaccinate()` function](https://usethis.r-lib.org/reference/git_vaccinate.html), and the [gitignore package](https://docs.ropensci.org/gitignore/) relevant.
- A later section of this book contains some [git workflow tips](#gitflow).
## Miscellaneous CRAN gotchas {#crangotchas}
This is a collection of CRAN gotchas that are worth avoiding at the outset.
- Make sure your package title is in Title Case.
- Do not put a period on the end of your title.
- Do not put 'in R' or 'with R' in your title as this is obvious from packages hosted on CRAN. If you would like this information to be displayed on your website nonetheless, check the [`pkgdown` documentation](https://pkgdown.r-lib.org/reference/build_home.html#yaml-config-home) to learn how to override this.
- Avoid starting the description with the package name or "This package ...".
- Make sure you include links to websites if you wrap a web API, scrape data from a site, etc. in the `Description` field of your `DESCRIPTION` file. URLs should be enclosed in angle brackets, e.g. `<https://www.r-project.org>`.
- In both the `Title` and `Description` fields, the names of packages or other external software must be quoted using single quotes (e.g., *'Rcpp' Integration for the 'Armadillo' Templated Linear Algebra Library*).
- Avoid long running tests and examples. Consider `testthat::skip_on_cran` in tests to skip things that take a long time but still test them locally and on [continuous integration](#ci).
- Include top-level files such as `paper.md`, continuous integration configuration files, in your `.Rbuildignore` file.
For further gotchas, refer to the collaborative list maintained by ThinkR, ["Prepare for CRAN"](https://github.com/ThinkR-open/prepare-for-cran).
### CRAN checks {#cranchecks}
Once your package is on CRAN, it will be [regularly checked on different platforms](https://blog.r-hub.io/2019/04/25/r-devel-linux-x86-64-debian-clang/#cran-checks-101). Failures of such checks, when not false positives, can lead to the CRAN team's reaching out. You can monitor the state of the CRAN checks via
- the [`foghorn` package](https://fmichonneau.github.io/foghorn/).
- the [CRAN checks badges](https://github.com/r-hub/cchecksbadges).
## Bioconductor gotchas {#bioconductor-gotchas}
If you intend your package to be submitted to, or if your package is on, Bioconductor, refer to [Bioconductor packaging guidelines](https://www.bioconductor.org/developers/package-guidelines/) and the [updated developer book](https://contributions.bioconductor.org/).
## Further guidance {#further-guidance}
- If you are submitting a package to rOpenSci via the [software-review repo](https://github.com/ropensci/software-review), you can direct further questions to the rOpenSci team in the issue tracker, or in our [discussion forum](https://discuss.ropensci.org/).
- Read the [authors guide](#authors-guide).
- Read, incorporate, and act on advice from the [*Collaboration Guide* chapter](#collaboration).
### Learning about package development {#learning-about-package-development}
#### Books {#books}
- [Hadley Wickham and Jenny Bryan's *R packages* book](https://r-pkgs.org/) is an excellent, readable resource on package development which is available for [free online](https://r-pkgs.org/) (and can be bought in [print](https://www.oreilly.com/library/view/r-packages/9781491910580/)).
- [Writing R Extensions](https://cran.r-project.org/doc/manuals/r-release/R-exts.html) is the canonical, usually most up-to-date, reference for creating R packages.
- [*Mastering Software Development in R* by Roger D. Peng, Sean Kross, and Brooke Anderson](https://bookdown.org/rdpeng/RProgDA/).
- [*Advanced R* by Hadley Wickham](https://adv-r.hadley.nz/)
- [*Tidyverse style guide*](https://style.tidyverse.org/)
- [*Tidyverse design guide*](https://design.tidyverse.org/) (WIP) and the accompanying [newsletter](http://tidydesign.substack.com/).
#### Tutorials {#tutorials}
- [Your first R package in 1 hour](https://www.pipinghotdata.com/posts/2020-10-25-your-first-r-package-in-1-hour/) by Shannon Pileggi.
- [this workflow description by Emil Hvitfeldt](https://www.emilhvitfeldt.com/post/2018-09-02-usethis-workflow-for-package-development/).
- [This pictorial by Matthew J Denny](https://www.mjdenny.com/R_Package_Pictorial.html).
#### Blogs {#blogs}
- [R-hub blog](https://blog.r-hub.io/post).
- Some posts of the [rOpenSci blog](https://ropensci.org/archive/) e.g. ["How to precompute package vignettes or pkgdown articles"](https://ropensci.org/blog/2019/12/08/precompute-vignettes/).
- Package Development Corner section of [rOpenSci newsletter](https://ropensci.org/news/).
- Some posts of the [tidyverse blog](https://www.tidyverse.org) e.g. ["Upgrading to testthat edition 3"](https://www.tidyverse.org/blog/2022/02/upkeep-testthat-3/).
#### MOOCs {#moo-cs}
There is a [Coursera specialization corresponding to the book by Roger Peng, Sean Kross and Brooke Anderson](https://fr.coursera.org/specializations/r), with a course specifically about R packages.