forked from COMBINE-Australia/r-pkg-dev
-
Notifications
You must be signed in to change notification settings - Fork 0
/
11-advanced.Rmd
75 lines (64 loc) · 4.26 KB
/
11-advanced.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# Advanced topics
In this workshop we have worked though the basic steps required to create an
R package. In this section we introduce some of the advanced topics that may be
useful for you as you develop more complex packages. These are included here to
give you an idea of what is possible for you to consider when planning a
package. Most of these topics are covered in Hadley Wickhams "Advanced R" book
(https://adv-r.hadley.nz) but there are many other guides and tutorials
available.
## Including datasets
It can be useful to include small datasets in your R package which can be used
for testing and examples in your vignettes. You may also have reference data
that is required for your package to function. If you already have the data as
an object in R it is easy to add it to your package with `usethis::use_data()`.
The `usethis::use_data_raw()` function can be used to write a script that reads
a raw data file, manipulates it in some way and adds it to the package with
`usethis::use_data()`. This is useful for keeping a record of what you have done
to the data and updating the processing or dataset if necessary. See the
"Data" section of "R packages" (http://r-pkgs.had.co.nz/data.html) for more
details about including data in your package.
## Designing objects
If you work with data types that don't easily fit into a table or matrix you may
find it convenient to design specific objects to hold them. Objects can also
be useful for holding the output of functions such as those that fit models or
perform tests. R has several different object systems. The S3 system is the
simplest and probably the most commonly used. Packages in the Bioconductor
ecosystem make use of the more formal S4 system. If you want to learn more about
desiging R objects a good place to get started is the "Object-oriented
programming" chapter of Hadley Wickham's "Advanced R" book
(https://adv-r.hadley.nz/oo.html). Other useful guides include
Nicholas Tierney's "A Simple Guide to S3 Methods"
(https://arxiv.org/abs/1608.07161) and Stuart Lee's "S4: a short guide for the
perplexed" (https://stuartlee.org/post/content/post/2019-07-09-s4-a-short-guide-for-perplexed/).
## Integrating other languages
If software for completing a task already exists but is in another language it
might make sense to write an R package that provides an interface to the
existing implementation rather than replementing it from scratch. Here are some
of the R packages that help you integrate code from other languages:
* **Rcpp** (C++) http://www.rcpp.org/
* **reticulate** (Python) https://rstudio.github.io/reticulate/
* **RStan** (Stan) https://mc-stan.org/users/interfaces/rstan
* **rJava** (Java) http://www.rforge.net/rJava/
Another common reason to include code from another language is to improve
performance. While it is often possible to make code faster by reconsidering
how things are done within R sometimes there is no alternative. The **Rcpp**
package makes it very easy to write snippets of C++ code that is called from R.
Depending on what you are doing moving even very small bits of code to C++ can
have big impacts on performance. Using **Rcpp** can also provide access to
existing C libraries for specialised tasks. The "Rewriting R code in C++"
section of "Advanced R" (https://adv-r.hadley.nz/rcpp.html) explains when and
how to use **Rcpp**. You can find other resources including a gallery of
examples on the official **Rcpp** website (http://www.rcpp.org/).
## Metaprogramming
Metaprogramming refers to code that reads and modifies other code. This may
seem like an obscure topic but it is important in R because of it's
relationship to non-standard evaluation (another fairly obscure topic). You
may not have heard of non-standard evaluation before but it is likely you have
used it. This is what happens whenever you provide a function with a bare name
instead of a string or a variable. Metaprogramming particularly becomes
relevant to package development if you want to have functions that make use of
packages in the Tidyverse such as **dplyr**, **tidyr** and **purrr**. The
"Metaprogramming" chapter of "Advanced R"
(https://adv-r.hadley.nz/metaprogramming.html) covers the topic in more detail
and the "Tidy evaluation" book (https://tidyeval.tidyverse.org/) may be useful
for learning how to write functions that use Tidyverse packages.