5 Tips and Tricks
- Spell check functionality in RStudio
Although often overlooked, RStudio can check your spelling
- New line for each sentence
5.1 Global chunk options
It can be useful to set chunk options globally for all (following) chunks to avoid retyping or copy-and-pasting
# Save all plots as 600 DPI TIFF-files
::opts_chunk$set(dev = "tiff", dpi = 600)
knitr
# Do not evaluate subsequent chunks (debugging or fine-tuning)
::opts_chunk$set(eval = FALSE) knitr
See the knitr chunk options and package options for an overview of settings
5.2 Meaningful chunk names
…
5.3 Text references
Because papaja
extends bookdown
you can use text references in any papaja
document.
A text reference consists of a unique label—defined as (ref:unique-label)
somewhere in the body of the document (not inside a code chunk)—and the text that the label stands in for.
Text references must be defined in a separate single-line paragraph with empty lines above and below:
(ref:my-caption) This is a caption for my table.
The definition of a text reference must be on a single line and should not end with a white space. Hence, the following will not work as expected
(ref:my-caption)
This is a caption for my table.
Using text references for table and figure captions has several advantages:
- Markdown and \(\LaTeX\) syntax is not well supported in chunk options, such as
fig.cap
, or elsewhere inside a code chunk, e.g., in thecaption
argument ofapa_table()
. Specifically, Markdown formatting and citation syntax are ignored (rendered as-is) and, for example,\
and_
must be escaped to prevent errors in either R or \(\LaTeX\). None of these limitations apply to text references. - Long captions can impair the readability of the document when they are part of chunk options or R code.
knitr
considers modifications of the text infig.cap
as changes to the code chunk and hence may invalidate the cache of a chunk. As a consequence cached results of the code chunk need to be recomputed. Changes to the text of a text reference do not invalidate the cached computations.- It’s straight forward to include inline code chunks in text references.
Text references can be used to duplicate information throughout the document. Consider the following example.
(ref:aesthetics) Points represent conditions means, error bars represent 955% confidence intervals.
(ref:caption1) An interesting plot.
```{r fig.cap = paste("(ref:caption1)", "(ref:aesthetics)")}
apa_beeplot(data = npk, id = "block", dv = "yield", factors = "N")
```
(ref:caption2) Another interesting plot.
```{r fig.cap = paste("(ref:caption2)", "(ref:aesthetics)")}
apa_beeplot(data = npk, id = "block", dv = "yield", factors = "N")
```
The resulting figure caption combines the two text references, Figure 5.1.
Because the information about what points and error bars represent is repeated using a text reference, and not by literal repetition throughout the document, it’s easy to correct the typo (955% confidence intervals
) and be sure that it is corrected in every instance.
5.4 Useful RStudio Addins
citr
: Insert Markdown Citationsremedy
: Keyboard shortcuts for Markdown formattingsplitChunk
: Split R Markdown code chunksgramr
: Write-good linterwordcount
: Word counts and readability statistics
Set up keyboard shortcuts viaTools > Modify keyboard shortcuts
Suggested keyboard shortcuts
Package | Addin | Keyboard shortcut |
---|---|---|
citr |
Insert citation | Shift + Alt+R |
wordcount |
Word count | Shift + Alt+C |
splitChunk |
Chunk split | Shift + Alt+S |
remedy |
Bold | Shift + Alt+B |
Italic | Shift + Alt+I | |
Backtick | Shift + Alt+P | |
URL | Shift + Alt+U |
statcheck
: Extract Statistics from Articles and Recompute p Valuesretractcheck
: Check DOIs in a paper for retractions
5.5 Reproducible software environments
To ensure mid- to long-term computational reproducibility, we highly recommend conserving the software environment used to write a manuscript (e.g. R and all R packages) either in a software container or a virutal machine. This helps to avoid code rot (that is, your R code breaking because of updates to, for example, R or any R package) and ensures you can reproduce your analysis in the years to come. For a brief primer on containers and virtual machines see the supplementary material by Klein et al. (2018).
5.5.1 Docker
Docker is probably the most widely used containerization approach.
Docker containers are similar to virtual machines: insulated software environments (system libraries, R, R packages, RStudio, LaTeX, LaTeX packages, etc.) that run inside your host system.
Docker works on most operating systems and is widely used, free, and open source.
It just requires some disk space.
For a concise hands-on introduction see the ROpenSci Docker tutorial; a more detailed introduction is available from the Docker project.
Docker containers are configured using so-called Docker files that act as a recipe for the software environment.
With the Docker file, anyone can automatically recreate the software environment that you used and rerun your analysis.
As a starting point for your container you can build on the following Docker file, which sets up everything that is needed for creating a manuscript with papaja
—including an instance of RStudio that you can access through your browser:
# Look up available R versions at https://github.com/rocker-org/rocker-versioned/tree/master/verse
FROM rocker/verse:3.6.3
# Install papaja dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libgsl0-dev \
libnlopt-dev
RUN install2.r --error \
--skipinstalled \
--deps TRUE \
rmdfiltr
# Required by broom -- obsolete once newer versions are available from MRAN
RUN Rscript -e "remotes::install_version('rlang', '0.4.7', repos = 'http://cran.us.r-project.org', upgrade = FALSE, Ncpus = 3)"
RUN Rscript -e "remotes::install_version('tidyselect', '1.1.0', repos = 'http://cran.us.r-project.org', upgrade = FALSE, Ncpus = 3)"
RUN Rscript -e "remotes::install_version('vctrs', '0.3.2', repos = 'http://cran.us.r-project.org', upgrade = FALSE, Ncpus = 3)"
RUN Rscript -e "remotes::install_version('dplyr', '1.0.0', repos = 'http://cran.us.r-project.org', upgrade = FALSE, Ncpus = 3)"
# Latest papaja development version
RUN Rscript -e "remotes::install_github('crsh/papaja', quick = FALSE, build = TRUE, dependencies = c('Depends', 'Imports'), Ncpus = 3, upgrade = FALSE)"
Place this Docker file in your project directory alongside the following bash script (for MacOS or Linux):
#!/bin/sh
docker build \
--build-arg RSTUDIO_VERSION=1.3.1093 \
-t container-name .
docker run -d \
-p 8787:8787 \
-e DISABLE_AUTH=true \
-e ROOT=TRUE \
-v $(pwd):/home/rstudio \
container-name
sleep 1
open http://$(ipconfig getifaddr en0):8787
Execute this script in your project directory to set up and run the container. This script will take a little while to finish the first time around (it downloads the base container and installs all needed R packages), but should be fast the next time. Finally, a browser window with an instance of RStudio should open and all files from you project directory should be shared between your container and the host system. You can work in that RStudio instance in your browser just as you usually would.
install.packages()
) they will be lost once you stop the container.
While this may seem inconvenient, it ensures that your Docker file (that is your recipe) is complete.
To permanently install new R packages in your container, add them to the Docker file. For example,
RUN install2.r --error \
--skipinstalled \
--deps TRUE \
rmdfiltr
afex
emmeans
Note that all R packages are installed from MRAN, which serves packages as they were available from CRAN on a particular date in the past. For the rocker images used here this date is the last day the desired version of R was the most recent release, see the rocker version information for details.
--ncpus 3 \
to the above RUN
instructions.
5.5.2 CodeOcean
CodeOcean is a commercial service that builds on Docker, facilitates setting up and sharing containers, and lets you run computations in the cloud.
In case you prefer CodeOcean over plain Docker, you may be interested in the minimal papaja
example capsule that CodeOcean’s Seth Green has kindly prepared.
If you want to use papaja
in your next CodeOcean project, you may use this capsule as a starting point.
5.6 RStudio
- Document outline
RStudio provides a handy document outline view
5.7 Splitting an R Markdown document
Some authors may prefer to split long manuscripts into multiple component files for better clarity.
There are two basic strategies to split R Markdown documents that can be combined or used in isolation: sourcing R scripts and splitting the R Markdown document.
If the R Markdown document contains a lot of code, it may be helpful to disincorporate parts of the code, such as reading, merging, restructuring, and relabeling data files.
The R scripts can then be executed at the respective section of the document using source()
.
Some authors may prefer to split long manuscripts into a master or parent document and multiple children.
The master document, for example, consists of the YAML front matter and includes the children, which are themselves R Markdown documents without a YAML front matter.
To include a child document, insert an empty chunk and provide the path to the R Markdown document in the chunk option child
.
It may be preferable to split long documents into multiple files
```{r child = "introduction.Rmd"}
```
```{r child = "method.Rmd"}
```
```{r child = "results.Rmd"}
```
```{r child = "discussion.Rmd"}
```
Search all files with Ctrl + Shift+F
5.8 Best practices
- Load all R packages in the first code chunk
- Never include
install.packages()
- Never include
- Set a seed for random number generators
(e.g.,set.seed()
) - Never use
setwd()
! - Use relative paths or load files from a permanent location
- Use meaningful chunk names
- Keep R code close to the corresponding prose
- Document R and R-package versions
(e.g.,devtools::session_info()
) - Try to ensure you can knit without errors before going home
5.9 Troubleshooting
As detailed in Document compilation, rendering a papaja
document involves several software packages.
This layered software design grants the package its capabilities but it comes at a cost:
When compilation of a papaja
-document throws an error it may not be immediately obvious to an inexperienced user, which part of the process failed.
However, the error message usually give some indication which portion of the process errored:
- Parsing of the YAML front matter
Error in yaml::yaml.load(enc2utf8(string), ...) :
- R code execution
Error: Object 'x' not found.
bookdown
adds cross- and text-references- No error messages; look for
in text
- No error messages; look for
pandoc
document conversionError: pandoc document conversion failed with error 1
Error running filter /path/to/filter/filter.lua
pandoc-citeproc
reference generationpandoc-citeproc: Cannot decode byte '\xfc'
pandoc-citeproc: reference X not found
, shows up as ??? in text
- \(\LaTeX\) PDF generation
! Missing $ inserted
It is often helpful to search the internet for the error messages or portions thereof. Many times others will have encountered the same problem and may have documented their solution. In the following we provide some general advice.
5.9.1 YAML
TBD
Double check indentation and white space.
5.9.2 R
Fixing bugs in R Markdown documents can be challenging because the code is run in a new non-interactive R session. This makes it a little harder to play around to pinpoint what’s causing the problem. Hence, do your best to recreate the problem in your interactive session:
- Restart R (
Session > Restart R
or
Ctrl + Shift + F10 in RStudio) - Compare the working directories (e.g., use
getwd()
in the console and in a code chunk of your knitted document) - Run every chunk individually until you get the error
If you can’t reproduce the problem in your interactive R session, there must be some difference between it and the R environment of your document. Once you identify that difference you will often know what is causing the problem. If you don’t you at least can now recreate the problem in your interactive R session and start debugging. To learn more about debugging techniques refer to Advanced R (Wickham, 2019).
5.9.3 bookdown
TBD
Don’t use _
in chunk names!
5.9.5 \(\LaTeX\)
TBD
Inspect the log file.