Introduction

I like to set up all of my projects in the form of an R package. This imposes a consistent structure to everything so I always know where to find things. R packages aren’t perfectly suited for everything, but I’m willing to sacrifice custom file organization for consistency. I also like to reduce the choice paralysis that can set in with trying to figure out how to organize a project. For more on using R packages for setting up projects - and some extensions and alternatives – see “Packaging Data Analytical Work Reproducibly Using R (and Friends)” by Marwick et al (2018)..

The key elements of this approach are

  1. Use the R package folder structure to organize everything
  2. Write all your analyses and documentation as .Rmd Vignettes
  3. Put any custom functions in the R directory as .R files.
  4. raw data
  5. data
  6. install

One of the major drawback of this approach is that the “install” folder is …

The following code outlines the key steps I use to set up the framework for a package, principally using the “how could I do this with out it” usethis package.

Required packages

I use these packages for setting things up:

  1. devtools: for general developer stuff
  2. usethis: for functions to automatically build key package components
  3. here: for setting up paths to directories and files.
library(devtools)
library(usethis)
library(here)
#library(spelling)

Preliminaries

Create a repository using GitHub

I create the GitHub repo for a package directly via the GitHub website. (There’s probably a way to do this from within R this but I haven’t bothered to figure it out).

Associate an R project with the GitHub using RStudio

I then create an R project and directory for the packing from within RStudio using

  1. File -> New Project -> Version Control -> Git
  2. Paste in link to the repo from GitHub

Set up the package with usethis

I then use usethis::create_package() within my project directory to build the basic package infrastructure This over rights the initial project (after a handy prompt in the console).

usethis::create_package(path = getwd())

Set up Vignettes for analyses and documentation

Create vignette infrastructure

I create the vignette infrastructure and a dummy vignette like this:

usethis::use_vignette("z",
                      title = "Adding data to an R Package")

I then write my vignettes using this template and change the file name. I label all my vignettes with a leading “a)”, “b)”, etc to keep them easily organized in the vignettes folder and in the dropdown menu of my rendered pkgdown site.

Copy in package-making script

I copy this script and related ones for maintaining the package into the vignettes folder. I call this script something like “x) building_this_package.Rmd”. (I always include this script in my packages so I can easily find the most up to date version just by going to my most recent project).

Create the readme.md file

The readme.md file gives anyone viewing the package material a basic intro to the package and how to load it. On GitHub, the readme is rendered to HTML to serve as a basic landing page.

Create the “readme” file.

These contain basic information about the

usethis::use_readme_md()

Populating the readme file

I add basic information to the readme fill and add sample code for a call to devtools::install_github() to install the package. The chunk below is an example of this boilerplate:

Example materials for readme

The development version of biodata is on GitHub. If you don’t already have it, you will need to install the devtools package

install.packages("devtools")

You can then install the biodata from GitHub with:

devtools::install_github("brouwern/biodata")

Once I have added data and/or functions to the package its good to present a typical or motivating use of the package.

Add required packages to

These are the packages I typically use. I should vectorize this so it would look nicer : )

# admin
usethis::use_package("here",   "Imports")
usethis::use_package("devtools",   "Imports")

# visualization
usethis::use_package("ggplot2", "Imports")
usethis::use_package("ggpubr",  "Imports")
usethis::use_package("cowplot",   "Imports")

#usethis::use_package("tint",   "Imports")
#usethis::use_package("formatR",   "Imports")
#usethis::use_package("readxl",   "Imports")
#usethis::use_package("dplyr",   "Imports")
#usethis::use_package("pander",   "Imports")
#usethis::use_package("plotrix",   "Imports")

#usethis::use_package("reshape2",   "Imports")

#usethis::use_package("dplyr",   "Imports")
#usethis::use_package("tidyr",   "Imports")

# use_package("RCurl", "Imports")
# use_package("GGally", "Imports")
# use_package("broom", "Imports")

Other files

Create git ignore file

The git ignore file tells your local git installation to not track and push unnecessary file.

usethis::git_vaccinate() “Adds .DS_Store, .Rproj.user, .Rdata, .Rhistory, and .httr-oauth to your global (a.k.a. user-level) .gitignore. This is good practice as it decreases the chance that you will accidentally leak credentials to GitHub.”

usethis::git_vaccinate()

I always have trouble with gitingore and haven’t taken the time to learn more about it. I am not sure how to use this properly yet.

usethis::use_git_ignore(".pdf", directory = ".")
usethis::use_git_ignore(".xls", directory = ".")
usethis::use_git_ignore(".xlsx", directory = ".")
usethis::use_git_ignore(".docx", directory = ".")

Create package news files

usethis::use_news_md()

Other thinks

Don’t save/load user workspace between sessions

usethis::use_blank_slate()

Set up documentation files

Set roxygen as your framework documentation

Where would we be without roxygen?!?

usethis::use_roxygen_md()

Package-level documents

“Adds a dummy .R file that will prompt roxygen to generate basic package-level documentation.”

usethis::use_package_doc()

Set up pkgdown for creating a website for the project

For making front end website

usethis::use_pkgdown()

Set up data for the project

Create folder for external data.

R packages often have a folder called “/inst” which stands for “installed.” This folder is usually for miscellaneous files associated with the package. This includes external data (“/extdata”) such as .csv files.

dir.create(here::here("/inst"))
dir.create(here::here("/inst/extdata"))

This could be done with use_directory()

Look at data in my extdata file

External data is stored in “/inst/extdata”

list.files(here::here("/inst/extdata"))

Raw data

I copy raw data files into the “/inst/extdata”.

Raw data prep

If there are any data processing steps that I don’t want to include in the vignettes I put them into the directory “packagename/data-raw”. This structure for this directory and data prep script can be made using use_data_raw()

usethis::use_data_raw()

In a separate article I summarize the steps for setting up an actual data set.

License

For information on licenses see http://kbroman.org/pkg_primer/pages/licenses.html

Plain text versions of licenses can be found at https://creativecommons.org/2011/04/15/plaintext-versions-of-creative-commons-licenses-and-cc0/

usethis::use_gpl_license(version = 3, include_future = TRUE)

An error can occur if you have a separate license file. e.g.

File LICENSE is not mentioned in the DESCRIPTION file.

This can be fixed by changing the description from “License: GPL-3” to “License: GPL-3 + file LICENSE”

http://r-pkgs.had.co.nz/description.html#license

Spell check

“Adds a unit test to automatically run a spell check on documentation and, optionally, vignettes during R CMD check, using the spelling package. Also adds a WORDLIST file to the package, which is a dictionary of whitelisted words. See spelling::wordlist for details.”

Run devtools::check() to trigger spell check


usethis::use_spell_check(vignettes = TRUE, lang = "en-US", error = FALSE)

Documenting datasets

All datasets saved in /data must be documented.

# usethis::use_r(name = "fledgedata")

A minimal R dataset .R documentation file looks like this #’ Full dataset for Meillere et al. 2015 “telosful”

A standard R dataset .R file looks like this (https://r-pkgs.org/data.html)

#’ Prices of 50,000 round cut diamonds. #’ #’ A dataset containing the prices and other attributes of almost 54,000 #’ diamonds. #’ #’ @format A data frame with 53940 rows and 10 variables: #’ #’ @source “diamonds”

Other potentially useful usethis functions

  • use_r() Create or edit a .R file
  • use_build_ignore() Add files to .Rbuildignore
  • use_version() use_dev_version() Increment package version
  • edit_r_profile()
  • edit_r_environ()
  • edit_r_makevars()
  • edit_rstudio_snippets()
  • edit_git_config()
  • edit_git_ignore()