USGS - science for a changing world

Getting Started

There are a number of advanced R developers across the USGS who write scripts to perform important analyses. They often want to share the workflows and steps associated with the analysis, but passing around scripts that require users to change hard-coded inputs can be cumbersome and inefficient. This course gives advanced R programmers the skills they need to turn their scripted workflows into an R package. R packages allow code to be bundled into functions and easily downloaded and installed by users. Packages contain documentation and help files that should minimize questions asked of package authors.

Course objectives

  1. Improve R programming skills.
  2. Create a manageable, testable, and version-controlled codebase for your analysis.
  3. Easily and openly share your workflows and methods with others as R packages.

Software setup

Software installation:

  • R (latest version)
  • RStudio (>1.0)
  • RTools (compatible with your version of R)
  • R packages: devtools, roxygen2, testthat, knitr

Suggested prerequisite knowledge

This course discusses advanced topics in programming. Transitioning from a scripting mentality to package development can be challenging; we recommend you have advanced knowledge of R programming to use this curriculum. Please refer to the list below to see if you qualify. In addition, putting time and effort into package development is more useful if you have an existing script that could be useful to yourself and others if it were turned into a package.

I am comfortable …

  • loading files into data.frames.
  • differentiating data structures and data types.
  • indexing vectors, data.frames, or lists.
  • creating scatter, line, or boxplots and saving the output as a PNG/JPG.
  • writing for loops.
  • using logical statements (>, >=, ==).
  • writing conditional statements (if, if-else).
  • installing, loading, and using additional packages.
  • troubleshooting/decrypting error messages.
  • writing and using my own functions.

Course overview

Table 1. Summary of available modules.
Module Objectives
What Is a Package? 1. Distinguish scripts and packages.
2. Compare benefits and challenges of package creation.
3. Identify alternatives to packages.
4. Recall USGS and DOI policies related to publishing and maintaining code.
Package Mechanics 1. List the structural components of an R-package.
2. Understand package dependency trees.
3. Be familiar with different ways data can be included in packages.
4. Correctly define what licenses and disclaimers are needed for USGS software.
5. Apply the build and check features to a package.
6. Define internal functions and know their benefits.
Version Control 1. Define version control and give examples of how it is useful.
2. Navigate the GitHub interface.
3. Summarize a typical GitHub-to-R workflow.
Documentation 1. Distinguish the types of documentation for R packages.
2. Develop documentation for individual functions.
3. Create a vignette to highlight the top-level package uses.
4. Edit and update README files.
Debugging 1. Identify different types of errors.
2. Describe the available debugging tools in R and RStudio, namely `traceback()`, `debug()`, breakpoints, and `browser()`.
3. Apply debugging tools to locate a particular error in your code.
4. Use debugging tools to find errors in unfamiliar functions.
Defensive Programming 1. Define defensive programming and give examples of problems to defend against.
2. List common techniques for defensive programming.
3. Construct and execute defensive programming functions.
Writing Tests 1. Recognize the purpose and value of writing tests.
2. Understand different types of testing and when to apply them.
3. Describe good test writing practices.
4. Identify appropriate testing frequency.
Maintenance 1. Define various levels of maintenance and user groups.
2. Discuss strategies for short- and long-term package maintenance.
3. Explain how to communicate your level of support.

Lindsay R. Carr