Difference between revisions of "R Coding Practices"

Jump to: navigation, search
Line 69: Line 69:
== Additional Resources ==
== Additional Resources ==
* [https://www.r-bloggers.com/r-code-best-practices/| R-bloggers post on best practices]
* [https://www.r-bloggers.com/r-code-best-practices/| R-bloggers post on best practices]
* DIME Analytics, World Bank [https://github.com/dime-wb-trainings/shiny-training?tab=readme-ov-file/ shiny training]


==Related DIME Analytics Trainings==
==Related DIME Analytics Trainings==

Revision as of 13:14, 3 June 2025

This article lays out some best practices for coding using R. Though it is possible to use R without it, the RStudio integrated development environment makes its use easier and is the standard among R users. There is not a single set of best practices and the guidelines below are suggestions that can and should be adapted the each project's needs, as well as users' preferences


Read First

  • RStudio
  • Comments
  • Objects names

Package installation

R packages are collections of functions, data, and documentation that extend the functionality of base R. They are essential for everything from data cleaning to statistical modeling and visualization.

Installing CRAN Packages

CRAN (The Comprehensive R Archive Network) is the primary repository for R packages.

Here's an example to install a package from CRAN:

install.packages("tidyverse")

To install multiple packages:

install.packages(c("tidyverse", "haven", "data.table"))

To load the packages installed:

library(tidyverse)

Comments and script structure

Running code that returns the right result is only the first half of the job. The other half is making sure your code is easy to follow, test, and reuse. This helps teams catch mistakes, audit decisions, and collaborate more effectively. Poorly structured code increases the risk of errors.


1. Use header comments to introduce each script:

##################################################
# Script: 01_clean_data.R
# Purpose: Clean raw baseline data
# Author: First Last
# Date: 2025-05-30
# Inputs: data/raw/baseline.csv
# Outputs: data/clean/baseline_clean.rds
##################################################

2. Use section headers to structure code within each script

# Load packages -------------------------------------------------------
# Import data ---------------------------------------------------------
# Clean variables -----------------------------------------------------
# Save outputs --------------------------------------------------------

Good syntax makes it easy to understand what the code is doing and why. You should:

  • Use clear, expressive names for variables and objects (e.g., baseline_data instead of bd).
  • Avoid deeply nested code and one-liners that sacrifice clarity.
  • Write logic in small chunks—long chains of operations should be broken down or commented carefully.

Naming objects

Style and white space

Loops in R

Tidyverse

Version control

RStudio projects

Additional Resources

Related DIME Analytics Trainings