Difference between revisions of "R Coding Practices"
Line 69: | Line 69: | ||
== Additional Resources == | == Additional Resources == | ||
* [https://www.r-bloggers.com/r-code-best-practices/| R-bloggers post on best practices] | * [https://www.r-bloggers.com/r-code-best-practices/| R-bloggers post on best practices] | ||
* DIME Analytics, World Bank [https://github.com/dime-wb-trainings/shiny-training?tab=readme-ov-file/ shiny training] | |||
==Related DIME Analytics Trainings== | ==Related DIME Analytics Trainings== |
Revision as of 13:14, 3 June 2025
This article lays out some best practices for coding using R. Though it is possible to use R without it, the RStudio integrated development environment makes its use easier and is the standard among R users. There is not a single set of best practices and the guidelines below are suggestions that can and should be adapted the each project's needs, as well as users' preferences
Read First
- RStudio
- Comments
- Objects names
Package installation
R packages are collections of functions, data, and documentation that extend the functionality of base R. They are essential for everything from data cleaning to statistical modeling and visualization.
Installing CRAN Packages
CRAN (The Comprehensive R Archive Network) is the primary repository for R packages.
Here's an example to install a package from CRAN:
install.packages("tidyverse")
To install multiple packages:
install.packages(c("tidyverse", "haven", "data.table"))
To load the packages installed:
library(tidyverse)
Comments and script structure
Running code that returns the right result is only the first half of the job. The other half is making sure your code is easy to follow, test, and reuse. This helps teams catch mistakes, audit decisions, and collaborate more effectively. Poorly structured code increases the risk of errors.
1. Use header comments to introduce each script:
##################################################
# Script: 01_clean_data.R
# Purpose: Clean raw baseline data
# Author: First Last
# Date: 2025-05-30
# Inputs: data/raw/baseline.csv
# Outputs: data/clean/baseline_clean.rds
##################################################
2. Use section headers to structure code within each script
# Load packages -------------------------------------------------------
# Import data ---------------------------------------------------------
# Clean variables -----------------------------------------------------
# Save outputs --------------------------------------------------------
Good syntax makes it easy to understand what the code is doing and why. You should:
- Use clear, expressive names for variables and objects (e.g., baseline_data instead of bd).
- Avoid deeply nested code and one-liners that sacrifice clarity.
- Write logic in small chunks—long chains of operations should be broken down or commented carefully.
Naming objects
Style and white space
Loops in R
Tidyverse
Version control
RStudio projects
Additional Resources
- R-bloggers post on best practices
- DIME Analytics, World Bank shiny training
Related DIME Analytics Trainings
- Training session recording on Introduction to R Shiny
- Training session recording on Big data workflows with R
data.table