name: R Best Practices description: Idiomatic R coding standards for statistical analysis. Use when writing R functions, using tidyverse/data.table, or structuring analysis scripts. metadata: labels: [r-language, best-practices, tidyverse, vectorization] triggers: files: ['/*.R', '/*.Rmd'] keywords: [function, library, source, vectorize, tidyverse, pipe]
R Best Practices
Priority: P0 (FOUNDATIONAL)
๐ฆ Package Management
library()only: Neverrequire()in scripts. Uselibrary()at script top.- Explicit namespacing: Use
pkg::fun()to avoid namespace collisions. Always for infrequent calls. - Minimal dependencies: Prefer base R when equivalent. Import only what you use.
โก Vectorization
- Rule: If you write a
forloop on data, you're probably doing it wrong. applyfamily:sapply,vapply,lapply,mapplyfor list/matrix operations.purrr:map(),map_dbl(),map_dfr()for type-stable functional programming.data.table: Use[i, j, by]syntax for grouped operations on large datasets.Vectorize(): Wrap scalar functions for vector input when needed.
๐ฏ Functional Programming
- Pure functions: No side effects. Same input โ same output. Explicit
return(). - No global assignment: Never use
<<-in functions. Return values instead. - Closures: Use factory functions for parameterized behavior.
purrr::safely(): Wrap functions that may fail to capture errors without stopping.
๐ Code Style (tidyverse Guide)
snake_casefor all names.PascalCaseonly for R6 classes.<-for assignment. Reserve=for function arguments only.- Spacing: Space after commas, around
<-, after#. No space before(in function calls. - Braces:
{on same line,}on own line. Always use braces, even for one-liners. - Line length: Max 80 characters.
- Tools: Enforce with
styler::style_file()andlintr::lint().
๐ Memory Management
- Remove large objects:
rm(large_df); gc()after processing. - Chunked reading:
data.table::fread()withselectfor column subsetting. .RDataavoidance: Never save/load.RData. Use explicitsaveRDS()/readRDS().