good practice

Apply best practices when coding!

  • Write readable code that is easy to maintain (see below).
  • Use version control with a single source of truth (SSOT). Do not have duplicate code versions in both .qmd and .R/py scripts.
  • For reports / presentations, use quarto instead of separate files with scripts, Jupyter notebook, images, code outputs, word document.

good code

  • clear
  • simple (one job per function)
  • documented
  • performant
  • bug-free
  • well tested

The following points are written for R, but apply (in spirit) to Python and other languages as well.
See also the RStudio best practice cheatsheet.

organisation / workflow

  • Use RStudio projects (File -> New project). They set the working directory and manage settings & opened scripts.
  • Never use setwd(): others don’t have that path and neither do you, after rearranging folders.
  • Use relative path names, e.g. read.table("datafolder/file.txt") instead of "C:/Users/berry/Desktop/Project/datafolder/file.txt".
  • Put source("functions.R") in your main (quarto) script (see project), or write your own package.
  • Use short informative script names without spaces.
  • Reference the source / authors of copied code (including chatbot model info).

code format

  • Follow a style guide consistantly (example).
  • Choose short but descriptive object names. df, data, X are not!
  • Use expressive verbs for function names. Functions do something.
  • Functions should call each other, instead of being one big multi-purpose monster.
  • Use RStudio script sections # 1 clean data ---- for an outline (CTRL+SHIFT+O).
  • Use line breaks (with indentation) to avoid horizontal scrolling (margin settings).
  • In qmd text sections, use line breaks for version control.
  • In qmd documents, use short code chunk names (labels) with no spaces.

code quality

  • Vectorize code whenever possible.
  • If not, use lapply/sapply instead of for loops (lesson 4.3 and 8.3).
  • DRY: don’t repeat yourself.
  • Write defensive code that checks inputs (lesson 8.1).
  • Use arrays for all-numeric data (lesson 4.4).
  • Do not load >2 packages from the library, instead use pack::fun.
  • Install packages conditionally.
  • Do not create more objects than needed, clean up with rm.
  • Make sure your code runs in a clean session:
    • CTRL+SHIFT+F10 to restart R with a clean workspace (Rdata settings)
    • source() the entire script with CTRL+SHIFT+S.

To practice writing good R code, improve the examples in elegant code.