project
Start a research project under version control
- install R, Rstudio and git
- create a well-named github repository (https://github.com/new), initialize with Readme
Code-> Copy URL (SSH)- Rstudio -> File -> New Project -> Version Control -> Git: paste URL, set subdirectory, create project.
- Rstudio -> File -> New File -> R script / quarto document
- follow good practices
- work, then commit changes and push to github
Organize a research project
in the git folder you could have something like:
project/
├── raw_data_large/
├── reduce_data_size.R
├── raw_data_small/
│ ├── file1.csv
│ └── file2.csv
├── process_data.R
├── data_full.csv
├── functions.R
├── test_functions.R
└── main_file.qmd
raw_data_large/(only locally, i.e. listed in.gitignore)reduce_data_size.R: read big files, select interesting bits, store inraw_data_small/withwrite.csv. If you have (many) text entries with commas but no tabstops, usewrite.table(..., sep="\t", row.names=FALSE, fileEncoding="UTF-8")instead.process_data.Rwith
data_csv <- lapply(csvfiles, read.csv)
data_full <- Reduce(merge, data_csv)
write.csv(data_full, "data_full.csv")functions.Rwith
helper <- function(x) x
analyze <- function(df) sapply(df, helper)
visualize <- function(column) plot(analyze(full_data)[,column])test_functions.Rwith
source("functions.R")
helper(input) == expected
checkmate::assert_number(helper(example))
testthat::expect_equal(analyze(example_df), expected)
res <- analyze(example_df)
if(res != expected) stop("analyze(example_df) should be ", expected, ", not ", res)main_file.qmdwith code chunks for