Appendix C — Templates

Copy-paste templates for common project files.

C.1 .gitignore

.gitignore

# Generated outputs (reproducible from code)
outs/

# R artifacts
.Rhistory
.RData
.Rproj.user/
renv/library/
renv/staging/
renv/local/
*_cache/

# Python artifacts
__pycache__/
*.py[cod]
*.egg-info/
.eggs/
*.egg
.venv/
venv/

# Quarto rendering
*_files/
.quarto/

# OS files
.DS_Store
Thumbs.db

# IDE settings
.vscode/
.positron/
*.Rproj

# Secrets
.env
*.pem
credentials.json

C.2 environment.yml (Conda)

environment.yml

name: project-name
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - pandas
  - numpy
  - matplotlib
  - seaborn
  - ipykernel
  - pip
  - pip:
    - session-info

Generate with conda env export --from-history > environment.yml for a portable file containing only explicitly installed packages.

C.3 README.md

README.md

# Project Name

Brief description of what this project does.

## Setup

### Prerequisites
- [Positron](https://positron.posit.co/) (recommended) or VS Code
- [rig](https://github.com/r-lib/rig) for R installation
- [Conda](https://github.com/conda-forge/miniforge) (Miniforge recommended)
- [Git](https://git-scm.com/)

### Installation

1. Clone the repository:
   ```bash
   git clone https://github.com/username/project-name.git
   cd project-name

Create conda environment (Python):

conda env create -f environment.yml
conda activate project-name

Install R packages (if using R):
```
renv::restore()
```
Configure Positron interpreters via Command Palette (Cmd+Shift+P / Ctrl+Shift+P):
- R: “R: Select Interpreter” → select R 4.x
- Python: “Python: Select Interpreter” → select project-name conda env

C.4 Project Structure

project-name/
├── data/            # External inputs (read-only)
├── scripts/         # Quarto analysis scripts (.qmd)
│   └── exploratory/ # One-off analyses
├── outs/            # All script-generated outputs
├── R/               # Shared R helper functions
├── python/          # Shared Python helper functions
├── environment.yml  # Python dependencies
└── renv.lock        # R dependencies

C.5 Data

Describe data sources: - data/file.csv: Description, source, date obtained

C.6 Authors

Your Name (@github-username)

C.7 License

[Choose appropriate license]


## .claude/CLAUDE.md

```{.markdown filename=".claude/CLAUDE.md"}
# Project: Project Name

## Overview
Brief description of the project and its goals.

## Environment

### Python
```bash
conda activate project-name

C.7.1 R

Uses renv. Activates automatically via .Rprofile.

C.8 Key Files and Directories

data/ — External input data (read-only, scripts never write here)
scripts/ — Quarto analysis scripts (.qmd)
scripts/exploratory/ — One-off analyses
outs/ — All script-generated outputs
R/ — Shared R helper functions
python/ — Shared Python helper functions

C.9 Workflows

C.9.1 Running the full analysis

quarto render scripts/

C.9.2 Rendering a single script

# R scripts
quarto render scripts/01_analysis.qmd

# Python scripts (conda must be active)
source ~/miniconda3/etc/profile.d/conda.sh && conda activate project-name && quarto render scripts/02_plots.qmd

C.10 Data

C.10.1 Input data

data/counts.csv — Gene expression counts, 20000 genes × 12 samples
data/metadata.csv — Sample metadata

C.11 Conventions

Scripts use status: lifecycle field (development → finalized → deprecated)
Every script writes BUILD_INFO.txt to its output folder
Scripts communicate through files in outs/, not shared memory
Cross-language data uses Parquet format
Use tidyverse style for R code
Commit after completing each logical unit of work

C.12 Project Document Registry

[Planning documents go here as the project grows]


## R QMD Template

```{.markdown filename="scripts/01_analysis.qmd"}
---
title: "Analysis Title"
subtitle: "Brief description"
author: "Your Name"
date: today
status: development
format:
  html:
    toc: true
    toc-depth: 2
    number-sections: true
    code-overflow: wrap
    code-fold: false
    code-tools: true
    highlight-style: github
    theme: cosmo
    fontsize: 1rem
    linestretch: 1.5
    self-contained: true
execute:
  echo: true
  message: false
  warning: false
  cache: false
---

## Introduction

Brief description of the analysis.

## Setup

```{r}
#| label: setup
#| include: false

# ---- Libraries ----
suppressPackageStartupMessages({
  library(tidyverse)
  library(here)
})

# ---- Paths ----
dir_data <- here::here("data")
dir_out <- here::here("outs", "01_analysis")
dir.create(dir_out, recursive = TRUE, showWarnings = FALSE)

# ---- Options ----
options(stringsAsFactors = FALSE)
set.seed(42)

# ---- Provenance ----
git_hash <- system("git rev-parse --short HEAD", intern = TRUE)
cat("Rendered from commit:", git_hash, "\n")

C.13 Inputs

```{r}
#| label: inputs

# --- Inputs (external data) ---
data <- read_csv(here("data", "data.csv"))
cat("Loaded", nrow(data), "rows\n")
```

C.14 Analysis

Your analysis code here.

C.15 Results

Summary of results.

C.16 Build Info

```{r}
#| label: build-info

writeLines(
  c(
    paste("script:", "01_analysis.qmd"),
    paste("commit:", git_hash),
    paste("date:", format(Sys.time(), "%Y-%m-%d %H:%M:%S"))
  ),
  file.path(dir_out, "BUILD_INFO.txt")
)

sessionInfo()
```


## Python QMD Template

```{.markdown filename="scripts/01_analysis.qmd"}
---
title: "Analysis Title"
subtitle: "Brief description"
author: "Your Name"
date: today
status: development
jupyter: python3
format:
  html:
    toc: true
    toc-depth: 2
    number-sections: true
    code-overflow: wrap
    code-fold: false
    code-tools: true
    highlight-style: github
    theme: cosmo
    fontsize: 1rem
    linestretch: 1.5
    self-contained: true
execute:
  echo: true
  warning: false
  cache: false
---

## Introduction

Brief description of the analysis.

## Setup

```{python}
#| label: setup

import subprocess
import sys
import random
from pathlib import Path
from datetime import datetime

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

PROJECT_ROOT = Path(subprocess.check_output(
    ["git", "rev-parse", "--show-toplevel"]
).decode().strip())
sys.path.insert(0, str(PROJECT_ROOT / "python"))

# ---- Options ----
random.seed(42)
np.random.seed(42)
pd.set_option("display.max_columns", None)
sns.set_theme(style="whitegrid")

# ---- Paths ----
out_dir = PROJECT_ROOT / "outs/01_analysis"
out_dir.mkdir(parents=True, exist_ok=True)

git_hash = subprocess.check_output(
    ["git", "rev-parse", "--short", "HEAD"]
).decode().strip()
print(f"Rendered from commit: {git_hash}")

C.17 Inputs

```{python}
#| label: inputs

# --- Inputs (external data) ---
data = pd.read_csv(PROJECT_ROOT / "data/data.csv")
print(f"Loaded {len(data)} rows")
```

C.18 Analysis

Your analysis code here.

C.19 Results

Summary of results.

C.20 Build Info

```{python}
#| label: build-info

(out_dir / "BUILD_INFO.txt").write_text(
    f"script: 01_analysis.qmd\n"
    f"commit: {git_hash}\n"
    f"date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n"
)

import session_info
session_info.show()
```


## Python Script Template (.py)

For standalone utilities, CLI tools, and library code (not for data analysis — use QMD for that):

```{.python filename="python/helpers.py"}
"""
Shared helper functions for this project.
"""

from pathlib import Path
import pandas as pd


def load_data(filename: str) -> pd.DataFrame:
    """Load a data file from the project data/ directory."""
    project_root = Path(__file__).parent.parent
    return pd.read_csv(project_root / "data" / filename)

C.21 R Script Template (.R)

For shared helper functions (not for data analysis — use QMD for that):

R/helpers.R

# Shared helper functions for this project.

#' Load and validate a data file
#'
#' @param filename Name of file in data/ directory
#' @return A tibble
load_data <- function(filename) {
  path <- here::here("data", filename)
  stopifnot(file.exists(path))
  readr::read_csv(path, show_col_types = FALSE)
}