17  Quick Reference: Project Setup

This page is a concise reference for setting up a new analysis project. For a full step-by-step walkthrough with explanations, see Setting Up a Lab Project. For the rationale behind each convention, see Project Organization.

17.1 Standard Structure

17.1.1 Flat layout (single topic, < 10 scripts)

project-name/
├── .claude/
│   └── CLAUDE.md         # Claude Code instructions
├── .git/
├── .gitignore
├── R/                    # Shared R helper functions
├── python/               # Shared Python helper functions
├── scripts/
│   └── exploratory/      # One-off analyses (no numbering)
├── data/                 # External inputs only — scripts never write here
├── outs/                 # All script-generated outputs
├── environment.yml       # Conda environment (Python)
├── renv.lock            # R package versions
└── README.md

17.1.2 Sectioned layout (multiple analysis threads)

For projects with multiple analysis sections (e.g., phosphoproteomics + transcriptomics), mirror the sections across scripts/, data/, and outs/:

scripts/
├── phosphoproteomics/
│   ├── 01_qc.qmd
│   └── 02_analysis.qmd
├── transcriptomics/
│   └── 01_heatmaps.qmd
└── exploratory/

data/
├── phosphoproteomics/
└── transcriptomics/

outs/
├── phosphoproteomics/
├── transcriptomics/
└── exploratory/

Each section has its own numbering sequence.

17.2 Setup Commands

# 1. Create project
mkdir my-analysis && cd my-analysis

# 2. Git
git init
# Create .gitignore (see template below)

# 3. Directory structure
mkdir -p R python scripts/exploratory data outs/exploratory .claude

# 4. Python environment (if using Python)
conda create -n my-analysis python=3.11 pandas numpy matplotlib seaborn ipykernel -y
conda activate my-analysis
pip install session-info
conda env export --from-history > environment.yml

# 5. R packages (if using R)
# In R console: renv::init(); install.packages(c("tidyverse", "here")); renv::snapshot()

# 6. First commit
git add .
git commit -m "Initial project setup"

# 7. Push to GitHub
gh repo create my-analysis --private --source=. --push
# Or for lab: gh repo create MusserLab/my-analysis --private --source=. --push

17.3 .gitignore Template

# Generated outputs (reproducible from code)
outs/

# R artifacts
.Rhistory
.RData
.Rproj.user/
renv/library/
renv/staging/
renv/local/
*_cache/

# Python artifacts
__pycache__/
*.py[cod]
*.egg-info/
.eggs/
*.egg
.venv/
venv/

# Quarto rendering
*_files/
.quarto/

# OS files
.DS_Store
Thumbs.db

# IDE settings
.vscode/
.positron/
*.Rproj

# Secrets
.env
*.pem
credentials.json

17.4 CLAUDE.md Template

Create .claude/CLAUDE.md:

# Project: My Analysis

## Overview
[Brief description of what this project does]

## Environment
- Python: `conda activate my-analysis`
- R: Uses renv (auto-activates)

## Key Directories
- `data/` — External input data (read-only, scripts never write here)
- `scripts/` — Quarto analysis scripts (.qmd)
- `scripts/exploratory/` — One-off analyses
- `outs/` — All script-generated outputs
- `R/` — Shared R helper functions
- `python/` — Shared Python helper functions

## Workflows
[How to run the analysis]

## Data
[Description of data files and their sources]

## Conventions
- Scripts use `status:` lifecycle field (development → finalized → deprecated)
- Every script writes `BUILD_INFO.txt` to its output folder
- Scripts communicate through files in `outs/`, not shared memory
- Cross-language data uses Parquet format

## Project Document Registry
[Planning documents go here as the project grows]

17.5 README Template

# My Analysis

## Overview
[What this project does]

## Setup

### Python
conda env create -f environment.yml
conda activate my-analysis

### R
# renv will auto-activate; run renv::restore() if needed

## Running the Analysis
[Instructions]

## Data
[Data sources and descriptions]

17.6 Data Rules

  • Put external data files in data/. Never modify these files — treat them as read-only.
  • Scripts never write to data/.
  • If data files are large, don’t commit them to Git. Add them to .gitignore and document where to obtain the data in your README.

17.7 Using Claude Code

With your .claude/CLAUDE.md in place:

claude
WarningClaude Code

Claude Code can scaffold project configuration files based on your specific setup.

I’m starting a new project analyzing proteomics data from cell cultures. The data is in data/, I’m using R with renv, and I’ll render with Quarto. Can you create a starter .claude/CLAUDE.md for this project?

Claude will create a CLAUDE.md tailored to your project type, data location, and tooling—saving you from starting with a blank template.

17.8 Checklist

Before starting analysis: