17 Quick Reference: Project Setup
This page is a concise reference for setting up a new analysis project. For a full step-by-step walkthrough with explanations, see Setting Up a Lab Project. For the rationale behind each convention, see Project Organization.
17.1 Standard Structure
17.1.1 Flat layout (single topic, < 10 scripts)
project-name/
├── .claude/
│ └── CLAUDE.md # Claude Code instructions
├── .git/
├── .gitignore
├── R/ # Shared R helper functions
├── python/ # Shared Python helper functions
├── scripts/
│ └── exploratory/ # One-off analyses (no numbering)
├── data/ # External inputs only — scripts never write here
├── outs/ # All script-generated outputs
├── environment.yml # Conda environment (Python)
├── renv.lock # R package versions
└── README.md
17.1.2 Sectioned layout (multiple analysis threads)
For projects with multiple analysis sections (e.g., phosphoproteomics + transcriptomics), mirror the sections across scripts/, data/, and outs/:
scripts/
├── phosphoproteomics/
│ ├── 01_qc.qmd
│ └── 02_analysis.qmd
├── transcriptomics/
│ └── 01_heatmaps.qmd
└── exploratory/
data/
├── phosphoproteomics/
└── transcriptomics/
outs/
├── phosphoproteomics/
├── transcriptomics/
└── exploratory/
Each section has its own numbering sequence.
17.2 Setup Commands
# 1. Create project
mkdir my-analysis && cd my-analysis
# 2. Git
git init
# Create .gitignore (see template below)
# 3. Directory structure
mkdir -p R python scripts/exploratory data outs/exploratory .claude
# 4. Python environment (if using Python)
conda create -n my-analysis python=3.11 pandas numpy matplotlib seaborn ipykernel -y
conda activate my-analysis
pip install session-info
conda env export --from-history > environment.yml
# 5. R packages (if using R)
# In R console: renv::init(); install.packages(c("tidyverse", "here")); renv::snapshot()
# 6. First commit
git add .
git commit -m "Initial project setup"
# 7. Push to GitHub
gh repo create my-analysis --private --source=. --push
# Or for lab: gh repo create MusserLab/my-analysis --private --source=. --push17.3 .gitignore Template
# Generated outputs (reproducible from code)
outs/
# R artifacts
.Rhistory
.RData
.Rproj.user/
renv/library/
renv/staging/
renv/local/
*_cache/
# Python artifacts
__pycache__/
*.py[cod]
*.egg-info/
.eggs/
*.egg
.venv/
venv/
# Quarto rendering
*_files/
.quarto/
# OS files
.DS_Store
Thumbs.db
# IDE settings
.vscode/
.positron/
*.Rproj
# Secrets
.env
*.pem
credentials.json
17.4 CLAUDE.md Template
Create .claude/CLAUDE.md:
# Project: My Analysis
## Overview
[Brief description of what this project does]
## Environment
- Python: `conda activate my-analysis`
- R: Uses renv (auto-activates)
## Key Directories
- `data/` — External input data (read-only, scripts never write here)
- `scripts/` — Quarto analysis scripts (.qmd)
- `scripts/exploratory/` — One-off analyses
- `outs/` — All script-generated outputs
- `R/` — Shared R helper functions
- `python/` — Shared Python helper functions
## Workflows
[How to run the analysis]
## Data
[Description of data files and their sources]
## Conventions
- Scripts use `status:` lifecycle field (development → finalized → deprecated)
- Every script writes `BUILD_INFO.txt` to its output folder
- Scripts communicate through files in `outs/`, not shared memory
- Cross-language data uses Parquet format
## Project Document Registry
[Planning documents go here as the project grows]17.5 README Template
# My Analysis
## Overview
[What this project does]
## Setup
### Python
conda env create -f environment.yml
conda activate my-analysis
### R
# renv will auto-activate; run renv::restore() if needed
## Running the Analysis
[Instructions]
## Data
[Data sources and descriptions]17.6 Data Rules
- Put external data files in
data/. Never modify these files — treat them as read-only. - Scripts never write to
data/. - If data files are large, don’t commit them to Git. Add them to
.gitignoreand document where to obtain the data in your README.
17.7 Using Claude Code
With your .claude/CLAUDE.md in place:
claudeClaude Code can scaffold project configuration files based on your specific setup.
I’m starting a new project analyzing proteomics data from cell cultures. The data is in
data/, I’m using R with renv, and I’ll render with Quarto. Can you create a starter.claude/CLAUDE.mdfor this project?
Claude will create a CLAUDE.md tailored to your project type, data location, and tooling—saving you from starting with a blank template.
17.8 Checklist
Before starting analysis: