8  Conda

In Your First R Project, renv gave your project its own private library of R packages — isolated from everything else on your computer. If you installed Seurat for one project and ggplot2 for another, neither could interfere with the other. That isolation is what made your analysis reproducible.

Conda does the same thing for Python. It creates environments — self-contained spaces, each with their own Python version and their own set of packages. When you start a new Python project, you create a new conda environment for it, install the packages you need, and work inside that environment. Nothing you do in one environment affects another.

If you’ve never used Python before, that’s fine. This chapter assumes no prior Python experience — just the concepts you picked up from working with renv in Chapter 3. The commands are different, but the idea is the same: each project gets its own tools, and you record what you installed so others can reproduce your setup.

8.1 Why Environments Matter

Imagine you’re six months into a project analyzing microscopy images. You’ve been using a Python package called scikit-image version 0.21, and everything works. Then a labmate asks for help with their project, which needs scikit-image version 0.23 — a newer version with a different API. If Python packages all live in one place (the way they do with a bare pip install), updating for their project would break yours.

This is exactly the problem renv solved for R, and it’s the problem conda solves for Python. Each project gets its own environment with its own packages, so updating one project can’t break another. You can even use different Python versions — one project on Python 3.11, another on 3.12 — without conflict.

In the lab, every project gets its own conda environment. This isn’t optional — it’s how we keep analyses reproducible and avoid the kind of subtle, hard-to-diagnose bugs that come from shared package installations.

8.2 Installing Miniforge

Conda itself is just a program — you need to install it before you can use it. We use Miniforge, a lightweight distribution that comes pre-configured with the conda-forge package channel (the community-maintained repository where most scientific Python packages live). Miniforge gives you conda without the bloat of the full Anaconda distribution.

The easiest way to install Miniforge on macOS is with Homebrew, the macOS package manager. If you already have Homebrew installed (you can check by running brew --version in Terminal), run:

brew install miniforge

If you don’t have Homebrew, you can download the Miniforge installer directly from the Miniforge GitHub releases page. Download the macOS installer for your chip (Apple Silicon or Intel), then run it:

bash Miniforge3-MacOSX-arm64.sh

Follow the prompts, accepting the defaults. When the installer asks whether to initialize conda, say yes.

Download the Miniforge installer from the Miniforge GitHub releases page. Choose the Windows installer (.exe file), run it, and follow the prompts. Accept the default installation location (C:\Users\YourName\miniforge3).

After installation, close and reopen your terminal (Terminal on macOS, PowerShell on Windows) so the new conda command becomes available. You can verify the installation worked by running:

conda --version

If you see a version number (like conda 24.7.1), you’re set. If you get “command not found,” see the Troubleshooting section at the end of this chapter.

8.3 One-Time Setup

Before creating your first environment, there are a few configuration steps that make conda faster and more reliable. You only need to do these once — they apply globally to all environments you create.

NoteWindows Users: Initialize Your Shell

On Windows, conda needs to hook into PowerShell before conda activate will work. Open PowerShell and run:

conda init powershell

Then close and reopen PowerShell. You only need to do this once. (If you use Command Prompt instead, run conda init cmd.exe.)

8.3.1 Set conda-forge as the Default Channel

Conda downloads packages from channels — think of them as package repositories, like CRAN for R. The most important channel is conda-forge, a community-maintained collection of well-tested scientific packages. We want conda to always look here first:

conda config --set channel_priority strict
conda config --add channels conda-forge

The first command tells conda to respect channel priority strictly (no mixing versions from different sources). The second makes conda-forge the top-priority channel. This avoids subtle version mismatches that can cause hard-to-debug problems.

8.3.2 Speed Up the Solver

Conda figures out which package versions are compatible before installing anything — this is called “solving the environment.” The default solver can be slow. The libmamba solver is much faster and is built into modern conda:

conda config --set solver libmamba

With these three commands done, conda is configured for reliable, fast use. You won’t need to think about this again.

8.4 Creating Your First Environment

Let’s create an environment for a hypothetical data analysis project. The command looks like this:

conda create -n my-project python=3.11 numpy pandas matplotlib ipykernel

Here’s what each piece means:

  • conda create — make a new environment
  • -n my-project — name it “my-project” (you’ll use this name every time you activate it)
  • python=3.11 — install Python 3.11 (pinning the version so it doesn’t change unexpectedly)
  • numpy pandas matplotlib ipykernel — install these packages into the environment from the start

Conda will show you a list of packages it plans to install (including dependencies) and ask you to confirm. Type y and hit Enter.

The packages in this example are a typical starting point for data science work. numpy handles numerical computing, pandas gives you dataframes (similar to R’s dataframes or tibbles), matplotlib is the core plotting library, and ipykernel is required for Quarto to execute Python .qmd files. Without ipykernel, you’ll get an error when trying to render — so always include it.

You might also want seaborn (statistical visualization built on matplotlib) or scikit-learn (machine learning). You can add these during creation or install them later — there’s no penalty for adding packages after the environment exists.

NoteWhat About Jupyter?

You don’t need jupyter or jupyterlab for our workflow — Quarto handles document rendering directly through ipykernel. Only install Jupyter if you receive .ipynb notebook files from collaborators and want to open them in the traditional notebook interface.

WarningClaude Code

When you’re starting a new type of analysis and aren’t sure which Python packages you need, Claude Code can recommend packages and generate the environment creation command.

I’m starting a Python project to analyze fluorescence microscopy images. I’ll need to load TIFF stacks, segment cells, and measure intensities. What packages should I include in my conda environment?

Claude will suggest the right packages (scikit-image for segmentation, tifffile for TIFF loading, etc.), generate the full conda create command, and flag any packages that need to be installed via pip instead of conda.

TipClaude Code Does This Automatically

In practice, you’ll rarely build environments from scratch. The Musser Lab’s /new-project skill in Claude Code (covered in Part 3) creates a conda environment for your project automatically, installs the packages you need, and configures Positron to use it. But understanding these commands is important for troubleshooting and for working outside the lab’s automated workflow.

8.5 Using Your Environment

8.5.1 Activating and Deactivating

An environment doesn’t do anything until you activate it. Activation tells your terminal to use that environment’s Python and packages instead of the system defaults:

conda activate my-project

When an environment is active, your terminal prompt changes to show the environment name in parentheses:

(my-project) $
(my-project) PS C:\Users\yourname>

This is your visual confirmation that the right environment is active. If you don’t see the environment name in your prompt, your code will use the wrong Python — a common source of “package not found” errors.

[TODO: screenshot of terminal showing (my-project) in the prompt after activation]

To leave an environment and return to the default state, run:

conda deactivate

8.5.2 Installing More Packages

Once an environment is active, you can install additional packages into it:

conda install seaborn scikit-learn

Conda installs packages into whichever environment is currently active, so always check your prompt before installing. If you accidentally install into the wrong environment, it’s not the end of the world — just activate the right one and install again there.

To see what’s installed in your current environment:

conda list

And to see all the environments you’ve created:

conda env list

8.5.3 Connecting to Positron

Creating and activating an environment in the terminal is only half the story. You also need to tell Positron which environment to use, so that when you run Python code from a .qmd file, it uses the right Python with the right packages.

  1. Open Positron and navigate to your project folder
  2. Open the Command Palette (Cmd+Shift+P on macOS, Ctrl+Shift+P on Windows)
  3. Type “Python: Select Interpreter”
  4. Choose your environment from the list (it will show something like Python 3.11 (my-project))

[TODO: screenshot of Positron Command Palette showing “Python: Select Interpreter” with conda environments listed]

Once selected, Positron remembers this choice for the project — you won’t need to do it again unless you create a new environment. The status bar at the bottom of Positron shows which Python interpreter is active, so you can always verify at a glance.

[TODO: screenshot of Positron status bar showing the active Python environment]

NoteEnvironment Not Showing Up?

If you just created a new conda environment and it doesn’t appear in Positron’s interpreter list, try restarting Positron. It sometimes needs a fresh start to detect new environments.

If you use the Musser Lab’s /new-project skill in Claude Code, it configures the Positron interpreter automatically as part of project setup — one less thing to remember.

8.6 Sharing Environments

8.6.1 environment.yml

Just as renv uses renv.lock to record your R packages, conda uses a file called environment.yml to record your Python environment. This file lists the environment name, which channels to use, and which packages are installed — everything someone else needs to recreate your setup.

To create this file from your current environment:

conda env export --from-history > environment.yml

The --from-history flag is important. It records only the packages you explicitly installed, not every sub-package that was pulled in automatically as a dependency. This makes the file portable across operating systems — a colleague on Windows can recreate an environment you built on macOS, because conda resolves platform-specific dependencies at install time.

The resulting file looks like this:

name: my-project
channels:
  - conda-forge
dependencies:
  - python=3.11
  - numpy
  - pandas
  - matplotlib
  - ipykernel

8.6.2 Recreating an Environment from a File

When a collaborator shares a project with an environment.yml, they can recreate the environment with one command:

conda env create -f environment.yml

This creates the environment, installs all the packages, and sets it up so they can activate it and start working. It’s the conda equivalent of running renv::restore() in R.

TipKeep environment.yml in Git

Commit environment.yml to your Git repository alongside your code and renv.lock. When you install new packages, re-export the file and commit the update. The Claude Code /done command can help manage this as part of your session wrap-up.

8.6.3 Full Export (When Needed)

For exact reproducibility on the same platform, you can do a full export:

conda env export > environment.yml

This includes exact version numbers and platform-specific build strings for every package, including dependencies. It’s useful for archiving a precise state or debugging, but the file won’t work on a different operating system. Use --from-history for sharing and full export for archiving.

8.7 Best Practices

8.7.1 One Environment Per Project

Every project in the lab gets its own conda environment, named after the project. This keeps things clean and avoids the situation where updating a package for one project silently breaks another. It also means environment.yml accurately reflects what that specific project needs.

conda create -n tihkal-analysis python=3.11 numpy pandas matplotlib ipykernel
conda create -n imaging-project python=3.11 numpy pandas scikit-image ipykernel

Name your environments after the project — short, descriptive, and easy to type. Avoid generic names like env1 or test that tell you nothing when you see them in conda env list six months later.

8.7.2 Never Install into Base

When you open a terminal with conda configured, you’ll see (base) in your prompt. This is the base environment — it’s where conda itself lives. Never install project packages into base. It’s tempting (it’s right there, already active), but packages in base affect every project on your machine. Keep base clean and always create a dedicated environment before installing anything.

8.7.3 Pin Your Python Version

Always specify a Python version when creating an environment:

conda create -n my-project python=3.11

Without a version, conda installs whatever Python is newest — which means your environment might get Python 3.12 today and someone recreating it might get 3.13 next year. Pinning the version ensures consistency. The specific patch version (3.11.x) will be the latest available within that minor version.

NoteBoth Conda and renv

In the lab, every project has both an environment.yml (for Python) and an renv.lock (for R), even if the project primarily uses one language. Projects evolve — what starts as an R-only analysis might later need a Python script for preprocessing, or vice versa. Having both environment files in place from the start means you don’t have to retrofit them later when you add a second language.

8.8 Conda vs pip

Both conda and pip install Python packages, and you’ll sometimes need both. The general rule is simple: use conda first, fall back to pip for packages conda doesn’t have.

# Prefer conda for scientific packages
conda install pandas numpy scipy

# Use pip for packages not in conda
pip install session-info

Most scientific packages (pandas, numpy, scipy, scikit-learn, matplotlib) are in conda-forge and install faster and more reliably through conda. Some specialized or newer packages are only on PyPI (Python’s main package repository) and need pip.

When using both in the same environment, install all your conda packages first, then pip packages. This avoids dependency conflicts. Conda is aware of pip-installed packages but can’t manage them, so installing conda packages after pip can sometimes overwrite or break pip-installed ones.

8.9 Channels and Bioconda

For most data science work, the default conda-forge channel (which you configured in One-Time Setup) has everything you need. But if you work with bioinformatics command-line tools — things like samtools, bedtools, bwa, or STAR — you’ll need an additional channel called bioconda.

Bioconda is a channel specifically for bioinformatics software, maintained by the bioinformatics community. To add it:

conda config --add channels bioconda

After adding bioconda, you can install bioinformatics tools directly into a conda environment:

conda create -n genomics-tools python=3.11 samtools bedtools

You can also specify channels in your environment.yml so collaborators get the right packages automatically:

name: genomics-project
channels:
  - conda-forge
  - bioconda
dependencies:
  - python=3.11
  - pandas
  - samtools
  - bedtools

The order of channels matters — channels listed first have higher priority. With conda-forge first, general scientific packages come from the well-tested community repository, while bioinformatics-specific tools come from bioconda.

WarningClaude Code

Claude Code can diagnose environment issues by checking which environment is active, what’s installed, and where packages are missing.

I’m getting ModuleNotFoundError: No module named 'pandas' when I run my Python script in Positron, but I installed pandas in my conda environment. Here’s what conda list shows: [paste output]. What’s wrong?

Claude will check whether the package is installed in the right environment, whether Positron is pointing at a different Python interpreter, and suggest the specific fix — often it’s as simple as selecting the right interpreter in the Command Palette.

8.10 Troubleshooting

8.10.1 macOS: “conda: command not found”

Your shell doesn’t know where conda is. If you installed via Miniforge, run:

source ~/miniforge3/etc/profile.d/conda.sh

To make this permanent, add that line to your shell configuration file (~/.zshrc for zsh, which is the default on modern macOS, or ~/.bashrc for bash).

8.10.2 Windows: “conda is not recognized”

The Miniforge installer may not have added conda to your PATH. You have two options:

  1. Use the “Miniforge Prompt” from the Start menu — this opens a terminal with conda pre-configured
  2. Add Miniforge to your PATH manually: search “Environment Variables” in Windows Settings and add C:\Users\YourName\miniforge3\Scripts to your PATH

8.10.3 Windows: “conda activate” Does Nothing

This means conda isn’t initialized for PowerShell. Run:

conda init powershell

Then close and reopen your terminal completely. If you’re using Positron’s integrated terminal, close and reopen the entire application.

8.10.4 “Solving environment” Takes Forever

You probably haven’t set the fast solver. Run:

conda config --set solver libmamba

If you’ve already set libmamba and it’s still slow, you likely have conflicting package requirements. Try loosening version constraints or starting fresh (see below).

8.10.5 Dependency Conflicts

Conda reports “conflict” or can’t find a compatible set of packages. Try these in order:

  1. Loosen version pins — let conda choose versions instead of specifying exact ones
  2. Start fresh — sometimes it’s easier to recreate the environment than to fix conflicts:
conda deactivate
conda env remove -n my-project
conda create -n my-project python=3.11 numpy pandas matplotlib ipykernel

Then add other packages one at a time to identify which one causes the conflict.

8.10.6 Package Not Found

Check the spelling, make sure conda-forge is in your channels (conda config --show channels), and search for the package: conda search package-name. If it’s not in any conda channel, install it with pip instead.

8.10.7 Environment Seems Corrupted

If an environment is behaving strangely and you can’t figure out why, remove it and recreate from your environment.yml:

conda deactivate
conda env remove -n my-project
conda env create -f environment.yml

This is the conda equivalent of renv::restore() — it rebuilds the environment from your recorded specification. This is also why keeping environment.yml up to date and committed to Git matters: it’s your safety net.

WarningClaude Code

Claude Code is good at diagnosing environment and interpreter issues because it can check your conda configuration, Positron settings, and installed packages all at once.

I created a conda environment called analysis-env and it shows up in conda env list, but when I open “Python: Select Interpreter” in Positron, it’s not in the list. I’ve restarted Positron already. What else can I try?

Claude will check your conda installation path, Positron’s Python discovery settings, and suggest targeted fixes — often a configuration path issue or a Positron extension that needs reloading.

8.11 Quick Reference

Task Command
Create environment conda create -n NAME python=3.11 numpy pandas matplotlib ipykernel
Activate conda activate NAME
Deactivate conda deactivate
Install package conda install PACKAGE
List packages conda list
List environments conda env list
Export environment conda env export --from-history > environment.yml
Create from file conda env create -f environment.yml
Remove environment conda env remove -n NAME
Set fast solver conda config --set solver libmamba
Set channel priority conda config --set channel_priority strict
Add bioconda channel conda config --add channels bioconda