Lab 06 — The pip-conda trap and how to recover

Goal

Experience — in a deliberately controlled setting — the most common way Python environments silently break on HPC: a pip install that re-installs a package conda already manages, producing an environment that looks fine but actually has incompatible binary versions of NumPy/SciPy linked against the wrong BLAS.

You’ll create a fresh env, wreck it on purpose, diagnose what happened, recover, and then redo it the right way. The discomfort is the lesson.


Reading

Budget ~20 minutes for the reading.


Learning objectives

  1. Recognize when a pip install is about to overwrite a conda-managed package, and why that’s dangerous.
  2. Use mamba list’s channel column to audit which packages were installed by mamba vs. by pip.
  3. Recover from a broken env by deletion-and-rebuild.
  4. Install pip-only packages safely with pip install --no-deps after all mamba installs are done.

Setup / prerequisites

  • Lab 05 complete — mamba is installed and your eslab env works.
  • Don’t do this in your eslab env — we’ll create a separate throwaway env. The risk is contained.

Tasks

1. Create a fresh test env (5 min)

mamba create -n piptrap python=3.11 numpy scipy pandas
mamba activate piptrap

2. Take a snapshot of the “before” state (3 min)

mamba list > ~/lab06_before.txt
mamba list | grep -E "^(numpy|scipy|pandas) "

Note the Channel column for each — they should all say conda-forge. Note the versions.

python -c "import numpy; print(f'Channel: {numpy.__file__}')"
# Should print something like: .../envs/piptrap/lib/python3.11/site-packages/numpy/__init__.py

3. Deliberately wreck the env (5 min)

Now do the thing the handbook warned you about:

pip install --force-reinstall numpy

This forces pip to download NumPy from PyPI (a different build than the conda-forge one) and overwrite the conda-managed version.

You’ll see pip download a NumPy wheel from PyPI:

Collecting numpy
  Downloading numpy-X.X.X-cp311-cp311-manylinux_*.whl (...)

4. Inspect the damage (5 min)

mamba list > ~/lab06_after_wreck.txt
mamba list | grep -E "^(numpy|scipy|pandas) "

Compare to before. The Channel column for NumPy should now say pypi instead of conda-forge. This means mamba no longer manages it — pip does.

Try a quick functional test:

python -c "
import numpy as np
import scipy.linalg as la
print('numpy:', np.__version__, '  file:', np.__file__)
print('scipy:', la.svd(np.eye(5))[1])
"

It might still work (if the binary ABIs happen to be compatible), or it might fail with a cryptic linker error, or — worst case — silently produce wrong results. The point is: you no longer know. Conda’s promise that “everything in this env is consistent” has been broken.

5. Diagnose in writing (10 min)

In a file ~/lab06/incident.md, write a mini incident report. Try to answer:

  • What command did you run that caused the breakage?
  • What changed in mamba list output before vs. after?
  • Why does pip not “know” that numpy was already installed by conda?
  • Why is this potentially worse than an outright crash? (Hint: silent wrong results.)

6. Recover (5 min)

The safest recovery is nuking and recreating:

mamba deactivate
mamba env remove -n piptrap
mamba create -n piptrap python=3.11 numpy scipy pandas
mamba activate piptrap
mamba list | grep -E "^(numpy|scipy) "

The Channel column should now say conda-forge again. You’re back to a known-good state.

7. Now do it the RIGHT way (10 min)

Suppose you need a package that genuinely only exists on PyPI — for example scikit-base, tabulate, or any small library you might encounter. The rule from Handbook §6.3:

  1. Install everything mamba can provide FIRST, in one mamba install.
  2. Only after that, install pip-only packages with pip install --no-deps.
  3. Never pip install --upgrade <something-conda-installed>.
  4. Check mamba list afterwards — pip-installed packages show with a pypi channel marker.

Do it:

# mamba env is already activated. We've already done step 1 (mamba create).
# Step 2 — install ONE pip-only package safely:
pip install --no-deps tabulate

mamba list | grep -E "^(numpy|scipy|tabulate) "
# Expected:
#   numpy      X.X.X   conda-forge
#   scipy      X.X.X   conda-forge
#   tabulate   X.X.X   pypi

tabulate is from pip (good), but the conda packages weren’t touched. The --no-deps flag prevented pip from “helpfully” replacing anything that conda already manages.

8. Clean up (2 min)

mamba deactivate
mamba env remove -n piptrap

You don’t need this throwaway env anymore; the eslab env from Lab 5 is what you’ll use for the rest of the course.


Deliverables

Save to lab06/ in your personal repo:

  1. lab06/before.txtmamba list output from Task 2 (before the wrecking).

  2. lab06/after_wreck.txtmamba list output from Task 4 (after the bad pip install). Highlight (with a comment line at the top, or a separate diff.txt) the row where the Channel changed from conda-forge to pypi.

  3. lab06/incident.md — your mini incident report from Task 5.

  4. lab06/after_recovery.txtmamba list from Task 7, showing the recovered env with a safely-pip-installed tabulate (or whatever you chose) alongside conda-managed numpy/scipy.

  5. lab06/reflection.md — 5–7 sentences:

    • Articulate the rule for safely mixing pip and conda in one sentence.
    • Why is pip install --no-deps better than plain pip install when working inside a mamba env?
    • What would you do now if you discovered a mamba env you’ve been working in for weeks has a pip-installed numpy lurking inside it?

Self-check


Common issues

❌ “pip install --no-deps says package depends on numpy>=1.20

That’s pip informing you of a dependency, not breaking the install. As long as numpy is already installed via mamba at a compatible version, you’re fine. --no-deps is what’s preventing pip from helpfully “fixing” your numpy install.

❌ My pip-only package needs a build (compiles C code)

This happens with niche packages. Make sure mamba has installed the build prerequisites first:

mamba install -n piptrap gcc gxx python-dev

Then retry the pip install --no-deps.

❌ I pip-installed something with regular pip install (no --no-deps) and it Re-installed numpy. How worried should I be?

If your code still imports cleanly and produces the same answers on a known test, probably fine in practice. But the “consistency guarantee” is broken — you’ve left a foothold for future weirdness. The safest move is to mamba env remove and recreate; the pragmatic move is to note it in your env’s README and move on, with the understanding that a future bug-hunt may lead back here.

❌ I want a specific version of numpy that conda-forge doesn’t have

Two options: - Use a different channel (mamba install -c some-channel numpy=X.Y.Z). Pin strict channel priority first. - Use mamba install "numpy=X.Y" with a looser pin that conda-forge can satisfy.

pip install numpy==X.Y.Z is the wrong answer.


Time estimate

  • Reading: ~20 min
  • Tasks: ~40 min
  • Deliverables (especially writing up the incident report): ~15 min

Total: ~75 min


Extensions (optional)

Inspect the actual binary differences

After the wrecking step, before recovering, run:

python -c "
import numpy as np
print('numpy version:', np.__version__)
print('numpy library file:', np.__file__)
print('numpy core dep:')
np.show_config()
"

np.show_config() shows which BLAS/LAPACK NumPy is linked against. The conda-forge one usually links against OpenBLAS or MKL with conda-forge paths; the pip one links against whatever the wheel was built against. This is the technical heart of why mixing breaks things.

Try pip install --upgrade numpy instead of --force-reinstall

Both are dangerous in the same way. --upgrade is sneakier because pip will only re-install if the PyPI version is newer than what’s in the env. Sometimes it silently does nothing, sometimes it silently swaps your numpy out.

Try Poetry or pixi as alternative dependency managers

Poetry and pixi are newer Python project managers that try to avoid this pitfall by tracking pip and conda installs in one lock file. Worth knowing about for greenfield projects, though most existing HPC research code is built around mamba.


What’s next?

With env-management foundations in place, Lab 07 — Jupyter on the cluster uses your eslab env to run notebooks served from Unity in your laptop’s browser.