BuckAI HPC Handbook

BuckAI Observatory Logo
BuckAI HPC Handbook
Practical notes from the BuckAI Observatory on using OSU's Unity HPC cluster β€” SSH, AI coding assistants (Claude Code, Copilot, Gemini), mamba, tmux, Jupyter, and the workflows that make them all play nicely together.
TipπŸ‘‹ Never heard of HPC, Unity, or Slurm?

Start with Why HPC, and is it for you? β€” a beginner-friendly explanation of what an HPC cluster actually is, why a grad student might (or might not) need one, and an honest assessment of who this handbook is for. Especially recommended for Windows users and anyone whose only computing experience is on a personal laptop.

What this handbook is

A practical, opinionated reference for using OSU’s Unity HPC cluster β€” written for BuckAI Observatory students, postdocs, and collaborators. It takes you from β€œI just got my HPC account and the SSH prompt is rejecting me” to β€œI’m training models on GPU nodes from VS Code on my laptop, with my AI coding assistant (Claude Code / Copilot / Gemini) helping me debug, while sharing a reproducible mamba environment with my labmates.”

It’s opinionated about workflows that work well in practice, and warns about the ones that look fine but break in subtle ways.

Browse by topic

 SSH

Connect securely from your laptop to Unity through the ASC jumphost β€” with one Duo prompt per ten-minute window instead of one per connection.

 HPC fundamentals

The shell, environments, and patterns that turn a raw cluster account into a productive setup.

 Slurm

Submit jobs, request the right resources, and avoid blocking the cluster for others.

Where to start

If you’ve just been given an OSU HPC account and want a sensible reading order, work through these in sequence:

  1. SSH Setup β€” write a working ~/.ssh/config with connection multiplexing so you only Duo-tap once.
  2. SSH Keys β€” the concepts behind why your setup is secure, and how to manage keys long-term.
  3. VS Code Remote-SSH + AI Coding Assistants β€” set up the editor + AI assistant (Claude Code, Copilot, or Gemini) that you’ll spend most of your time in.
  4. Shell Environment β€” make .bashrc work for you (aliases, PATH, umask 002 for group collaboration, your Unix groups and Slurm partitions).
  5. Persistent Sessions β€” keep work alive across disconnects with tmux and the livenode function.
  6. Python Environments β€” install mamba, create per-project envs, and avoid the nightmare of mixing pip with conda incorrectly.
  7. Jupyter & TensorBoard β€” run interactive notebooks and live training dashboards in your laptop’s browser while compute happens on Unity.
  8. Slurm Basics β€” how to submit unattended jobs with sbatch.
  9. Slurm Best Practices β€” the most important Slurm skill: right-sizing memory, CPU, and walltime requests so your jobs run sooner and don’t block others.

Returning to look something up? Use the search box at the top of the sidebar β€” it indexes every page.

Conventions used in this handbook

  • Copy-paste code: Code blocks are ready to paste, with placeholders you replace:
    • yourname.## β€” your OSU username (e.g. smith.123)
    • <group> β€” your Slurm partition name (often batch) or your Unix group name
    • <username> β€” same as above, in path contexts
    • mynode β€” a real compute-node hostname (on Unity these follow uXXX: u101, u250, u500, etc.)
    • buckai_key β€” the name we use for your SSH private key
  • Emoji legend:
    • βœ” β€” a recommended practice or thing to do
    • ❌ β€” a problem symptom or common mistake
    • βœ… β€” a verification step (β€œyou should see this”)
    • ⚠ β€” a caution or non-obvious gotcha

About

This handbook is maintained by the BuckAI Observatory at Ohio State University.