Lab 04 — Persistent sessions: tmux + livenode
Goal
Stop losing your work to network blips, laptop sleeps, and coffee-shop WiFi. Learn tmux well enough to detach from and re-attach to a long-running session on the cluster from any machine, anywhere, and install the livenode() function that combines tmux with sinteractive for persistent compute-node sessions you can reconnect to days later.
This is the workflow that makes long-running Claude Code sessions (or any CLI-based AI assistant), multi-hour exploratory analyses, and overnight notebook runs robust. (Copilot and Gemini live in the VS Code extension and persist differently — through VS Code’s auto-reconnect — so tmux/livenode mostly applies to your other long-running cluster work in their case.)
Reading
- Handbook: Persistent Sessions: tmux, nohup, and livenode — read end-to-end (~20 minutes).
Pay particular attention to Section 3 (the livenode pattern) and Section 6.2 (the tmux vs. nohup Slurm chain-of-death — this is non-obvious and worth understanding now rather than discovering it the hard way).
Learning objectives
- Use the four essential
tmuxoperations confidently: new session, detach (Ctrl+b d), list, attach. - Install the
livenode()function from the handbook into your Unity.bashrc. - Demonstrate empirically that a process started inside a tmux session survives SSH disconnection and can be reattached from a different machine.
- Explain (in writing) why
nohupalone wouldn’t keep a Jupyter session alive on a compute node — the Slurm allocation chain-of-death.
Setup / prerequisites
- Labs 01–03 complete — SSH config + multiplexing working,
.bashrcconfigured. tmuxavailable on Unity. Verify withwhich tmux— it should print a path. If not,module load tmuxor check with cluster admins.
Tasks
1. tmux basics — your first session (10 min)
SSH to Unity (via VS Code’s integrated terminal or a regular Terminal). Then:
tmux new -s test1You’re now inside a tmux session named test1. The prompt may look slightly different, and you should see a green status bar at the bottom of the terminal.
Inside the session, start something long-running so you can prove it survives:
echo "Started at $(date)"
( while true; do echo "tick $(date)"; sleep 10; done ) | tee tmux_log.txt(The tee pipes output both to your terminal and to a log file. Useful evidence later.)
2. Detach (5 min)
While inside the session, press Ctrl+b then release, then press d (for “detach”).
You should see:
[detached (from session test1)]
You’re back at the regular Unity prompt. The tick loop is still running inside the detached session.
Verify:
tmux ls
# test1: 1 windows (created Thu ...) [193x42]
cat tmux_log.txt
# Should show ticks accumulating3. Force a real disconnect (5 min)
Now do something that would have killed a non-tmux process:
exit # quit your SSH session entirelyYour terminal disconnects. Wait 30 seconds (so the tick loop accumulates more entries while you’re disconnected).
Then SSH back in:
ssh unity
tmux ls
# test1 should still be there
cat ~/tmux_log.txt
# Should show ticks that happened DURING your disconnection✅ Self-check: the tmux_log.txt contains ticks from while you were disconnected. The process kept running.
4. Reattach (5 min)
tmux attach -t test1
# or: tmux a -t test1You’re back inside the session, watching the ticks roll by. Press Ctrl+c to stop the while loop, then exit to end the session.
5. Install the livenode() function (10 min)
Edit ~/.bashrc (via VS Code or nano) and append:
# Wrapper for a persistent interactive compute-node session.
# Usage: livenode # uses default session name "jbm_node"
# livenode mywork # named session
# If a session by that name exists, reattach to it.
# Otherwise, create a new tmux session that requests a compute node.
livenode() {
local session="${1:-jbm_node}"
if tmux has-session -t "$session" 2>/dev/null; then
tmux attach -t "$session"
else
tmux new-session -s "$session" "sinteractive -p batch --cpus-per-task=2 --mem=8G --time=04:00:00; bash"
fi
}Adjust the sinteractive parameters to your real needs and partition. Save the file and reload:
source ~/.bashrc
type livenode # should show the function body6. Use livenode end-to-end (10 min)
Run it:
livenodeWhat should happen:
- tmux launches a new session named
jbm_node - Inside the session,
sinteractiverequests a compute node - After a wait (could be seconds, could be minutes depending on queue), you land on a compute node — check with
hostname
Once on the compute node, start something Python-flavored:
# (your env from Lab 5 isn't installed yet — that's OK; use system python for this test)
python3 -c "
import time, datetime
while True:
print(f'still alive: {datetime.datetime.now()}', flush=True)
time.sleep(10)
" | tee livenode_log.txtDetach: Ctrl+b then d. SSH disconnect: exit.
7. Reconnect to the running livenode session (5 min)
After at least 60 seconds away:
ssh unity
livenode # since session exists, this reattachesYou should land back inside the tmux session, on the same compute node, watching the Python loop still print. The log file should show entries from while you were away.
✅ Self-check: livenode_log.txt has entries spanning your disconnect.
8. Compare to nohup (10 min — conceptual)
Re-read Handbook §6.2. Make sure you can answer: why wouldn’t nohup python script.py & work as a replacement for livenode if I want my script to keep running on a compute node?
Write your answer in lab04/reflection.md (see deliverables below).
9. Clean up (3 min)
When you’re done with the session:
# Inside the tmux session:
exit # ends the inner shell
# tmux auto-kills the session when its last window's process exits
# Or, from outside:
tmux kill-session -t jbm_nodeDeliverables
Save to lab04/ in your personal repo:
lab04/tmux_log.txt— the log file from Tasks 1–4 showing ticks that happened during your disconnect period.lab04/livenode_log.txt— the Python loop log from Tasks 6–7, with entries spanning your disconnect.lab04/livenode_function.sh— copy of just thelivenode()function you added to.bashrc. Redact partition / lab-specific values.lab04/reflection.md— 5–7 sentences answering:- Why does tmux preserve your session when SSH disconnects? (What is tmux doing that a normal shell isn’t?)
- On a compute node, why does
nohup python script.py &also fail to survive disconnect — even thoughnohupis supposed to ignoreSIGHUP? - What’s one type of work where
nohupis actually the right tool, even on Unity?
Self-check
Common issues
❌ tmux ls says “no server running on /tmp/tmux-….”
You haven’t started any tmux session yet, or your last one ended cleanly. That message just means “nothing to list.” Start one with tmux new -s NAME.
❌ livenode says “no such session” immediately after creation
This means the inner shell exited the tmux session before tmux could verify it. Usually because: - sinteractive failed (e.g. bad partition name) and the ; bash fallback didn’t catch it - You typed exit inside the session - Re-check the partition name in your livenode function
❌ Mouse scrolling doesn’t work inside tmux
By default, scrolling in your terminal doesn’t see tmux’s internal scrollback. Two options: - Enter copy mode: Ctrl+b then [, then use arrow keys / PgUp; press q to exit - Permanent fix: add to ~/.tmux.conf: set -g mouse on then tmux kill-server once and restart your sessions.
❌ I can’t tell whether my session was killed or just detached
tmux lsIf your session appears, it’s still alive (just detached). If not, it died — most likely because the Slurm allocation expired (look at the --time you requested in your livenode function) or you typed exit in the last window.
❌ My livenode works but the Slurm allocation goes pending forever
The partition or resources you asked for aren’t available right now. Either: - Wait (squeue -u $USER will show your pending allocation) - Cancel and adjust the livenode function to ask for less (smaller --mem, fewer --cpus-per-task) - Check if a different partition is faster to allocate
Time estimate
- Reading: ~25 min
- Tasks: ~60 min (including waiting for
sinteractive) - Deliverables: ~10 min
Total: ~1.5 hours
Extensions (optional)
Add livegpu() and livecpu() variants
Different work needs different resources. Add to ~/.bashrc:
livegpu() {
local session="${1:-jbm_gpu}"
if tmux has-session -t "$session" 2>/dev/null; then
tmux attach -t "$session"
else
tmux new-session -s "$session" "sinteractive -p batch --gres=gpu:1 --cpus-per-task=4 --mem=16G --time=12:00:00; bash"
fi
}
livecpu() {
local session="${1:-jbm_cpu}"
if tmux has-session -t "$session" 2>/dev/null; then
tmux attach -t "$session"
else
tmux new-session -s "$session" "sinteractive -p batch --cpus-per-task=16 --mem=64G --time=12:00:00; bash"
fi
}Learn tmux windows and panes
Inside a tmux session: - Ctrl+b c — create a new window - Ctrl+b n / Ctrl+b p — next/previous window - Ctrl+b % — split current window vertically - Ctrl+b " — split current window horizontally
These let you run htop in one pane and your code in another, all inside one persistent session.
Configure tmux scrollback length
Default tmux scrollback is 2000 lines. For long-running training runs you’ll want more. In ~/.tmux.conf:
set -g history-limit 50000
What’s next?
With persistent sessions in hand, Lab 05 — Mamba and your first project env sets up the Python environment you’ll use for the rest of the course.