Installation¶
TL;DR
# If running remotely over ssh,
# setup port forwarding
ssh -L 8765:localhost:8765 \
-L 8137:localhost:8137 \
-L 6006:localhost:6006 \
-L 8000:localhost:8000 \
user@dev-host
# Install with Docker
git clone https://github.com/jdinalt/forgather.git
cd forgather
docker/build # per-user dev image, bakes your host UID/GID in
docker/run # interactive shell, --gpus all, ports forwarded
# Inside the container:
# Start the webui...
forgather server
# control-click on `http://localhost:8765/?token=4c4febdc07830cdd...` to connect with your browser
# ...or use the CLI
forgather --help
cd examples/tutorials/tiny_llama
forgather -t v2.yaml train
Two paths: install on the host directly (Python venv via pip or
uv), or run inside the bundled Docker development image. Pick
whichever fits your machine.
Want to skip the host setup? Forgather ships a development Dockerfile that provisions Python 3.12, PyTorch with CUDA wheels, all dependencies, and a developer-friendly base toolchain in a reproducible image. Jump to Installing with Docker below.
After installing, head back to Getting Started for the first-training-run walkthrough, CLI reference, and the Forgather server tour.
Prerequisites¶
- A Linux system (tested on Ubuntu 24.04)
- Python 3.12 or newer. Forgather uses Python 3.12 language features. Newer versions will likely work but are untested; older versions will not. Python 3.12 is the default on Ubuntu 24.04. On older Debian-based distributions you can install it from the deadsnakes PPA:
- An NVIDIA GPU with CUDA support is strongly recommended but not required. CPU-only training works -- the Tiny Llama tutorial below has been run end-to-end on a Chromebook, taking most of a day for the same workload that finishes in ~2 minutes on an RTX 4090. Budget accordingly. Non-CUDA accelerators (Intel, AMD, Apple Silicon) may work -- Forgather deliberately avoids hard CUDA dependencies where possible -- but have not been tested outside of CUDA and CPU, so treat them as experimental.
- A C compiler and Python development headers (required by Triton / flex-attention):
git(used to clone the repo and to fetch thecut-cross-entropysource install below). On most distributions it's installed by default, but minimal Docker base images (e.g. plainubuntu:24.04) don't ship it:- Graphviz (optional). Only used by the CLI's
forgather trefs --format svg, which shells out todotto render template-dependency graphs as SVG. The Forgather server's in-browser graph view bundles a WebAssembly build of Graphviz (@viz-js/viz) and works without the system package. - Node.js + npm (optional, only for the Forgather server's web
UI). The
forgather servercommand serves a Vite/React SPA built fromtools/forgather_server/webui/. The build artifact isn't checked in, so you build it once after install via./build-webui.shat the repo root — see Running the Forgather server. Any current LTS Node release works (tested on Node 20). None of this is needed if you only use the CLI; the running server itself has no Node dependency once the dist bundle exists. On a checkout shared between hosts of different platform (e.g. an NFS share spanning x86_64 and aarch64), always invoke./build-webui.sh—node_modules/is platform-specific and the script keeps each platform's install in its own sibling directory.
Host installation (pip / uv)¶
Clone the repository, then install in a virtual environment.
Using venv:
git clone https://github.com/jdinalt/forgather.git
cd forgather
# Use python3.12 explicitly if your system default is older.
python3.12 -m venv ~/venvs/forgather
source ~/venvs/forgather/bin/activate
pip install -e .
Using uv:
git clone https://github.com/jdinalt/forgather.git
cd forgather
uv venv --python 3.12 ~/venvs/forgather
source ~/venvs/forgather/bin/activate
uv pip install -e .
The install pulls in PyTorch, transformers, the FastAPI server deps, mkdocs, and a few other large packages — expect ~2–3 GB of downloads on a fresh machine. On a slow network the first install can take several minutes; if pip looks stuck it's almost certainly still downloading.
Recommended: install cut-cross-entropy from source:
The pip-installable version of cut-cross-entropy (25.1.1) is missing features
needed for numerical stability during bf16/fp16 training (accum_e_fp32,
accum_c_fp32). Forgather will fall back gracefully, but training may exhibit
lm_head spectral norm explosion over long runs. Install the latest version from
source:
Heads-up: TensorBoard + setuptools 82 incompatibility. TensorBoard
≤ 2.20.0 (the latest release as of writing) imports pkg_resources
at module load, but setuptools 82 (Feb 2026) removed pkg_resources
entirely. If your environment ends up with setuptools ≥ 82 you'll
hit ModuleNotFoundError: No module named 'pkg_resources' the first
time you run tensorboard or forgather tb. The fix is on
TensorBoard master (PR #7057,
March 2026) but not in any release yet. Two workarounds:
# Option 1 — pin setuptools below 82 (most common):
pip install "setuptools<82"
# Option 2 — backport the upstream fix in-place against your installed
# tensorboard. The Docker image takes this path; the patch script is
# idempotent, fails loudly if the pre-patch text has moved, and is
# safe to remove once tensorboard ships a fixed release. From the
# Forgather repo:
python docker/patches/fix_tensorboard_pkg_resources.py
Drop either workaround once Forgather pins a TensorBoard release that contains the upstream fix.
Verify the installation:
This recursively lists all Forgather projects and configurations found under the current directory. You should see output listing the bundled example projects.
Installing with Docker¶
Looking for the full reference? See Docker images for the comprehensive guide — every CLI flag and env var on the
build.sh/run.shhelpers, the runtime (distributable) image for clusters, multi-node setup, persistent overrides, and troubleshooting. The section below is the install quick-start; the reference page is where to go to customize things or understand how it works.
The repo ships a Dockerfile (and matching helpers in docker/)
that builds an Ubuntu 24.04 image with the full Forgather environment
pre-provisioned: Python 3.12, PyTorch (CUDA wheels), all
dependencies, cut-cross-entropy from source, and a developer
toolchain (vim, tmux, ripgrep, jq, htop, ssh, sudo, ...). It's
useful in two ways:
- As a development environment — one command and you have a working Forgather install without touching your host Python.
- As a clean sandbox for release testing — build the image with
--no-cacheand you get a reproducible from-scratch verification that the source tree builds and runs end-to-end.
There's also a separate runtime image (Dockerfile.runtime)
intended for distribution to a multi-node cluster — generic, no
host-clone dependency, builds the SPA inside the image. The
Docker images reference covers both.
Prerequisites¶
- Docker Engine 24+ (or Docker Desktop on macOS/Windows).
- For GPU training: an NVIDIA GPU with current drivers on the host
and the
NVIDIA Container Toolkit
installed (
nvidia-ctk runtime configure --runtime=dockerand asystemctl restart docker). PyTorch wheels bundle their own CUDA runtime, so you don't need a CUDA SDK on the host — just the driver and the container toolkit.
Build the image¶
docker/build builds a per-user dev image: it reads your
id -u / id -g / id -un and passes them as build args, baking
your host identity into the in-container user. Files created inside
the container on bind-mounted host paths land with correct ownership
without any runtime remap — the in-container user simply IS you.
The default image tag is forgather-dev:<your-host-username> so
multiple operators on a shared host get separate images. (For the
build-once-deploy-everywhere, user-agnostic story, see the
runtime image.)
The first build pulls ~3 GB of dependencies and takes a few minutes;
rebuilds reuse the layer cache. After the docker build, build.sh
runs ./build-webui.sh in a transient container against the host
clone so the Forgather server's SPA dist/ is ready before
docker/run is invoked. Skip the post-step with
SKIP_WEBUI_BUILD=1 docker/build (e.g. you'll iterate on the
SPA via npm run dev).
Run it¶
This drops you into an interactive bash shell with:
- The Forgather venv (at
/opt/forgather/venv) onPATH. --gpus all(override withGPUS=nonefor CPU only orGPUS='"device=0,1"'for a subset).- Your host home directory bind-mounted at the same path inside the container, so absolute paths in shell history, configs, and notebooks keep resolving correctly.
- The host's network stack (
--network host) so services bound to127.0.0.1inside the container are reachable on the host's loopback as-is.
The container's entrypoint detects the bind-mounted Forgather checkout and re-links the editable install to it on entry, so your host-side edits are picked up live without a rebuild.
Container lifecycle¶
The container is long-lived: the first docker/run invocation
creates a detached container named forgather-dev-${USER} with
sleep infinity as PID 1; subsequent invocations re-attach via
docker exec. Logging out of an interactive shell does not
stop the container, so a forgather server (or any training job)
you started in one session keeps running, and you can re-attach
from a new terminal to inspect or control it.
docker/run # attach (creating the container if needed)
docker/run forgather ls -r # one-shot command in the same container
docker/run --status # is the container running, stopped, or absent?
docker/run --stop # stop (but keep) — preserves filesystem state
docker/run --rm # stop and remove (next run.sh recreates fresh)
docker/run --recreate # rebuild from scratch (e.g. after image rebuild)
IMAGE, GPUS, NETWORK, port and mount overrides only apply
when the container is created. After docker/build
rebuilds the image, run docker/run --recreate to roll the
running container forward to the new image.
If you'd rather drive docker directly:
NAME=forgather-dev-$USER
docker ps -a --filter name=${NAME} # see the container, running or not
docker logs ${NAME} # entrypoint output (install re-link warnings)
docker stop ${NAME} # stop
docker start ${NAME} # start an existing stopped container
docker rm -f ${NAME} # stop and remove
After pulling repo changes, most updates are picked up live —
the source tree is bind-mounted from your host clone. If
pyproject.toml changed (new deps, version bumps), refresh the
venv from inside the running container — no rebuild needed:
# Inside the container:
uv pip install -e "$FORGATHER_REPO"
cd "$FORGATHER_REPO" && ./build-webui.sh # only if the SPA changed
Force-rebuilding the image is only needed when the Dockerfile
itself changed (new system packages, Python minor-version bump):
See docker.md → Upgrading Forgather inside the container for the full reference.
Networking¶
docker/run defaults to --network host, so the container
shares the host's network stack. Every service inside the
container is reachable on its bound port without -p mappings,
and tools that default to 127.0.0.1 (Forgather server, MkDocs,
TensorBoard, inference) Just Work — open
http://localhost:8765/ from the host browser as if Forgather
were running on bare metal.
If you'd rather use bridge networking with explicit port-forwards
(slightly more isolated, but every service then has to bind
0.0.0.0 inside the container to be reachable through the
forward), set NETWORK=bridge:
NETWORK=bridge docker/run
# Inside the container:
forgather server -H 0.0.0.0
mkdocs serve --host 0.0.0.0
tensorboard --bind_all
The bridge mode forwards the host side to 127.0.0.1 only by
default (same exposure as the host-networking case). For LAN
access from another machine, set HOST_BIND=0.0.0.0 alongside
NETWORK=bridge.
Binding outside loopback? The server refuses to bind a non-loopback host (
0.0.0.0, LAN IP, public IP) without TLS unless you pass--insecure. Provision HTTPS withforgather tls initfirst — see TLS for the single-host setup and the Docker runtime image'sTLS_INIT=1convenience flag.
Common overrides¶
# CPU-only:
GPUS=none docker/run
# Specific GPUs:
GPUS='"device=0,1"' docker/run
# Mount additional host paths (e.g. scratch / dataset volumes):
EXTRA_MOUNTS="-v /scratch:/scratch" docker/run
# Forward extra ports (Vite dev server, etc.):
EXTRA_PORTS="-p 5173:5173" docker/run
# Build / run a tagged variant:
docker/build forgather-dev:experiment
IMAGE=forgather-dev:experiment docker/run
For more detail — full CLI / env-var reference, the runtime (distributable) image, multi-node setup, and the release-testing workflow against a freshly cloned tree — see the Docker images reference.
Next: your first training run¶
With Forgather installed, head to Getting Started → Your first training run to train a tiny Llama on TinyStories.