run-on-slurm

Installation
SKILL.md

Run Megatron-LM on SLURM

Prerequisites

  • A SLURM cluster login with submission rights to a GPU partition.
  • Megatron-LM checked out on a filesystem visible to all nodes in the allocation (NFS, Lustre, or similar). All nodes must reach the same paths for code, data, checkpoints, and output.
  • uv installed; run uv sync --extra training --extra dev (or --extra lts) on the worktree once before submission so the .venv is materialized and visible to every node.

Minimal sbatch script

Save as run_megatron.slurm in the worktree:

#!/bin/bash
#SBATCH --job-name=megatron
#SBATCH --account=<SLURM_ACCOUNT>
#SBATCH --partition=<SLURM_PARTITION>
#SBATCH --nodes=<NODES>
#SBATCH --ntasks-per-node=1
Related skills
Installs
1
GitHub Stars
16.2K
First Seen
11 days ago