slurm

Installation
SKILL.md

Slurm Cluster Management

Help developers submit, manage, and troubleshoot GPU-accelerated workloads on SRP's Slurm clusters. Supports training, inference, and data processing jobs using Apptainer containers.

When to Use This Skill

Use this skill when:

  • Submitting GPU training or inference jobs to Slurm clusters
  • Managing running or queued jobs
  • Monitoring cluster resources and job status
  • Debugging job failures or performance issues
  • Writing Slurm job scripts with Apptainer containers
  • Checking GPU availability and utilization

SRP Slurm Clusters

Oracle OKE Cluster (H100 GPUs)

SSH Access:

Related skills
Installs
11
GitHub Stars
2
First Seen
Jan 22, 2026