Vram-GPU-OOM

Installation
SKILL.md

GPU OOM Retry Pattern

Simple pattern for sharing GPU memory across multiple services without coordination.

Strategy

  1. All services try to load models normally
  2. Catch OOM errors
  3. Wait 30-60 seconds (for other services to auto-unload)
  4. Retry up to 3 times
  5. Configure all services to unload quickly when idle

Python (PyTorch / Transformers)

import torch
import time
Related skills
Installs
11
GitHub Stars
6
First Seen
Jan 26, 2026