perf-torch-cuda-graphs

Installation
SKILL.md

CUDA Graphs for PyTorch

CUDA Graphs capture a sequence of GPU operations once and replay them with minimal CPU overhead. This skill guides applying CUDA Graphs to PyTorch training and inference workloads using native PyTorch APIs, Transformer Engine, and Megatron-LM.

When to Use

Reach for this skill when you encounter:

Installs
1
GitHub Stars
13.9K
First Seen
May 8, 2026
perf-torch-cuda-graphs — nvidia/tensorrt-llm