rocm-kernels

Installation
SKILL.md

ROCm Triton Kernels for Diffusers & Transformers

This skill provides patterns and guidance for developing optimized Triton kernels targeting AMD GPUs (MI355X, R9700) on ROCm, for use with HuggingFace diffusers (LTX-Video, SD3, FLUX) and transformers libraries.

Quick Start

Diffusers (LTX-Video)

Inject optimized kernels into LTX-Video pipeline:

import os
os.environ['TRITON_HIP_USE_BLOCK_PINGPONG'] = '1'
os.environ['TRITON_HIP_USE_ASYNC_COPY'] = '1'
Installs
9
GitHub Stars
702
First Seen
Apr 24, 2026
rocm-kernels — huggingface/kernels