helion-jagged-and-autotuning

Installation
SKILL.md

Helion: Jagged Tensors & Autotuning/Config Management

Overview

Two high-value Helion areas that are easy to get wrong or miss entirely:

  1. Ragged/jagged tensorshl.jagged_tile() iterates variable-length inner dimensions with implicit masking, so you never hand-build masks.
  2. Autotuning & config management — autotuning is slow; Helion has a layered system (on-disk cache → saved configs → AOT heuristics) for tuning once and reusing results keyed by GPU architecture and input shape.

The single most-missed feature: AOT heuristics (helion.experimental.aot_kernel

  • python -m helion.experimental.aot_runner) give zero-cost per-shape config selection at runtime, with automatic compute-capability fallback. If a request mentions "many GPUs," "many shapes," or "don't want to re-tune at deploy," reach for AOT, not just the cache.

Installs
7
First Seen
10 days ago
helion-jagged-and-autotuning — d-laub/dlaub-togo