inference-optimization

Installation
SKILL.md

Inference Optimization Skill

Making AI inference faster and cheaper.

Performance Metrics

@dataclass
class InferenceMetrics:
    ttft: float   # Time to First Token (seconds)
    tpot: float   # Time Per Output Token
    throughput: float  # Tokens/second
    latency: float     # Total time

Model Optimization

Quantization

Installs
6
GitHub Stars
4
First Seen
Mar 10, 2026
inference-optimization — doanchienthangdev/omgkit