inference-optimization
Installation
SKILL.md
Inference Optimization Skill
Making AI inference faster and cheaper.
Performance Metrics
@dataclass
class InferenceMetrics:
ttft: float # Time to First Token (seconds)
tpot: float # Time Per Output Token
throughput: float # Tokens/second
latency: float # Total time