ai-llm
Installation
SKILL.md
LLM Development & Engineering — Complete Reference
Build, evaluate, and deploy LLM systems with modern production standards.
This skill covers the full LLM lifecycle:
- Development: Strategy selection, dataset design, instruction tuning, PEFT/LoRA fine-tuning
- Evaluation: Automated testing, LLM-as-judge, metrics, rollout gates
- Deployment: Serving handoff, latency/cost budgeting, reliability patterns (see
ai-llm-inference) - Operations: Quality monitoring, change management, incident response (see
ai-mlops) - Safety: Threat modeling, data governance, layered mitigations (NIST AI RMF: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf)
Modern Best Practices (2026):
- Treat the model as a component with contracts, budgets, and rollback plans (not "magic").
- Separate core concepts (tokenization, context, training vs adaptation) from implementation choices (providers, SDKs).
- Gate upgrades with repeatable evals and staged rollout; avoid blind model swaps.
- Cost-aware engineering: Measure cost per successful outcome, not just cost per token; design tiering/caching early.
- Security-by-design: Threat model prompt injection, data leakage, and tool abuse; treat guardrails as production code.