agent-engineering
Agent Engineering
Coverage
- The discipline's relationship to and distinction from prompt engineering, harness engineering, and traditional distributed systems
- The four pillars: architecture and lifecycle management, task decomposition and context management, multi-agent coordination patterns, production reliability
- The lifecycle state machine: claim → execute → verify → commit → release, with the extended research/plan/review variant for complex workflows
- Context health states (ok / degraded / compact / exhausted) and their budget thresholds, plus the six observable signals of context rot
- Multi-agent coordination patterns: orchestrator/worker, fan-out/merge, evaluator/optimiser, consensus/fusion, sequential chain, hybrid — and the cost/reliability trade-offs of each
- The two-pass pattern (audit then fresh-context implement) for reliability-critical workflows
- The eight named coordination failure modes (task stealing, context contamination, merge conflicts, silent stall, brief rot, result injection, context bloat, double-commit) with detection and mitigation
- The six production reliability requirements: observability, cost budgets, idempotency, failure recovery, safety caps, claim locks — and what breaks when each is missing
- The delegation decision framework: six gates with overhead crossover analysis (≈1000-token minimum subagent overhead), batch crossover at four tasks for cheap-model fan-out
- The most common anti-patterns (God Agent, prompt-as-architecture, memory-persisted state, runaway loop, telephone-game briefs, ghost claim) and corrective actions
- The production readiness audit checklist and the staged-rollout verification workflow (10% → 50% → 100% budget)
Philosophy
A single LLM prompt produces an answer. A system of LLMs produces a workflow that survives session boundaries, crashes, model variance, budget exhaustion, and adversarial input. Agent engineering is the discipline of building the second from the first.
More from jacob-balslev/skill-graph-skills
ai-native-development
Use when reasoning about agent autonomy levels, designing auto-improve loops, evaluating AI-generated code quality, or measuring agent productivity in an LLM-assisted codebase. Covers Karpathy's three eras of software (1.0 explicit / 2.0 learned / 3.0 natural-language), the vibe-coding-vs-agentic-engineering distinction, the 0–5 autonomy slider with task-type recommendations, the one-asset / one-metric / one-time-box AutoResearch loop, Software 3.0 productivity metrics, and the documented quality regressions of ungated AI-generated code (the 'vibe hangover'). Do NOT use for choosing a specific autonomy-loop topology (use `agent-engineering`), for the per-prompt authoring discipline (use `prompt-craft`), or for reviewing the AI-generated code that comes out of a Software 3.0 workflow (use `code-review`).
4ideation
Use when generating a wide range of solution concepts before converging on a direction, running structured idea-generation sessions, breaking out of solution fixation, or moving from divergent to convergent selection with explicit criteria. Do NOT use for collaborative engineering domain discovery (event-storming), solo deep technical design, or making final go/no-go investment decisions — those require different methods.
4frontend-architecture
Use when organizing a frontend codebase — module boundaries, component layering, state ownership, data-flow direction, and the separation between feature code and shared primitives. Do NOT use for visual design decisions, specific framework migration tactics, or backend API contract design.
4color-system-design
Use when designing a color system — palette construction, semantic color tokens, WCAG contrast ratios, perceptual uniformity in OKLCH/LCH, and light/dark mode parity. Do NOT use for single brand-color picks, runtime theme-switching mechanics, or non-color design tokens.
4form-ux-architecture
Use when designing or auditing form structure and validation UX: field grouping, required vs optional inputs, validation timing, client/server validation split, submission lifecycle, recovery, multi-step forms, and high-risk data entry. Do NOT use for labels and announcements alone (use `a11y`), validation-message wording (use `microcopy`), API schema design (use `api-design`), or stored data modeling (use `data-modeling`).
4constraint-awareness
Use when prioritizing work in an AI-assisted codebase, designing agent autonomy levels, deciding what to automate vs keep manual, or evaluating whether a process/tool adds value. Covers Theory of Constraints for AI-era engineering: cheap code production, human review/validation/decision bottlenecks, Five Focusing Steps, constraint-aware process design, attention audits, and constraint-shift modeling. Do NOT use for task-effort estimation, backlog scoring with RICE/WSJF/ICE, or routing a task to a specific model.
4