skill-infrastructure
Skill Infrastructure
Coverage
- The library-as-database mental model: skill-infrastructure as the linter, integrity checker, and query planner for a SKILL.md library
- Why deterministic, zero-LLM tooling is mandatory for skill libraries (trustworthy enough for CI gates; LLM-based health checks are circular)
- The five categories of skill-health tooling: (1) inventory and frontmatter validation, (2) protocol consistency, (3) conflict detection (overlap, imperative, code duplication, heading overlap), (4) routing health (gap analysis, miss tracking), (5) drift detection (truth-source hashing, mirror parity)
- Eval quality patterns: minimum eval threshold per skill, the contradiction-check eval type, the negative-expectation requirement, valid eval-type taxonomy
- Imperative conflict detection: the same-target-opposite-polarity rule, three-check false-positive suppression, when conflicts indicate scope-tightening vs merging
- Routing gap analysis: how to read a "routing-misses" log, how to distinguish keyword gaps from skill content gaps, signal-hygiene rules to suppress noise
- Maintenance workflows: when to run a full health check, the order in which to run the categories, what to fix before what
- Anti-patterns that cause invisible decay: dirty-tree manifest writes, deletion-as-conflict-resolution, eval-renumbering during cleanup, scope misuse to mask threshold violations
- The verification gate before any batch skill commit: every category clean, every new skill meets eval minimums, every routing change is reflected in the manifest
- Package and workspace-root integrity: the npm CLI entrypoint must dispatch to the same scripts as local development while resolving schemas from the package and skills/manifests from the caller workspace
Philosophy
A skill library is only as useful as its worst skill. When agents load stale, conflicting, poorly-routed, or mirror-drifted skills, they get worse at tasks — not better. A skill library at scale (50+, certainly 200+) decays invisibly: eval counts drift below minimums, keyword maps miss whole product areas, mirror copies fall out of sync, and two skills start giving opposite instructions for the same function.
More from jacob-balslev/skills
layout-composition
Use when deciding responsive page or screen structure: section order, scan pattern, grid/flex composition, breakpoints, viewport hierarchy, responsive media, and density. Do NOT use for user-goal decomposition (use `task-analysis`), navigation taxonomy (use `information-architecture`), visual polish (use `visual-design-foundations`), or component/token contracts (use `design-system-architecture`).
8context-graph
Use when designing or auditing the multi-graph context architecture of an AI-coding workspace: skill graph, document routing graph, memory index, script registry, and the cross-graph edges between them. Covers edge typing, orphan detection, connectivity health, deterministic graph synthesis signals, change-propagation checks, and drift or hub-and-spoke anti-patterns. Do NOT use for authoring one SKILL.md (use `skill-scaffold`), validating one skill (use `graph-audit`), live routing decisions (use `skill-router`), context-window budgeting (use `context-window`), or session load/drop choices (use `context-management`).
8visual-design-foundations
Use when designing or auditing visual craft: color palette, typography, spacing, elevation, rhythm, density, visual hierarchy, brand fit, contrast intent, and motion feel. Do NOT use for sign-system meaning (use `semiotics`), token/component architecture (use `design-system-architecture`), responsive structure (use `layout-composition`), or accessibility compliance (use `a11y`).
7project-knowledge-extraction
Use when extracting durable project knowledge from code, docs, issues, incidents, reports, screenshots, or conversations into reusable context such as skills, ADRs, glossaries, context docs, or memory. Do NOT use for writing a new skill contract (use `skill-scaffold`), maintaining library tooling (use `skill-infrastructure`), or generic documentation polish (use `documentation`).
6problem-framing
Use when a team is converging on solutions before agreeing on the problem, when a brief reads as a feature request, when symptoms and root needs are tangled, or when assumptions need surfacing before design work proceeds. Do NOT use for code-level bug triage, runtime failure diagnosis, or root-cause analysis of system errors — those are engineering investigation tasks, not design problem framing.
6ai-native-development
Use when reasoning about agent autonomy levels, designing auto-improve loops, evaluating AI-generated code quality, or measuring agent productivity in an LLM-assisted codebase. Covers Karpathy's three eras of software (1.0 explicit / 2.0 learned / 3.0 natural-language), the vibe-coding-vs-agentic-engineering distinction, the 0–5 autonomy slider with task-type recommendations, the one-asset / one-metric / one-time-box AutoResearch loop, Software 3.0 productivity metrics, and the documented quality regressions of ungated AI-generated code (the 'vibe hangover'). Do NOT use for choosing a specific autonomy-loop topology (use `agent-engineering`), for the per-prompt authoring discipline (use `prompt-craft`), or for reviewing the AI-generated code that comes out of a Software 3.0 workflow (use `code-review`).
6