color-system-design
Color System Design
Coverage
A color system has three layers: a palette of raw color values (organized as scales — typically 9 to 12 steps from very light to very dark within a single hue), a set of semantic tokens that name colors by purpose rather than appearance (background-surface, text-primary, border-subtle, danger-emphasis), and a mapping that resolves semantic tokens to palette values per theme. The palette is the vocabulary; the semantic tokens are the contract that components consume; the mapping is what changes between themes.
Color spaces matter for scale construction. sRGB and HSL are non-perceptually-uniform — equal numeric steps in lightness produce unequal perceived lightness changes, with notable cliffs around yellow and teal. OKLCH and CIELAB are perceptually uniform color spaces where equal L (lightness) steps look equal regardless of hue, making them the appropriate space for generating scales. The CSS Color Module Level 4 specifies oklch() and lch() as first-class CSS color functions, and color-mix() in lch/oklch space produces predictable interpolation. The display-p3 gamut, supported by most modern displays, allows more saturated colors than sRGB in the green/red regions; declaring color-gamut: p3 or using p3 color functions delivers them while CSS color fallbacks handle sRGB-only displays.
Contrast is governed by WCAG 2.1, which specifies a contrast ratio computed from relative luminance: 4.5:1 for normal text (AA), 3:1 for large text (AA, defined as 18pt+ or 14pt+ bold), and 7:1 for normal text (AAA). APCA (Accessible Perceptual Contrast Algorithm), the working draft for WCAG 3, uses a different model that better predicts perceived text legibility, with thresholds expressed as Lc values (60 Lc for body text, 75 Lc for small text). Tooling should compute both, but WCAG 2.1 remains the legally referenced standard in most jurisdictions as of 2026.
Semantic tokens decouple the system from a single visual treatment. text-primary on a light theme might be near-black at L=0.15; on a dark theme it might be near-white at L=0.96. Both resolve to the same component token and meet the same contrast requirement against their respective backgrounds. The discipline is to define semantic tokens by intent and contrast pair (text-on-surface, text-on-emphasis), never by appearance (text-dark-gray).
Philosophy
Color choices are constraints applied to perception, not free decisions. Perceptual uniformity, contrast minima, color-blindness considerations (8% of men have some form of color vision deficiency), and gamut limits are real and observable. A palette that ignores them produces accessibility violations and uneven scales that designers have to compensate for with one-off tweaks.
Semantic tokens are worth the indirection because color meanings outlive specific values. "Danger" stays danger when the brand red shifts a few degrees; components keep working. Tying components directly to palette values turns every brand refresh into a code change.
More from jacob-balslev/skill-graph-skills
ai-native-development
Use when reasoning about agent autonomy levels, designing auto-improve loops, evaluating AI-generated code quality, or measuring agent productivity in an LLM-assisted codebase. Covers Karpathy's three eras of software (1.0 explicit / 2.0 learned / 3.0 natural-language), the vibe-coding-vs-agentic-engineering distinction, the 0–5 autonomy slider with task-type recommendations, the one-asset / one-metric / one-time-box AutoResearch loop, Software 3.0 productivity metrics, and the documented quality regressions of ungated AI-generated code (the 'vibe hangover'). Do NOT use for choosing a specific autonomy-loop topology (use `agent-engineering`), for the per-prompt authoring discipline (use `prompt-craft`), or for reviewing the AI-generated code that comes out of a Software 3.0 workflow (use `code-review`).
4ideation
Use when generating a wide range of solution concepts before converging on a direction, running structured idea-generation sessions, breaking out of solution fixation, or moving from divergent to convergent selection with explicit criteria. Do NOT use for collaborative engineering domain discovery (event-storming), solo deep technical design, or making final go/no-go investment decisions — those require different methods.
4frontend-architecture
Use when organizing a frontend codebase — module boundaries, component layering, state ownership, data-flow direction, and the separation between feature code and shared primitives. Do NOT use for visual design decisions, specific framework migration tactics, or backend API contract design.
4agent-engineering
Use when designing or evaluating a production AI agent system, choosing a multi-agent coordination pattern (orchestrator/worker, fan-out, consensus, sequential chain, evaluator/optimizer), diagnosing coordination failures (claim races, silent stalls, context contamination, runaway loops), or auditing whether an agent loop is truly production-ready. Covers the four pillars (architecture and lifecycle, task decomposition, coordination patterns, production reliability), the six reliability requirements (observability, cost budgets, idempotency, failure recovery, safety caps, claim locks), the delegation decision framework with overhead crossover, and the most common anti-patterns. Do NOT use for prompt wording (use `prompt-craft`), per-call tool efficiency (use `tool-call-strategy`), context-stack design within a single agent (use `context-engineering`), or runtime debugging of a deployed system (use `debugging`).
4form-ux-architecture
Use when designing or auditing form structure and validation UX: field grouping, required vs optional inputs, validation timing, client/server validation split, submission lifecycle, recovery, multi-step forms, and high-risk data entry. Do NOT use for labels and announcements alone (use `a11y`), validation-message wording (use `microcopy`), API schema design (use `api-design`), or stored data modeling (use `data-modeling`).
4constraint-awareness
Use when prioritizing work in an AI-assisted codebase, designing agent autonomy levels, deciding what to automate vs keep manual, or evaluating whether a process/tool adds value. Covers Theory of Constraints for AI-era engineering: cheap code production, human review/validation/decision bottlenecks, Five Focusing Steps, constraint-aware process design, attention audits, and constraint-shift modeling. Do NOT use for task-effort estimation, backlog scoring with RICE/WSJF/ICE, or routing a task to a specific model.
4