agentic-harness-design
Agentic Harness Design
Patterns for building multi-agent systems that produce high-quality outputs on long, complex tasks—covering generator/evaluator loops, context management, and task decomposition.
Core Architecture: Planner → Generator → Evaluator
Three-agent split addresses the two main failure modes of solo agents:
- Context degradation — models lose coherence as the context window fills; some exhibit "context anxiety" and wrap up prematurely.
- Self-evaluation bias — agents reliably over-praise their own output; separating producer from judge is the key lever.
Planner takes a short user prompt (1–4 sentences) and expands it into a full product spec. Keep it at the level of deliverables and high-level architecture—not granular implementation details, which cascade errors downstream. Ask the planner to identify opportunities to weave AI-native features into the spec.
Generator implements against the spec. Works in sprints (one feature at a time) when the model needs scaffolding. Stronger models can run as a single continuous session with SDK-level compaction handling context growth. Self-evaluates at the end of each sprint before handoff.
Evaluator grades the generator's output against agreed criteria. Uses a live browser tool (e.g. Playwright MCP) to interact with the running app rather than scoring static screenshots. Produces specific, actionable findings.
Sprint Contracts
More from editframe/skills
video-analysis
Analyze video files using ffprobe, mp4dump, and jq. Use when investigating video samples, keyframes, MP4 box structure, codec info, packet timing, or debugging video playback issues.
93visual-thinking
Create visual analogies by mapping relational structure from familiar domains onto unfamiliar concepts using spatial relationships to make abstract patterns concrete. Covers static diagrams AND animated video storytelling (camera choreography, race comparisons, pacing). Use when explaining complex concepts, creating analogies, designing diagrams, creating explainer animations, or revealing system structure.
84css-animations
CSS animation fill-mode requirements for Editframe timeline system. Use when creating CSS animations, debugging flashing/flickering issues, or when user mentions animation problems, fade effects, slide effects, or sequential animations.
82threejs-compositions
Integrate Three.js 3D scenes into Editframe compositions via addFrameTask. Scenes are pure functions of time, fully scrubable, and renderable to MP4. Use when creating 3D animations, WebGL content in compositions, or integrating Three.js with Editframe's timeline system.
81editor-gui
Build video editing interfaces using Editframe's GUI web components. Assemble timeline, scrubber, filmstrip, preview, and playback controls like lego bricks. Use when creating video editors, editing tools, or when user mentions timeline, scrubber, preview, playback controls, trim handles, or wants to build editing UIs.
77elements-new-package
Create a new @editframe/* workspace package in the elements monorepo and publish it to npm.
77