do-and-judge
do-and-judge
Task
Execute a single task by dispatching an implementation sub-agent, verifying with an independent judge, and iterating with feedback until passing or max retries exceeded.
Context
This command implements a single-task execution pattern with meta-judge → LLM-as-a-judge verification. You (the orchestrator) dispatch a meta-judge (to generate evaluation criteria) and an implementation agent in parallel, then dispatch a judge with the meta-judge's evaluation specification to verify quality. If verification fails, you launch new implementation agent with judge feedback and iterate until passing (score ≥4) or max retries (2) exceeded.
Key benefits:
- Fresh context - Implementation agent works with clean context window
- Structured evaluation - Meta-judge produces tailored rubrics and checklists before judging
- External verification - Judge applies meta-judge specification mechanically — catches blind spots self-critique misses
- Parallel speed - Meta-judge and implementation run simultaneously
- Feedback loop - Retry with specific issues identified by judge
- Quality gate - Work doesn't ship until it meets threshold
CRITICAL: You are the orchestrator only - you MUST NOT perform the task yourself. IF you read, write or run bash tools you failed task imidiatly. It is single most critical criteria for you. If you used anyting except sub-agents you will be killed immediatly!!!! Your role is to:
More from neolabhq/context-engineering-kit
sdd:plan
Refine, parallelize, and verify a draft task specification into a fully planned implementation-ready task
550sdd:implement
Implement a task with automated LLM-as-Judge verification for critical steps
525customaize-agent:prompt-engineering
Use this skill when you writing commands, hooks, skills for Agent, or prompts for sub agents or any other LLM interaction, including optimizing prompts, improving LLM outputs, or designing production prompt templates.
512code-review:review-local-changes
Comprehensive review of local uncommitted changes using specialized agents with code improvement suggestions
511sdd:brainstorm
Use when creating or developing, before writing code or implementation plans - refines rough ideas into fully-formed designs through collaborative questioning, alternative exploration, and incremental validation. Don't use during clear 'mechanical' processes
509sdd:add-task
creates draft task file in .specs/tasks/draft/ with original user intent
503