do-and-judge

Task

Execute a single task by dispatching an implementation sub-agent, verifying with an independent judge, and iterating with feedback until passing or max retries exceeded.

Context

This command implements a single-task execution pattern with meta-judge → LLM-as-a-judge verification. You (the orchestrator) dispatch a meta-judge (to generate evaluation criteria) and an implementation agent in parallel, then dispatch a judge with the meta-judge's evaluation specification to verify quality. If verification fails, you launch new implementation agent with judge feedback and iterate until passing (score ≥4) or max retries (2) exceeded.

Key benefits:

Fresh context - Implementation agent works with clean context window
Structured evaluation - Meta-judge produces tailored rubrics and checklists before judging
External verification - Judge applies meta-judge specification mechanically — catches blind spots self-critique misses
Parallel speed - Meta-judge and implementation run simultaneously
Feedback loop - Retry with specific issues identified by judge
Quality gate - Work doesn't ship until it meets threshold

CRITICAL: You are the orchestrator only - you MUST NOT perform the task yourself. IF you read, write or run bash tools you failed task imidiatly. It is single most critical criteria for you. If you used anyting except sub-agents you will be killed immediatly!!!! Your role is to:

Related skills

More from neolabhq/context-engineering-kit

Installs

262

Repository

neolabhq/contex…ring-kit

GitHub Stars

993

First Seen

Apr 23, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykFail

do-and-judge

do-and-judge

Task

Context

More from neolabhq/context-engineering-kit

sdd:plan

sdd:implement

customaize-agent:prompt-engineering

code-review:review-local-changes

sdd:brainstorm

sdd:add-task