simmer-judge-board
Simmer Judge Board
Dispatch a panel of judges, let them score independently, deliberate, and converge on consensus scores + a single ASI. The board's output is identical to a single judge's output — the orchestrator can't tell the difference.
Why a Board
A single judge has blind spots. It anchors on whatever it notices first, and its ASI reflects one perspective. Three judges with different lenses catch different things and challenge each other. The ASI that emerges from deliberation is stronger because blind spots get surfaced.
This matters most at plateaus — when a single judge keeps suggesting the same class of fix because it can't see the real bottleneck.
Context You Receive
The board receives the same context the single judge would receive (passed through from the orchestrator):
- Current candidate: full artifact text or key workspace files
- Criteria rubric: 2-3 criteria with descriptions of what 10/10 looks like
- Iteration number: which round this is
- Seed calibration (iteration 1+): original seed + iteration-0 scores
- Evaluator output (if evaluator mode): stdout/stderr from evaluator command
More from 2389-research/claude-plugins
omakase-off
This skill should be used as the entry gate for build/create/implement requests. Triggers on "build X", "create Y", "implement Z", "add feature", "try both approaches", "not sure which approach". Offers brainstorm-together or omakase (chef's choice parallel exploration) options. Detects indecision during brainstorming to offer parallel exploration.
15binary-re:static-analysis
Use when analyzing binary structure, disassembling code, or decompiling functions. Deep static analysis via radare2 (r2) and Ghidra headless - function enumeration, cross-references (xrefs), decompilation, control flow graphs. Keywords - "disassemble", "decompile", "what does this function do", "find functions", "analyze code", "r2", "ghidra", "pdg", "afl
15firebase-development:add-feature
This skill should be used when adding features to existing Firebase projects. Triggers on "add function", "create endpoint", "new tool", "add api", "new collection", "implement", "build feature". Guides TDD workflow with test-first development, security rules, and emulator verification.
15css-development:refactor
This skill should be used when refactoring existing CSS from inline styles or utility classes to semantic patterns. Triggers on "refactor CSS", "extract styles", "consolidate CSS", "convert inline", "clean up styles", "migrate to semantic". Transforms to semantic classes with dark mode and tests.
15binary-re:dynamic-analysis
Use when you need to run a binary, trace execution, or observe runtime behavior. Runtime analysis via QEMU emulation, GDB debugging, and Frida hooking - syscall tracing (strace), breakpoints, memory inspection, function interception. Keywords - "run binary", "execute", "debug", "trace syscalls", "set breakpoint", "qemu", "gdb", "frida", "strace", "watch memory
14binary-re:tool-setup
Use when reverse engineering tools are missing, not working, or need configuration. Installation guides for radare2 (r2), Ghidra, GDB, QEMU, Frida, binutils, and cross-compilation toolchains. Keywords - "install radare2", "setup ghidra", "r2 not found", "qemu missing", "tool not installed", "configure gdb", "cross-compiler
14