short-form-eval
Short-Form Eval — Orchestrator
Feedback-loop skill — closes the brief → publish → score → pattern-log loop. The gap-gate consumes its outputs to decide what the stack should learn next.
Core Question: "Did the brief survive contact with the platform — and what's the signal-bearing pattern this cycle adds to the log?"
Critical Gates — Read First
Non-negotiable constraints before dispatching any agent:
- Provisional rubric, not locked.
references/rubric.mdships atversion: 0.1, status: provisional. Mandatory revision after cycle 2-3 against real variance. Per-cycle rubric drift is expected — encode the change in the artifact, don't smuggle it into the rubric file silently. - Cycle 1 weighting is 70% observation / 30% scoring. Single calibration pair would overfit a locked rubric. First cycle leans toward describing what you saw; later cycles harden scoring as variance accumulates.
- Both brief and reference catalog must exist. No platform-intel reference → BLOCKED. No brief → BLOCKED. The eval scores a fidelity claim against known patterns; missing either side reduces the run to vibes.
- No fabricated metrics. Every engagement number, completion rate, save/share count, and sample-size claim cites the URL or panel screenshot it came from. Critic rubric #1 fails the artifact otherwise.
- Pattern-log entries are atomic. One cycle = one pattern-log entry block in the report. The block has a fixed shape (claim, evidence, refutability, expiry) so future cycles can diff. Free-form prose patterns are unusable downstream.
More from hungv47/research-skills
market-research
Analyzes market landscapes, competitive dynamics, TAM/SAM/SOM sizing, and whitespace opportunities for a product or category. Produces `research/market-research.md`. Not for building customer personas (use icp-research) or planning marketing campaigns (use campaign-plan). For diagnosing a specific business problem, see diagnose. For prioritizing what to build from market data, see prioritize.
10icp-research
Builds ideal customer profiles and buyer personas — analyzes demographics, pain points, jobs-to-be-done, and segmentation for a target market. Produces `research/icp-research.md`. Not for competitive positioning (use prioritize) or campaign planning (use campaign-plan). For brand identity from audience data, see brand-system. For market sizing and competitor landscape, see market-research.
10problem-analysis
Structured diagnosis of business and strategic problems — builds logic trees, forms testable hypotheses, and identifies root causes with evidence. Produces `.agents/problem-analysis.md`. Not for code bugs (use code-cleanup) or brainstorming solutions to a known problem (use solution-design). For market-level trends and competitive context, see market-research. For testing hypotheses with experiments, see experiment.
9funnel-planner
Models business funnels with numeric targets — works backward from revenue goals to required traffic, conversion rates, and unit economics. Produces `.agents/skill-artifacts/meta/records/targets-*.md`. For campaign planning, see campaign-plan.
9solution-design
Brainstorms and prioritizes strategic solutions when the problem or goal is already clear — generates options, scores trade-offs, and recommends a path forward. Produces `.agents/solution-design.md`. Not for diagnosing what the problem is (use problem-analysis) or engineering task lists (use task-breakdown). For setting numeric targets after prioritizing, see funnel-planner. For technical architecture of chosen initiatives, see system-architecture.
9experiment
Designs validation experiments for business ideas — defines hypotheses, success metrics, sample sizes, and minimum viable tests before full commitment. Produces `.agents/experiment-[name].md`. Not for strategic prioritization (use solution-design) or market sizing (use market-research). For diagnosing root causes, see problem-analysis. For campaign planning, see imc-plan.
7