cekura-metric-improvement
Cekura Metric Improvement (Labs Workflow)
Purpose
Guide the metric improvement cycle: identify misaligned metric results, leave structured feedback, run the labs improvement pipeline, and validate changes. This workflow transforms metric quality from initial draft to production-ready through systematic iteration.
Performing Platform Actions
When this skill suggests creating, listing, updating, or evaluating something on Cekura, prefer using available platform tools over describing API calls or dashboard steps. In Claude Code with the Cekura plugin installed, these tools are auto-configured and handle authentication, parameter validation, and error handling for you. Fall back to direct API endpoints or dashboard guidance only when no tools are available in the current session.
Manual Fix First, Then Labs
When metrics have systemic issues (high false-fail rates), do NOT jump straight to labs feedback. Instead:
- Read failure explanations and categorize root causes (e.g., cross-pollination from other flows, extra_questions flagged, end-of-call protocol violations, should-be-N/A cases)
- Write manual prompt fixes targeting the dominant failure categories — add SCOPE & FOCUS, DO NOT FLAG, narrow FAILURE CONDITIONS
- PATCH the updated descriptions via API
- Re-evaluate a sample of 20-30 calls per metric to validate the fixes
- THEN use labs feedback for remaining edge cases that manual fixes didn't catch
More from cekura-ai/cekura-skills
cekura-metric-design
>
9cekura-coordinator
>
9cekura-eval-design
>
9cekura-fixing-prod-issues
Debugs a failing production call, reproduces the bug with Cekura evaluators, implements a fix, verifies it, runs regression tests, then raises a PR with evidence. Use when the user wants to fix a production call bug, investigate a failing prod call, reproduce and fix a production issue, run regression tests before a PR, or says things like "fix this prod call issue", "debug and fix call ID", "test my fix against prod scenarios", "reproduce this production bug", or "regression test before raising PR".
9cekura-self-improving-agent
>
9cekura-onboarding
>
8