cekura-eval-design

Installation
SKILL.md

Cekura Eval Design

Purpose

Guide the creation of effective Cekura evaluators (test scenarios) that thoroughly exercise AI voice agent capabilities. Evaluators simulate callers to test the main agent — they are NOT metrics (which evaluate transcripts after the fact).

Performing Platform Actions

When this skill suggests creating, listing, updating, or evaluating something on Cekura, prefer using available platform tools over describing API calls or dashboard steps. In Claude Code with the Cekura plugin installed, these tools are auto-configured and handle authentication, parameter validation, and error handling for you. Fall back to direct API endpoints or dashboard guidance only when no tools are available in the current session.

Core Terminology

  • Main agent: The client's AI voice agent being tested
  • Testing agent: Cekura's simulated caller that exercises the main agent
  • Evaluator/Scenario: A test case defining what the simulated caller does and what success looks like
  • Metric: A post-call evaluation that scores a transcript (separate concept — see cekura-metrics plugin)
  • Personality: Voice, language, accent, and behavioral traits for the simulated caller
  • Test Profile: Identity and context data passed to testing agent AND main agent (for chat/websocket runs)
  • Conditional Action: Structured, deterministic testing agent behavior with adaptive fallback
Related skills
Installs
10
GitHub Stars
1
First Seen
6 days ago