evals-create-suite

Installation
SKILL.md

Create an Eval Suite

Overview

Eval suites live in dedicated kbn-evals-suite-<name> packages. Each suite is a self-contained Playwright project that uses the evaluate fixture from @kbn/evals to run LLM experiments with datasets, tasks, and evaluators.

Inputs to Collect

  • Suite name (kebab-case, e.g. my-feature)
  • Parent directory under x-pack/ (e.g. x-pack/platform/packages/shared/ai-infra/ or x-pack/solutions/security/test/)
  • Owner GitHub team handle (e.g. @elastic/appex-ai-infra)
  • Group (platform, security, observability, search)
  • Visibility (shared or private)
  • Whether custom fixtures are needed (chat client, esArchiver, supertest, etc.)

Do NOT Use node scripts/scout.js generate

Eval suites are not standard Scout test configs. The Scout generator creates test/scout/ directories that are picked up by Scout's CI discovery glob -- this will break because evals configs use createPlaywrightEvalsConfig (not createPlaywrightConfig) and contain non-JS files (like .text prompt files) that Playwright cannot parse.

Related skills
Installs
1
Repository
elastic/kibana
GitHub Stars
21.1K
First Seen
4 days ago