genie-benchmark-evaluator
Installation
SKILL.md
Genie Benchmark Evaluator
Evaluates Genie Space responses using a multi-dimensional 3-layer judge architecture with MLflow tracking. Supports both Databricks Job and inline evaluation modes.
When to Use This Skill
- Scoring Genie Space accuracy against benchmark questions
- Comparing evaluation results across optimization iterations
- Running post-deploy verification after bundle deployment
- Testing repeatability of Genie SQL generation