addon-llm-judge-evals

Installation
SKILL.md

Add-on: LLM Judge Evals

Use this skill when you need qualitative evaluation (clarity, domain fit, UX coherence, docs quality) in addition to deterministic checks.

Compatibility

  • Works with all stacks.
  • Best paired with addon-deterministic-eval-suite.

Inputs

Collect:

  • JUDGE_BACKEND: auto | langchain | google-adk (default auto).
  • JUDGE_MODEL: model id to run scoring.
  • JUDGE_TIMEOUT_SECONDS: default 60.
  • JUDGE_MAX_RETRIES: default 2.
  • JUDGE_TEMPERATURE: default 0.
  • JUDGE_FAIL_ON_BACKEND_MISMATCH: yes | no (default yes).
  • JUDGE_RUBRIC_MODE: product | security | developer-experience | custom.
Related skills
Installs
7
First Seen
Mar 2, 2026