behavioral-evals
Installation
SKILL.md
Behavioral Evals
Overview
Behavioral evaluations (evals) are tests that validate the agent's decision-making (e.g., tool choice) rather than pure functionality. They are critical for verifying prompt changes, debugging steerability, and preventing regressions.
[!NOTE] Single Source of Truth: For core concepts, policies, running tests, and general best practices, always refer to evals/README.md.
🔄 Workflow Decision Tree
Installs
211
Repository
google-gemini/gemini-cliGitHub Stars
105.6K
First Seen
Mar 24, 2026
Security Audits