eval-driven-development
Pass
Audited by Gen Agent Trust Hub on May 18, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill provides educational content and a mental model for LLM evaluation. It does not attempt to override agent behavior or bypass safety guidelines.
- [SAFE]: No sensitive data access, hardcoded credentials, or network exfiltration patterns were detected. All external URLs point to legitimate academic sources, official documentation, or trusted software repositories.
- [SAFE]: The skill does not include any obfuscated code, hidden characters, or encoded payloads.
- [SAFE]: The skill does not perform any remote code execution or install unverifiable dependencies. References to external frameworks (like OpenAI Evals or Anthropic Cookbook) are for documentation purposes.
- [SAFE]: The 'allowed-tools' section in the frontmatter restricts access to basic utilities ('Read', 'Grep'), ensuring the skill operates within a safe and scoped environment.
Audit Metadata