agent-eval

Pass

Audited by Gen Agent Trust Hub on May 19, 2026

Risk Level: SAFE

Full Analysis

[SAFE]: No security issues were identified in the skill. The instructions and metadata are purely informational, providing templates for agent evaluation tasks.- [COMMAND_EXECUTION]: The skill documents the use of a tool that executes shell commands (e.g., pytest, npm run build) provided in YAML configuration files to judge the success of an agent's code. This is a primary and expected feature of the benchmarking tool.

Audit Metadata

Risk Level

SAFE

Analyzed

May 19, 2026, 06:50 AM

Security Audit — agent-trust-hub — agent-eval