claude-skills-benchmark
Pass
Audited by Gen Agent Trust Hub on Mar 19, 2026
Risk Level: SAFENO_CODECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [SAFE]: No high-severity security risks or malicious behaviors were identified. The skill serves as a guideline for evaluating other skills and does not attempt to exfiltrate data or bypass safety filters.\n- [NO_CODE]: The skill consists entirely of markdown instructions and methodology. It does not include any accompanying scripts, binaries, or external dependencies.\n- [COMMAND_EXECUTION]: The skill references the use of the
mise test:skills-qualitytask runner command and the/benchmark-skillsagent command. These are intended tools for executing the benchmarking process and do not exhibit suspicious behavior.\n- [PROMPT_INJECTION]: The skill defines a process for analyzing other Agent Skills (ingestion point: SKILL.md), which creates an inherent indirect prompt injection surface. Capability inventory: execution ofmiseand agent commands. Boundary markers and sanitization are not explicitly defined in the methodology, but the risk is documented as a low-severity surface consistent with the skill's primary purpose.
Audit Metadata