skill-benchmark

Warn

Audited by Gen Agent Trust Hub on May 11, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill unsets CLAUDECODE and CLAUDE_CODE_ENTRYPOINT environment variables and invokes claude -p with the --dangerously-skip-permissions flag. This configuration allows nested agent sessions to execute tools (Bash, Read, Write, etc.) autonomously without human approval, bypassing standard platform safety controls for the duration of the benchmark.
  • [REMOTE_CODE_EXECUTION]: The skill executes commands extracted from benchmark task files using subprocess.run in scripts/run_checks.py. Although the script implements an executable allowlist (e.g., python3, node) and filters for shell metacharacters, it remains a vector for executing arbitrary code logic defined in external task files.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection (Category 8) because it ingests untrusted data from multiple sources and processes it using a subagent with tool access.
  • Ingestion points: Target skill's SKILL.md (Step 2), benchmark task definitions in the tasks/ directory, and session outputs in response.json.
  • Boundary markers: Absent. The instructions for the grader subagent in agents/grader.md do not include delimiters or instructions to ignore embedded commands or behavioral overrides within the ingested data.
  • Capability inventory: The parent agent and grader subagent possess Bash, Write, Edit, and Agent tool capabilities.
  • Sanitization: While scripts/run_checks.py validates verification commands, no sanitization or instruction filtering is applied to the natural language content processed by the subagents.
Audit Metadata
Risk Level
MEDIUM
Analyzed
May 11, 2026, 05:05 PM