skill-system-debug
Fail
Audited by Gen Agent Trust Hub on Apr 11, 2026
Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The
bisectoperation inscripts/debug_tool.pyaccepts atestcommand string which is split into arguments and passed directly togit bisect run. This utility executes the provided command repeatedly on different commits, providing a vector for arbitrary command execution within the agent's environment. - [DATA_EXFILTRATION]: The
traceandcompareoperations inscripts/debug_tool.pyprovide unrestricted filesystem access. Thetracefunction parses imports and reads the contents of referenced files, while thecomparefunction recursively reads and diffs directory contents. Neither function implements path validation or sandboxing, allowing an attacker to read sensitive files like SSH keys, AWS credentials, or environment variables by providing absolute or relative paths outside the project scope. - [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests untrusted data from the codebase and presents it to the agent without isolation.
- Ingestion points:
_run_grep_search,_run_behavior_search, and_parse_importsinscripts/debug_tool.pyread content from files across the project directory. - Boundary markers: No delimiters or 'ignore' instructions are used to wrap the gathered content when returning it to the agent.
- Capability inventory: The skill possesses high-privilege capabilities including command execution and broad filesystem reading.
- Sanitization: Content read from the filesystem is directly incorporated into the JSON output (e.g., in the 'hypotheses' or 'key_diffs' fields) without any escaping or filtering of potentially malicious instructions embedded in the code or data files.
Recommendations
- AI detected serious security threats
Audit Metadata