debug-loop
Warn
Audited by Gen Agent Trust Hub on Mar 20, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- [COMMAND_EXECUTION]: The skill's primary workflow involves executing arbitrary shell commands provided in the symptom description or generated during the debugging process. Phase 1 and Phase 4 explicitly instruct the agent to run reproduction commands, while Phase 2 encourages running real system and database commands (e.g.,
sqlite3) to test hypotheses. This lack of command validation or restriction poses a risk if malicious commands are provided as input. - [REMOTE_CODE_EXECUTION]: Phase 5 executes
npm test, which runs scripts defined in the localpackage.jsonfile. This can lead to the execution of untrusted or malicious code if the project's dependencies or test configurations are compromised. - [DATA_EXFILTRATION]: The capability to execute arbitrary commands like
sqlite3andgrepprovides broad access to local databases and the entire codebase. This access could be leveraged to read sensitive information if an attacker influences the command generation process. - [PROMPT_INJECTION]: The skill processes a user-provided
<symptom>argument without boundary markers or sanitization. This input directly influences the reproduction commands executed in Phase 1, creating a surface for indirect prompt injection where a malicious symptom description could trigger unintended system actions. - Ingestion points: The
<symptom>argument (SKILL.md). - Boundary markers: None present; the agent is instructed to run the command directly.
- Capability inventory: Subprocess execution for shell commands and test suites; file-system access via
grepand code modification (SKILL.md). - Sanitization: No validation or escaping of the symptom text or generated commands is performed.
Audit Metadata