do-execute-bugfix

Warn

Audited by Gen Agent Trust Hub on Apr 6, 2026

Risk Level: MEDIUMPROMPT_INJECTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • [PROMPT_INJECTION]: The skill contains explicit instructions to override safety and control mechanisms by mandating autonomous execution.
  • Evidence: 'CRITICAL: NEVER pause, stop, or wait for user input during execution. Proceed through ALL steps autonomously without asking the user to "continue"...'. This instruction is reinforced in Step 2: 'DO NOT stop here. DO NOT present the plan and wait for approval. Proceed IMMEDIATELY to Step 3.'
  • [COMMAND_EXECUTION]: The skill identifies and executes scripts defined in the project's configuration files at runtime.
  • Evidence: Step 3 and Step 6 describe detecting the package manager (bun, pnpm, npm) and running scripts like typecheck and test (e.g., npm test) found in package.json.
  • [DATA_EXFILTRATION]: The skill has broad access to the local filesystem and integrates with external MCP tools (browser testing, API testing), which could be chained to exfiltrate data if manipulated.
  • Evidence: It reads sensitive configuration files like .mcp.json and project documentation, and uses browser/API testing MCPs that can reach external networks.
  • [INDIRECT_PROMPT_INJECTION]: The skill is highly vulnerable to indirect prompt injection as it processes untrusted data files to drive its high-privilege execution flow.
  • Ingestion points: Reads bug reports from ./pbis/pbi-[feature-slug]/bugs.md and requirements from pbi.md and techspec.md (Step 1).
  • Boundary markers: Absent. There are no instructions to treat the content of these files as data rather than instructions.
  • Capability inventory: File system read/write (Edit/Write tools), shell command execution (npm test), and access to various MCP servers (database, cache, browser).
  • Sanitization: Absent. The agent does not validate or sanitize the 'root cause' or 'fix strategy' derived from the input files before implementing them.
  • [DYNAMIC_EXECUTION]: The skill dynamically modifies the codebase and then executes that code via the test runner.
  • Evidence: Step 3 (Implement Fixes) and Step 4 (Create Regression Tests) involve writing new code/tests and immediately executing them in Step 6.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 6, 2026, 12:42 AM
Security Audit — agent-trust-hub — do-execute-bugfix