stress-test

Pass

Audited by Gen Agent Trust Hub on Mar 20, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill uses 'npm' to install dependencies required for Proof of Concept (POC) execution in Phase 5. These downloads are performed from the official npm registry.
  • [COMMAND_EXECUTION]: The skill executes shell commands to create, run, and eventually delete POC tests within a temporary directory ('.poc-stress-test/'). Key operations include directory creation, running Node.js scripts, and cleanup. All code execution is gated by a mandatory user approval step via the 'AskUserQuestion' tool in Phase 4.
  • [PROMPT_INJECTION]: The skill processes technical plans and external documentation, which introduces a surface for indirect prompt injection. \n
  • Ingestion points: Technical plans retrieved from the conversation context and external content fetched via the 'WebFetch' and 'WebSearch' tools. \n
  • Boundary markers: The skill does not explicitly use delimiters or specialized instructions to distinguish between its core logic and the data being analyzed. \n
  • Capability inventory: The skill has the ability to execute shell commands and run JavaScript code via the 'Bash' tool. \n
  • Sanitization: No explicit sanitization or filtering of external data is performed before it is used to generate POC specifications. \n
  • Mitigation: The risk is mitigated by the design of Phase 4, which requires the agent to present the POC specification to the user and obtain explicit consent before any commands are executed.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 20, 2026, 12:50 PM