executing-test-plans
Pass
Audited by Gen Agent Trust Hub on Mar 21, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes system commands such as
docker psduring the pre-flight phase to verify environment health and service status. - [PROMPT_INJECTION]: The skill exhibits a significant surface for indirect prompt injection because its operational logic is derived from an external file (
plan-file) provided at runtime. A malicious file could contain instructions that lead the agent to perform unauthorized actions. - Ingestion points: the
plan-fileprovided as a primary argument to the skill. - Boundary markers: the instructions lack delimiters or specific warnings to distinguish between test plan data and agent directives.
- Capability inventory: the agent is configured with access to powerful tools including
docker,curl,mysql, andaws sqs. - Sanitization: there is no evidence of validation or sanitization of the test plan content before the agent executes the steps.
Audit Metadata