agent-evals
Pass
Audited by Gen Agent Trust Hub on Jun 13, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill processes untrusted workspace data (code, rule files, prompts) which serves as a surface for indirect prompt injection.
- Ingestion points: The Step 1 Workspace Audit (SKILL.md) and the autonomous improvement loop (autonomous-improve-loop.mjs) read local workspace file contents into the agent's context.
- Boundary markers: SKILL.md requires human review checkpoints and the use of held-out evaluation sets to mitigate the risk of malicious instructions influencing the optimization loop.
- Capability inventory: Scaffolds like autonomous-improve-loop.mjs execute shell commands via spawnSync and apply code patches with git apply. The level-3-sandbox-harness.py template executes arbitrary shell commands within Docker containers via subprocess.run.
- Sanitization: The autonomous-improve-loop.mjs template includes a redact function to filter credentials and secrets from data sent to external optimizers.
- [COMMAND_EXECUTION]: Testing infrastructure and provided templates utilize shell commands for environment setup and isolated execution.
- evals/phase2-grader.py uses subprocess.check_call and os.execv to bootstrap a Python virtual environment and install dependencies during testing.
- references/templates/level-3-sandbox-harness.py uses subprocess.run to coordinate agent tasks within Docker containers, providing a sandbox for high-risk operations.
Audit Metadata