test-first-bugs

Pass

Audited by Gen Agent Trust Hub on May 11, 2026

Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill utilizes common shell utilities (find, grep, ls) to perform project reconnaissance and identify existing test suites.
  • [COMMAND_EXECUTION]: The skill executes language-specific test runners such as pytest, npm test, and go test to confirm bug reproduction and verify fixes. This is the intended and appropriate use of tools for the skill's stated purpose.
  • [PROMPT_INJECTION]: The skill identifies an indirect prompt injection surface where user-supplied bug descriptions are processed into test files and subagent task prompts.
  • Ingestion points: User bug reports are ingested in Phase 1 (SKILL.md).
  • Boundary markers: Absent; user content is interpolated into test files and subagent instructions without explicit delimiters.
  • Capability inventory: The agent has capabilities for file system writes (creating test files), shell command execution (running tests), and task delegation (Task tool).
  • Sanitization: Absent; the instructions do not explicitly direct the agent to sanitize user-provided bug descriptions before including them in commands or subagent prompts.
Audit Metadata
Risk Level
SAFE
Analyzed
May 11, 2026, 01:24 AM