pr-test

Fail

Audited by Gen Agent Trust Hub on May 6, 2026

Risk Level: HIGHCOMMAND_EXECUTIONCREDENTIALS_UNSAFEDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill uses docker, docker-compose, git, and gh to manage environments and execute tests. This includes running commands inside containers via docker exec, providing significant control over the local system and infrastructure.
  • [CREDENTIALS_UNSAFE]: The skill automatically extracts Claude OAuth tokens from the host's secure storage (macOS Keychain, ~/.claude/.credentials.json, or Windows AppData) and injects them into environment files. It also instructs the agent to copy .env files containing sensitive secrets between worktrees and contains hardcoded test user credentials (test@test.com / testtest123).
  • [DATA_EXFILTRATION]: Credential harvesting occurs by moving host-level tokens into a Docker environment. Additionally, the skill captures and uploads UI screenshots to a GitHub branch, which could inadvertently expose sensitive data displayed in the interface during execution.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection by consuming untrusted PR descriptions and logs to generate test scenarios. An attacker could craft a PR description to manipulate the agent into executing unauthorized commands given its high Docker and GitHub permissions.
  • Ingestion points: gh pr view (PR body) and git logs
  • Boundary markers: None
  • Capability inventory: docker exec, gh api, and git
  • Sanitization: None
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
May 6, 2026, 03:29 PM