pr-test
Fail
Audited by Gen Agent Trust Hub on May 6, 2026
Risk Level: HIGHCOMMAND_EXECUTIONCREDENTIALS_UNSAFEDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill uses docker, docker-compose, git, and gh to manage environments and execute tests. This includes running commands inside containers via docker exec, providing significant control over the local system and infrastructure.
- [CREDENTIALS_UNSAFE]: The skill automatically extracts Claude OAuth tokens from the host's secure storage (macOS Keychain, ~/.claude/.credentials.json, or Windows AppData) and injects them into environment files. It also instructs the agent to copy .env files containing sensitive secrets between worktrees and contains hardcoded test user credentials (test@test.com / testtest123).
- [DATA_EXFILTRATION]: Credential harvesting occurs by moving host-level tokens into a Docker environment. Additionally, the skill captures and uploads UI screenshots to a GitHub branch, which could inadvertently expose sensitive data displayed in the interface during execution.
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection by consuming untrusted PR descriptions and logs to generate test scenarios. An attacker could craft a PR description to manipulate the agent into executing unauthorized commands given its high Docker and GitHub permissions.
- Ingestion points: gh pr view (PR body) and git logs
- Boundary markers: None
- Capability inventory: docker exec, gh api, and git
- Sanitization: None
Recommendations
- AI detected serious security threats
Audit Metadata