ship-and-babysit

Pass

Audited by Gen Agent Trust Hub on May 17, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill makes extensive use of local shell commands to perform its workflow. It utilizes git for version control, gh for GitHub interaction, and package managers like pnpm and cargo for code validation and testing.
  • [PROMPT_INJECTION]: The skill implements an indirect prompt injection surface (Category 8) by design. It fetches and acts upon external text data which could contain malicious instructions designed to manipulate the agent's behavior.
  • Ingestion points: Untrusted data enters the agent context through gh pr checks, gh api (fetching PR and issue comments), and GraphQL queries for review threads as defined in the 'Babysit Loop' section of SKILL.md.
  • Boundary markers: The skill instructions do not specify any delimiters or safety constraints to distinguish between descriptive feedback and executable instructions within the fetched comments.
  • Capability inventory: The skill possesses significant capabilities, including filesystem modification, git commit/push operations, and execution of build/test scripts via pnpm and cargo.
  • Sanitization: No sanitization or verification logic is present to filter malicious content from the pull request comments before the agent attempts to apply fixes based on them.
Audit Metadata
Risk Level
SAFE
Analyzed
May 17, 2026, 11:18 AM
Security Audit — agent-trust-hub — ship-and-babysit