office-hours
Fail
Audited by Gen Agent Trust Hub on May 2, 2026
Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill constructs shell commands using unsanitized user input. In Phase 1, it uses user-provided 'key terms' directly in a
grepcommand (grep -rl "[key terms from user's idea]"). In Phase 7.5, it assembles session context into a file and executes it viacodex exec. If a user provides input containing shell metacharacters (e.g., backticks, semicolons, or command substitutions), it could result in arbitrary command execution on the host system. - [DATA_EXFILTRATION]: The 'Cross-Model Second Opinion' feature (Phase 7.5) extracts session context—including the problem statement, Q&A summaries, and diagnostic findings—and sends it to an external utility (
codex). This represents a transfer of potentially sensitive user data to an external service/model. - [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection. During Phase 1 ('Related Design Discovery'), the agent searches for and reads existing design documents (
docs/plans/*/design.md). If these files contain malicious instructions planted by an attacker, those instructions would be loaded into the agent's active context and could influence its subsequent behavior. - [PROMPT_INJECTION]: The skill uses extremely forceful, 'non-negotiable' instructions to override the agent's default behavior and persona (e.g., 'Non-negotiable throughout the entire session', 'Be direct to the point of discomfort', 'Never say...'). While functional for a product diagnostic tool, these patterns are characteristic of prompt injection techniques designed to bypass standard agent constraints.
Recommendations
- AI detected serious security threats
Audit Metadata