zr-inbox
Fail
Audited by Gen Agent Trust Hub on May 10, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- [COMMAND_EXECUTION]: Step 1 of the skill instructions in
SKILL.mdexplicitly directs the agent to "Run the provided context command first." As this command is supplied as part of theprobe nextinput context, it allows for arbitrary command execution if the input data is untrusted or can be influenced by a malicious actor. - [PROMPT_INJECTION]: The skill processes and acts upon instructions extracted from inbox messages and directives (Category 8: Indirect Prompt Injection). It lacks any boundary markers or instructions to treat message content as untrusted. Step 4 ("Execute only the requested action that fits this wake") directly facilitates the execution of attacker-controlled instructions embedded in these external sources.
- Ingestion points:
probe message listandprobe message directivesinSKILL.md. - Boundary markers: Absent. The skill provides no delimiters for message content.
- Capability inventory:
probe message sendand the general instruction to execute extracted actions. - Sanitization: Absent. No evidence of validation, escaping, or filtering for message content or the context command.
- [DATA_EXFILTRATION]: The skill's primary workflow involves reading direct messages and sending replies to channels. This pattern can be exploited via indirect prompt injection to exfiltrate sensitive data from the agent's environment or private inbox to a channel controlled by an attacker.
Recommendations
- AI detected serious security threats
Audit Metadata