breach
Pass
Audited by Gen Agent Trust Hub on Apr 25, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill includes reference material in 'references/ai-red-teaming.md' containing explicit prompt injection and jailbreak payloads (e.g., 'Ignore all previous instructions', 'You are DAN'). These strings are provided as inert test templates for the agent to use in security assessments and do not constitute instructions for the agent to bypass its own safety protocols.
- [DATA_EXFILTRATION]: Security playbooks in 'references/attack-playbooks.md' describe various exfiltration techniques such as C2 channels and DNS tunneling. These are included as reference points for the agent to identify and test for exfiltration risks in target systems, not as functional capabilities for the skill itself to perform exfiltration.
- [PROMPT_INJECTION]: The skill possesses an indirect prompt injection surface as it is designed to ingest and process security findings from partner agents (Sentinel, Probe, Canon) to generate reports and threat models.
- Ingestion points: Security findings, compliance gaps, and architecture notes received via structured YAML handoffs (defined in 'references/handoffs.md').
- Boundary markers: The agent utilizes structured YAML schemas for communication, which maintains a logical separation between untrusted data and system instructions.
- Capability inventory: The skill's scope is strictly limited to designing scenarios, planning tests, and generating documentation; it is explicitly forbidden from executing exploits, generating destructive payloads, or writing implementation code ('design tests, do not run code').
- Sanitization: While no specific input filtering is described, the skill's output is intended for advisory reporting, and the lack of autonomous execution capabilities minimizes the potential impact of poisoned inputs.
Audit Metadata