ai-safety-researcher
Warn
Audited by Snyk on Apr 21, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 0.70). The skill's red-team workflow (see §9.2 "Red-Team Evaluation — Jailbreak Attack Suite Design" in SKILL.md / references/9-2...) explicitly instructs sourcing base prompts from HarmBench and augmenting with domain-specific/public prompts, meaning the agent ingests untrusted third-party prompt content that will be read and used to drive evaluations and defenses, which can enable indirect prompt injection.
Issues (1)
W011
MEDIUMThird-party content exposure detected (indirect prompt injection risk).
Audit Metadata