access-review

Pass

Audited by Gen Agent Trust Hub on May 8, 2026

Risk Level: SAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill contains explicit defensive instructions to identify and report prompt injection attempts. The static detector flag for 'ignore previous instructions' is a false positive, as the phrase is used within a hardening block to guide the agent's response to malicious input. The skill also addresses indirect injection risks:
  • Ingestion points: Untrusted data is ingested via identity metadata (role names, group descriptions, policy tags) during the audit process (SKILL.md).
  • Boundary markers: Clear boundaries are established in 'Injection Hardening' and 'Prompt Injection Safety Notice' sections.
  • Capability inventory: The skill is limited to read-only file system operations using 'Read', 'Grep', and 'Glob' tools.
  • Sanitization: Instructions mandate that the agent must flag and report adversarial content rather than executing embedded directives.
  • [DATA_EXFILTRATION]: A strict security boundary is defined that explicitly prohibits the exfiltration of credentials, user lists, or entitlement data discovered during the review process.
Audit Metadata
Risk Level
SAFE
Analyzed
May 8, 2026, 12:28 AM
Security Audit — agent-trust-hub — access-review