The Agent Skills Directory

[PROMPT_INJECTION]: The skill contains explicit defensive instructions to identify and report prompt injection attempts. The static detector flag for 'ignore previous instructions' is a false positive, as the phrase is used within a hardening block to guide the agent's response to malicious input. The skill also addresses indirect injection risks:
Ingestion points: Untrusted data is ingested via identity metadata (role names, group descriptions, policy tags) during the audit process (SKILL.md).
Boundary markers: Clear boundaries are established in 'Injection Hardening' and 'Prompt Injection Safety Notice' sections.
Capability inventory: The skill is limited to read-only file system operations using 'Read', 'Grep', and 'Glob' tools.
Sanitization: Instructions mandate that the agent must flag and report adversarial content rather than executing embedded directives.
[DATA_EXFILTRATION]: A strict security boundary is defined that explicitly prohibits the exfiltration of credentials, user lists, or entitlement data discovered during the review process.

access-review