anth-policy-guardrails

Pass

Audited by Gen Agent Trust Hub on May 19, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill implements best-practice security patterns for AI integrations, including regex-based input validation for PII (SSN and credit cards) and output filtering for sensitive credentials such as API keys and private keys.
  • [SAFE]: The system prompt template explicitly includes instructions to resist system prompt extraction and ignore override attempts.
  • [SAFE]: All external references are to official and well-known documentation from Anthropic.
  • [SAFE]: No suspicious command execution, network exfiltration, or obfuscation patterns were detected.
Audit Metadata
Risk Level
SAFE
Analyzed
May 19, 2026, 11:05 PM
Security Audit — agent-trust-hub — anth-policy-guardrails