adversarial-resilience

Pass

Audited by Gen Agent Trust Hub on Apr 12, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill is purely defensive and educational in nature, offering code templates to protect AI agents from malicious manipulation.- [SAFE]: Provides an Input Sanitizer implementation that uses regular expressions to detect and filter common prompt injection patterns, such as system prompt overrides and role-play instructions.- [SAFE]: Includes an Instruction Anchoring pattern that demonstrates how to use XML delimiters and immutable rules to maintain agent identity and prevent data from being misinterpreted as instructions.- [SAFE]: Features a Secret Scanner designed to identify and redact sensitive information like API keys, tokens, and PII from agent outputs.- [SAFE]: Implements a Permission Boundary check to restrict file system access and block dangerous shell commands like 'rm -rf' or piped remote script executions.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 12, 2026, 07:03 PM