ai-redacting-data

Pass

Audited by Gen Agent Trust Hub on May 13, 2026

Risk Level: SAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill defines a process for handling untrusted user input that is then analyzed by an LLM, creating a potential surface for indirect prompt injection.
  • Ingestion points: Untrusted text is ingested through the forward methods of the PIIRedactor in SKILL.md and the example modules in examples.md.
  • Boundary markers: The prompts for contextual PII detection do not include explicit delimiters or instructions for the LLM to ignore embedded commands.
  • Capability inventory: The skill lacks hazardous capabilities such as file system writing, subprocess execution, or arbitrary network access.
  • Sanitization: A 'regex-first' strategy is utilized to mask structured PII like emails and SSNs prior to the LLM pass, which acts as a significant mitigation against data exposure.
  • [EXTERNAL_DOWNLOADS]: The documentation provides instructions for installing supplementary skills via the vendor's command-line interface.
  • [SAFE]: The skill architecture follows privacy-by-design principles, including explicit documentation on the trade-offs of using external LLMs and de-identification strategies for HIPAA and GDPR compliance.
Audit Metadata
Risk Level
SAFE
Analyzed
May 13, 2026, 06:46 PM
Security Audit — agent-trust-hub — ai-redacting-data