anti-distill

Pass

Audited by Gen Agent Trust Hub on May 15, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: Indirect Prompt Injection vulnerability surface detected.
  • Ingestion points: The skill reads untrusted external content from user-provided files, PDF documents, and images using the Read tool in SKILL.md (Step 1).
  • Boundary markers: The instructions do not specify the use of delimiters (such as XML tags or triple backticks) or specific 'ignore' instructions for the data being processed, which increases the risk that malicious instructions inside a document could hijack the agent's behavior.
  • Capability inventory: The skill has access to powerful tools including Write, Edit, and Bash (Step 5), which are used to generate and modify files on the local filesystem.
  • Sanitization: No sanitization or validation of the input content is performed before it is processed by the LLM for classification and rewriting.
  • [COMMAND_EXECUTION]: The skill instructs the agent to execute shell commands (mkdir -p {output_dir}_cleaned) using variables derived from user-provided paths. This pattern of constructing command lines from external input can lead to command injection if paths are maliciously crafted.
Audit Metadata
Risk Level
SAFE
Analyzed
May 15, 2026, 10:45 AM
Security Audit — agent-trust-hub — anti-distill