devils-advocate

Pass

Audited by Gen Agent Trust Hub on May 1, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The 'Council Mode' feature provides a template command (uv run python -m cli_council) to be executed within the local packages/cli-council directory. This command uses temporary files in /tmp/ to manage input and output, which is a standard pattern for local tool orchestration and poses no external security risk.
  • [PROMPT_INJECTION]: The skill is designed to ingest and analyze untrusted research papers and arguments, creating a surface for indirect prompt injection.
  • Ingestion points: The skill reads external paper or argument content in the 'Understand the claim' step of SKILL.md.
  • Boundary markers: No specific delimiters or XML tags are instructed for use when isolating the paper content from the agent's instructions.
  • Capability inventory: The skill possesses command execution capabilities through the uv tool in its Council Mode.
  • Sanitization: No explicit sanitization or filtering logic is defined for the input text.
  • Note: The multi-turn debate protocol (Critic -> Defense -> Adjudication) naturally mitigates unintentional instruction following by requiring all points to be cross-examined and ruled upon.
Audit Metadata
Risk Level
SAFE
Analyzed
May 1, 2026, 06:18 PM
Security Audit — agent-trust-hub — devils-advocate