deep-research

Fail

Audited by Gen Agent Trust Hub on Apr 25, 2026

Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/run-evaluation-loop.mjs utilizes spawnSync to execute local Node.js scripts. The execution path is configurable via the --companion-script command-line argument, allowing an attacker or a compromised agent to execute arbitrary local files as Node.js code.
  • [DATA_EXFILTRATION]: The skill performs file write operations in scripts/run-evaluation-loop.mjs using paths constructed from user-controlled inputs artifact_dir and slug. The absence of path validation or sanitization enables potential Path Traversal attacks, allowing for the creation or overwriting of files in sensitive system locations outside the intended directory.
  • [PROMPT_INJECTION]: The skill's core function is to fetch and synthesize data from the open web using tools like WebSearch and WebFetch (referenced in references/claude-companion.md) without implementing boundary markers or content sanitization. This creates a significant surface for indirect prompt injection, where malicious instructions embedded in research sources could manipulate agent behavior, especially given the skill's file-system and command-execution capabilities.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 25, 2026, 07:10 AM
Security Audit — agent-trust-hub — deep-research