deep-research

Pass

Audited by Gen Agent Trust Hub on May 13, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes a bundled Python script (scripts/check_saturation.py) to evaluate the progress and depth of research tasks. Technical review of the script confirms it only performs benign text-based heuristic checks, such as counting unique domains and matching keywords against a specification, without any dangerous system calls.
  • [EXTERNAL_DOWNLOADS]: The skill is designed to perform extensive web exploration via the agent-browser tool. This involves downloading and processing content from external websites, which is the primary intended functionality for a deep research tool.
  • [PROMPT_INJECTION]: The skill exhibits an inherent surface for indirect prompt injection as it processes untrusted data from the web and stores it in knowledge_fragments.md files for later synthesis.
  • Ingestion points: External data enters the agent context via web search results and site extractions saved in knowledge_fragments.md files within task-specific directories.
  • Boundary markers: The instructions do not define specific delimiters or "ignore previous instructions" warnings when the agent reads or synthesizes these fragments.
  • Capability inventory: The primary risk is limited to the synthesis phase where the agent compiles research; the saturation script used for task management does not execute fragment content.
  • Sanitization: There is no evidence of content filtering or sanitization of the data retrieved from the web before it is written to the workspace.
Audit Metadata
Risk Level
SAFE
Analyzed
May 13, 2026, 02:21 PM