deep-research
Pass
Audited by Gen Agent Trust Hub on May 13, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes a bundled Python script (
scripts/check_saturation.py) to evaluate the progress and depth of research tasks. Technical review of the script confirms it only performs benign text-based heuristic checks, such as counting unique domains and matching keywords against a specification, without any dangerous system calls. - [EXTERNAL_DOWNLOADS]: The skill is designed to perform extensive web exploration via the
agent-browsertool. This involves downloading and processing content from external websites, which is the primary intended functionality for a deep research tool. - [PROMPT_INJECTION]: The skill exhibits an inherent surface for indirect prompt injection as it processes untrusted data from the web and stores it in
knowledge_fragments.mdfiles for later synthesis. - Ingestion points: External data enters the agent context via web search results and site extractions saved in
knowledge_fragments.mdfiles within task-specific directories. - Boundary markers: The instructions do not define specific delimiters or "ignore previous instructions" warnings when the agent reads or synthesizes these fragments.
- Capability inventory: The primary risk is limited to the synthesis phase where the agent compiles research; the saturation script used for task management does not execute fragment content.
- Sanitization: There is no evidence of content filtering or sanitization of the data retrieved from the web before it is written to the workspace.
Audit Metadata