ultraresearch

Fail

Audited by Gen Agent Trust Hub on Jun 23, 2026

Risk Level: HIGHPROMPT_INJECTIONREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • [PROMPT_INJECTION]: The skill contains explicit instructions to override and supersede system-level constraints and safety guidelines. The 'Authority while active' section directs the agent to ignore exploration-bounding instructions in surrounding prompts or rules.
  • [UNVERIFIABLE_DEPENDENCIES_AND_REMOTE_CODE_EXECUTION]: The skill implements a 'Phase 3' verification process that involves generating and executing code scripts based on research findings. This allows for the execution of arbitrary code synthesized from untrusted external data (web content). It also utilizes 'git clone' to download external repositories for analysis.
  • [DATA_EXPOSURE_AND_EXFILTRATION]: The skill exhibits a high-risk pattern of reading local codebase information while simultaneously having the capability to perform network operations (web search, page fetching). This combination creates a significant surface for accidental or malicious data exposure.
  • [INDIRECT_PROMPT_INJECTION]: The skill has a large attack surface for indirect prompt injection. Ingestion points include content fetched from arbitrary web pages and GitHub repositories. While it uses '## EXPAND' and '## CLAIMS' markers, it lacks robust isolation for the untrusted content itself. The skill possesses powerful capabilities including sub-agent spawning and 'uv run' for code execution, but lacks any mention of sanitizing external data.
  • [DYNAMIC_EXECUTION]: The skill generates and executes scripts at runtime to settle contested claims or create reports (Phase 3 and Phase 5), utilizing runtimes like python, bun, and compilers.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Jun 23, 2026, 11:38 AM
Security Audit — agent-trust-hub — ultraresearch