research-refine

Pass

Audited by Gen Agent Trust Hub on Apr 19, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The skill is susceptible to Indirect Prompt Injection (Category 8) due to its core workflow of ingesting untrusted content from external sources.
  • Ingestion points: In Phase 1.1, the skill uses Read on local files in papers/ and literature/, and WebFetch to retrieve online research papers.
  • Boundary markers: While boundary markers are used when sending the proposal to the reviewer model in Phase 2, they are absent during the initial grounding phase where the agent reads external paper content.
  • Capability inventory: The skill has access to powerful tools including Bash(*), Write, Edit, and Agent, which could be exploited if malicious instructions are embedded in scanned literature.
  • Sanitization: There is no evidence of sanitization or explicit 'ignore embedded instructions' warnings provided to the agent before it processes external documents.
  • [DATA_EXFILTRATION]: The skill transmits the research proposal and problem description to an external model (GPT-5.4) via the mcp__codex__codex tool. This is the intended behavior for the skill's 'Reviewer' functionality, but it involves sending potentially sensitive intellectual property to a third-party service provider.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 19, 2026, 03:14 AM
Security Audit — agent-trust-hub — research-refine