experiment-audit

Pass

Audited by Gen Agent Trust Hub on May 18, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection (Category 8) because it ingests untrusted project data for analysis.
  • Ingestion points: SKILL.md Step 1 (Scans for evaluation scripts, result logs, configuration files, and narrative reports across the project directory).
  • Boundary markers: Absent. The instructions do not define delimiters or provide specific 'ignore' directives to the reviewer model regarding the content of the files being read.
  • Capability inventory: Bash(*), Read, Write, Edit, Grep, Glob, Agent, mcp__codex__codex, mcp__codex__codex-reply.
  • Sanitization: Absent. File contents are processed directly by the external model without filtering.
  • [COMMAND_EXECUTION]: The skill requests broad shell access via Bash(*). While intended for directory scanning and file management, this permission allows for arbitrary command execution on the host environment.
  • [DATA_EXFILTRATION]: The workflow involves sending file paths and prompting an external reviewer backend (mcp__codex__codex) to read file contents. This results in the transfer of project metadata and implementation logic to an external service provider.
  • [METADATA_POISONING]: The skill description and constants reference non-existent model versions (GPT-5.4, GPT-5.5), which is deceptive regarding the actual underlying capabilities of the reviewer tool.
Audit Metadata
Risk Level
SAFE
Analyzed
May 18, 2026, 06:06 PM
Security Audit — agent-trust-hub — experiment-audit