experiment-audit

Pass

Audited by Gen Agent Trust Hub on May 13, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests untrusted data from the local project environment and passes it to an LLM for evaluation without sufficient isolation.
  • Ingestion points: The skill (in SKILL.md Step 1) scans the project directory for a wide variety of files including evaluation scripts (eval.py), result files (*.json, .csv), and narrative reports (.tex, *.md).
  • Boundary markers: The prompt provided to the reviewer LLM (via mcp__codex__codex) does not use strong delimiters or instructions to ignore embedded commands within the content of the files being read.
  • Capability inventory: The skill has powerful tools enabled, including Bash(*), Write, and Edit, which could be abused if the reviewer LLM is manipulated by instructions hidden inside the project files.
  • Sanitization: There is no evidence of sanitization or filtering of the file content before it is processed by the LLM.
  • [COMMAND_EXECUTION]: The skill instructions in the "Review Tracing" section recommend the execution of an external script tools/save_trace.sh. This pattern involves executing code that is not contained within the skill itself and whose provenance and safety cannot be verified by the skill's static definition.
Audit Metadata
Risk Level
SAFE
Analyzed
May 13, 2026, 02:00 PM