content-refinement-agent

Pass

Audited by Gen Agent Trust Hub on Apr 14, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it incorporates untrusted data from LaTeX manuscripts and reviewer feedback into its LLM prompts.
  • Ingestion points: The skill reads workspace/drafts/paper.tex and workspace/refinement/iter<N>/review.json (documented in SKILL.md and references/prompt.md).
  • Boundary markers: The refinement prompt uses simple textual labels like 'paper.tex:' and 'reviewer_feedback:' without robust delimiters such as XML tags or unique sequence markers to isolate data from instructions.
  • Capability inventory: The skill executes shell commands via latexmk and runs multiple Python scripts for workflow management.
  • Sanitization: No programmatic sanitization or validation of the LaTeX content is performed; the skill relies on the LLM's adherence to safety instructions and basic post-hoc checks like grepping for the word 'limitation'.
  • [COMMAND_EXECUTION]: The skill performs shell command execution based on the workflow state and processed data.
  • Evidence: The refinement loop in SKILL.md executes latexmk -pdf -interaction=nonstopmode paper.tex. LaTeX compilation can serve as a vector for arbitrary command execution if the environment allows macros like \write18 to be processed from LLM-modified source code.
  • Evidence: The skill uses python -c one-liners in SKILL.md to parse JSON results and store them in shell variables (e.g., CONSECUTIVE_SMALL), which are then used in subsequent loop iterations.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 14, 2026, 02:00 PM