content-refinement-agent
Pass
Audited by Gen Agent Trust Hub on Apr 14, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection because it incorporates untrusted data from LaTeX manuscripts and reviewer feedback into its LLM prompts.
- Ingestion points: The skill reads
workspace/drafts/paper.texandworkspace/refinement/iter<N>/review.json(documented in SKILL.md and references/prompt.md). - Boundary markers: The refinement prompt uses simple textual labels like 'paper.tex:' and 'reviewer_feedback:' without robust delimiters such as XML tags or unique sequence markers to isolate data from instructions.
- Capability inventory: The skill executes shell commands via
latexmkand runs multiple Python scripts for workflow management. - Sanitization: No programmatic sanitization or validation of the LaTeX content is performed; the skill relies on the LLM's adherence to safety instructions and basic post-hoc checks like grepping for the word 'limitation'.
- [COMMAND_EXECUTION]: The skill performs shell command execution based on the workflow state and processed data.
- Evidence: The refinement loop in
SKILL.mdexecuteslatexmk -pdf -interaction=nonstopmode paper.tex. LaTeX compilation can serve as a vector for arbitrary command execution if the environment allows macros like\write18to be processed from LLM-modified source code. - Evidence: The skill uses
python -cone-liners inSKILL.mdto parse JSON results and store them in shell variables (e.g.,CONSECUTIVE_SMALL), which are then used in subsequent loop iterations.
Audit Metadata