paper-autoraters
Pass
Audited by Gen Agent Trust Hub on Apr 14, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection as it processes untrusted research paper drafts and interpolates them directly into LLM scoring prompts.
- Ingestion points: Untrusted content is ingested via the paper_text placeholder in citation-f1-prompt.md and full paper content in litreview-quality-prompt.md, sxs-litreview-prompt.md, and sxs-paper-quality-prompt.md.
- Boundary markers: The prompts use headers like 'Paper Text:' and 'References List:' but do not employ robust delimiters or specific instructions for the agent to disregard instructions found within the input data.
- Capability inventory: The skill performs file system reads and writes and executes a local Python script (scripts/compute_f1.py) to process data.
- Sanitization: There is no evidence of escaping or validating the paper content before it is processed by the model.
Audit Metadata