paper-autoraters

Pass

Audited by Gen Agent Trust Hub on Apr 14, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection as it processes untrusted research paper drafts and interpolates them directly into LLM scoring prompts.
  • Ingestion points: Untrusted content is ingested via the paper_text placeholder in citation-f1-prompt.md and full paper content in litreview-quality-prompt.md, sxs-litreview-prompt.md, and sxs-paper-quality-prompt.md.
  • Boundary markers: The prompts use headers like 'Paper Text:' and 'References List:' but do not employ robust delimiters or specific instructions for the agent to disregard instructions found within the input data.
  • Capability inventory: The skill performs file system reads and writes and executes a local Python script (scripts/compute_f1.py) to process data.
  • Sanitization: There is no evidence of escaping or validating the paper content before it is processed by the model.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 14, 2026, 02:00 PM