paper-autoraters

Installation
SKILL.md

Paper Autoraters (App. F.3)

Faithful implementation of the four LLM-as-judge autoraters used in PaperOrchestra (Song et al., 2026, arXiv:2604.05018, §5 and App. F.3).

These are the metrics the paper uses to demonstrate that PaperOrchestra beats single-agent and AI-Scientist-v2 baselines. Use them to:

  1. Score a generated paper against a ground-truth paper.
  2. Compare two paper-writing pipelines side-by-side.
  3. Validate your own host-agent execution of the paper-orchestra pipeline.

The four autoraters

Related skills

More from ar9av/paperorchestra

Installs
7
GitHub Stars
420
First Seen
Apr 14, 2026