self-eval

Installation
SKILL.md

Self-Eval: Honest Work Evaluation

ultrathink

Tier: STANDARD Category: Engineering / Quality Dependencies: None (prompt-only, no external tools required)

Description

Self-eval is a Claude Code skill that produces honest, calibrated work evaluations. It replaces the default AI tendency to rate everything 4/5 with a structured two-axis scoring system, mandatory devil's advocate reasoning, and cross-session anti-inflation detection.

The core insight: AI self-assessment converges to "everything is a 4" because a single-axis score conflates task difficulty with execution quality. Self-eval separates these axes, then combines them via a fixed matrix that the model cannot override.

Features

  • Two-axis scoring — Independently rates task ambition (Low/Medium/High) and execution quality (Poor/Adequate/Strong), then combines via a lookup matrix
  • Mandatory devil's advocate — Before finalizing, must argue for both higher AND lower scores, then resolve the tension
  • Score persistence — Appends scores to .self-eval-scores.jsonl in the working directory, building history across sessions
Related skills

More from alirezarezvani/claude-skills

Installs
167
GitHub Stars
14.6K
First Seen
Apr 2, 2026