Self-Eval: Honest Work Evaluation

ultrathink

Tier: STANDARD Category: Engineering / Quality Dependencies: None (prompt-only, no external tools required)

Description

Self-eval is a Claude Code skill that produces honest, calibrated work evaluations. It replaces the default AI tendency to rate everything 4/5 with a structured two-axis scoring system, mandatory devil's advocate reasoning, and cross-session anti-inflation detection.

The core insight: AI self-assessment converges to "everything is a 4" because a single-axis score conflates task difficulty with execution quality. Self-eval separates these axes, then combines them via a fixed matrix that the model cannot override.

Features

Two-axis scoring — Independently rates task ambition (Low/Medium/High) and execution quality (Poor/Adequate/Strong), then combines via a lookup matrix
Mandatory devil's advocate — Before finalizing, must argue for both higher AND lower scores, then resolve the tension
Score persistence — Appends scores to .self-eval-scores.jsonl in the working directory, building history across sessions

self-eval

Self-Eval: Honest Work Evaluation

Description

Features

More from alirezarezvani/claude-skills

marketing-skills

engineering-skills

finance-skills

engineering-advanced-skills

c-level-advisor

business-growth-skills