AGENTS.md / CLAUDE.md Evaluator

Evaluate whether rules in instruction files actually change model behavior, identify non-discriminating rules (model already does this by default), and optimize the file for maximum impact per token.

Core Concept

Most AGENTS.md/CLAUDE.md files contain rules the model already follows without being told. These waste context tokens every conversation. This skill runs controlled A/B tests — with the instruction file vs. without it — to identify which rules earn their place and which can be cut.

The Codebase-Teaches-Patterns Effect

Well-structured codebases make most instruction rules redundant. The model explores existing code — reads package.json, scans existing components, follows import patterns — and matches conventions automatically. In empirical testing on a 755-line CLAUDE.md across a monorepo with 3 frontend stacks, 25 of 26 assertions passed identically with or without the instruction file. The model discovered React patterns, shadcn/ui conventions, Recharts usage, Hebrew labels, Firestore helpers, Zod validation, and serverTimestamp — all from existing code.

The only assertion that discriminated was pure domain knowledge the codebase couldn't teach: parallel components in separate directories that must always be changed together.

This means your CLAUDE.md is probably 80-95% redundant. The eval process will reveal exactly which rules survive.

The Evaluation Loop

Read the instruction file and categorize each rule

agents-md-evals

AGENTS.md / CLAUDE.md Evaluator

Core Concept

The Codebase-Teaches-Patterns Effect

The Evaluation Loop

More from vltansky/skills

simplify

what-i-did

debug-mode

rfc-research

roast-my-agents-md

batch