prompt-repetition

Installation

Summary

Prompt repetition technique that improves lightweight model accuracy by 67% across benchmarks.

Auto-applies to claude-haiku, gemini-flash, and gpt-4o-mini; uses 2× repetition for general tasks and 3× for position-based queries
Mitigates causal attention limitations by reprocessing the entire prompt, strengthening attention weights on key concepts without architectural changes
Skips automatically when Chain-of-Thought patterns detected; includes duplicate-application prevention via markers
Doubles input tokens with minimal latency impact (prefill parallelization) while improving cost-per-correct-answer by only 5%

SKILL.md

Prompt Repetition

LLMs are trained as Causal Language Models, where each token attends only to previous tokens. This leads to:

Context-Question Problem: The question is unknown when processing context
Options-First MCQ Problem: Cannot fully understand the question context when viewing answer choices
Position/Index Problem: Attention weights weaken for specific position information in long lists

Prompt repetition enables the second pass to reference the entire first pass, effectively mimicking some benefits of bidirectional attention.

When using lightweight models: claude-haiku, gemini-flash, gpt-4o-mini, etc.
Options-First MCQ: Multiple choice where answer choices appear before the question

Related skills

Installs

10.5K

Repository

GitHub Stars

First Seen

Jan 24, 2026

Security Audits