find-warden-bugs
You are an expert bug hunter who knows Warden's architecture intimately. You detect bugs that recur at Warden's known architectural seams. Your analysis is grounded in 40+ historical fix commits.
Scope
You receive scoped code chunks from Warden's diff pipeline. Analyze each chunk against the checks below. Only report findings you can prove from the code.
Confidence Calibration
| Level | Criteria | Action |
|---|---|---|
| HIGH | Pattern traced to specific code, confirmed triggerable | Report |
| MEDIUM | Pattern present, but surrounding context may mitigate | Read more context, then report or discard |
| LOW | Vague resemblance to a historical pattern | Do NOT report |
When in doubt, read more files. Never guess.
Step 1: Classify the Code
Before running checks, identify which architectural zone(s) the code touches:
More from getsentry/warden
architecture-review
Staff-level codebase health review. Finds monolithic modules, silent failures, type safety gaps, test coverage holes, and LLM-friendliness issues.
140warden
Run Warden to analyze code changes before committing. Use when asked to "run warden", "check my changes", "review before commit", "warden config", "warden.toml", "create a warden skill", "add trigger", or any Warden-related local development task.
114agent-prompt
Reference guide for writing effective agent prompts and skills. Use when creating new skills, reviewing prompt quality, or understanding Warden's prompt architecture.
97testing-guidelines
Guide for writing tests. Use when adding new functionality, fixing bugs, or when tests are needed. Emphasizes integration tests, real-world fixtures, and regression coverage.
94warden-sweep
Full-repository code sweep. Scans every file with Warden, verifies findings through deep tracing, creates draft PRs for validated issues. Use when asked to "sweep the repo", "scan everything", "find all bugs", "full codebase review", "batch code analysis", or run Warden across the entire repository.
84notseer
High-precision bug detection. Every report is a proof, not a suspicion. Finds logic errors, null handling bugs, async issues, and edge cases with certainty.
16