Deep Research with Codebase

A relentless, resumable research skill that answers "how does this codebase do X" by walking an explicit tree of sub-questions and grounding every finding in tagged file:line evidence. Read-only by invariant. Output is a structured REPORT/ directory, not a single file.

Core rules

Tree-first. Every research sub-question lives at a specific position in a branch tree maintained in STATE.md.
Breadth-first traversal. Resolve all sibling branches at the current depth before descending. Re-read the BFS queue before each new investigation.
Codebase-first toolbelt. Investigate with Grep/Read/Glob/Bash. Use LSP-first for symbol resolution (grep fallback). Use ast-grep for structural search (grep fallback). Treat tests as ground truth for behavioural claims — a behavioural finding without a covering test is tagged inferred and a coverage gap is reported. Spawn Agent with subagent_type: Explore for breadth-heavy lookups (preset-bounded). External docs (Context7/Exa) are opt-in per package — never auto-fetched. See TOOLBELT.md.
Mandatory 5-item checklist per finding. Every finding must resolve entry points, call sites, tests covering, failure / error paths, and configuration touch points. Each item is resolved / gap / waived. Missing items spawn sub-branches. Per-repo extras may be appended via .research-checklist.md at the repo root.
Tiered confidence + source-typed citations. Each finding carries definitive / likely / inferred. Each citation carries one of 8 tags: definition / call-site / test / type / config / doc / comment / git. definitive requires ≥1 definition + ≥1 corroborating (test or call-site). likely = definition alone OR test alone. inferred = comment / git / grep evidence only.
In-loop contradiction check. Before writing a new finding, compare it against findings in the same branch lineage. Flag and resolve contradictions in place; a final pairwise pass runs in the audit suite.
Saturation skip. If a target (file region or symbol) has already been cited in a sibling or ancestor finding, don't re-explore unless a new dimension is being investigated. Mark [~] — saturated.
Preset-driven control. Every new topic asks the user for a preset (quick / standard / exhaustive) — no implicit default. Overrides live in config.yaml and are re-read at every checkpoint. Mid-run dial changes ask "retroactive or forward-only?".
Ask only when the codebase is silent. Use AskUserQuestion only when (a) the code can't answer, (b) two plausible interpretations split the research, or (c) scope is genuinely ambiguous.
Persist before asking. Update STATE.md and FINDINGS.md before any AskUserQuestion call.
Audit before finalize. The full audit suite (citation re-verify + coverage sweep + contradiction scan + checklist completeness) runs auto-pre-finalize. Failures spawn fix branches the user approves. See AUDIT-SUITE.md.
Cite, always. Every finding carries path/to/file.ext:lineno references with a source-type tag. Conclusions without tagged citations are invalid.
Never modify code. Pure read-only audit. Only writes inside .plans/research/<topic>/.

deep-research-with-codebase

Deep Research with Codebase

Core rules