Full Empirical Analysis — Classical R Workflow

This skill is the canonical 8-step pipeline an applied economist runs on every empirical paper, written in the modern tidyverse + econometrics R ecosystem — dplyr/tidyr/haven for data, fixest as the panel/IV/DID workhorse, did/bacondecomp/HonestDiD for modern DID, rdrobust/rddensity for RD, Synth/gsynth/synthdid for synthetic control, MatchIt/WeightIt/cobalt/ebal for matching, grf/DoubleML for ML causal, mediation for causal mediation, marginaleffects for post-estimation, modelsummary/kableExtra/gt for publication tables, ggplot2/iplot/binsreg for figures.

Companion skills: this is the R sibling of 00-StatsPAI_skill (Python DSL), 00.1-Full-empirical-analysis-skill (explicit Python), and 00.2-Full-empirical-analysis-skill_Stata (Stata .do). All four implement the same 8 steps, in their respective ecosystems.

Philosophy

Tidyverse + fixest, the modern R idioms. feols(... | unit + year, cluster = ~unit), not Frankenstein-y lm(y ~ x + factor(unit) + factor(year)).
Reproducible scripts / Quarto. Every example below is paste-runnable. renv for package locking; Quarto (.qmd) for combined narrative + code + tables/figures.
8 steps, first-class. R users historically over-invest in Step 5; this skill treats Steps 1–4 and 6–8 as core.
Rich outputs. Every step yields at least one table or figure — tex/docx/png/pdf.
Progressive disclosure. SKILL.md gives the canonical call per step; references/ holds variant-specific depth.

Three domain modes (default = AER econ; alternates = epi & ML-causal)

The default playbook above is AER-style applied econometrics — the AEA convention: written-out estimating equation, identifying assumption, design horse-race, full robustness gauntlet. The skill also ships two parallel sub-pipelines for the other two big causal-inference traditions, each reusing the same Steps 1–4 (cleaning / construction / Table 1 / diagnostics) and Step 8 (tables/figures) — only Step 5 (estimator) and Step 6/7 swap packages:

Full-empirical-analysis-skill-R

Full Empirical Analysis — Classical R Workflow

Philosophy

Three domain modes (default = AER econ; alternates = epi & ML-causal)