Full Empirical Analysis — Classical Python Workflow

This skill is the canonical 8-step pipeline an applied economist runs on every empirical paper, written in the traditional Python ecosystem — no opinionated one-stop wrapper. Every step calls libraries directly (pandas, numpy, scipy, statsmodels, linearmodels, pyfixest, rdrobust, econml, causalml, matplotlib, seaborn), so the agent — or the user reading the agent's code — has full visibility and can swap any component.

Companion skill: if the user prefers a single-import agent-native DSL (import statspai as sp), route to 00-StatsPAI_skill instead. This skill is the opposite philosophy: everything explicit, everything inspectable, every diagnostic run by hand, every plot shaped by the user.

Philosophy

Traditional stack, no magic. Agents should be able to read every line and know exactly which library / estimator / standard error family is at work.
Full pipeline, not just estimation. 80% of the time on a real paper is steps 1–4 and 6–8. This skill treats them as first-class, not an afterthought.
Rich outputs. Every step produces at least one table or figure — never a single point estimate in isolation.
Progressive disclosure. SKILL.md gives the canonical call at each step; references/ holds variant-specific depth (dozens of tests, estimator-specific diagnostics, plot recipes).
Reproducible. Every code block is runnable after pip install -r requirements.txt and df = pd.read_csv(...).

Three domain modes (default = AER econ; alternates = epi & ML-causal)

The default playbook above is AER-style applied econometrics — the AEA convention: written-out estimating equation, identifying assumption, design horse-race, full robustness gauntlet. The skill also ships two parallel sub-pipelines for the other two big causal-inference traditions, each reusing the same Steps 1–4 (cleaning / construction / descriptives / diagnostics) and Step 8 (tables & figures) — only Step 5 (estimator) and Step 6/7 (robustness / mechanism) swap libraries:

Full-empirical-analysis-skill

Full Empirical Analysis — Classical Python Workflow

Philosophy

Three domain modes (default = AER econ; alternates = epi & ML-causal)