svy
svy Skill
svy: design-based analysis of complex survey data in Python. Covers survey design specification (strata, PSU, weights, FPC), variance estimation (Taylor linearization, BRR, jackknife, bootstrap), descriptive estimation (means, totals, proportions, ratios, medians), survey-weighted GLM regression (gaussian, binomial, Poisson), domain/subpopulation analysis, calibration, and survey data I/O (SAS, SPSS, Stata). Uses Polars DataFrames natively. Use when analyzing data from complex sample surveys (NHANES, CPS, ACS PUMS, MEPS, ECLS-K, BRFSS, DHS). For non-survey regression, use statsmodels; for fixed effects, use pyfixest; for panel/IV models, use linearmodels.
Comprehensive skill for complex survey data analysis with svy. Use decision trees below to find the right guidance, then load detailed references.
What is svy?
svy is the Python package for design-based analysis of complex survey data:
- Survey-aware estimation: Means, totals, proportions, ratios, medians with proper design-based standard errors
- GLM regression: Survey-weighted linear, logistic, and Poisson regression with design-adjusted inference
- Flexible variance estimation: Taylor linearization (default), bootstrap, BRR (including Fay's modification), and jackknife (JK1, JKn) replicate methods
- Domain estimation: Correct subpopulation analysis without pre-filtering (preserves design structure)
- Native Polars: Built on Polars DataFrames, not pandas
- Survey data I/O: Read SAS (.sas7bdat), SPSS (.sav), Stata (.dta), and CSV with metadata
- Calibration: Post-stratification, raking, and GREG calibration for weight adjustment
- Validated: Results numerically equivalent to R's survey package across all methods