PyDESeq2 Differential Expression Analysis

Overview

PyDESeq2 is a Python reimplementation of the R DESeq2 package for differential gene expression analysis from bulk RNA-seq count data. It fits negative binomial generalized linear models per gene, estimates dispersion with empirical Bayes shrinkage, and performs Wald tests with Benjamini-Hochberg FDR correction. This skill covers the full pipeline from raw counts to publication-ready result tables and visualizations.

When to Use

Identifying differentially expressed genes between two or more experimental conditions from bulk RNA-seq
Performing two-group comparisons (e.g., treated vs control) with proper statistical testing
Running multi-factor designs that account for batch effects or covariates (e.g., ~batch + condition)
Applying log2 fold change shrinkage (apeGLM) for ranking and visualization
Generating volcano plots, MA plots, and heatmaps from differential expression results
Converting R-based DESeq2 workflows to a pure Python environment
Integrating DE analysis into larger Python bioinformatics pipelines (e.g., with scanpy, pandas)
Use DESeq2 (R/Bioconductor) or edgeR instead for the reference R implementations with the broadest method support and community validation

pydeseq2-differential-expression

PyDESeq2 Differential Expression Analysis

Overview

When to Use

Prerequisites

More from jaechang-hits/sciagent-skills

scientific-brainstorming

snakemake-workflow-engine

esm-protein-language-model

biopython-sequence-analysis

shap-model-explainability

archs4-database