GSEApy — Gene Set Enrichment Analysis in Python

Overview

GSEApy provides Python implementations of GSEA and over-representation analysis (ORA) for interpreting gene expression changes at the pathway level. The enrich module queries the Enrichr API to test a gene list against 200+ databases (GO, KEGG, MSigDB Hallmarks, Reactome, WikiPathways). The prerank and gsea modules run the GSEA algorithm on a pre-ranked gene list or expression matrix — computing normalized enrichment scores (NES) and FDR values for each gene set. GSEApy integrates directly with pandas DataFrames from DESeq2 or scanpy differential expression output, making it the standard Python tool for pathway analysis in RNA-seq workflows.

When to Use

Interpreting DESeq2 or edgeR differential expression results at pathway/GO-term level
Running fast ORA (over-representation analysis) against Enrichr's 200+ databases including GO, KEGG, and MSigDB Hallmarks
Performing GSEA prerank analysis on a log2-fold-change-ranked gene list without an expression matrix
Identifying enriched pathways in scRNA-seq cluster marker genes
Generating publication-ready enrichment dot plots and GSEA running-score plots
Use GSEA Java application for the official GUI-based analysis with full GSEA desktop interface
Use fgsea (R) as an alternative with fast permutation-based p-values; GSEApy is preferred for Python-native pipelines

gseapy-gene-enrichment

GSEApy — Gene Set Enrichment Analysis in Python

Overview

When to Use

Prerequisites

More from jaechang-hits/sciagent-skills

scientific-brainstorming

snakemake-workflow-engine

esm-protein-language-model

biopython-sequence-analysis

shap-model-explainability

archs4-database