MOFA+ Multi-Omics Factor Analysis

Overview

MOFA+ (Multi-Omics Factor Analysis v2) is an unsupervised statistical framework that jointly decomposes multiple omics datasets into a small set of latent factors. Each factor captures an independent source of variation (e.g., cell cycle, a disease phenotype, a technical batch) and is associated with feature weights (loadings) that reveal which genes, peaks, or proteins drive it. The Python package mofapy2 produces an HDF5 model file compatible with downstream analysis in both Python and R. MOFA+ extends the original MOFA to support multi-group settings where samples belong to distinct cohorts or conditions.

When to Use

Integrating two or more omics layers from the same set of cells or samples (e.g., scRNA-seq + scATAC-seq, RNA + proteomics, methylation + RNA)
Identifying shared and view-specific sources of variation across omics modalities without supervised labels
Comparing how latent factors differ between patient groups, treatment conditions, or time points in a multi-group analysis
Reducing multi-omics dimensionality before clustering, trajectory inference, or survival modeling
Discovering which genomic features (genes, peaks, proteins) drive each factor via sparse loadings
Annotating latent factors by correlating factor scores with sample metadata (age, stage, treatment response)
Use scVI / MultiVI (scverse) instead when you need deep generative batch correction across modalities with explicit latent space inference and VAE architecture
Use LIGER instead when your primary goal is integrating datasets across technologies (e.g., snRNA-seq + snATAC-seq) with shared and dataset-specific factors via iNMF

mofaplus-multi-omics

MOFA+ Multi-Omics Factor Analysis

Overview

When to Use

Prerequisites

More from jaechang-hits/sciagent-skills

scientific-brainstorming

snakemake-workflow-engine

esm-protein-language-model

biopython-sequence-analysis

shap-model-explainability

archs4-database