UMAP-Learn

Overview

UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction algorithm for visualization and general non-linear dimensionality reduction. It is faster than t-SNE, scales to larger datasets, preserves both local and global structure, and supports supervised learning and embedding of new data points.

When to Use

Reducing high-dimensional data to 2D/3D for visualization
Preprocessing for density-based clustering (HDBSCAN, DBSCAN)
Feature engineering in ML pipelines (transform new data into learned embedding)
Supervised/semi-supervised embedding with partial labels
Tracking embeddings across time points or batches (AlignedUMAP)
Density-preserving embeddings (DensMAP)
Neural network-based embedding with custom architectures (Parametric UMAP)
For linear dimensionality reduction use PCA (scikit-learn)
For neighborhood-graph construction without embedding use scikit-learn NearestNeighbors

umap-learn

UMAP-Learn

Overview

When to Use

Prerequisites

More from jaechang-hits/sciagent-skills

scientific-brainstorming

snakemake-workflow-engine

esm-protein-language-model

biopython-sequence-analysis

shap-model-explainability

archs4-database