anndata-data-structure

Installation
SKILL.md

AnnData — Annotated Data Matrices for Single-Cell Genomics

Overview

AnnData provides the standard data structure for single-cell genomics in the scverse ecosystem. It stores an observations-by-variables matrix (X) alongside cell metadata (obs), gene metadata (var), layers, embeddings (obsm/varm), graphs (obsp/varp), and unstructured metadata (uns). Supports sparse matrices, H5AD/Zarr storage, backed mode for large files, and integration with Scanpy, scvi-tools, and Muon.

When to Use

  • Constructing annotated matrices from raw count data with cell/gene metadata
  • Reading/writing .h5ad or .zarr files for single-cell experiments
  • Subsetting cells by quality metrics, gene sets, or metadata conditions
  • Concatenating multiple experimental batches with consistent metadata
  • Storing multiple data layers (raw counts, normalized, scaled) in one object
  • Working with large datasets exceeding RAM (backed mode, lazy concatenation)
  • Preparing data for Scanpy or scvi-tools pipelines
  • For single-cell analysis (clustering, DE, visualization), use scanpy instead
  • For probabilistic models, use scvi-tools instead

Prerequisites

Related skills

More from jaechang-hits/sciagent-skills

Installs
9
GitHub Stars
152
First Seen
Mar 16, 2026