zarr-python

Installation
SKILL.md

Zarr Python — Chunked N-D Arrays

Overview

Zarr is a Python library for storing large N-dimensional arrays with chunking, compression, and parallel I/O. It provides NumPy-compatible indexing with pluggable storage backends (local, cloud, in-memory), making it the standard format for cloud-native scientific data pipelines.

When to Use

  • Storing arrays too large for memory with chunked access (out-of-core computing)
  • Cloud-native data workflows with S3 or GCS storage backends
  • Parallel read/write with Dask for large-scale computation
  • Hierarchical data organization (groups of named arrays with metadata)
  • Converting between formats (HDF5 → Zarr, NetCDF → Zarr)
  • Appending time-series data incrementally without rewriting
  • For labeled, coordinate-aware arrays (time, lat, lon), use xarray with Zarr backend instead
  • For data management, lineage, and ontology validation, use lamindb (which uses Zarr as a storage format)

Prerequisites

Related skills

More from jaechang-hits/sciagent-skills

Installs
9
GitHub Stars
152
First Seen
Mar 16, 2026