pathml

Overview

PathML is a Python toolkit designed for computational pathology workflows on whole-slide images (WSIs). It provides a unified pipeline from raw slide files (SVS, NDPI, MRXS, TIFF) through tile extraction, preprocessing (stain normalization, nuclear segmentation, tissue detection), feature extraction, and machine learning. PathML integrates with popular Python ML and image processing libraries while abstracting the complexity of WSI handling through its SlideData and Pipeline abstractions.

When to Use

Processing whole-slide H&E images: Tiling a large WSI, normalizing staining variability across slides from different scanners or batches.
Nuclear segmentation on pathology slides: Detecting and segmenting nuclei in H&E or DAPI-stained WSIs using built-in segmentation pipelines.
Building ML training datasets from WSIs: Extracting tiles with associated labels for training tissue classifiers, tumor detectors, or survival prediction models.
Multiplex immunofluorescence (mIF) image analysis: Processing multi-channel IF slides with channel-specific preprocessing and feature extraction.
Stain normalization across cohorts: Applying Macenko or Vahadane stain normalization to harmonize H&E slides from multiple institutions.
Feature extraction for downstream ML: Extracting handcrafted or deep learning features from tiles for patient-level prediction tasks.
For standard 2D microscopy images (non-WSI), use scikit-image or cellpose directly without PathML overhead.

pathml

pathml

Overview

When to Use

Prerequisites

More from jaechang-hits/sciagent-skills

scientific-brainstorming

gene-database

snakemake-workflow-engine

esm-protein-language-model

biopython-sequence-analysis

shap-model-explainability