imaging-data-commons

Installation
SKILL.md

NCI Imaging Data Commons

Overview

NCI Imaging Data Commons (IDC) is NCI's cloud-based repository for cancer imaging data, hosting 50+ TB of publicly accessible DICOM images spanning radiology (CT, MRI, PET) and pathology (whole slide images) across 100+ collections. All data is hosted on Google Cloud Storage and BigQuery, enabling SQL queries over DICOM metadata without downloading. IDC integrates with Google Colab and BigQuery, making large-scale imaging research accessible without local storage.

When to Use

  • Searching for publicly available cancer imaging datasets by modality, cancer type, or anatomical site
  • Downloading DICOM image series for model training (segmentation, classification, detection)
  • Querying DICOM metadata at scale using SQL (BigQuery) without downloading the full dataset
  • Exploring available imaging collections before committing to a full download
  • Accessing pathology whole-slide images (WSI) and radiology scans from TCIA collections
  • Building reproducible imaging ML pipelines with versioned public datasets
  • For local DICOM file processing use pydicom-medical-imaging; for WSI preprocessing use histolab

Prerequisites

Related skills

More from jaechang-hits/sciagent-skills

Installs
9
GitHub Stars
152
First Seen
Mar 16, 2026