pyhealth
Installation
SKILL.md
PyHealth
Overview
PyHealth provides an end-to-end pipeline for healthcare ML on EHR data: data loading → medical code processing → patient-level dataset construction → model training → evaluation. It natively supports MIMIC-III, MIMIC-IV, eICU-CRD, and OMOP-CDM structured databases, and handles the idiosyncratic data formats of each. Medical codes (ICD-9, ICD-10, ATC, NDC, SNOMED) are organized in a hierarchical code system that supports code-level embedding and cross-ontology mapping. Pre-built tasks — mortality prediction, drug recommendation, readmission, length-of-stay, diagnosis code prediction — can be instantiated in a few lines. Custom tasks follow a standardized interface.
When to Use
- Training clinical outcome prediction models (mortality, readmission, LOS) from MIMIC-III or MIMIC-IV
- Building drug recommendation or drug interaction prediction models using ATC code hierarchy
- Processing OMOP-CDM formatted data from institutional EHR systems for ML
- Using pretrained clinical models (RETAIN, GRASP, MedBERT) as baselines on healthcare benchmarks
- Constructing patient visit sequences with temporal structure for RNN/Transformer models
- Evaluating clinical prediction models with appropriate metrics (AUROC, AUPRC, F1, Jaccard)
- Use FIDDLE for pure EHR preprocessing without ML; use clinical-longformer for clinical note NLP