ray-data

Installation
SKILL.md

Ray Data - Scalable ML Data Processing

Distributed data processing library for ML and AI workloads.

When to use Ray Data

Use Ray Data when:

  • Processing large datasets (>100GB) for ML training
  • Need distributed data preprocessing across cluster
  • Building batch inference pipelines
  • Loading multi-modal data (images, audio, video)
  • Scaling data processing from laptop to cluster

Key features:

  • Streaming execution: Process data larger than memory
  • GPU support: Accelerate transforms with GPUs
  • Framework integration: PyTorch, TensorFlow, HuggingFace
  • Multi-modal: Images, Parquet, CSV, JSON, audio, video
Related skills

More from kiterlin/intelligent-detection-system

Installs
29
GitHub Stars
1
First Seen
Apr 21, 2026