daft-distributed-scaling

Installation
SKILL.md

Daft Distributed Scaling

Scale single-node workflows to distributed execution.

Core Strategies

Strategy API Use Case Pros/Cons
Shuffle repartition(N) Light data (e.g. file paths), Joins Global balance. High memory usage (materializes data).
Streaming into_batches(N) Heavy data (images, tensors) Low memory (streaming). High scheduling overhead if batches too small.

Quick Recipes

1. Light Data: Repartitioning

Best for distributing file paths before heavy reads.

Related skills
Installs
25
GitHub Stars
5.5K
First Seen
Feb 27, 2026