daft-distributed-scaling
Installation
SKILL.md
Daft Distributed Scaling
Scale single-node workflows to distributed execution.
Core Strategies
| Strategy | API | Use Case | Pros/Cons |
|---|---|---|---|
| Shuffle | repartition(N) |
Light data (e.g. file paths), Joins | Global balance. High memory usage (materializes data). |
| Streaming | into_batches(N) |
Heavy data (images, tensors) | Low memory (streaming). High scheduling overhead if batches too small. |
Quick Recipes
1. Light Data: Repartitioning
Best for distributing file paths before heavy reads.
Related skills