paddle-distributed

Installation
SKILL.md

Paddle 分布式训练、SOT 动转静与 Python-C++ 互操作

分布式范式速查

范式 核心思想 通信原语
Data Parallel 复制模型,切分数据,AllReduce 梯度 AllReduce
Group Sharded (ZeRO) Stage1 切 optimizer / Stage2 + 切 grad / Stage3 + 切 weight Broadcast, ReduceScatter, AllGather
Model Parallel (Tensor) Column Parallel 切权重列 / Row Parallel 切权重行 AllReduce / AllGather
Pipeline Parallel F-then-B / 1F1B 交错前反向 Send / Recv (P2P)
Sequence Parallel 沿 sequence 维度切分 LayerNorm/Dropout AllGather / ReduceScatter

三种编程范式:手动 (fleet.meta_parallel)、半自动动态图 (ProcessMesh + shard_tensor)、半自动静态图 (auto_parallel.Engine)。

SOT 架构速查

Python Frame
Related skills
Installs
4
GitHub Stars
2
First Seen
Mar 13, 2026