paddle-debug
Paddle 仓库调试
调试流程
1. 描述问题并构造最小复现
- 用简洁的自然语言说明:
- 触发步骤(命令、脚本、关键配置)。
- 期望行为 vs 实际行为。
- 是否只在特定环境 / 机器 / 设备 / 数据子集上出现。
- 任何调试开始前,先确认 bug 能被稳定复现。若按照给定命令或脚本无法复现:
- 检查命令是否抄错、参数是否缺失;
- 比对并对齐环境(Paddle / Python / CUDA / CUDNN / 驱动 / 显卡型号等);
- 确认与最初出问题的环境一致后再继续。
- 尽量抽取一个独立的 Python 脚本(或单测)承载问题:
- 固定随机种子(
numpy/random/paddle.seed等); - 使用固定、可序列化的小数据(固定随机数或离线样本);
- 去掉与问题无关的逻辑(复杂数据增强、冗余日志、训练循环中的花活等)。
- 固定随机种子(
- 目标是做到:
More from pfcclab/paddle-skills
paddle-pull-request
|
30fastdeploy-pull-request
|
20paddle-pir-cinn
Use when working with Paddle's new IR system (PIR) or CINN compiler: understanding SSA-based Program structure, Dialect/Type/Attribute design, writing or debugging Passes, tracing the CINN compilation pipeline from GroupOp to CUDA kernel, or translating legacy ProgramDesc to PIR.
5paddle-phi-kernel
Use when working with Paddle's PHI kernel system: registering new kernels, debugging kernel selection/dispatch, understanding code auto-generation from YAML, or implementing operator decomposition via the combination mechanism.
5paddle-eager-graph
Use when navigating Paddle eager-mode (dynamic graph) source code, tracing forward/backward execution, debugging autograd issues, understanding PyLayer, or investigating complex-valued gradient computation. Covers Python API to C++ kernel call chain, backward graph topology sort, and inplace version tracking.
5paddle-static-graph
Use when working with Paddle's static graph mode: understanding Program/Block/Op/Var data structures, tracing the executor lifecycle from graph construction to scheduling, debugging InterpreterCore issues, or analyzing operator dependency and variable lifetime management.
4