msverl-daily-regression-triage

Installation
SKILL.md

MSVerl Daily Regression Triage

Use this skill when a fixed daily verl + MindSpeed training job has run and Codex needs to decide whether the result is healthy, whether there is a training failure or an accuracy regression, and which recent commit is the most likely cause.

Defaults

  • Baseline comparison log: /home/st_daily_verl/msverl.log
  • Training log pattern: /home/st_daily_verl/logs/msverl_YYYYMMDD.log
  • verl repo: https://github.com/verl-project/verl.git on main
  • MindSpeed repo: https://gitcode.com/Ascend/MindSpeed.git on master
  • Cache root for temporary clones: /tmp/msverl-skill-cache
  • Time window: from local previous day 00:00:00 to the task execution time

Hard Stop Rules

  • Read the comparison log first.
  • If it contains mean abs diff: and the parsed value is exactly 0, stop and report success.
  • If it contains mean abs diff: and the value is non-zero, classify as accuracy_regression.
  • If it contains error, please check log, classify as train_error.
Related skills

More from ascend/agent-skills

Installs
45
GitHub Stars
14
First Seen
Apr 3, 2026