verl-rl-training

Installation
SKILL.md

verl: Volcano Engine Reinforcement Learning for LLMs

verl is a flexible, efficient, and production-ready RL training library for large language models from ByteDance's Seed team. It implements the HybridFlow framework (EuroSys 2025) and powers models like Doubao-1.5-pro achieving O1-level performance on math benchmarks.

When to Use verl

Choose verl when you need:

  • Production-ready RL training at scale (tested up to 671B parameters)
  • Flexibility to swap backends (FSDP ↔ Megatron-LM ↔ vLLM ↔ SGLang)
  • Support for multiple RL algorithms (PPO, GRPO, RLOO, REINFORCE++, DAPO)
  • Multi-turn rollout with tool calling for agentic workflows
  • Vision-language model RL training

Consider alternatives when:

  • You need Megatron-native training → use slime or miles
  • You want PyTorch-native abstractions with Monarch → use torchforge
  • You only need simple SFT/DPO → use TRL or Axolotl

Key Features

Related skills

More from davila7/claude-code-templates

Installs
83
GitHub Stars
27.2K
First Seen
Jan 29, 2026