verl-rl-training

Originally fromdavila7/claude-code-templates

Installation

SKILL.md

verl: Volcano Engine Reinforcement Learning for LLMs

verl is a flexible, efficient, and production-ready RL training library for large language models from ByteDance's Seed team. It implements the HybridFlow framework (EuroSys 2025) and powers models like Doubao-1.5-pro achieving O1-level performance on math benchmarks.

When to Use verl

Choose verl when you need:

Production-ready RL training at scale (tested up to 671B parameters)
Flexibility to swap backends (FSDP ↔ Megatron-LM ↔ vLLM ↔ SGLang)
Support for multiple RL algorithms (PPO, GRPO, RLOO, REINFORCE++, DAPO)
Multi-turn rollout with tool calling for agentic workflows
Vision-language model RL training

Consider alternatives when:

You need Megatron-native training → use slime or miles
You want PyTorch-native abstractions with Monarch → use torchforge
You only need simple SFT/DPO → use TRL or Axolotl

Key Features

Related skills

More from kiterlin/intelligent-detection-system

Installs

Repository

kiterlin/intell…n-system

GitHub Stars

First Seen

Apr 21, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykWarn

verl-rl-training

verl: Volcano Engine Reinforcement Learning for LLMs

When to Use verl

Key Features

More from kiterlin/intelligent-detection-system

optimizing-attention-flash

ray-data

pytorch-fsdp2

ml-paper-writing

ray-train

tensorrt-llm