openclaw-rl-training

Installation
SKILL.md

OpenClaw-RL Training Skill

Skill by ara.so — Hermes Skills collection.

Overview

OpenClaw-RL is a fully asynchronous reinforcement learning framework that trains personalized AI agents from natural conversation feedback. It wraps self-hosted models in an OpenClaw-compatible API, intercepts live multi-turn conversations, and continuously optimizes the policy in the background without interrupting usage.

Key capabilities:

  • Fully async 4-component architecture (serving, rollout, evaluation, training)
  • Three learning paradigms: Binary RL (GRPO), On-Policy Distillation (OPD), Hybrid Combine
  • Self-hosted and private — runs entirely on your infrastructure
  • Supports personal agent optimization and general agentic RL (terminal, GUI, SWE, tool-call)
  • Zero manual labeling — automatic trajectory creation from conversations

Installation

Prerequisites

Installs
135
First Seen
May 16, 2026
openclaw-rl-training — aradotso/hermes-skills