slime-rl-training

Installation

SKILL.md

slime: LLM Post-Training Framework for RL Scaling

slime is an LLM post-training framework from Tsinghua's THUDM team, powering GLM-4.5, GLM-4.6, and GLM-4.7. It connects Megatron-LM for training with SGLang for high-throughput rollout generation.

When to Use slime

Choose slime when you need:

Megatron-LM native training with SGLang inference
Custom data generation workflows with flexible data buffers
Training GLM, Qwen3, DeepSeek V3, or Llama 3 models
Research-grade framework with production backing (Z.ai)

Consider alternatives when:

You need enterprise-grade stability features → use miles
You want flexible backend swapping → use verl
You need PyTorch-native abstractions → use torchforge

Key Features

Related skills

More from firecrawl/ai-research-skills

pinecone
Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
6
prompt-guard
Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.
6
chroma
Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG applications, or document retrieval. Best for local development and open-source projects.
6
unsloth
Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization
5
mlflow
Track ML experiments, manage model registry with versioning, deploy models to production, and reproduce experiments with MLflow - framework-agnostic ML lifecycle platform
5
nanogpt
Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for learning transformers. By Andrej Karpathy. Perfect for understanding GPT architecture from scratch. Train on Shakespeare (CPU) or OpenWebText (multi-GPU).
5

Installs

Repository

firecrawl/ai-re…h-skills

GitHub Stars

First Seen

Mar 28, 2026

Security Audits

Gen Agent Trust HubWarn

SocketPass

SnykPass

slime-rl-training

slime: LLM Post-Training Framework for RL Scaling

When to Use slime

Key Features

More from firecrawl/ai-research-skills

pinecone

prompt-guard

chroma

unsloth

mlflow

nanogpt