slime-user
SLIME User Guide
SLIME is an LLM post-training framework for RL Scaling developed by THUDM. It supports various RL algorithms (GRPO, GSPO, PPO, Reinforce++), multiple training backends (Megatron, FSDP), and advanced features like multi-turn interactions, tool calling, and dynamic sampling.
Quick Start Workflow
For First-Time Users
-
Environment Setup
- Use Docker:
docker pull slimerl/slime:latest - Or build from source: See
docs/en/get_started/quick_start.md - Hardware: Supports H100/H200, B200 series
- Use Docker:
-
Download Model and Data
hf download Qwen/Qwen3-4B --local-dir /root/Qwen3-4B hf download --repo-type dataset zhuzilin/dapo-math-17k --local-dir /root/dapo-math-17k
More from yzlnew/infra-skills
tikz-flowchart
Creates professional TikZ flowcharts with standardized themes, including Google Material-like and Anthropic-inspired options.
113tilelang-developer
Write, optimize, and debug high-performance AI compute kernels using TileLang (a Python DSL for GPU programming). Use when the user requests: (1) Writing custom GPU kernels for AI workloads (GEMM, Attention, MLA, etc.), (2) Optimizing existing TileLang code for NVIDIA, AMD, or Ascend hardware, (3) Implementing non-standard operators (like DeepSeek MLA, FlashAttention variants), (4) Debugging TileLang compilation or runtime errors, or (5) Cross-platform kernel development targeting multiple GPU vendors.
13megatron-memory-estimator
Estimate GPU memory usage for Megatron-based MoE (Mixture of Experts) and dense models. Use when users need to (1) estimate memory from HuggingFace model configs (DeepSeek-V3, Qwen, etc.), (2) plan GPU resource allocation for training, (3) compare different parallelism strategies (TP/PP/EP/CP), (4) determine if a model fits in available GPU memory, or (5) optimize training configurations for memory efficiency.
11material-you-slides
Create presentation slides using Material You (Material Design 3) style. Generates 1280x720 HTML slides with M3 color tokens, Roboto typography, rounded cards, flow diagrams, metric cards, code blocks, and structured layouts. Use when the user asks to create slides, presentations, or decks and wants a clean, modern Material Design 3 aesthetic.
3