slime-rl-training

Pass

Audited by Gen Agent Trust Hub on Apr 4, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill fetches the framework source code from a public GitHub repository (github.com/THUDM/slime.git) and a Docker image (slimerl/slime:latest). These sources appear consistent with the project's identity and the author's description.
  • [COMMAND_EXECUTION]: The skill instructs the agent to execute various shell commands for setup and training. This includes 'docker pull', 'git clone', and running training scripts like 'python train.py' and 'source scripts/models/qwen3-4B.sh'. Sourcing external shell scripts is a common pattern in Megatron-LM based frameworks to set environment variables.
  • [REMOTE_CODE_EXECUTION]: The framework supports loading user-defined Python scripts at runtime via arguments such as '--custom-generate-function-path' and '--custom-rm-path'. While this allows for dynamic execution, it is presented as a mechanism for users to provide custom reward modeling and data generation logic.
  • [PROMPT_INJECTION]: The skill is designed to process external training data provided via the '--prompt-data' argument. There is a potential surface for indirect prompt injection where malicious instructions embedded in the JSONL data could attempt to influence the training rollout logic or the behavior of the resulting model.
  • Ingestion points: Training data is loaded from local paths specified by the user (e.g., SKILL.md mentions '--prompt-data /path/to/data.jsonl').
  • Boundary markers: Not explicitly defined in the examples provided; standard machine learning data loading procedures are utilized.
  • Capability inventory: The training scripts execute Python code, perform high-performance networking (NCCL/Ray), and perform file I/O for saving checkpoints and logs.
  • Sanitization: No specific sanitization or escaping mechanisms for training data are described in the provided files.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 4, 2026, 05:50 PM