simpo-training

Originally fromovachiever/droid-tings
Installation
SKILL.md

SimPO - Simple Preference Optimization

Quick start

SimPO is a reference-free preference optimization method that outperforms DPO without needing a reference model.

Installation:

# Create environment
conda create -n simpo python=3.10 && conda activate simpo

# Install PyTorch 2.2.2
# Visit: https://pytorch.org/get-started/locally/

# Install alignment-handbook
git clone https://github.com/huggingface/alignment-handbook.git
cd alignment-handbook
python -m pip install .
Related skills

More from davila7/claude-code-templates

Installs
277
GitHub Stars
27.2K
First Seen
Jan 21, 2026