tinker
Tinker API - LLM Fine-Tuning
Overview
Tinker is a training API for large language models from Thinking Machines Lab. It provides:
- Supervised Fine-Tuning (SFT): Train models on instruction/completion pairs
- Reinforcement Learning (RL): PPO and policy gradient losses; cookbook patterns include GRPO-like group rollouts/advantage centering
- Vision-Language Models: VLM support via Qwen3-VL
- LoRA Training: Efficient parameter-efficient fine-tuning
Two abstraction levels:
- Tinker Cookbook: High-level patterns with automatic training loops
- Low-Level API: Manual control for custom training logic
Quick Reference
| Topic | Reference |
|---|
More from sundial-org/skills
icml-reviewer
|
82training-data-curation
Guidelines for creating high-quality datasets for LLM post-training (SFT/DPO/RLHF). Use when preparing data for fine-tuning, evaluating data quality, or designing data collection strategies.
58cs-research-methodology
Conduct a literature review and develop a CS research proposal. Use when asked to review a research area, find gaps in existing work, and propose a novel research contribution. The output is a research proposal identifying an assumption to challenge (the "bit flip") and how to validate it.
58ai-co-scientist
Transform Claude Code into an AI Scientist that orchestrates research workflows using tree-based hypothesis exploration. Triggers on "research project", "scientific experiment", "run experiments", "AI scientist", "tree search experimentation", "systematic study".
54project-referee
Critiques ML conference papers with reviewer-style feedback. Use when users want to anticipate reviewer concerns, identify weaknesses, check claim-evidence gaps, or find missing citations.
51commit-splitter
Split large sets of uncommitted changes into logical, well-organized commits. Use when the user has many uncommitted changes and wants structured commits, or proactively suggest when detecting a large diff that would benefit from splitting.
49