Overview

Full Fine-Tuning (FFT) in Unsloth allows for 100% exact weight updates, bypassing the low-rank approximations of LoRA. By utilizing Unsloth's optimized gradient checkpointing, FFT can fit significantly larger batch sizes while ensuring total model modification.

When to Use

When performing base model pre-training or continued pre-training on large datasets.
When model-wide behaviors need modification that adapters (LoRA) cannot fully capture.
When sufficient VRAM is available to handle full model gradients.

Decision Tree

Do you need to modify 100% of the model weights?
- Yes: Proceed with FFT.
- No: Use [[unsloth-lora]].
Is VRAM limited (e.g., < 24GB for a 7B model)?
- Yes: Enable use_gradient_checkpointing = 'unsloth' and adamw_8bit.
- No: Use standard BF16 and high batch sizes.

unsloth-fft

Overview

When to Use

Decision Tree

Workflows

Initializing Full Fine-tuning

More from cuba6112/skillfactory

ollama-rag

unsloth-sft

torchaudio

pytorch-onnx

unsloth-lora

pytorch-quantization