qlora

Installation

SKILL.md

QLoRA: Quantized Low-Rank Adaptation

QLoRA enables fine-tuning of large language models on consumer GPUs by combining 4-bit quantization with LoRA adapters. A 65B model can be fine-tuned on a single 48GB GPU while matching 16-bit fine-tuning performance.

Prerequisites: This skill assumes familiarity with LoRA. See the lora skill for LoRA fundamentals (LoraConfig, target_modules, training patterns).

Core Innovations
BitsAndBytesConfig Deep Dive
Memory Requirements
Complete Training Example
Inference and Merging
Troubleshooting
Best Practices
References

Core Innovations

Related skills

qlora

QLoRA: Quantized Low-Rank Adaptation

Table of Contents

Core Innovations

More from itsmostafa/llm-engineering-skills

mlx

prompt-engineering

context-engineering

pytorch

lora

rlhf