quantizing-models-bitsandbytes

Originally fromovachiever/droid-tings
Installation
SKILL.md

bitsandbytes - LLM Quantization

Quick start

bitsandbytes reduces LLM memory by 50% (8-bit) or 75% (4-bit) with <1% accuracy loss.

Installation:

pip install bitsandbytes transformers accelerate

8-bit quantization (50% memory reduction):

from transformers import AutoModelForCausalLM, BitsAndBytesConfig

config = BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",
    quantization_config=config,
Related skills

More from davila7/claude-code-templates

Installs
315
GitHub Stars
27.2K
First Seen
Jan 21, 2026