model-pruning

Originally fromovachiever/droid-tings

Installation

SKILL.md

Model Pruning: Compressing LLMs

When to Use This Skill

Use Model Pruning when you need to:

Reduce model size by 40-60% with <1% accuracy loss
Accelerate inference using hardware-friendly sparsity (2-4× speedup)
Deploy on constrained hardware (mobile, edge devices)
Compress without retraining using one-shot methods
Enable efficient serving with reduced memory footprint

Key Techniques: Wanda (weights × activations), SparseGPT (second-order), structured pruning, N:M sparsity

Papers: Wanda ICLR 2024 (arXiv 2306.11695), SparseGPT (arXiv 2301.00774)

Installation

Installs

356

Repository

orchestra-resea…h-skills

GitHub Stars

10.5K

First Seen

Feb 7, 2026

Security Audits

Gen Agent Trust HubPass

model-pruning — orchestra-research/ai-research-skills