gptq

Installation

SKILL.md

GPTQ (Generative Pre-trained Transformer Quantization)

Post-training quantization method that compresses LLMs to 4-bit with minimal accuracy loss using group-wise quantization.

When to use GPTQ

Use GPTQ when:

Need to fit large models (70B+) on limited GPU memory
Want 4× memory reduction with <2% accuracy loss
Deploying on consumer GPUs (RTX 4090, 3090)
Need faster inference (3-4× speedup vs FP16)

Use AWQ instead when:

Need slightly better accuracy (<1% loss)
Have newer GPUs (Ampere, Ada)
Want Marlin kernel support (2× faster on some GPUs)

Use bitsandbytes instead when:

Need simple integration with transformers

Related skills

More from ovachiever/droid-tings

security-auditor
Continuous security vulnerability scanning for OWASP Top 10, common vulnerabilities, and insecure patterns. Use when reviewing code, before deployments, or on file changes. Scans for SQL injection, XSS, secrets exposure, auth issues. Triggers on file changes, security mentions, deployment prep.
751
react-hook-form-zod
|
458
nextjs-shadcn-builder
Build new Next.js applications or migrate existing frontends (React, Vue, Angular, vanilla JS, etc.) to Next.js + shadcn/ui with systematic analysis and conversion. Enforces shadcn design principles - CSS variables for theming, standard UI components, no hardcoded values, consistent typography/colors. Use for creating Next.js apps, migrating frontends, adopting shadcn/ui, or standardizing component libraries. Includes MCP integration for shadcn documentation and automated codebase analysis.
226
deep-reading-analyst
Comprehensive framework for deep analysis of articles, papers, and long-form content using 10+ thinking models (SCQA, 5W2H, critical thinking, inversion, mental models, first principles, systems thinking, six thinking hats). Use when users want to: (1) deeply understand complex articles/content, (2) analyze arguments and identify logical flaws, (3) extract actionable insights from reading materials, (4) create study notes or learning summaries, (5) compare multiple sources, (6) transform knowledge into practical applications, or (7) apply specific thinking frameworks. Triggered by phrases like 'analyze this article,' 'help me understand,' 'deep dive into,' 'extract insights from,' 'use [framework name],' or when users provide URLs/long-form content for analysis.
191
playwright browser automation
Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check responsive design, validate UX, test login flows, check links, automate any browser task. Use when user wants to test websites, automate browser interactions, validate web functionality, or perform any browser-based testing.
146
threejs-graphics-optimizer
Performance optimization rules for THREE.js and graphics programming. Covers mobile-first optimization, fallback patterns, memory management, render loop efficiency, and general graphics best practices for smooth 60fps experiences across devices.
107

Installs

Repository

ovachiever/droid-tings

GitHub Stars

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykWarn

gptq

GPTQ (Generative Pre-trained Transformer Quantization)

When to use GPTQ

More from ovachiever/droid-tings

security-auditor

react-hook-form-zod

nextjs-shadcn-builder

deep-reading-analyst

playwright browser automation

threejs-graphics-optimizer