claude-code-local-mlx
Installation
SKILL.md
Claude Code Local — MLX On-Device AI
Skill by ara.so — Claude Code Skills collection.
Run Claude Code 100% on-device with local AI on Apple Silicon. This project provides an MLX-native Anthropic-API-compatible server that lets you use powerful local models (Qwen 3.5 122B at 65 tok/s, Llama 3.3 70B, Gemma 4 31B, DeepSeek V4 Flash with 1M context) as drop-in replacements for Claude. Built for privacy-critical workflows (NDA, legal, healthcare) where data cannot leave the device.
What It Does
- MLX-native inference optimized for Apple Silicon (M1/M2/M3/M4)
- Anthropic API compatibility — point Claude Code clients at
localhost:8000 - Four model options: Gemma 4 31B (fast), Qwen 3.5 122B (balanced), Llama 3.3 70B (dense), DeepSeek V4 Flash (1M context)
- 100% offline — works in airplane mode, airgap-ready
- Voice mode — hands-free coding with on-device STT/TTS
- Browser agent — remote access from any device on your LAN