
ds4
DeepSeek 4 Flash local inference engine for Metal
The Lens
DS4 is a single-model inference engine: it runs DeepSeek V4 Flash locally on Apple Silicon (Metal) or Linux (CUDA), and only that model. Antirez, the creator of Redis, built it as a focused experiment. The point is to run a 284B-parameter frontier-class model on a Mac Studio or a high-end Linux box without going through llama.cpp's generic GGUF loader. With 2-bit quantization the q2 build will fit on a 128GB Mac, q4 needs 256GB plus.
Setup is a download script and a make. You get a CLI with `/think` and `/nothink` modes, and an OpenAI- and Anthropic-compatible HTTP server that drops into any client that already speaks those APIs. On a Mac Studio M3 Ultra Antirez reports 84 tokens/sec prefill and 37 tokens/sec generation at 2-bit. Context window goes up to 1 million tokens.
Use this if you specifically want DeepSeek V4 Flash running locally on serious hardware. The appeal is sovereignty, not portability. Solo developers with a Mac Studio: this is a fun way to burn GPU hours. Anyone else: stick with llama.cpp or vLLM until DS4 ships a stable release.
The catch: this is alpha code, by Antirez's own admission. He notes the implementation leans heavily on GPT 5.5 assistance and acknowledges debt to llama.cpp. One model, one workload, no production claims. Treat it accordingly.
Free vs Self-Hosted vs Paid
fully freeFree: MIT licensed source. Model weights are DeepSeek's, also free under their license. No API keys, no usage limits beyond your local hardware.
Self-hosted: Runs entirely on your own machine. Hardware is the real cost: a 128GB Mac Studio starts around $4,800; a Linux box with 256GB+ RAM and capable GPUs runs higher. Electricity for long generations is real.
Paid: None. DS4 is a personal project; there is no hosted version.
Free if you have $5,000 of Apple Silicon already on your desk.
Get tools like this every Wednesday
One featured tool, three on the radar. No fluff.
License: MIT License
Use freely, including commercial. Just keep the license.
Commercial use: ✓ Yes
About
- Owner
- Salvatore Sanfilippo (User)
- Stars
- 11,582
- Forks
- 976
Explore Further
More tools in the directory
openclaw
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
373.3k ★claw-code
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
192.0k ★n8n
Fair-code workflow automation with native AI capabilities
189.4k ★