The Lens

Omlx puts an LLM inference server in your macOS menu bar. Click the icon, pick a model, and you have a local AI API running. It uses continuous batching (handles multiple requests efficiently) and SSD caching (models load faster after the first time) optimized specifically for Apple Silicon.

This is the easiest way to run local LLMs on a Mac right now. No Docker, no Python environments, no config files. Menu bar app, one click, done. The API is OpenAI-compatible so any tool that talks to OpenAI can point at your local omlx instead. Apache 2.0 licensed, Python.

The catch: Mac only. Apple Silicon specifically; Intel Macs are either unsupported or severely limited. The performance depends on your Mac's unified memory; 8GB will run small models, you need 32GB+ for anything serious. And 'menu bar simplicity' means less control over advanced settings like quantization, context length, and memory allocation.

Explore Further

GitHub Repository

Source code, issues, README

Reddit Discussions

Community opinions and use cases

Hacker News

HN threads and discussions

Dev.to Articles

Tutorials and write-ups

Tutorials & Guides

Getting started resources

Official Website

Docs, blog, and more

omlx

The Lens

Free vs Self-Hosted vs Paid

Similar Tools

About

Explore Further

More tools in the directory

VS Code

n8n

Flutter

Get tools like this delivered weekly