
text-generation-webui
Local LLM interface with text, vision, and training
The Lens
Text-generation-webui gives you a browser-based interface to do it. Load a model, chat with it, fine-tune it, generate images. It's the Swiss Army knife for local AI.
The entire project is free under AGPL-3.0. Every feature (chat, notebook mode, model loading, LoRA training, multimodal/vision support, extensions) ships at $0. The developer sells some extension packs on Gumroad, but those are optional add-ons, not core features.
Self-hosting is the only option, and the setup complexity depends on your GPU situation. If you have an NVIDIA card with 8GB+ VRAM, the one-click installers work well. AMD and Apple Silicon support exists but can be finicky. Expect 30-60 minutes for first-time setup including downloading a model. Solo developers: this is your playground. Run models locally, experiment with fine-tuning, keep your data private. Small teams: share a beefy GPU server running the API mode. Beyond that, look at dedicated inference servers like vLLM.
The catch: GPU hardware requirements are real. You need a decent GPU to run anything useful. A 7B parameter model needs ~6GB VRAM. Anything bigger needs proportionally more. No GPU, no party.
Free vs Self-Hosted vs Paid
fully free### Free Everything core is free. Chat interface, notebook mode, model management, LoRA/QLoRA fine-tuning, multimodal support, API server mode, extensions framework, all AGPL-3.0.
### Optional Paid Extensions The developer sells extension packs on Gumroad (e.g., "Deep Reason" for enhanced reasoning). These are optional convenience add-ons, not gated features. The core project doesn't depend on them.
### Self-Hosted (Only Option) This runs on your hardware. Minimum useful setup: NVIDIA GPU with 8GB VRAM (~$200-400 used for an RTX 3060 12GB). Ideal setup: 24GB VRAM card ($400-900 used for RTX 3090/4090). Running costs are electricity. Expect $5-15/month for regular use.
### Hardware Cost Reality Check - 7B models (Mistral, Llama 3): 6-8GB VRAM. Runs on an RTX 3060. - 13B models: 10-16GB VRAM. Needs RTX 3090 or better. - 70B models: 40GB+ VRAM. Multi-GPU territory or heavy quantization. - CPU-only: possible with llama.cpp backend but painfully slow for anything useful.
### Verdict The software is free. The hardware isn't. Budget $300-900 for a capable GPU if you don't have one already.
Software is completely free. The real cost is GPU hardware: $300-900 for a capable NVIDIA card.
Similar Tools
About
- Stars
- 46,412
- Forks
- 5,905
Explore Further
More tools in the directory
Get tools like this delivered weekly
The Open Source Drop — the best new open source tools, analyzed. Free.





