If you’re evaluating LLM gateways, you’ve probably looked at LiteLLM. It’s the most popular option - 100+ providers, big community, used by Stripe and Netflix. We built VoidLLM with different priorities. Here’s an honest look at both.
Privacy by architecture. VoidLLM never stores, logs, or persists any prompt or response content. Not as a config option - there’s no content logging code to disable. If GDPR compliance or data sovereignty is a hard requirement, this is the difference between “we turned off logging” and “logging doesn’t exist.” Read more in Zero-Knowledge by Architecture.
Single binary, no runtime. One Go binary (~25MB) with the admin UI embedded. No Python, no pip, no virtualenv, no dependency conflicts. Download, configure, run.
Performance. Under 500 microseconds of proxy overhead at 2000 RPS. Go + Fiber (fasthttp) keeps memory usage low and startup instant.
Built-in UI. A full admin dashboard ships inside the binary - key management, usage tracking, model configuration, playground, team management. Not a separate service to deploy.
MCP Gateway. VoidLLM doubles as an MCP gateway with scoped access control (per-org, per-team) and Code Mode for multi-tool orchestration in a WASM sandbox.
RBAC from the start. Org/team/user/key hierarchy with four roles. Rate limits, token budgets, and model access control at every level. Most-restrictive-wins inheritance.
Load balancing. Multi-deployment models with round-robin, least-latency, weighted, and priority routing. Automatic failover with per-deployment circuit breakers.
Provider coverage. 100+ providers out of the box - Bedrock, VertexAI, Cohere, and dozens more. VoidLLM supports 6 (OpenAI, Anthropic, Azure, Ollama, vLLM, custom). If you need native integration with a niche provider, LiteLLM has more ground covered.
Community size. Thousands of users, extensive documentation, large contributor base. VoidLLM is new - our docs are solid but our community is just getting started.
Python SDK. If your stack is Python-native and you want a library you can import directly, LiteLLM’s SDK is a natural fit. VoidLLM is a standalone proxy - you point your SDK at it.
Observability integrations. LiteLLM connects to Langfuse, Lunary, MLflow, and others for request-level observability. VoidLLM tracks usage metadata but deliberately avoids content-level logging.
| VoidLLM | LiteLLM | |
|---|---|---|
| Language | Go | Python |
| Proxy overhead | < 500us P50 | ~8ms P95 |
| Providers | 6 | 100+ |
| Content logging | Never (by design) | Optional (multiple backends) |
| Deployment | Single binary | Python runtime + deps |
| Admin UI | Embedded in binary | Separate service |
| MCP Gateway | Built-in + Code Mode | Recent addition |
| RBAC | Org/team/user/key | Virtual keys |
| Load balancing | 4 strategies + failover | Retry/fallback |
| Pro | $49/mo | On request |
| Enterprise | $149/mo | On request |
| License | BSL 1.1 | MIT |
If you need 30+ LLM providers and want a Python-native SDK with a large community, LiteLLM covers more ground.
If you care about privacy by design, want minimal operational overhead, need sub-millisecond proxy performance, or want an integrated MCP gateway - VoidLLM was built for that.
💡Switching is easy
Both proxies are OpenAI-compatible. Switching from LiteLLM to VoidLLM (or back) is a base URL change - your application code stays the same.
They solve overlapping problems with different priorities. Pick the one that matches yours.
An honest comparison between VoidLLM (self-hosted LLM proxy) and OpenRouter (hosted LLM aggregator) - when each makes sense.
VoidLLM's Code Mode lets AI agents orchestrate multiple MCP tool calls in a single WASM-sandboxed JavaScript execution. No round-trips, no latency penalty.
MCP tools advertise inputs but not outputs. We taught Code Mode to learn return types from the first successful call and surface them as TypeScript on the next discovery.
Most LLM proxies log your prompts. The EU AI Act makes that a compliance problem. Here's how VoidLLM's architecture simplifies things.