This is a how-to, not a pitch. If you’re still deciding between the two, the VoidLLM vs LiteLLM comparison covers that ground. This post assumes you’ve already decided to try VoidLLM and need a concrete checklist to get there without breaking your clients.
LiteLLM is a solid project. Most people we talk to who migrate do it for one of three reasons: a hard privacy or compliance requirement, a performance goal they can’t hit with a Python stack, or a preference for running one small Go binary instead of a Python service with its dependency tree. If none of those apply to you, LiteLLM is probably fine and you can close this tab.
Migrate if any of these are true:
Do NOT migrate if:
| LiteLLM concept | VoidLLM equivalent |
|---|---|
Proxy config (config.yaml) | voidllm.yaml with ${ENV_VAR} interpolation |
model_list entry | models: entry with provider + upstream config |
Model groups (same model_name, multiple deployments) | Multi-deployment models with load balancing strategies |
| Virtual keys | VoidLLM user, team, and service account keys (vl_uk_, vl_tk_, vl_sa_) |
| Teams | Teams inside an org (org/team/user/key hierarchy) |
| Rate limits (RPM/TPM) | Per-key, per-team, per-org rate limits, most-restrictive-wins |
| Budgets (max_budget) | Per-key, per-team, per-org token and cost budgets |
| Router strategies | Round-robin, weighted, priority, least-latency |
| Fallbacks | Fallback chains (Enterprise) |
Python SDK (litellm library) | No equivalent - use any OpenAI-compatible client pointed at VoidLLM |
| Langfuse/Lunary content logging | Not supported by design - only metadata is tracked |
The important mismatch is the SDK. VoidLLM is a standalone proxy, full stop. If your code calls litellm.completion(...) directly, you’ll swap it for the openai library pointed at your VoidLLM base URL. See Drop-in Replacement for the OpenAI SDK.
The fastest path is Docker:
docker run -p 8080:8080 \
-v voidllm_data:/data \
-e VOIDLLM_ENCRYPTION_KEY=$(openssl rand -base64 32) \
-e VOIDLLM_ADMIN_KEY=my-admin-key-at-least-32-chars!! \
ghcr.io/voidmind-io/voidllm:latest
That’s the whole install. No runtime to prep, no pip, no virtualenv. The full walkthrough (bootstrap credentials, first request, UI tour) is in Getting Started: Docker to First Request.
If you prefer a binary, grab a release from GitHub and drop it on the host. Config lives in voidllm.yaml next to the binary, or wherever VOIDLLM_CONFIG points. Environment variables all use the VOIDLLM_ prefix, and ${ENV_VAR} interpolation works inside the YAML for secrets.
Run VoidLLM on a new port and a new host (or a sidecar container) so both proxies can coexist during the migration. Don’t uninstall LiteLLM yet.
The mental model is the same on both sides: a list of models, each pointing at an upstream with credentials. Here’s the shape-by-shape translation.
LiteLLM:
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-3-5-sonnet-20241022
api_key: os.environ/ANTHROPIC_API_KEY
- model_name: gpt-4o-azure
litellm_params:
model: azure/my-gpt4o-deployment
api_base: https://my-resource.openai.azure.com
api_key: os.environ/AZURE_API_KEY
api_version: "2024-02-15-preview"
- model_name: llama3-local
litellm_params:
model: ollama/llama3
api_base: http://ollama:11434
VoidLLM:
models:
- name: gpt-4o
provider: openai
upstream_model: gpt-4o
api_key: ${OPENAI_API_KEY}
- name: claude-sonnet
provider: anthropic
upstream_model: claude-3-5-sonnet-20241022
api_key: ${ANTHROPIC_API_KEY}
- name: gpt-4o-azure
provider: azure
upstream_model: my-gpt4o-deployment
api_base: https://my-resource.openai.azure.com
api_key: ${AZURE_API_KEY}
api_version: "2024-02-15-preview"
- name: llama3-local
provider: ollama
upstream_model: llama3
api_base: http://ollama:11434
Notes on each provider:
upstream_model holds the Azure deployment name, not the base model name. api_base and api_version map one-for-one.provider: openai (or custom) and point api_base at the server.If you had the same model_name repeated across multiple LiteLLM entries to form a model group, in VoidLLM that becomes a single model with multiple deployments and a routing strategy. See Load Balancing and Failover for the deployment-list shape and strategy options.
You cannot copy LiteLLM’s virtual keys into VoidLLM. Key formats and hashing differ (VoidLLM uses HMAC-SHA256 with a global secret, and prefixes like vl_uk_, vl_tk_, vl_sa_). You have to issue new keys and roll them out to consumers.
The mapping exercise is usually:
Roles: system_admin > org_admin > team_admin > member. Members can create their own service accounts and delete only their own. Team and org admins manage everything in their scope.
You can do all of this in the embedded admin UI (the fastest way for most teams) or via the Admin API if you want to script it.
Good news: if your clients already talk to LiteLLM via the OpenAI-compatible endpoint, the code change is one line - the base URL. VoidLLM speaks the same OpenAI shape, so SDK calls don’t change.
from openai import OpenAI
client = OpenAI(
base_url="http://voidllm:8080/v1", # was: http://litellm:4000
api_key="vl_uk_...", # new VoidLLM key
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "hello"}],
)
For a gradual rollout, run both proxies side by side, issue VoidLLM keys to one service or one environment first, and shift traffic as you gain confidence. Watch the VoidLLM usage dashboard to confirm requests are landing where you expect.
If your app calls litellm.completion(...) directly as a library (not as a proxy), swap that for the openai package. The call shape is the same.
Once VoidLLM is carrying traffic:
Provider coverage. VoidLLM currently supports six providers: OpenAI, Anthropic, Azure, Ollama, vLLM, and a generic OpenAI-compatible “custom” adapter. If you rely on Bedrock, Vertex AI, Cohere, Gemini native, or one of the more niche providers LiteLLM ships, check the model providers docs before committing to the switch. Some are on the roadmap, some are not.
Historical usage data is not migrated. VoidLLM starts clean. If you care about continuity for cost reporting or audit, export LiteLLM’s usage tables before you shut it down.
No Python SDK. VoidLLM is proxy-only. If you had code importing litellm as a library, replace it with the openai package pointed at VoidLLM’s /v1 endpoint.
Observability tools that ingest content will see nothing. Langfuse, Lunary, and similar tools receive prompt and response content from LiteLLM. VoidLLM never has that content in the first place - the architecture is designed to support GDPR compliance, so content never touches disk or downstream integrations. You can still track tokens, cost, latency (TTFT + TPS per request), model usage, and per-key metadata. If request-level content review is a hard requirement for your team, that’s a signal VoidLLM is not the right fit.
Teams do not auto-sync. If you had a sizable team structure in LiteLLM, rebuild the equivalent org/team layout manually in VoidLLM. For larger installs, script it against the Admin API.
Fallback chains. If you use LiteLLM’s fallback feature heavily, note that VoidLLM’s fallback chains are an Enterprise feature. Load balancing across deployments of the same model is available on all tiers.
Be realistic: this is not a two-minute migration. For a small team with a handful of models and a dozen keys, plan an afternoon. For a larger install with many teams and services, plan a day or two of gradual rollout. The good news is both proxies can run in parallel the entire time - there’s no forced cutover.
If you’re still on the fence about whether the move is worth it, the VoidLLM vs LiteLLM comparison lays out the honest tradeoffs. Pricing is flat and public on the pricing page - no per-seat, no per-request, no surprises.
Step-by-step setup for using VoidLLM as your LLM proxy in Cursor and Windsurf, and as an MCP server in Claude Code.
Switch from direct OpenAI API calls to VoidLLM by changing one line. Route to any provider - Anthropic, Azure, vLLM - through the same SDK.
A quick walkthrough of VoidLLM - from docker run to your first proxied LLM request, with a look at the built-in UI.
A practical guide to tracking and allocating LLM spend across teams using org/team/user/key hierarchies. No more monolithic OpenAI bills.