migration getting-started

Migrating from LiteLLM to VoidLLM: A Practical Guide

· 8 min read

This is a how-to, not a pitch. If you’re still deciding between the two, the VoidLLM vs LiteLLM comparison covers that ground. This post assumes you’ve already decided to try VoidLLM and need a concrete checklist to get there without breaking your clients.

LiteLLM is a solid project. Most people we talk to who migrate do it for one of three reasons: a hard privacy or compliance requirement, a performance goal they can’t hit with a Python stack, or a preference for running one small Go binary instead of a Python service with its dependency tree. If none of those apply to you, LiteLLM is probably fine and you can close this tab.

When migration makes sense

Migrate if any of these are true:

Do NOT migrate if:

Before you start: the feature map

LiteLLM conceptVoidLLM equivalent
Proxy config (config.yaml)voidllm.yaml with ${ENV_VAR} interpolation
model_list entrymodels: entry with provider + upstream config
Model groups (same model_name, multiple deployments)Multi-deployment models with load balancing strategies
Virtual keysVoidLLM user, team, and service account keys (vl_uk_, vl_tk_, vl_sa_)
TeamsTeams inside an org (org/team/user/key hierarchy)
Rate limits (RPM/TPM)Per-key, per-team, per-org rate limits, most-restrictive-wins
Budgets (max_budget)Per-key, per-team, per-org token and cost budgets
Router strategiesRound-robin, weighted, priority, least-latency
FallbacksFallback chains (Enterprise)
Python SDK (litellm library)No equivalent - use any OpenAI-compatible client pointed at VoidLLM
Langfuse/Lunary content loggingNot supported by design - only metadata is tracked

The important mismatch is the SDK. VoidLLM is a standalone proxy, full stop. If your code calls litellm.completion(...) directly, you’ll swap it for the openai library pointed at your VoidLLM base URL. See Drop-in Replacement for the OpenAI SDK.

Step 1: Install VoidLLM

The fastest path is Docker:

docker run -p 8080:8080 \
  -v voidllm_data:/data \
  -e VOIDLLM_ENCRYPTION_KEY=$(openssl rand -base64 32) \
  -e VOIDLLM_ADMIN_KEY=my-admin-key-at-least-32-chars!! \
  ghcr.io/voidmind-io/voidllm:latest

That’s the whole install. No runtime to prep, no pip, no virtualenv. The full walkthrough (bootstrap credentials, first request, UI tour) is in Getting Started: Docker to First Request.

If you prefer a binary, grab a release from GitHub and drop it on the host. Config lives in voidllm.yaml next to the binary, or wherever VOIDLLM_CONFIG points. Environment variables all use the VOIDLLM_ prefix, and ${ENV_VAR} interpolation works inside the YAML for secrets.

Run VoidLLM on a new port and a new host (or a sidecar container) so both proxies can coexist during the migration. Don’t uninstall LiteLLM yet.

Step 2: Translate your config

The mental model is the same on both sides: a list of models, each pointing at an upstream with credentials. Here’s the shape-by-shape translation.

LiteLLM:

model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY

  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20241022
      api_key: os.environ/ANTHROPIC_API_KEY

  - model_name: gpt-4o-azure
    litellm_params:
      model: azure/my-gpt4o-deployment
      api_base: https://my-resource.openai.azure.com
      api_key: os.environ/AZURE_API_KEY
      api_version: "2024-02-15-preview"

  - model_name: llama3-local
    litellm_params:
      model: ollama/llama3
      api_base: http://ollama:11434

VoidLLM:

models:
  - name: gpt-4o
    provider: openai
    upstream_model: gpt-4o
    api_key: ${OPENAI_API_KEY}

  - name: claude-sonnet
    provider: anthropic
    upstream_model: claude-3-5-sonnet-20241022
    api_key: ${ANTHROPIC_API_KEY}

  - name: gpt-4o-azure
    provider: azure
    upstream_model: my-gpt4o-deployment
    api_base: https://my-resource.openai.azure.com
    api_key: ${AZURE_API_KEY}
    api_version: "2024-02-15-preview"

  - name: llama3-local
    provider: ollama
    upstream_model: llama3
    api_base: http://ollama:11434

Notes on each provider:

If you had the same model_name repeated across multiple LiteLLM entries to form a model group, in VoidLLM that becomes a single model with multiple deployments and a routing strategy. See Load Balancing and Failover for the deployment-list shape and strategy options.

Step 3: Bootstrap users, teams, and keys

You cannot copy LiteLLM’s virtual keys into VoidLLM. Key formats and hashing differ (VoidLLM uses HMAC-SHA256 with a global secret, and prefixes like vl_uk_, vl_tk_, vl_sa_). You have to issue new keys and roll them out to consumers.

The mapping exercise is usually:

  1. Create one or more orgs (most teams start with one).
  2. Under each org, create teams that match how you already group consumers in LiteLLM.
  3. For each LiteLLM virtual key, create a user or service account inside the right team and issue a VoidLLM key.
  4. Apply rate limits and budgets at the appropriate level. VoidLLM enforces most-restrictive-wins across org, team, and key, so put broad policies on the org/team and exceptions on the key.

Roles: system_admin > org_admin > team_admin > member. Members can create their own service accounts and delete only their own. Team and org admins manage everything in their scope.

You can do all of this in the embedded admin UI (the fastest way for most teams) or via the Admin API if you want to script it.

Step 4: Point your clients at VoidLLM

Good news: if your clients already talk to LiteLLM via the OpenAI-compatible endpoint, the code change is one line - the base URL. VoidLLM speaks the same OpenAI shape, so SDK calls don’t change.

from openai import OpenAI

client = OpenAI(
    base_url="http://voidllm:8080/v1",  # was: http://litellm:4000
    api_key="vl_uk_...",                # new VoidLLM key
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "hello"}],
)

For a gradual rollout, run both proxies side by side, issue VoidLLM keys to one service or one environment first, and shift traffic as you gain confidence. Watch the VoidLLM usage dashboard to confirm requests are landing where you expect.

If your app calls litellm.completion(...) directly as a library (not as a proxy), swap that for the openai package. The call shape is the same.

Step 5: Decommission LiteLLM

Once VoidLLM is carrying traffic:

  1. Verify VoidLLM’s usage numbers match what you expect from each service.
  2. Revoke the old LiteLLM virtual keys so nothing can fall back silently.
  3. Export any historical usage data you want to keep from LiteLLM - it is not migrated into VoidLLM.
  4. Shut down the LiteLLM instance.

Gotchas to know

Provider coverage. VoidLLM currently supports six providers: OpenAI, Anthropic, Azure, Ollama, vLLM, and a generic OpenAI-compatible “custom” adapter. If you rely on Bedrock, Vertex AI, Cohere, Gemini native, or one of the more niche providers LiteLLM ships, check the model providers docs before committing to the switch. Some are on the roadmap, some are not.

Historical usage data is not migrated. VoidLLM starts clean. If you care about continuity for cost reporting or audit, export LiteLLM’s usage tables before you shut it down.

No Python SDK. VoidLLM is proxy-only. If you had code importing litellm as a library, replace it with the openai package pointed at VoidLLM’s /v1 endpoint.

Observability tools that ingest content will see nothing. Langfuse, Lunary, and similar tools receive prompt and response content from LiteLLM. VoidLLM never has that content in the first place - the architecture is designed to support GDPR compliance, so content never touches disk or downstream integrations. You can still track tokens, cost, latency (TTFT + TPS per request), model usage, and per-key metadata. If request-level content review is a hard requirement for your team, that’s a signal VoidLLM is not the right fit.

Teams do not auto-sync. If you had a sizable team structure in LiteLLM, rebuild the equivalent org/team layout manually in VoidLLM. For larger installs, script it against the Admin API.

Fallback chains. If you use LiteLLM’s fallback feature heavily, note that VoidLLM’s fallback chains are an Enterprise feature. Load balancing across deployments of the same model is available on all tiers.

Closing

Be realistic: this is not a two-minute migration. For a small team with a handful of models and a dozen keys, plan an afternoon. For a larger install with many teams and services, plan a day or two of gradual rollout. The good news is both proxies can run in parallel the entire time - there’s no forced cutover.

If you’re still on the fence about whether the move is worth it, the VoidLLM vs LiteLLM comparison lays out the honest tradeoffs. Pricing is flat and public on the pricing page - no per-seat, no per-request, no surprises.

Related posts