Drop-In Replacement for the OpenAI SDK

If you’re calling OpenAI or Anthropic directly, switching to VoidLLM is a one-line change. The proxy speaks the same API - your SDK doesn’t know the difference.

Python (OpenAI SDK)

Before:

from openai import OpenAI
client = OpenAI(api_key="sk-...")

After:

from openai import OpenAI
client = OpenAI(
    base_url="https://your-voidllm/v1",
    api_key="vl_uk_your_key_here"
)

Everything else stays the same - client.chat.completions.create(), streaming, function calling, all of it.

Python (Anthropic models via OpenAI SDK)

If you’re using Anthropic models through VoidLLM, you use the OpenAI SDK - VoidLLM translates the request format to Anthropic’s Messages API on the upstream side:

from openai import OpenAI
client = OpenAI(
    base_url="https://your-voidllm/v1",
    api_key="vl_uk_your_key_here"
)

# This hits Anthropic through VoidLLM - same OpenAI SDK call
response = client.chat.completions.create(
    model="claude-sonnet",  # alias configured in VoidLLM
    messages=[{"role": "user", "content": "hello"}]
)

Your code uses one SDK format. VoidLLM handles the translation per provider.

TypeScript / Node.js

import OpenAI from 'openai'

const client = new OpenAI({
  baseURL: 'https://your-voidllm/v1',
  apiKey: 'vl_uk_your_key_here',
})

curl

curl https://your-voidllm/v1/chat/completions \
  -H "Authorization: Bearer vl_uk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"model": "default", "messages": [{"role": "user", "content": "hello"}]}'

What happens behind the scenes

graph LR
  A[Your App] -->|same SDK call| B[VoidLLM]
  B -->|model: default| C{Resolve alias}
  C -->|claude-sonnet| D[Anthropic]
  C -->|gpt-4o| E[OpenAI]
  C -->|llama-70b| F[vLLM]

  style B fill:#8b5cf6,stroke:#6366f1,color:#fff
  style A fill:#1a1a24,stroke:#333,color:#e2e8f0
  style D fill:#1a1a24,stroke:#22c55e,color:#e2e8f0
  style E fill:#1a1a24,stroke:#22c55e,color:#e2e8f0
  style F fill:#1a1a24,stroke:#22c55e,color:#e2e8f0

Your app sends the same SDK call. VoidLLM resolves the model alias and routes to the right provider.

When you send model: "default", VoidLLM resolves the alias to the actual model, builds the correct upstream request (including provider-specific translation for Anthropic and Azure), and streams the response back in the format your SDK expects.

What you get for free

By routing through VoidLLM instead of calling providers directly:

Usage tracking per team, user, and API key
Rate limiting and token budgets
Model aliases - swap providers without changing client code
Load balancing across multiple deployments
API key isolation - your apps never see upstream provider keys
Zero-knowledge privacy - no prompt content stored

ℹUnder 500 microseconds of overhead

The proxy adds less than 0.1% latency to a typical LLM response. Your users won’t notice it. See our benchmark numbers.

Environment variables

Most SDKs respect environment variables, so you don’t even need code changes:

export OPENAI_BASE_URL=https://your-voidllm/v1
export OPENAI_API_KEY=vl_uk_your_key_here

Set these in your deployment config and every service in your cluster goes through VoidLLM automatically - regardless of whether the upstream is OpenAI, Anthropic, Azure, or self-hosted.

Drop-In Replacement for the OpenAI SDK

Python (OpenAI SDK)

Python (Anthropic models via OpenAI SDK)

TypeScript / Node.js

curl

What happens behind the scenes

What you get for free

Environment variables

Related posts

Connect Cursor, Windsurf, and Claude Code to VoidLLM

Getting Started: Docker to First Request in 3 Minutes

How to Allocate LLM Costs Across Teams and Departments

Code Mode: Let AI Agents Write Scripts, Not Chat