getting-started features

Drop-In Replacement for the OpenAI SDK

· 3 min read

If you’re calling OpenAI or Anthropic directly, switching to VoidLLM is a one-line change. The proxy speaks the same API - your SDK doesn’t know the difference.

Python (OpenAI SDK)

Before:

from openai import OpenAI
client = OpenAI(api_key="sk-...")

After:

from openai import OpenAI
client = OpenAI(
    base_url="https://your-voidllm/v1",
    api_key="vl_uk_your_key_here"
)

Everything else stays the same - client.chat.completions.create(), streaming, function calling, all of it.

Python (Anthropic models via OpenAI SDK)

If you’re using Anthropic models through VoidLLM, you use the OpenAI SDK - VoidLLM translates the request format to Anthropic’s Messages API on the upstream side:

from openai import OpenAI
client = OpenAI(
    base_url="https://your-voidllm/v1",
    api_key="vl_uk_your_key_here"
)

# This hits Anthropic through VoidLLM - same OpenAI SDK call
response = client.chat.completions.create(
    model="claude-sonnet",  # alias configured in VoidLLM
    messages=[{"role": "user", "content": "hello"}]
)

Your code uses one SDK format. VoidLLM handles the translation per provider.

TypeScript / Node.js

import OpenAI from 'openai'

const client = new OpenAI({
  baseURL: 'https://your-voidllm/v1',
  apiKey: 'vl_uk_your_key_here',
})

curl

curl https://your-voidllm/v1/chat/completions \
  -H "Authorization: Bearer vl_uk_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{"model": "default", "messages": [{"role": "user", "content": "hello"}]}'

What happens behind the scenes

graph LR
  A[Your App] -->|same SDK call| B[VoidLLM]
  B -->|model: default| C{Resolve alias}
  C -->|claude-sonnet| D[Anthropic]
  C -->|gpt-4o| E[OpenAI]
  C -->|llama-70b| F[vLLM]

  style B fill:#8b5cf6,stroke:#6366f1,color:#fff
  style A fill:#1a1a24,stroke:#333,color:#e2e8f0
  style D fill:#1a1a24,stroke:#22c55e,color:#e2e8f0
  style E fill:#1a1a24,stroke:#22c55e,color:#e2e8f0
  style F fill:#1a1a24,stroke:#22c55e,color:#e2e8f0
Your app sends the same SDK call. VoidLLM resolves the model alias and routes to the right provider.

When you send model: "default", VoidLLM resolves the alias to the actual model, builds the correct upstream request (including provider-specific translation for Anthropic and Azure), and streams the response back in the format your SDK expects.

What you get for free

By routing through VoidLLM instead of calling providers directly:

Under 500 microseconds of overhead

The proxy adds less than 0.1% latency to a typical LLM response. Your users won’t notice it. See our benchmark numbers.

Environment variables

Most SDKs respect environment variables, so you don’t even need code changes:

export OPENAI_BASE_URL=https://your-voidllm/v1
export OPENAI_API_KEY=vl_uk_your_key_here

Set these in your deployment config and every service in your cluster goes through VoidLLM automatically - regardless of whether the upstream is OpenAI, Anthropic, Azure, or self-hosted.

Related posts