The Model Context Protocol is becoming the standard way AI agents interact with tools. But managing MCP servers across an organization gets messy fast - different auth tokens, scattered configs, no visibility into what tools are being called.
Without a central gateway, every AI agent connects directly to every MCP server. Each connection needs its own auth, its own config, and there’s no way to track or control what’s happening.
graph LR A1[Claude Code] --> S1[AWS Knowledge] A1 --> S2[Exa Search] A1 --> S3[Internal Docs] A2[Cursor] --> S1 A2 --> S2 A3[Internal App] --> S2 A3 --> S3 style A1 fill:#1a1a24,stroke:#333,color:#e2e8f0 style A2 fill:#1a1a24,stroke:#333,color:#e2e8f0 style A3 fill:#1a1a24,stroke:#333,color:#e2e8f0 style S1 fill:#1a1a24,stroke:#ef4444,color:#e2e8f0 style S2 fill:#1a1a24,stroke:#ef4444,color:#e2e8f0 style S3 fill:#1a1a24,stroke:#ef4444,color:#e2e8f0
Every client connects to VoidLLM. VoidLLM handles auth, access control, and proxies the requests to the right MCP server. One config per client, full visibility for admins.
graph LR A1[Claude Code] --> V[VoidLLM] A2[Cursor] --> V A3[Internal App] --> V V --> S1[AWS Knowledge] V --> S2[Exa Search] V --> S3[Internal Docs] V --> S4[VoidLLM Tools] V -.->|async, non-blocking| DB[(Tool Call Log)] style V fill:#8b5cf6,stroke:#6366f1,color:#fff style A1 fill:#1a1a24,stroke:#333,color:#e2e8f0 style A2 fill:#1a1a24,stroke:#333,color:#e2e8f0 style A3 fill:#1a1a24,stroke:#333,color:#e2e8f0 style S1 fill:#1a1a24,stroke:#22c55e,color:#e2e8f0 style S2 fill:#1a1a24,stroke:#22c55e,color:#e2e8f0 style S3 fill:#1a1a24,stroke:#22c55e,color:#e2e8f0 style S4 fill:#1a1a24,stroke:#8b5cf6,color:#e2e8f0 style DB fill:#12121a,stroke:#8b5cf6,color:#e2e8f0
Register your external MCP servers once (via YAML config or the Admin API), and every team accesses them through VoidLLM’s unified endpoint.
Each server gets scoped access control: global, per-org, or per-team. An org admin grants access to specific servers - teams can only use what they’ve been allowed.
ℹAccess is closed by default
Global MCP servers are not accessible to any organization until an admin explicitly grants access. This prevents accidental exposure of powerful tools to teams that shouldn’t have them.
VoidLLM ships with 6 built-in MCP tools: list_models, get_model_health, get_usage, list_keys, create_key, and list_deployments. Point Claude Code or Cursor at /api/v1/mcp/voidllm and your AI agent can check model health, look up usage stats, or create API keys without leaving the conversation.
Need to chain multiple tool calls? Code Mode lets AI agents write JavaScript that orchestrates MCP tools in a single WASM-sandboxed execution. No round-trips, no latency penalty.
Three dedicated tools - list_servers, search_tools, execute_code - expose the full MCP ecosystem to your AI agent in one conversation.
Add this to your Claude Code config:
{
"mcpServers": {
"voidllm": {
"url": "https://your-voidllm/api/v1/mcp/voidllm",
"headers": {
"Authorization": "Bearer <your-api-key>"
}
}
}
}
That’s it. Your AI agent now has access to all your configured tools through a single, access-controlled gateway.
VoidLLM's Code Mode lets AI agents orchestrate multiple MCP tool calls in a single WASM-sandboxed JavaScript execution. No round-trips, no latency penalty.
MCP tools advertise inputs but not outputs. We taught Code Mode to learn return types from the first successful call and surface them as TypeScript on the next discovery.
Step-by-step setup for using VoidLLM as your LLM proxy in Cursor and Windsurf, and as an MCP server in Claude Code.
Route LLM requests across multiple deployments with automatic failover, health-aware routing, and four balancing strategies.