Put one OpenAI-compatible API in front of every model — and route, govern, and observe every LLM and AI-agent call at sub-millisecond overhead. Open-source core, managed cloud.
Apache-2.0 core · 100+ providers · See pricing
from openai import OpenAI client = OpenAI( base_url="https://your-aisix-gateway/v1", # point to AISIX api_key="AISIX_API_KEY", ) # your existing OpenAI code — unchanged resp = client.chat.completions.create( model="gpt-4o", # or claude, gemini, deepseek… 100+ messages=[{"role": "user", "content": "Hello"}], )
The first prompt is easy. The 50th service, 10th model, and first surprise invoice are not. A gateway is where that complexity goes.
Point your existing OpenAI SDK at AISIX and reach every major provider — no rewrites, no per-vendor SDKs, no lock-in.
/v1/messages.Requests, latency, errors, spend, and model health — live across every environment, with no dashboards to build yourself.
Routing, limits, cost, and safety — configured once, enforced on every request, visible everywhere.
Alias any model, route to any provider. One stable name maps to any upstream — OpenAI, Anthropic, Bedrock, Vertex, Groq — with weighted load balancing, automatic failover, health checks, and semantic & cost-optimal routing.
Rate limits that actually hold. Sliding-window request (RPM / RPD) and token (TPM / TPD) limits plus concurrency caps — scoped per key, team, or provider account, and synchronized across replicas through Redis.
Guardrails on every prompt. Pre-input and post-output checks — keyword / regex blocklists, PII redaction (Presidio), prompt-injection and moderation (Lakera, OpenAI Moderation, Llama-Guard), and per-key model access control.
No surprise invoices. Month-to-date spend across every environment, key, and member — with per-key and org budgets, alerts at 75 / 90 / 100%, per-model cost tracking, and hard-stop caps.
Most AI gateways are a scripting-language shim in your hot path. AISIX is gateway infrastructure, rebuilt for AI.
Sub-millisecond overhead, no garbage-collection pauses, and a stateless data plane that scales horizontally with your traffic.
Self-host the full gateway as a single binary — free, forever, no lock-in. Or run it fully managed on AISIX Cloud.
Five-plus years of production gateway engineering from API7.ai, the creators of Apache APISIX — now rebuilt for LLMs and agents.
Start on the managed cloud in minutes, or deploy the control plane and data planes entirely inside your own infrastructure.
OpenAI-compatible — point your SDK at AISIX and your existing code just works. Start free, no credit card.