Questions, answered
Do you charge for tokens or LLM usage?+
No. You bring your own provider keys and pay your LLM bill directly to OpenAI, Anthropic, and others. AISIX Cloud only charges the plan subscription — there is no markup on your tokens.
How is AISIX different from an LLM API relay or token reseller?+
A relay resells access to models through its own accounts and servers — your prompts, responses, and API keys pass through a third party you don't control, and you inherit its markup, shared rate limits, and the risk of the upstream account being throttled or banned. AISIX is your own gateway, not a reseller. You connect your own provider keys and run it in your cloud or VPC (or a managed plane with envelope-encrypted keys), so traffic and data stay under your control. You get routing, failover, rate limits, guardrails, budgets, and full observability — and you pay providers directly, with no token markup. It's open-source (Apache-2.0, built in Rust), production-grade, and SOC 2 / ISO 27001 / GDPR / HIPAA-ready when you scale.
What counts toward my monthly request quota?+
Recorded requests — any call routed and logged through the gateway, including error responses. One streaming response counts as a single request.
What happens when I hit the limit?+
On Developer, your traffic keeps flowing past 100K — those extra requests simply aren't recorded, so they don't show up in your logs or analytics, and nothing is ever blocked. On Team, we automatically add $100 per additional 1M requests — traffic is never interrupted. Sustained usage over 5M/mo is a good point to move to Enterprise.
Can I self-host AISIX instead of using the cloud?+
Yes. AISIX is open source under Apache-2.0 — run the full gateway as a single Rust binary, free, with community support. The managed cloud adds the hosted control plane, dashboard, budgets, RBAC, and SLAs on top. Enterprise can also run the managed stack inside your own cloud / VPC.
Which LLM providers are supported?+
More than 100 providers through one OpenAI-compatible API, including 20+ popular integrations (OpenAI, Anthropic, Gemini, DeepSeek, Groq, Mistral, Cohere, Qwen, Together, Fireworks, and more). Cloud-hosted providers — AWS Bedrock, Azure OpenAI, and GCP Vertex AI — are available on Enterprise.
How do SSO, audit logs, and compliance work?+
Organization management, SSO (SAML / OIDC), audit logs, and SOC 2 Type II / ISO 27001 / GDPR / HIPAA are part of Enterprise. Talk to sales to scope your requirements and deployment model.
Where is my data stored, and how are provider keys protected?+
On AISIX Cloud, provider keys are envelope-encrypted at rest and decrypted only at request time; each data plane is scoped to its own environment keyspace. On self-host, all data and keys stay entirely within your own infrastructure.
Does the gateway add latency?+
The data plane is a native Rust proxy with a published performance baseline: ~28,300 req/s at saturation on 4 vCPUs, sub-millisecond p50 gateway overhead at low-to-moderate load, and ~0.65 ms added time-to-first-token for streaming — negligible next to LLM inference time.