New

Announcing AISIX: The AI-Native AI Gateway for LLMs and AI AgentsLearn More

Learn More

From the original creators of Apache APISIX

Start free. Scale when AI is moving your numbers.

Ship your first AI features free, then grow on the managed cloud as your traffic scales — all at sub-millisecond overhead. Open-source core, enterprise-ready when you get there.

Sub-ms
proxy overhead
100+
LLM providers
100%
OpenAI-compatible
Apache-2.0
open-source core

Pricing that grows with your AI

Start free, then grow on the managed cloud as your traffic scales — or self-host the open-source core anytime.

Developer

Ship your first AI features — free, fully managed.

$0/ month
100K recorded requests / month

Past 100K, traffic keeps flowing — extra requests just aren't recorded.

Start free
  • Direct LLM providers, OpenAI-compatible
  • 1 environment · 1 data plane · 1 member
  • Request logs (3-day retention)
  • API key rotation, social login
  • Community support
MOST POPULAR
Team

Take winning AI features to production scale.

$149/ month
1M recorded requests / month

$100 per additional 1M, never blocked. Sustained over 5M/mo moves to Enterprise.

Get started
  • Everything in Developer, plus:
  • Model routing, ensembles, load balancing & failover
  • Budgets, rate limits & guardrails
  • RBAC, teams & response caching
  • 25 members · 3 environments · 3 data planes
  • 30-day retention · Standard SLA
Enterprise

Roll AI across the org — governed, compliant, supported.

Custom
10M+ recorded requests / month

Custom quota, overage, deployment and SLA.

Talk to sales
  • Everything in Team, plus:
  • AWS Bedrock · Azure OpenAI · GCP Vertex AI
  • Org management · SSO · audit logs
  • Custom guardrail hooks · semantic cache
  • SOC 2 Type II / ISO 27001 / GDPR / HIPAA
  • Private / VPC deploy · dedicated support
Open-source project or a funded startup? You may qualify for a free Team plan — full Team features, free, subject to usage limits and review.
Apply for free
Prefer to self-host? AISIX is open source (Apache-2.0) — run the full gateway yourself, free, forever.
View on GitHub

What's a recorded request?

Any call routed and logged through the gateway, including errors. One streaming response counts as a single request.

Free tier never blocks

On Developer, traffic keeps flowing past 100K — those extra requests simply aren't recorded or counted. No surprise cut-offs.

Overage is predictable

Past 1M, we auto-add $100 per additional 1M on your monthly invoice. Traffic is never interrupted.

Everything you need to build, ship & scale AI

Swipe to compare plans →

Feature Developer$0 Team$149/mo EnterpriseCustom
Usage & limits
Recorded requests / month100K1M10M+
Members125Unlimited
Environments13Unlimited
Data planes13Unlimited
Log retention3 days30 daysCustom
Metrics retention30 days90 daysCustom
Monthly overageNot recorded$100 / 1MCustom
AI gateway core
Universal OpenAI-compatible API
Chat completions + streaming (SSE)
Anthropic-compatible /v1/messages
Embeddings · rerank
Audio (speech-to-text, TTS)
Image generation
Provider passthrough
Playground
Single-model direct
Virtual / routing models
Model ensembles (panel + judge)
MCP gateway
Model providers
OpenAI · Anthropic · Gemini · DeepSeek
20+ popular providers
100+ via OpenAI-compatibility
AWS Bedrock · Azure OpenAI · GCP Vertex AI
Routing & reliability
Weighted load balancing
Sticky canary / A-B routing
Tag-based conditional routing · wildcard aliases
Automatic retry
Fallback on errors / 429
Upstream health checks
Semantic routing
Cost / latency / load-aware routing
Rate limiting
Request rate limits (RPM / RPD)
Token rate limits (TPM / TPD)
Concurrency limits
Per-team / per-member limits
Cost & budgets
Per-key budgets (hard-stop / warn)
Org / env / provider budgets
Budget alerts (75 / 90 / 100%)
Per-model cost tracking
Caching
Response cache (exact-match)
Memory + Redis backends
Cost-saved telemetry
Semantic cache
Security & guardrails
Keyword / regex guardrails
Cloud safety guardrails
PII redaction · content moderation
Custom guardrail hooks
Observability
Request access logs
Usage & cost analytics dashboardBasic
Prometheus metrics
OpenTelemetry trace export
Alerts
Data Lake / bucket export
Access control & org
API key management + rotation
Virtual keys & model allowlist
Personal access tokens (CLI / CI)
Social login (GitHub / Google)
RBAC
Teams
Organization management
SSO (SAML / OIDC)
Audit logs
Deployment & compliance
Managed SaaS hosting
Provider-key encryption at rest
Self-host / on-prem option
Private / VPC deployment
SOC 2 Type II · ISO 27001 · GDPR · HIPAA
BAA · data isolation
Support & SLA
SupportCommunityStandardDedicated
SLAStandardCustom

Enterprise-grade security, ready when you scale

Run AISIX in your own cloud or VPC, enforce org-wide policy, and meet the compliance bar — backed by a dedicated team and an SLA.

SOC 2 Type II ISO 27001 GDPR HIPAA
SSOSAML / OIDC
Private / VPCRun in your own cloud
Org managementPolicy across all teams
Audit logsEvery change, attributable
Data isolationEnvelope-encrypted keys, BAA
Dedicated supportOnboarding + SLA

Questions, answered

Do you charge for tokens or LLM usage?+

No. You bring your own provider keys and pay your LLM bill directly to OpenAI, Anthropic, and others. AISIX Cloud only charges the plan subscription — there is no markup on your tokens.

How is AISIX different from an LLM API relay or token reseller?+

A relay resells access to models through its own accounts and servers — your prompts, responses, and API keys pass through a third party you don't control, and you inherit its markup, shared rate limits, and the risk of the upstream account being throttled or banned. AISIX is your own gateway, not a reseller. You connect your own provider keys and run it in your cloud or VPC (or a managed plane with envelope-encrypted keys), so traffic and data stay under your control. You get routing, failover, rate limits, guardrails, budgets, and full observability — and you pay providers directly, with no token markup. It's open-source (Apache-2.0, built in Rust), production-grade, and SOC 2 / ISO 27001 / GDPR / HIPAA-ready when you scale.

What counts toward my monthly request quota?+

Recorded requests — any call routed and logged through the gateway, including error responses. One streaming response counts as a single request.

What happens when I hit the limit?+

On Developer, your traffic keeps flowing past 100K — those extra requests simply aren't recorded, so they don't show up in your logs or analytics, and nothing is ever blocked. On Team, we automatically add $100 per additional 1M requests — traffic is never interrupted. Sustained usage over 5M/mo is a good point to move to Enterprise.

Can I self-host AISIX instead of using the cloud?+

Yes. AISIX is open source under Apache-2.0 — run the full gateway as a single Rust binary, free, with community support. The managed cloud adds the hosted control plane, dashboard, budgets, RBAC, and SLAs on top. Enterprise can also run the managed stack inside your own cloud / VPC.

Which LLM providers are supported?+

More than 100 providers through one OpenAI-compatible API, including 20+ popular integrations (OpenAI, Anthropic, Gemini, DeepSeek, Groq, Mistral, Cohere, Qwen, Together, Fireworks, and more). Cloud-hosted providers — AWS Bedrock, Azure OpenAI, and GCP Vertex AI — are available on Enterprise.

How do SSO, audit logs, and compliance work?+

Organization management, SSO (SAML / OIDC), audit logs, and SOC 2 Type II / ISO 27001 / GDPR / HIPAA are part of Enterprise. Talk to sales to scope your requirements and deployment model.

Where is my data stored, and how are provider keys protected?+

On AISIX Cloud, provider keys are envelope-encrypted at rest and decrypted only at request time; each data plane is scoped to its own environment keyspace. On self-host, all data and keys stay entirely within your own infrastructure.

Does the gateway add latency?+

The data plane is a native Rust proxy with a published performance baseline: ~28,300 req/s at saturation on 4 vCPUs, sub-millisecond p50 gateway overhead at low-to-moderate load, and ~0.65 ms added time-to-first-token for streaming — negligible next to LLM inference time.

Ship AI that grows your business

OpenAI-compatible — point your SDK at AISIX and start shipping. Free to start, no credit card.