New

Announcing AISIX: The AI-Native AI Gateway for LLMs and AI AgentsLearn More

Learn More

AI Gateway Comparison 2026:
Choose the Right AI Gateway for LLM Traffic

Compare AISIX, LiteLLM, Portkey, Envoy, TrueFoundry, Vercel, and Cloudflare AI gateways across provider coverage, multi-model routing, guardrails, token budgets, caching, and governance — and find the gateway that fits how your team runs LLM and AI-agent traffic.

Introduction

At a Glance

Gateway Profiles

Our Take

In-Depth Comparisons

FAQ

How to Choose an AI Gateway in 2026

An AI gateway sits in front of your LLM providers and gives teams one place to route, govern, secure, cache, and observe LLM and AI-agent traffic. It adds AI-native controls — multi-model routing, token-based budgets, prompt and response guardrails, and per-model cost tracking — on top of the routing and reliability of a traditional API gateway.

This page consolidates how the leading AI gateways compare, so you can evaluate them side by side before reading any single head-to-head. The fastest way to narrow the field is to decide how you want to run the gateway.

Three Ways Teams Run AI Gateways

  • 1. Open-source, self-hosted AI gateways

    AISIX, LiteLLM, the Portkey gateway, and Envoy AI Gateway run on infrastructure you control. You get full control over deployment, customization, and data residency — LLM traffic can stay entirely inside your network — in exchange for operating the gateway yourself.

  • 2. Managed / SaaS AI gateways

    Vercel AI Gateway and Cloudflare AI Gateway are fully managed: one endpoint reaches many models with edge caching and built-in analytics. Onboarding is fast and there is nothing to operate, but traffic routes through the vendor's network rather than your own.

  • 3. Hybrid platforms you can self-host

    AISIX, TrueFoundry, and Portkey pair a governance control plane with a data plane you can run yourself — start with a managed experience, then self-host in your VPC for compliance or data residency. AISIX is the open-source option here, keeping all model traffic in your network while still offering a managed control plane.

AI Gateways at a Glance

Last updated: July 2026. Competitor facts are sourced from official documentation and repositories. AISIX capabilities that are not yet shipped are marked Roadmap rather than listed as available.

supported · not available · Ent. enterprise / paid tier · not disclosed

CapabilityAISIXLiteLLMPortkeyEnvoyTrueFoundryVercelCloudflare
Foundation
License

Apache-2.0

MIT

MIT

Apache-2.0

Proprietary

Proprietary

Proprietary

Built with

Rust

Python

TypeScript

Go

TypeScript

Hosting model

Self-host or SaaS

Self-host

Self-host + hosted

Self-host

Self-host or SaaS

SaaS

SaaS

Can run in your network

Routing & traffic
Provider / model coverage

Major providers

100+ providers

1,600+ models

16+ providers

1,600+ models

Hundreds of models

20+ providers

Multi-model routing & fallback

Semantic caching

Roadmap

Ent.

MCP gateway

Governance & cost
Token budgets & cost tracking

Prompt & response guardrails

SSO / SCIM

Roadmap

Ent.

Ent.

Ent.

Ent.

AISIX

Foundation

License

Apache-2.0

Built with

Rust

Hosting model

Self-host or SaaS

Can run in your network

Routing & traffic

Provider / model coverage

Major providers

Multi-model routing & fallback

Semantic caching

Roadmap

MCP gateway

Governance & cost

Token budgets & cost tracking

Prompt & response guardrails

SSO / SCIM

Roadmap

LiteLLM

Foundation

License

MIT

Built with

Python

Hosting model

Self-host

Can run in your network

Routing & traffic

Provider / model coverage

100+ providers

Multi-model routing & fallback

Semantic caching

MCP gateway

Governance & cost

Token budgets & cost tracking

Prompt & response guardrails

SSO / SCIM

Ent.

Portkey

Foundation

License

MIT

Built with

TypeScript

Hosting model

Self-host + hosted

Can run in your network

Routing & traffic

Provider / model coverage

1,600+ models

Multi-model routing & fallback

Semantic caching

Ent.

MCP gateway

Governance & cost

Token budgets & cost tracking

Prompt & response guardrails

SSO / SCIM

Ent.

Envoy

Foundation

License

Apache-2.0

Built with

Go

Hosting model

Self-host

Can run in your network

Routing & traffic

Provider / model coverage

16+ providers

Multi-model routing & fallback

Semantic caching

MCP gateway

Governance & cost

Token budgets & cost tracking

Prompt & response guardrails

SSO / SCIM

TrueFoundry

Foundation

License

Proprietary

Built with

TypeScript

Hosting model

Self-host or SaaS

Can run in your network

Routing & traffic

Provider / model coverage

1,600+ models

Multi-model routing & fallback

Semantic caching

MCP gateway

Governance & cost

Token budgets & cost tracking

Prompt & response guardrails

SSO / SCIM

Vercel

Foundation

License

Proprietary

Built with

Hosting model

SaaS

Can run in your network

Routing & traffic

Provider / model coverage

Hundreds of models

Multi-model routing & fallback

Semantic caching

MCP gateway

Governance & cost

Token budgets & cost tracking

Prompt & response guardrails

SSO / SCIM

Ent.

Cloudflare

Foundation

License

Proprietary

Built with

Hosting model

SaaS

Can run in your network

Routing & traffic

Provider / model coverage

20+ providers

Multi-model routing & fallback

Semantic caching

MCP gateway

Governance & cost

Token budgets & cost tracking

Prompt & response guardrails

SSO / SCIM

Ent.

Gateway Profiles

AISIX

Open-source AI gateway (Apache-2.0) in Rust from the team behind Apache APISIX — a hybrid platform you can self-host or run with a managed control plane.

AISIX is a hybrid: its open-source data plane runs inside your own VPC over an outbound-only connection — so prompts and responses never leave your network — while the control plane can be self-managed or used as a hosted service, letting you start quickly without giving up data residency. It governs LLM and AI-agent traffic on the same high-performance data plane as its API gateway: multi-model routing and load balancing (including cost-, latency- and load-aware strategies and sticky canary releases), token-based budgets and cost tracking, prompt and response guardrails, and an MCP gateway that runs MCP tools through the same keys, rate limits, and guardrails as LLM traffic. It also offers ensemble models that fan a single request out to a panel of models and synthesize the result — something most AI gateways do not offer — and publishes a performance baseline of ~28,300 req/s at saturation on 4 vCPUs, with sub-millisecond p50 gateway overhead at low-to-moderate load. SSO/SCIM and semantic caching are on the active roadmap.

Best for: Teams that want LLM traffic and data to stay inside their own infrastructure, a unified API + AI gateway on one data plane, and the flexibility to self-host or use a managed control plane.

LiteLLM

MIT-licensed, Python-based proxy that calls 100+ LLM providers behind an OpenAI-compatible API.

LiteLLM focuses on breadth and portability. The open-source proxy supports semantic caching, per-key/team spend budgets, and an MCP gateway, and self-hosts via Docker, Helm, or Terraform. SSO is free for up to 5 users; larger SSO deployments and SCIM require an enterprise license.

Best for: Teams that want the widest provider compatibility and a lightweight, fully open-source self-hosted proxy.

Portkey

Open-source gateway (MIT, TypeScript) paired with a hosted control plane for governance and analytics.

Portkey routes to 1,600+ models and ships 20+ native guardrails (plus partner integrations) and an MCP gateway. Governance, analytics, and configuration run through a Portkey-hosted control plane, with a self-hosted data-plane option for enterprises — so LLM traffic only stays in your network when you run that data plane yourself. Semantic caching, SSO, and SCIM sit on enterprise / control-plane tiers.

Best for: Teams that want a managed governance console and don't mind a hosted control plane (with a self-hosted data plane available for enterprises).

Envoy AI Gateway

Apache-2.0, Go-based AI gateway built on Envoy Proxy and Envoy Gateway, Kubernetes-native and self-hosted.

Envoy AI Gateway brings LLM traffic onto the Envoy data plane: model-name virtualization, automatic provider fallback, token-based usage rate limiting and quota budgets, a GA MCP gateway (v1.0), and GenAI metrics via OpenTelemetry. As pure infrastructure it has no native guardrail engine or semantic cache, and no built-in identity/SSO plane — those are expected to live in surrounding tooling.

Best for: Platform teams already standardized on Envoy and Kubernetes who want a CNCF-aligned, infrastructure-grade AI data plane.

TrueFoundry AI Gateway

Proprietary enterprise gateway (TypeScript/Hono) deployable as SaaS or fully self-hosted in your own VPC.

TrueFoundry can run on its managed cloud or be self-hosted in a VPC, on-prem, or air-gapped — keeping LLM traffic in your network. It offers latency-based routing, weighted load balancing and fallback, semantic caching, an MCP gateway with OAuth/RBAC, token and cost quotas, PII/toxicity guardrails, and SSO (OIDC/SAML) with SCIM. The vendor claims sub-3ms internal latency.

Best for: Enterprises that want a full governance plane they can also self-host for compliance or data-residency reasons.

Vercel AI Gateway

Proprietary managed service that exposes hundreds of models through a single Vercel-hosted endpoint.

Vercel AI Gateway is a hosted SaaS: one key reaches hundreds of models across ~45 providers, with provider ordering, fallback, BYOK (no token markup), and built-in spend/budget observability. Requests route through Vercel rather than your network; it offers prompt caching only (no semantic cache), no native guardrails, and SSO/SCIM via Vercel Enterprise. MCP is a Vercel platform feature, not a gateway capability.

Best for: Teams building on Vercel that want zero-ops model access and spend visibility without running infrastructure.

Cloudflare AI Gateway

Proprietary managed service that proxies LLM traffic through Cloudflare's edge network.

Cloudflare AI Gateway fronts 20+ providers through a universal, OpenAI-compatible endpoint with retries and fallback, exact-match response caching, spend limits and analytics, and Cloudflare Guardrails for prompt/response moderation. Traffic runs through Cloudflare's edge rather than your own network; semantic caching is not yet available, and MCP is handled by separate Cloudflare products. SSO is free; SCIM is enterprise-only.

Best for: Teams already on Cloudflare that want a fast, edge-cached gateway with built-in content moderation.

Our Take

If your priority is keeping LLM traffic and data inside your own network, AISIX is the most natural fit. It is open source (Apache-2.0) from the creators of Apache APISIX, runs its data plane in your own VPC, and governs LLM and AI-agent traffic on the same data plane as your API gateway — plus ensemble models that few gateways offer. As a hybrid platform you can self-host it outright or pair it with a managed control plane, so you keep data residency without taking on all the operations yourself.

The other gateways fit different priorities: choose LiteLLM when raw provider breadth matters most, Vercel or Cloudflare when you want a fully managed, zero-ops service and do not need data residency, Envoy AI Gateway when you are standardizing on Envoy and Kubernetes, and TrueFoundry or Portkey when you want a hosted governance console you can later self-host. AISIX’s SSO/SCIM and semantic caching are on the active roadmap; everything else credited to it above — including its MCP gateway — is available today.

In-Depth Comparisons

Read the full, side-by-side breakdowns:

Frequently Asked Questions