Announcing AISIX: The AI-Native AI Gateway
March 31, 2026
When your first AI application transitions from a demo to a production-ready service, you'll quickly realize that the most challenging part is the infrastructure connecting it to the world. Infrastructure built for the network traffic of the past decade cannot meet the unique demands of large language models (LLMs).
This post explains why API7, the team behind the Apache APISIX project, decided to build a brand-new AI gateway from scratch. This is the story of AISIX.
New Traffic Requires a New Gateway
The rise of generative AI has triggered a seismic shift in API traffic. The predictable, stateless request-response patterns that defined the previous era are being replaced by an entirely new, more demanding paradigm. This is not merely a trend, but a market-wide transformation. According to a Menlo Ventures post published in late 2025, enterprise spending on generative AI reached $37 billion in 2025, more than triple the $11.5 billion spent in 2024. Alongside this explosive growth, we are facing a series of new infrastructure challenges that cannot be resolved by simply patching up old systems.
| Challenges | Description |
|---|---|
| Unpredictable Costs | Token-based pricing makes LLM costs a "black box." A minor code change could increase costs tenfold, and without token-level tracking, budgets become meaningless. |
| Observability Blind Spots | Traditional gateways can only measure request counts and total latency, but cannot track LLM-specific metrics such as token usage, TTFT (Time to First Token), and cost per request. Without this data, optimization and troubleshooting are impossible. |
| Security and Governance Challenges | How to manage credentials for dozens of different models? How to prevent prompt injection attacks? How to ensure that sensitive personally identifiable information (PII) isn't leaked to third-party models? Legacy security paradigms are already falling short. |
| The Complexity of Model Routing | Enterprises use multiple LLM providers simultaneously, each with different API formats, rate limits, and pricing. Traditional gateways lack the ability to intelligently route traffic based on model capabilities, latency, or quotas, or to automatically fall back to alternatives. |
We've observed vendors attempting to address these issues by adding an ever-increasing number of plugins to their existing API gateways, but this is merely a stopgap measure. An API gateway designed for stateless microservice traffic cannot be gracefully adapted to manage the streaming and compute-intensive characteristics of AI workloads.

LLMs and AI agents should be first-class citizens, not plugins
As the initiators of the Apache APISIX project, we have spent the past seven years building a high-performance, highly reliable open-source API gateway. We have grown it into a top-level Apache project, trusted by enterprises across industries worldwide to handle their most critical traffic. This journey has placed us at the forefront of API infrastructure evolution.
When the AI wave hit, we were among the first to respond, building AI plugins for APISIX, such as ai-proxy and ai-prompt-guard, to help our users adapt. But the more we built, the more we realized we were pushing the limits of the underlying architecture. We saw users chaining multiple plugins together to achieve basic AI-specific functionality, resulting in complex, fragile configurations and a poor user experience.
This experience led us to a profound reflection: in the past, we were accustomed to starting with data plane technology, but for an emerging concept like the AI Gateway, product positioning, conceptual abstraction, and user experience should take precedence. Users aren't concerned with the technology we use; they care about whether the final product is easy to use and whether it truly solves their problems.
We realized that what the world needs is not another set of AI plugins, but a native AI Gateway, one designed to address the unique challenges of AI. This observation was validated by market forecasts: Gartner predicts that by 2028, 70% of software engineering teams building multimodal applications will use AI gateways to improve reliability and optimize costs.
We made a natural decision. Instead of continuing to build on top of an API gateway, we decided to create an AI-native product. We drew on all the experience gained from building APISIX and launched a project specifically designed for the AI era, which we call AISIX.
We believe the ultimate value of an AI Gateway lies in serving the next generation of AI Agents. When Agents begin autonomously invoking tools and collaborating with other Agents, they require a reliable, secure, and high-performance infrastructure layer. AISIX's native architecture is designed precisely for this future. In our first release, we address the most pressing LLM management challenges; in future releases, we will gradually introduce support for Agent orchestration and protocols such as MCP.
Native AI Gateway
AISIX is our response to the challenges of the AI era. It is a streamlined, powerful, and developer-friendly AI gateway that offers a robust set of features, all natively designed to support AI workloads.

Features implemented in the first version of AISIX:
| Features | Value Proposition |
|---|---|
| Unified LLM Access | Access models from OpenAI, Azure, Anthropic, Google Gemini, and others through a single, consistent API. Includes intelligent routing and load balancing. |
| Enterprise-Grade Observability | Gain token-level cost analysis, latency monitoring, and detailed logs to understand and control your AI spending. |
| Granular Traffic Control | Supports token-based and request-based rate limiting (TPM/TPD, RPM/RPD), configurable down to the individual virtual key or model. |
| Developer-Friendly Tools | Built-in Admin UI and interactive Playground enable developers to quickly test and debug their AI applications. |
| Future-proof architecture | Native support for streaming responses, with best-in-class support for future protocols such as AI Agents and MCP (Model Context Protocol). |
Because AISIX is a native AI gateway, these features are not "add-ons"; they are integrated into the core of the product, resulting in a system that is simpler, more performant, and more secure than any retrofitted solution.

Building Better AI Infrastructure Through Open Source
We believe that the greatest infrastructure software is always built through the collective efforts of passionate communities, in an open environment. The AISIX story has only just begun, and we hope you'll be a part of it. Whether you're a developer building your first AI application or an architect designing an enterprise-grade AI strategy, there's a place for you in the AISIX community.
- Give us a Star on GitHub: Your support is our greatest motivation. Please visit github.com/api7/aisix and give us a Star.
- Join the conversation: Have any questions or ideas? Join our community Discord server to chat directly with core maintainers.
- Become a contributor: We will prepare a list of Good First Issues for you, and we look forward to seeing your first PR!
- Get started in five minutes: Learn more about the product at api7.ai/ai-gateway, and follow the official documentation to launch your first AISIX instance.
Let's build the foundation for AI applications in the next decade and move forward together.