What Is an API Gateway? How It Works, Architecture, and Use Cases
API7.ai
February 6, 2025
An API gateway is a server that sits between clients and backend services, acting as a single entry point for all API traffic. It receives every API request, applies cross-cutting policies (authentication, rate limiting, logging), routes the request to the correct backend service, and returns the response to the client. In a microservices architecture, the API gateway eliminates the need for clients to know about or communicate directly with individual services.
| Quick Facts | Details |
|---|---|
| What It Is | A reverse proxy specialized for API traffic management |
| Primary Function | Single entry point that routes, secures, and manages API requests |
| Core Features | Routing, authentication, rate limiting, load balancing, monitoring |
| Where It Sits | Between API clients and backend services |
| Used In | Microservices, cloud-native apps, mobile backends, AI/ML platforms |
| Open-Source Options | Apache APISIX, Kong, Traefik, Envoy, KrakenD |
| Cloud Options | AWS API Gateway, Azure API Management, Google Apigee |
What Is an API Gateway?
An API gateway is infrastructure software that manages the flow of API requests between clients (web apps, mobile apps, partner systems, IoT devices) and backend services. Instead of each client calling each microservice directly, all requests pass through the gateway.
The gateway handles concerns that every API needs but no individual service should implement on its own:
- Request routing — Direct each request to the correct backend service based on URL path, headers, or request content
- Authentication and authorization — Verify API keys, JWT tokens, or OAuth credentials before forwarding requests
- Rate limiting — Cap the number of requests per client to prevent abuse and protect backend resources
- Load balancing — Distribute requests across multiple instances of a service
- Request/response transformation — Modify headers, rewrite URLs, convert between protocols (REST to gRPC)
- Caching — Store and serve frequently requested responses without hitting the backend
- Monitoring and logging — Capture metrics, traces, and access logs for every request
graph LR
C1[Web App] --> GW[API Gateway]
C2[Mobile App] --> GW
C3[Partner API] --> GW
C4[IoT Device] --> GW
GW --> S1[User Service]
GW --> S2[Order Service]
GW --> S3[Payment Service]
GW --> S4[Inventory Service]
GW --> S5[AI/ML Service]
Without a gateway, each client must discover and connect to each backend service individually, handle authentication differently per service, and implement retry logic, circuit breaking, and load balancing on its own. The API gateway centralizes all of this into one manageable layer.
How Does an API Gateway Work?
When a client makes an API request, the gateway processes it through a pipeline of steps before forwarding it to the backend:
sequenceDiagram
participant C as Client
participant G as API Gateway
participant P as Plugin Pipeline
participant U as Upstream Service
C->>G: HTTP Request (GET /api/orders/123)
G->>P: 1. Route matching
P->>P: 2. Authentication (verify JWT)
P->>P: 3. Rate limit check
P->>P: 4. Request transformation
G->>U: Forward to Order Service
U-->>G: Response (200 OK + JSON)
G->>P: 5. Response transformation
P->>P: 6. Logging & metrics
G-->>C: Return response to client
Step-by-Step Request Flow
-
Route matching — The gateway matches the incoming URL path and HTTP method against configured routes.
GET /api/orders/123maps to the Order Service, whilePOST /api/paymentsmaps to the Payment Service. -
Plugin execution (request phase) — Configured plugins run in order. An authentication plugin verifies the JWT token. A rate-limiting plugin checks whether the client has exceeded its quota. A request transformation plugin may add headers or modify the body.
-
Upstream forwarding — The gateway selects a healthy backend instance using a load balancing algorithm (round-robin, least connections, consistent hashing) and forwards the request.
-
Response processing — The backend response passes back through the plugin pipeline. Logging plugins record metrics. Response transformation plugins may strip internal headers or modify the response format.
-
Client response — The gateway returns the final response to the client, including any added headers (rate limit counters, correlation IDs).
The Plugin Architecture
Modern API gateways use a plugin architecture that makes functionality modular and extensible. Each cross-cutting concern — authentication, rate limiting, caching, logging — is implemented as a separate plugin that can be enabled or disabled per route, per service, or globally.
Apache APISIX, for example, ships with 80+ built-in plugins and supports custom plugins written in Lua, Go, Java, Python, or WebAssembly. This means you can add new capabilities without modifying gateway core code.
# Example: Apache APISIX route with plugins routes: - uri: /api/v1/orders/* upstream: nodes: "order-service:8080": 1 type: roundrobin plugins: jwt-auth: {} limit-req: rate: 100 burst: 50 rejected_code: 429 proxy-rewrite: regex_uri: ["^/api/v1/orders/(.*)", "/orders/$1"] prometheus: {}
This configuration creates a route that:
- Matches requests to
/api/v1/orders/* - Requires JWT authentication
- Limits to 100 requests/second with a burst of 50
- Rewrites the URI before forwarding
- Exposes Prometheus metrics
API Gateway Architecture Patterns
API gateways can be deployed in different patterns depending on your architecture and scale.
Single Gateway (Edge Gateway)
The simplest pattern: one gateway instance (or cluster) handles all external API traffic.
graph TD
Internet[Internet] --> GW[API Gateway]
GW --> S1[Service A]
GW --> S2[Service B]
GW --> S3[Service C]
Best for: Small to medium deployments, monolith-to-microservice transitions, simple architectures.
Backend for Frontend (BFF)
Each client type (web, mobile, partner) gets its own gateway optimized for that client's needs. The web BFF might aggregate multiple API calls into a single response, while the mobile BFF returns smaller payloads optimized for bandwidth.
graph TD
Web[Web App] --> GW1[Web BFF Gateway]
Mobile[Mobile App] --> GW2[Mobile BFF Gateway]
Partner[Partner] --> GW3[Partner Gateway]
GW1 --> S1[Service A]
GW1 --> S2[Service B]
GW2 --> S1
GW2 --> S3[Service C]
GW3 --> S2
Best for: Organizations with diverse client types that have different API consumption patterns.
Two-Tier Gateway
An external gateway handles north-south traffic (client-to-service), while an internal gateway or service mesh handles east-west traffic (service-to-service).
graph TD
Internet[Internet] --> EGW[External API Gateway]
EGW --> IGW[Internal Gateway / Service Mesh]
IGW --> S1[Service A]
IGW --> S2[Service B]
IGW --> S3[Service C]
S1 <--> S2
S2 <--> S3
Best for: Large enterprises with both external APIs and complex internal service communication.
Core API Gateway Features
Authentication and Authorization
An API gateway centralizes authentication so individual services don't implement it themselves. Supported methods typically include:
| Method | How It Works | Best For |
|---|---|---|
| API Keys | Client sends a key in header or query param | Simple public APIs |
| JWT (JSON Web Tokens) | Self-contained signed tokens with claims | Stateless auth in microservices |
| OAuth 2.0 | Token-based delegated authorization with scopes | User-facing APIs with third-party access |
| mTLS | Mutual TLS with client certificates | Service-to-service, zero-trust |
| Basic Auth | Username/password in Authorization header | Internal/development APIs |
| LDAP/OIDC | Enterprise directory integration | Corporate environments |
The gateway verifies credentials, extracts identity information (user ID, roles, scopes), and passes it to backend services via headers — so services receive pre-authenticated requests.
Rate Limiting and Traffic Control
API gateways enforce rate limiting to prevent abuse and protect backend infrastructure:
- Per-consumer limits — Each API key or user gets a separate quota
- Per-route limits — Expensive endpoints (e.g., AI inference) get lower limits
- Global limits — Cap total system throughput
- Dynamic limits — Adjust based on current system load
When a client exceeds the limit, the gateway returns 429 Too Many Requests with a Retry-After header without the request ever reaching your backend.
Load Balancing
The gateway distributes traffic across multiple backend instances using algorithms like:
- Round-robin — Requests cycle through backends sequentially
- Weighted round-robin — More traffic goes to more powerful instances
- Least connections — New requests go to the instance with fewest active connections
- Consistent hashing — Requests from the same client always go to the same instance (useful for caching)
- EWMA (Exponential Weighted Moving Average) — Routes to the instance with lowest latency
Request and Response Transformation
API gateways can modify requests before forwarding them upstream and transform responses before returning them to clients. Common examples include:
- URL rewriting — Map external paths (
/api/v2/users) to internal paths (/users) - Header injection — Add correlation IDs, authentication context, or routing hints
- Body transformation — Convert between JSON and XML, add/remove fields
- Protocol translation — Accept REST requests from clients, forward as gRPC to backends
Observability
- Centralized Logging & Monitoring: Provides insights into API usage, errors, and performance metrics.
Production API gateways integrate with monitoring and logging systems:
- Metrics — Request count, latency histograms, error rates per route (Prometheus, Datadog, StatsD)
- Distributed tracing — Inject trace headers for end-to-end request tracing (Jaeger, Zipkin, OpenTelemetry)
- Access logging — Structured logs for every request (Elasticsearch, Splunk, CloudWatch)
- Alerting — Trigger alerts when error rates spike or latency exceeds thresholds
API Gateway vs Related Technologies
API Gateway vs Reverse Proxy
Example: APISIX API gateway supports OAuth 2.0, mTLS, and API key-based authentication, ensuring secure access control.
A reverse proxy (Nginx, HAProxy) forwards requests to backend servers and handles SSL termination, caching, and load balancing. An API gateway does all of this plus API-specific features: authentication, rate limiting per consumer, request/response transformation, API versioning, and developer portal integration.
| Capability | Reverse Proxy | API Gateway |
|---|---|---|
| Request routing | Yes | Yes |
| SSL termination | Yes | Yes |
| Load balancing | Yes | Yes |
| Authentication (JWT, OAuth) | Limited | Yes |
| Per-consumer rate limiting | No | Yes |
| API versioning | No | Yes |
| Request transformation | Limited | Yes |
| Developer portal | No | Yes |
| Plugin ecosystem | Limited | Extensive |
In practice: Many API gateways (APISIX, Kong) are built on top of reverse proxy technology (Nginx/OpenResty) and extend it with API-specific capabilities.
API Gateway vs API Management
An API gateway is the runtime component that processes API traffic. API management is the broader discipline that includes the gateway plus:
- Developer portal — Documentation, API discovery, and self-service key management for developers
- API lifecycle management — Versioning, deprecation, and retirement workflows
- Analytics & reporting — Business-level insights into API adoption and usage trends
- Monetization — Billing integration, subscription tiers, and usage-based pricing
- Governance & policy management — Organization-wide API standards, compliance, and approval workflows
API Gateway vs Service Mesh
An API gateway manages north-south traffic — requests from external clients entering your system. A service mesh (Istio, Linkerd) manages east-west traffic — communication between internal services.
| Aspect | API Gateway | Service Mesh |
|---|---|---|
| Traffic direction | North-south (external → internal) | East-west (internal ↔ internal) |
| Deployment | Centralized proxy | Sidecar per service |
| Primary concern | External API management | Internal service communication |
| Features | Auth, rate limiting, transformation | mTLS, retry, circuit breaking |
| Complexity | Lower | Higher |
Many organizations use both: an API gateway at the edge and a service mesh internally. Some gateways like APISIX can also function in service mesh mode for east-west traffic.
API Gateway Use Cases
Microservices Architecture
The most common use case. When a monolith is decomposed into microservices, clients need a single entry point rather than discovering and communicating with dozens of individual services. The API gateway provides this entry point, handling routing, authentication, and aggregation.
Mobile and IoT Backends
Mobile apps and IoT devices operate on unreliable networks with limited bandwidth. An API gateway can aggregate multiple backend calls into a single response (reducing round trips), compress responses, and cache frequently requested data — improving the mobile client experience.
Use OAuth 2.0, JWT tokens, and API keys to control access securely.
Multi-Cloud and Hybrid Deployments
Organizations running services across AWS, Azure, GCP, and on-premises data centers use API gateways to provide a unified API surface regardless of where backend services are deployed. The gateway abstracts away the infrastructure topology from API consumers.
AI and LLM Applications
Modern AI applications call multiple LLM providers (OpenAI, Anthropic, Google Gemini) with different API formats, pricing, and rate limits. An AI Gateway extends the traditional API gateway with LLM-specific features: model routing, token-based rate limiting, cost tracking, prompt caching, and automatic failover between providers.
API Monetization
SaaS companies that sell API access use gateways to enforce subscription tiers, track usage, and integrate with billing systems. Different API keys get different rate limits, feature access, and SLAs — all enforced at the gateway layer.
Legacy Modernization
When modernizing legacy systems, an API gateway can sit in front of legacy services and expose them as modern REST or GraphQL APIs. The gateway handles protocol translation, request/response mapping, and security — without modifying the legacy system.
Choosing an API Gateway
Open-Source Options
| Gateway | Language | Performance | Plugin System | Kubernetes | Key Strength |
|---|---|---|---|---|---|
| Apache APISIX | Lua/Nginx | Very high (sub-ms latency) | 80+ plugins, multi-language | Ingress Controller | Performance, extensibility |
| Kong | Lua/Nginx | High | Plugin Hub | Ingress Controller | Ecosystem, enterprise features |
| Envoy | C++ | Very high | Filter chain | Istio integration | Service mesh, gRPC native |
| Traefik | Go | High | Middleware | Native CRDs | Auto-discovery, simplicity |
| KrakenD | Go | Very high | Declarative config | Helm charts | API aggregation, stateless |
Cloud-Managed Options
| Service | Provider | Best For |
|---|---|---|
| AWS API Gateway | Amazon | AWS-native applications |
| Azure API Management | Microsoft | Azure ecosystem, enterprise governance |
| Google Apigee | API monetization, multi-cloud | |
| Cloudflare API Gateway | Cloudflare | Edge computing, DDoS protection |
How to Choose
- High performance + extensibility → Apache APISIX or Envoy
- Managed service, minimal ops → AWS API Gateway or Azure APIM
- Enterprise features + support → API7 Enterprise (built on APISIX) or Kong Enterprise
- Kubernetes-native auto-discovery → Traefik
- API aggregation (BFF pattern) → KrakenD
- AI/LLM workloads → AISIX AI Gateway or custom Envoy filters
For a detailed side-by-side comparison, see our API Gateway Comparison page.
Getting Started with an API Gateway
Here is a minimal example using Apache APISIX to set up an API gateway that routes, authenticates, and rate-limits an API:
1. Install APISIX
# Using Docker curl https://raw.githubusercontent.com/apache/apisix-docker/master/quickstart/docker-compose.yml -o docker-compose.yml docker compose up -d
2. Create an Upstream (Backend Service)
curl -i http://127.0.0.1:9180/apisix/admin/upstreams/1 -X PUT \ -H "X-API-KEY: $APISIX_ADMIN_KEY" \ -d '{ "type": "roundrobin", "nodes": { "httpbin.org:80": 1 } }'
3. Create a Route with Plugins
curl -i http://127.0.0.1:9180/apisix/admin/routes/1 -X PUT \ -H "X-API-KEY: $APISIX_ADMIN_KEY" \ -d '{ "uri": "/api/*", "upstream_id": 1, "plugins": { "key-auth": {}, "limit-req": { "rate": 10, "burst": 5, "rejected_code": 429 }, "proxy-rewrite": { "regex_uri": ["^/api/(.*)", "/$1"] } } }'
4. Create a Consumer with an API Key
curl -i http://127.0.0.1:9180/apisix/admin/consumers -X PUT \ -H "X-API-KEY: $APISIX_ADMIN_KEY" \ -d '{ "username": "demo-user", "plugins": { "key-auth": { "key": "my-secret-key" } } }'
Note: Replace
$APISIX_ADMIN_KEYwith the Admin API key configured in yourconfig.yaml. The default key in Docker quickstart isedd1c9f034335f136f87ad84b625c8f1— always change it in production.
5. Test
# Without API key — rejected curl -i http://127.0.0.1:9080/api/get # HTTP/1.1 401 Unauthorized # With API key — success curl -i http://127.0.0.1:9080/api/get \ -H "apikey: my-secret-key" # HTTP/1.1 200 OK
In under 5 minutes, you have an API gateway that authenticates requests and enforces rate limits — the two most critical production concerns.
Next Steps
Explore more topics in our API Gateway Guide series:
- AI Gateway vs MCP Gateway vs API Gateway — Understand how AI gateways and MCP gateways differ from traditional API gateways
- How API Gateways Enhance MCP Servers — Why MCP servers need API gateway security and traffic management
- RESTful API Best Practices — Design patterns for building secure, scalable REST APIs behind an API gateway
- HTTP Methods in APIs — Deep dive into GET, POST, PUT, DELETE and how gateways handle each method
Follow our LinkedIn for valuable insights delivered straight to your inbox!
FAQ
What is an API gateway in simple terms?
An API gateway is a single front door for all your APIs. Every request from clients (web apps, mobile apps, partners) goes through this gateway first. It checks who is making the request, whether they are allowed, whether they are making too many requests, and then forwards the request to the right backend service. Think of it like a hotel reception desk — all guests check in through one desk, which handles room assignments, key cards, and requests.
Do I need an API gateway?
If you have more than one backend service or expose APIs to external consumers, yes. An API gateway centralizes authentication, rate limiting, and monitoring so each service doesn't have to implement these independently. For a single monolithic application with no external API consumers, a simple reverse proxy may be sufficient.
What is the difference between an API gateway and a load balancer?
A load balancer distributes traffic across multiple server instances for availability and performance. An API gateway does this plus API-specific functions: authentication, rate limiting per consumer, request transformation, API versioning, and observability. Load balancers work at the network/transport layer (L4); API gateways work at the application layer (L7) with API-specific intelligence.
Is an API gateway the same as a reverse proxy?
No. A reverse proxy forwards requests to backend servers and handles basic concerns like SSL termination and caching. An API gateway is a reverse proxy with additional API-specific capabilities: consumer-level authentication, per-API rate limiting, request/response transformation, and integration with API management platforms. Most API gateways are built on reverse proxy foundations (e.g., APISIX and Kong use Nginx/OpenResty).
What is the best open-source API gateway?
The best choice depends on your requirements. Apache APISIX offers the best performance and extensibility with 80+ plugins and multi-language support. Kong has the largest ecosystem and enterprise feature set. Envoy is ideal for service mesh integration and gRPC-native workloads. Traefik provides the simplest Kubernetes auto-discovery. For enterprise support on APISIX, API7 Enterprise adds a management console, RBAC, and multi-cluster management.
Can an API gateway replace a service mesh?
Not entirely. An API gateway handles north-south traffic (external clients accessing your APIs), while a service mesh handles east-west traffic (internal service-to-service communication). They solve different problems. However, some gateways like APISIX can handle both north-south and east-west traffic, reducing the need for a separate service mesh in simpler architectures.
How does an API gateway handle security?
An API gateway centralizes security enforcement. It authenticates requests (API keys, JWT, OAuth 2.0, mTLS), authorizes access based on roles or scopes, enforces rate limits to prevent abuse, blocks malicious requests (IP blacklisting, WAF integration), and encrypts all traffic with TLS. Backend services receive pre-authenticated requests, so they can focus on business logic rather than security concerns.
What is an AI gateway?
An AI gateway is a specialized API gateway designed for AI and LLM workloads. It adds capabilities specific to AI: model routing across multiple LLM providers, token-based rate limiting (TPM/RPM), prompt caching, cost tracking per model and consumer, semantic caching, and automatic failover between AI providers. AISIX is an AI-native gateway built on Apache APISIX.