What Is an API Gateway, and Why Is It Essential in a Cloud-Native Era?
Intro to APIs
What is an API (Application Programming Interface)? API is a standard way of exchanging data between different applications and systems. Many development teams adopt the API-first approach where iteration is focused on the API, from the designing, implementing, testing, securing, deploying, troubleshooting and analyzing of APIs, which is the full lifecycle API management (APIM).
Before the advent of APIs, there was no standard way of exchanging data. Computer programs communicate with each other using a variety of protocols, such as FTP, FTPS, SFTP, HTTPS, etc. The lack of standards creates high development cost and hidden security risks in many dimensions: permission control, data management, rate limiting, auditing, etc. This is the "Tower of Babel" in the computer world. To build a sufficiently complex product, we must solve the problems caused by systems developed by different languages and different data storage schemes.
The emergence of API has successfully solved the "Tower of Babel" problem. Developers only need to focus on the APIs exposed by other systems and there is no need to understand the underlying implementation details.
The connection and data transmission between client devices and servers for mobile apps, online games, live video streaming, remote conferences, and IoT devices are all inseparable from APIs. APIs play an important role in their communications.
Why Use an API Gateway?
API Gateway is an essential component in the full lifecycle API management. It is responsible for API configuration, release, version rollback, security, and load balancing in the production environment. In addition, API gateway is the entrance of all client traffic, responsible for routing the client’s API request to the correct upstream service for processing and then returning the returned data to the original requester while ensuring safety, reliability, and low latency of the entire process.
When there are not a lot of APIs at the beginning, the API gateway is usually a virtual component joined by the web server and upstream services. Simple functionalities such as routing, forwarding, reverse proxy, and load balancing are done by Apache, NGINX, and some other components; other functionalities such as authentication and rate limiting rely on upstream services.
Why do you need an API Gateway?
However, as the number of APIs increases, the “lazy” developers found a severe problem. In different parts of the upstream services, the same authentication, rate limiting, logging, and some other functions are coded repeatedly. It is not only a waste of resources; it is also a code management nightmare. When one part of the code is being modified or upgraded, code in other places will also need to be updated. How do we solve this problem? The intelligent developers found a solution quickly: abstract (take out) the common functions and put them into a single component. We pull out the code that is unrelated to the business logic from the upstream services, and enhance components Apache and NGINX. This is the evolution of the first-generation API gateway.
The direction of API gateway’s evolution is to embed as many non-business-related functionalities as possible. In order to speed up product iteration, front-end developers and back-end developers demand more and more from API gateways, not only limited to traditional functionalities such as routing, forwarding, reverse proxy, and load balancing, but also demand functionalities for observabilities such as gRPC and GraphQL.
The role of API gateways
To make the API gateway more flexible and efficient, API gateway developers made a lot of innovations on the underlying structure, such as:
- Functionalities Plugins. As more and more functionalities are embedded on the API gateway, how do we separate each functionality to make development easy? A Lego-like plug-in mechanism would be the perfect solution. The mainstream API gateways all use plugins. In Apache APISIX, it is called “plugins”. In Envoy, it is called “filters”. Plugins free gateway developers from the implementation details, and fewer developmental resources are needed to implement new functionalities.
- Separation of Data Plane and Control Plane. In the first-generation of API gateway, Data Plane and Control Plane are implemented in the same computer process. It is easier for users to deploy and use but it creates significant security risks. Data Plane provides services directly to the outside world. If hackers hack into the Data Plane from the outside, it could obtain control permissions and Control Plane’s data (such as SSL certificates), potentially causing more devastating damage. Therefore, most of the open-source API gateways now deploy DP and CP separately and use relational databases or etcd for management and synchronization of configurations.
Taking Apache APISIX as an example, the following architecture diagram illustrates the above innovations:
Challenges From Cloud-Native
The most significant technological change in IT over the past decade has been cloud-native. Docker, born in 2013, opened the curtain of cloud-native. Since then, bare metal and virtual machines have been replaced by containers, and monolithic architectures have been replaced by microservices. However, cloud-native is not a simple technological revolution. The driving force behind it comes from the rapid development and fierce competition of Internet products. Cloud-native related technologies were born at the right time and quickly became popular and replaced many previous technical architectures and solutions. Specifically, the challenges of API gateway in cloud-native mainly come from two of the following aspects:
Monolithic to Microservices
After microservice architecture gains its popularity among developers, it has released huge technical dividends. Microservices can be upgraded and released at their own pace without worrying about coupling with other services. Product iterations are thus agile, with dozens or even hundreds of releases per day.
However, the development of microservices also brings some side effects, such as:
- The number of APIs and microservices has grown from dozens to thousands or even tens of thousands.
- How do we quickly locate which API caused the error?
- How do we ensure the security of the API?
- How do we achieve service circuit break and service downgrade?
API gateway cannot solve problems of security, observability, and canary release by itself. It needs to cooperate with many other open-source projects and SaaS services such as Prometheus, Zipkin, Skywalking, Datadog, Okta, etc., to provide better solutions for enterprises.
Dynamic and Cluster Management
The first challenge comes from the ecosystem, while the second comes from technology.
The popularity of containers and Kubernetes has made dynamics a standard feature of all fundamental cloud-native components. In the Kubernetes environment, containers are constantly being created and destroyed, and elastic scaling has become a necessity rather than an option.
Imagine a scenario: an e-commerce company runs a promotion, and millions of users pour in within an hour and leave after the promotion is over. In this scenario, companies with traditional architecture must purchase a batch of physical servers to deal with the API traffic at peak times. In contrast, companies with cloud-native architecture can use the elastic resources on the public cloud at any time. They may adjust the scale of the network, computing power, databases, and other resources automatically based on the number of API requests.
There are also technical challenges associated with elastic scaling of containers:
- Upstream services frequently changing IP addresses and ports.
- Frequent updates of IP allowlist and denylist.
- Exception detection and handling of service health.
- Regular releases of APIs.
- Timeliness of service registration and discovery.
- Hot renewal and automatic rotation of SSL certificates.
At the heart of addressing these technical challenges is dynamics.
The first-generation API gateways represented by NGINX have weak dynamic capabilities. Because NGINX is driven by local static configuration files, any changes to the configuration will need to reload NGINX service to take effect, which is unacceptable to enterprises in the cloud-native era. This is the first technical pain point of the first-generation API gateway.
The second technical pain point is cluster management.
WPS, a SaaS office software company in China, which provides software like Microsoft office 365. They have hundreds of physical machines running Apache APISIX, nearly 10,000 core CPUs processing API requests from clients, and processing tens of billions of APIs daily.
In this ultra-large-scale API gateway environment, it is impossible for developers to modify the configuration of each API gateway one by one and then reload them. Instead, they want an integrated console to operate the entire cluster. Unfortunately, when the first generation of API gateways was born, there was no such a large instance scale, so the developers of the first generation of gateways did not consider the needs of cluster management.
Next-Generation API Gateway
The above challenges and pain points have gradually spawned a new generation of API gateways.
Features of Next-Generation API Gateway
Unlike the first generation of API gateways, the open-source community is the main force driving the development of the next generation of API gateways in the cloud-native era. With the power of the community and numerous open-source contributors, these API gateways have the opportunity to form a positive iteration and evolution cycle:
- Able to collect the needs and pain points of developers and users much more quickly
- Try to solve these problems in open-source projects
- Open-source projects gets easier to use, attracting more developers
In the process, the next-generation API gateway breaks through the positioning of load balancing and reverse proxy of traditional gateways and takes on more responsibilities, such as traffic connection, scheduling, filtering, analysis, protocol conversion, governance, and integration (See below).
API Gateway supports lower cost secondary development
At the same time, allowing developers to carry out custom development at a lower cost has also become the highlight of the next-generation API gateway. Integration is one of the essential functions of API Gateway. For the downstream, it is protocol resolution and protocol conversions, including GraphQL, gRPC, Dubbo, etc.; for the upstream, it integrates Okta, Keycloak, Datadog, and Prometheus for authentication and observability services and the company's internal certification, logging, auditing, and other services.
The API gateway cannot cover all the components of the integration process. Therefore, it is inevitable for developers to carry out custom development through plugins to meet their own business needs.
Different API gateways provide different programming languages and development methods for custom development. For example, both Apache APISIX and Kong can use Lua to write native plugins, while Envoy uses C++ to write native plugins. At the same time, Apache APISIX can also use Go, Python, Node.js, Java, and Wasm to develop plugins. Almost all developers use one of these mainstream programming languages.
Open source and easy custom development are the most important features of the next-generation API gateway. They provide more choices for developers. In the meanwhile, developers can confidently use API gateway in a multi-cloud or hybrid cloud environment, without worrying about being locked in by cloud service providers.
Example: API Gateway for Black Friday Traffic
Next, let us explain what an API Gateway does with a concrete example.
On Black Friday, e-commerce companies will have lots of promotions, and the volume of API requests during this period is dozens of times higher than usual. First, let's take a look at what the technical architecture would be if there were no API Gateway:
As you can see from the image above, the authentication and logging functions are duplicated in the Order and Payment services. An e-commerce service generally consists of thousands of different services. At this time, much of the codes and procedures will be repeated.
The following figure is the architecture diagram after adding the API gateway:
As you can see from the above, we have integrated the common services at the API gateway layer. As a result, back-end services only need to focus on their own business, providing more possibilities for elastic scaling.
When the promotion starts, millions of API requests from clients pour into the API gateway, and the back-end service needs to perform fast elastic scaling. To protect critical businesses from being affected by sudden traffic, we need to identify malicious crawlers on the API gateway and implement the throttling, service downgrading, and circuit breaking.
We can also temporarily shut down some services, such as product evaluation, express delivery inquiries, etc. However, core businesses such as inventory information, purchase, and payment must not fail. Therefore, we need to manage the container service through K8s and generate more service copies to maintain its proper operation. At this time, the API gateway needs to route the client's API request to the newly copied replica service and automatically remove the faulty service, as shown in the following figure:
In summary, API Gateway is not a new middleware. However, it has gained more and more importance as technical architectures evolve and products iterate at a faster and faster pace. The emergence of the next-generation cloud-native API gateway solves the pain points of enterprise users in terms of cluster management, dynamics, ecosystem, observability, and security.
API Gateway can handle not only API traffic, but also the traffic from Kubernetes Ingress and service mesh, further shorten developer’s learning curve and help enterprises manage traffic in an integrated manner.
Contact us to learn more about Apache APISIX, API7 Enterprise Edition, and API7 Cloud products.