Technical Explorations of Open-Source API Gateway Apache APISIX
Background
Apache APISIX is a dynamic, real-time, high-performance open-source API gateway that provides rich traffic management functions such as load balancing, dynamic routing, canary release, circuit breaking, identity authentication, and observability.
As an API gateway, Apache APISIX can help enterprises process API and microservice traffic quickly and safely in scenarios such as gateways, Kubernetes Ingress, and service meshes. In addition, it can handle north-south traffic from the client to the server and east-west traffic from various enterprise microservices.
Being open-sourced on June 6, 2019, APISIX was donated to the Apache Software Foundation in October 2019 and rapidly became a top-level project of Apache Software Foundation within several months.
Why can APISIX boom in a short time? What kind of magic technical explorations did APISIX make? Why are more and more developers and enterprises willing to use APISIX? Let's find out.
Main Features of Apache APISIX
Free from Dependencies on Database
There were many commercial API gateways or open-source API gateway projects before the APISIX project came out. The potential problem of those competitors is that most of these products store their API data, configuration information, certificate configuration, and route information in a relational database.
The apparent advantages of storing in a relational database are that it makes it convenient for users to perform flexible queries with SQL statements and make backup and follow-up maintenance.
However, every coin has two sides. As a basic middleware, the API gateway will handle all client traffic, increasing higher requirements for availability. If your API gateway relies on a relational database, the gateway will be affected once a database error occurs, such as a crash and data loss. In this case, the limitation of the overall availability of the system is reinforced.
Therefore, the designers of APISIX tried every means to avoid such problems from the aspect of the underlying architecture.
The architecture of APISIX is mainly divided into two parts. The first part, called the data plane, is the component that handles clients' requests and traffic. It supports authentication, certificate offloading, log analysis, and observability. The data plane stores no data, which is a stateless structure.
The second part is called the control plane. APISIX does not use traditional configuration storage like MySQL but etcd on the control plane.
The advantages of doing so:
- More unified with the cloud-native architecture of products
- More suitable for the data types stored by the API gateway
- Better reflecting the characteristics of high availability
- Millisecond notifications of changes
By using etcd, the data plane only needs to monitor the changes of etcd. If you poll the database, it may take 5-10 seconds to get the latest configuration. However, if you keep an eye on etcd configuration changes, you can get feedback within milliseconds to achieve the real-time effect.
Therefore, using etcd instead of a relational database makes APISIX more compatible with the native cloud environment at the bottom layer and strengthens its advantages of high availability.
Plugins for Multiple Programming Languages
API gateways are sort of different from databases and other middleware. Compared with the latter, gateways are more frequently used in customized development and system integration.
Although APISIX has released many official plugins, it is still difficult to cover all usage scenarios for different users. Therefore, in actual use, some customized plugins need to be developed for the business, more or less. APISIX also integrates more protocols or systems through the gateway and eventually achieves unified management at the gateway layer.
In the beginning, APISIX only supported the development of plugins in the Lua language. The advantage is that the developed plugins can have high performance through the underlying optimization of the native programming language. However, there is an obvious disadvantage, that is, learning Lua, a new language, requires time and learning costs.
Essentially, APISIX solves the problems in two ways.
The first way is to support more mainstream programming languages, such as Java, Python, Go, etc., through the Runner Plugin. If you are a backend engineer, you should know at least one of these languages; then, you can easily utilize the RPC protocol and develop an APISIX plugin using the language you are familiar with.
On the one hand, it is beneficial for reducing development costs and improving development efficiency. However, on the other hand, there is some latency at the performance level. So the question occurs. Would there be a solution that can achieve the native performance of Lua and take into account the high-level language at the same time?
Here comes the second method, shown in the left part of the above picture. WebAssembly was first used as a technology on the front end or browser and gradually offers its advantages on the server side.
Embedding WebAssembly into APISIX, users can use it to compile to WebAssembly bytecode running in APISIX. The ultimate effect is to efficiently develop an APISIX plugin that is both high-performance and written in high-level programming languages.
So in the current APISIX version, users can use Lua, Go, Python, Wasm, and other methods to customize code based on APISIX. The design lowers the threshold for developers and provides more possibilities for the functions of APISIX.
Hot Reloading of Plugin
APISIX has two significant advances compared to NGINX: APISIX supports cluster management and dynamic reloading.
If you have used NGINX, you would know that all the configurations of NGINX will be written in the configuration file nginx.conf. To perform cluster control, you need to modify the nginx.conf file on every single server. There is no centralized management control plane in the whole process. Each NGINX is a combination of the data plane and the control plane. The management cost will be exceptionally high if users have dozens or hundreds of NGINX servers.
In the scenario mentioned above, every NGINX server needs to get restarted to work after modifying the nginx.conf file. For example, for certificate updates or upstream changes, you need to modify the configuration file first and reboot the server to take effect. This method is barely acceptable if your request is not particularly large. However, as APIs and microservices increasingly show up, the impact on the client will be huge if the server needs to reboot for each modification.
- Hot Updates And Hot Plugins: Continuously updates its configurations and plugins without restarts!
- Proxy Rewrite: Support rewriting the host, uri, schema, method, and headers of the request before sending it to upstream.
- Response Rewrite: Set customized response status code, body, and header to the client.
- Dynamic Load Balancing: Round-robin load balancing with weight.
- Hash-based Load Balancing: Load balance with consistent hashing sessions.
- Health Checks: Enable health checks on the upstream node and will automatically filter unhealthy nodes during load balancing to ensure system stability.
- Circuit-Breaker: Intelligent tracking of unhealthy upstream services.
- Proxy Mirror: Provides the ability to mirror client requests.
- Traffic Split: Allows users to incrementally direct percentages of traffic between various upstreams.
The above picture lists those components where APISIX currently implements hot reloading. We can see that the code takes effect in real-time from upstream to the certificate and even the plugin. Some community members can understand why the upstream and certificates are dynamic as the data changes frequently. However, they would ask why the plugin’s modification needs to be dynamic as they hold the view that modifying the plugin is infrequent, which seems unnecessary to be highly active.
As the underlying designers of APISIX, we hope it can be ultimately dynamic, which brings a significant advantage in increasing more possibilities for enterprise. For example, users can troubleshoot code while modifying the debug plugin without changing any plugin code. Under such circumstances, the user can reproduce the problem at any time and record it without restarting. The plugin with debugging function combined with the hot reloading mechanism will be very flexible, helping developers save time and effort in troubleshooting.
Dynamic Orchestration of Plugin
In addition to hot reloading, APISIX supports real-time dynamic orchestration among plugins. Dynamic orchestration also brings infinite possibilities for the operation of plugins.
What is plugin orchestration? When we put forward various requirements, we hope to turn a request into a plugin, like playing LEGO. We can build an infinite variety of possible shapes through a unified standard like shape fit and intersection, which is one of the joys of LEGO. For APISIX plugins, each plugin fulfills an independent case requirement. We can’t stop asking if it is possible to enable users to customize their needs with plugins as if they are playing LEGO.
For example, if there are 100 plugins in APISIX, the users can see only the functions of these 100 plugins rather than their underlying flexibility. Therefore, when developing middleware, we need to consider what the product can be and how to endow more possibilities when people use them.
APISIX currently has nearly 100 plugins, but it has far more than 100 possibilities. Consequently, after developing its capability in plugin arrangement, the combination becomes 100 * 99 * 98 * 97 * 96 * ..., close to infinite.
For instance, an error code will usually occur after you limit a user’s rate. You can try to connect a logging plugin or an error reporting plugin for subsequent activity recording. The image below shows the model of APISIX’s plugin orchestration.
This feature has a hidden benefit: a complete test suite covers each plugin's code. When the user arranges the plugin, he does not need to write any code. The design would be extremely friendly to product managers, security engineers, and operation and maintenance engineers who do not need to spend long hours on training and learning. Instead, they can create a new APISIX plugin by dropping plugins to adjust some settings. Furthermore, the quality of the new plugin code is as high as the official code of the open-source APISIX.
Gateway for All-Direction Traffic
Server-side engineers who do some gateway-related development might be familiar with two basic concepts: North-South Traffic, which refers to traffic from clients, browsers, or IoT (Internet of Things) devices to the server, and East-West Traffic, referring to the mutual calls between systems and microservices within enterprises.
The components are different in dealing with north-south and east-west traffic. For example, north-south traffic may pass through a load balancer, go to the API gateway, and then enter a service gateway. That’s why there are components like NGINX, APISIX, and Spring Cloud Gateway. When dealing with east-west traffic with a service mesh, you may use components like Envoy, which seems a lot, but you can find their similarities if you focus on their functions. Most are plugins for routing scheduling, dynamic upstream, and secure identity authentication. In this case, can we unify the components that handle the north-south and east-west traffic? The ideal way is that when a client's request enters the server, it is all taken by APISIX. No matter north-south or east-west, all traffic and data are controlled through the control plane. It is entirely achievable with the current technology of APISIX.
Some of our users already confirm that this mode can significantly reduce the user's operation and maintenance costs. In the meantime, it can reduce the complexity, thereby improving the response speed of the entire system. Moreover, such positive feedback gives a more precise direction for subsequent iterations of APISIX, allowing APISIX to try more functions and roles at the full-traffic gateway level.
Support for Multi-Service Discovery Components
The gateway is a fundamental but vital component. It processes all client requests and integrates with various systems and open-source projects, which is more important.
Another essential component - service discovery and registration will be used in integrating with other elements. Users will place various services into separate parts, such as Eureka and Nacos. Consequently, it is prevalent for multiple service discovery components to coexist in large-scale and long-lived IT systems.
In this situation, all traffic ingress and egress become gateways. Almost all gateways support only one service discovery. You need to specify a separate Nacos service discovery component on the A route and another Consul component on the B route. Consequently, you must deploy multiple gateways to match specific gateways to different service discovery components.
Currently, APISIX not only supports service discovery on the data plane but also gradually supports the service registration and service discovery components on the control plane. It is a highly efficient solution for some large-scale and long-term enterprise services. You can easily connect to various service discovery and registration components by deploying only one API gateway.
Exploration of Multi-Cloud and Hybrid Cloud Scenarios
If users deploy the gateway into the production environment equipped with Cloud-Native architecture, the multi-cloud and hybrid cloud must be a long-term technical scenario. On the other hand, if APISIX is set up with full features, performance, plugins, and multi-service discovery, the problem unavoidably comes from how to make users run better in a production environment. Multi-cloud and hybrid cloud scenarios bring more challenges to APISIX. Therefore, more details as follows need to be considered.
1. Both Upstream and Downstream Support mTLS
We didn’t recognize that the function of supporting the upstream mTLS was a high priority previously. However, once it is in a cross-cloud scenario, the upstream may be a service on another cloud or even become another SaaS service. Thus, it is necessary to support the upstream mTLS to improve data security.
2. Complete Separation of Control Plane and Data Plane Architecture
Several security vulnerabilities of APISIX were exposed in the past year, most of which come from the hybrid deployment of its control and data planes. In other words, the control and the data planes are in the same service after the APISIX service starts. So once a hacker invades a specific data plane through a security loophole, he can also get into the control plane to control the whole data, causing disastrous consequences.
3. Strengthen Safety Management
The gateway generally stores some sensitive data. For example, some users may keep the SSL certificate or the critical information for connecting to etcd on the gateway. Under such circumstances, once etcd or the data plane gets breached, it may cause severe data leakage. Hence, when storing some essential information, it is necessary to consider using a component dedicated to keeping keys like Vault to protect sensitive data.
4. Integrate More Cloud Standards
We intend to ensure that users can run smoothly on various cloud platforms without configuring anything under multi-cloud scenarios. It does not mean that users need to configure customized plugins, but it directly allows APISIX to integrate standards, APIs or other services from various clouds. This mode can help users adapt in advance and ensure convenience and comfort for subsequent use.
Supporters of Apache APISIX
Throughout the development history of APISIX, many technical innovations have been made. Growing community contributors work hand in hand since the code construction phase to build APISIX into an integrated API gateway. Nowadays, APISIX maintains rapid iteration, thanks to supporters' efforts.
The iteration and upgrade of an open-source product must be attributed to contributors and users.
When APISIX was donated to the Apache Software Foundation three years ago, it was an immature project with only 20 contributors. Fortunately, APISIX has attracted more users, contributors, and enterprises worldwide owing to its rapid development. For example, our customers include companies such as Tencent, WPS, Sina Weibo, and iQiyi, carrying tens of billions of API calls daily. Moreover, many international users from various industries, such as NASA, European Factory Platform, Swisscom, etc., are among our client lists.
Take WPS as an example. It's a SaaS office software company in China, providing software like Microsoft office 365. Whether you work with multiple people on mobile phones, browsers, or terminal devices, dozens or hundreds of users can edit the same document and see the modifications simultaneously. The function is realized through various calls of the API.
Most giants handle tens of billions of API calls, with a peak of QPS exceeding one million. Such use scale also allows APISIX to gain feedback from real users in large-scale circumstances. Thanks to these enterprise users' support, APISIX has rapidly developed into a mature open-source project.
Besides, many users also share their experiences or code function iterations of using APISIX in the community, contributing to mutual benefit and a win-win for both parties. It is demonstrated that users take APISIX as a good product and more of a worthy open-source project. Only after we win developers' trust can we have the opportunity to become a valuable open-source project.
Contributors adapt the experience and feedback to create many product features; users can utilize these features to bring value to the enterprises. A virtuous circle arises, which is the essence of open source. We are always looking for this kind of truth rather than blindly pursuing bubbles of numerous followers.
APISIX 3.0 Preview
So far, we've talked a lot about what APISIX has done in the past and now. The development process of APISIX can be divided into several stages.
The APISIX 1.0 was to build its framework, strengthen the underlying architecture, and present some essential functions of the API gateway. We explored more profoundly in the 2.0 version, making the bottom layer more flexible and the architecture more mature.
For a mature open-source project, the sign of its maturity focuses not on its great function but on bringing a better experience for users and developers.
At the current stage, APISIX has done plenty of technical exploration and innovation but has not fully considered the user experiences. The problem shows unstable document quality and a lack of tutorial videos. Hence, many users are at a loss when they first come into contact with APISIX. They don't know how to use it or apply certain functions to different user cases, and they are unsure about what unique value APISIX can bring to enterprises.
Therefore, in the next APISIX 3.0 version, we will try to solve similar problems and reconstruct many flaws unfriendly to the developer experience. For example, we will redesign the API to remove the dependence on particular return values of etcd and renew the official documentation to be more user-friendly with straightforward descriptions. In the aspect of function level, the control plane and the data plane will also be separated to enhance security; it supports more layer-4 protocols and RPC protocols so that users can quickly get traffic from all gateway directions.
After implementing the above functions, APISIX 3.0 will become more secure, reliable, and easier to use. We always pay attention to excellent operations and user experience. We hope APISIX can handle requests from all APIs and microservices worldwide and help enterprises manage API and microservice traffic more efficiently.
It is expected that APISIX will release a new version, 3.0, at the end of 2022. We hope you will continue to follow the trends of the latest version of APISIX and actively participate in the contribution of the Apache APISIX project.
The Future of APISIX
The development and replacement of server-side technology are very rapid, as many popular technologies and frameworks five years ago are fading away. The reason is not that engineers prefer new matters but that these technologies cannot meet the real needs of engineers and enterprises. Their fate is doomed: being kicked off. The crucial reality tells us that technology must serve businesses and products and cannot survive without matching the current market needs.
"Don't build a castle on a swamp." This saying always reminds the developers of Apache APISIX that they should rely on actual needs and scenarios to guide its development and evolution. Otherwise, the product will be abandoned by engineers gradually.
How can we keep the technology of Apache APISIX leading in the industry? It is the crucial question of whether Apache APISIX can continue to attract developers and enterprise users in the future. The answer is simple: join fast-growing companies and engineers, grow together, and support each other. It makes Apache APISIX always stand at the forefront of technology. Only by doing so will APISIX have the potential to become a world-class evergreen open-source project.
The future of APISIX is to better support Serverless scenarios, improve service mesh, build API complete lifecycle platforms, and improve the user experience on the public cloud. These are not planned by a few in-house developers but by thousands of developers in the industry. So join us to experience the charm of open source!