How Does vivo Integrate with APISIX

November 25, 2022

Case Study

Overview

Since May 2021, vivo has introduced Apache APISIX as its API Gateway. After more than a year of practice in vivo, APISIX has solved many technical and business pain points and has been used on a large scale.

Pain Points before Using APISIX

  • Complex Management of Business Scenarios and System Maintenance

    Due to the rapid growth of business, there are various scenarios and systems that serve them, which vivo needs a unified way to manage.

  • Interact between the Data Plane and the Control Plane

    For medium and large-sized companies like vivo, it’s unexpected that minor trouble that occurs in the data plane will impact the control plane.

  • No Support for Multi-Dimension Resources

    Diverse projects leads to various domain names and URLs. The business department needs to search according to different resource dimensions.

  • Uncontrollable Impact of Problems

    As vivo’s projects are complex, and the effect of problems encountered are uncontrollable. The use of some complicated plugins intensifies this.

By replacing APISIX with NGINX, vivo finally successfully made a series of achievements as below.

Achievements after Using APISIX

  • High Availability

    No major failure has occurred since APISIX is launched in vivo, and the system availability exceeds 99.99%.

  • High Performance

    Undertaking significant online traffic and serving a large number of services, the current online forwarding traffic reaches close to one million QPS (Queries-per-second).

  • Rich Features

    Thanks to APISIX's rich features, APISIX can cover almost common NGINX proxy scenarios. About 50% of the vivo projects is migrated from the NGINX to the APISIX clusters.

  • Supporting the Construction and Development of Cloud-Native

    The K8s bare metal supporting the containerization has reached a scale of 10,000. About 40% of the projects has been migrated from bare metal and virtual machines to the K8s container platform, supporting and promoting vivo’s containerization progress.

vivo's System Design Based on APISIX

Next, let’s look at the vivo’s system design after adopting APISIX.

Customized Architecture Display on APISIX

vivo's API gateway architecture with APISIX

From the diagram above, we can analyze that vivo has:

  • Completed the construction of Layer 4 and Layer 7 traffic gateways, which are supported by APISIX
  • Realized the traffic access and mixed deployment of bare metal, virtual machines, and containers
  • Implemented APISIX cluster management
  • Connected internal DevOps platform and business deployment services to access traffic quickly and automatically
  • Improved monitoring construction

Configuration Management and Launching Improvements

In order to better meet the actual needs of the business department, vivo has carried out a series of adaptations on APISIX. Below are some of those adjustments, including control plane alteration, cluster separation management, and data forwarding.

Control Plane Alteration

The whole process can be this:

After the data is configured in the A6 changes platform, the information will be delivered through RPC notify to the ManagerAPI, which is built by vivo based on the open-sourced APISIX Dashboard.

Then the traffic will be sent to apisix-agent. APISIX polls apisix-agent regularly through the privileged process to obtain change tasks in batches. Next, the privileged process notifies the worker through shared queues to realize the change in memory.

In the meantime, APISIX informs the apisix-agent of the result of tasks and then delivers them to ManagerAPI. Additionally, the A6 changes platform can poll the ManagerAPI to get task results.

vivo's management and publishing mode

The etcd is a highlight of APISIX, allowing the independent operation of the control and data planes. Considering the uniqueness of its architecture, vivo discarded etcd in the above process. Here are some reasons.

Due to vivo’ projects diversity, there are various domain names and URLs. Besides, the business departments need to query different dimensions. Thanks to APISIX’s adaptability with not only etcd but also varying kinds of databases, vivo could easily utilize databases like MongoDB, to work together with APISIX.

In addition, vivo made the below contributions to be compatible with Apache APISIX.

  • Developing the Agent Component

    Since May 2021, vivo introduced Apache APISIX. Considering the technical background and context, vivo was unconfident to be incapable of adopting APISIX because vivo has no experience with OpenResty and Lua. In addition, there are many non-forwarding tasks, like log collection and monitoring handling, which might increase the management complexity of the data plane. Consequently, vivo developed the agent component to reduce the complexity of development.

  • Writing Data to Disk

    To make the system adjustable and enable the data plane runs independently, thus reducing the reliance on the control plane, vivo wrote the configuration file into the disk. When APISIX is started, it supports complete pulling from the configuration center and also supports directly obtaining configuration resources from the file directory of the local disk. This way dramatically improves the independence of data and the robustness of the system. Moreover, it is very intuitive to understand the configured route and upstream information on the disk that is placed on the disk, which is helpful for troubleshooting.

  • Callback Change Task Result

    As a large-sized company, vivo needs to make sure that changes to resources such as routers and upstreams can be guaranteed to be effective and successful, and the system can report the error even if these changes fail. Such logic of ACK (Acknowledgement Code) ensures that NGINX workers on a machine can call back. When the callback tasks are successful, all workers on APISIX will update resource changes to the relevant memory.

Cluster Separation Management

Cluster Split Management

The open-source version of APISIX provides etcd for everyone to share. However, the company’s projects are complex, and the problems encountered are uncontrollable. In addition, it is inevitable to use complicated plugins, affecting the system’s performance.

Therefore, it is managed by cluster separation to realize the isolation of cluster configuration on APISIX, which can

  • Control the fault domain and effectively support projects complexity without affecting other projects
  • Effectively reduce the load caused by the APISIX non-forwarding layer when the container nodes change frequently
  • Reduce the load impact caused by the health check

Increasing the QPS Carried by HTTPS

According to the relevant requirements of the Ministry of Industry and Information Technology in China, external network traffic must go through the HTTPS protocol. As a TLS-based HTTP encryption protocol, HTTPS heavily burdens the CPU in the encryption and decryption process.

When the routing and other configurations are the same, the traffic that HTTPS can carry is about 1/8 - 1/10 of that of HTTP.

After patching the Intel® QAT (QuickAssist Technology) accelerator card, vivo delivers the decryption handling to the QAT accelerator card, which frees up the CPU, thereby increasing the QPS carried by HTTPS on a single machine. As can be seen from the figure below, the HTTPS load capacity of a single machine is about doubled.

data showing vivo's improvment on carrying traffic

How Does vivo Combine Businesses with APISIX

Supporting Containerization Development

To support containerization development, vivo self-developed a K8s ingress controller. Below are some functions of it.

  1. Adapting to vivo’s modified asynchronous push configuration change mechanism

  2. Conducting multi-K8s cluster event processing notifications to APISIX

  3. Coping with complex projetcs scenarios such as:

  • One server with multiple ports

  • When other RPC framework servers, such as Dubbo and gRPC, are connected to K8s, a unified set of processing logic is required to notify APISIX or other frameworks of port information according to the configuration characteristics of the projects

  1. Adapting to the particular needs of the company’s internal DevOps and other automation scenarios, facilitating rapid deployment, and enabling traffic

Helping Projects Migrate from NGINX to APISIX

Vivo’s projects are deployed on the existing NGINX cluster and has been running stably for a long time. However, it brings non-business workloads and instability to projects. Consequently, it’s challenging to conduct migration. So how to promote the projects migration to APISIX?

  • First, find a project of a cooperative department, serve the business department well to set a benchmark, and provide technical guidance and training

  • Build an easy-to-use control plane system to facilitate business access and multi-dimensional management of business departments

  • Provide automatic conversion capability and basic configuration converting from NGINX configuration to APISIX configuration

Upgrading APISIX and Supporting Its Open-Source Version

Based on APISIX 2.4 version, vivo made some adjustments and released the new version, which was upgraded to a newer one in Q2 this year.

On the one hand, thanks to the modular architecture of APISIX, it is relatively easy to integrate vivo’s modified Lua code into the branch of the higher version of APISIX. On the other hand, vivo also keeps upgrading OpenResty section, with about one version per year. Since vivo utilizes a lot of PATCH and some useful functions like QAT, upgrading this component is difficult and laborious.

The free version of NGINX community features is slow to update and is inactive. Vivo is considering whether to build jointly with APISIX. To reduce the manpower needed to do related system testing, vivo adopted Robot Framework, a generic test automation framework for system integration testing. They are promoting the relevant components for the unit test coverage and the development model of TDD (Test-driven development).

vivo's Future Planning

Next year, vivo plans to extend APISIX as a traffic gateway into an API gateway, utilizing its advantages of rate limiting, authentication, circuit breaking, etc. Considering combining APISIX with DPDK-NGINX, vivo will also cultivate technical personnel and join the community establishment. Furthermore, it will consolidate the basic skills to lay a good foundation, building traffic and service governance.

Welcome to learn more about Apache APISIX.

You can contact us at https://api7.ai/contact.

Topics:
vivoAPISIXAPI Gateway