APISIX Elevates System Performance for the Risk Management Enterprise

Jing Yan

Jing Yan

January 23, 2024

Case Study

Overview

About DataVisor

DataVisor experts in fraud and risk management platforms, empowering organizations to dynamically counter evolving fraud attacks and proactively manage risks in real time. Its extensive suite of solutions integrates patented machine learning technology, native device intelligence, and a robust decision engine to ensure protection throughout the entire customer lifecycle, spanning various industries and use cases. DataVisor holds a prominent position in the industry and is widely adopted by Fortune 500 companies worldwide.

Challenges

  • DataVisor faces the critical task of ensuring the efficient execution of risk control calculations to prevent potential risks.

  • Balancing security and preserving an excellent user experience presents a challenge for DataVisor.

  • The choice of a stable, low-latency API gateway tool is crucial for DataVisor to maintain smooth system operation.

Results

  • By leveraging APISIX, DataVisor has streamlined its operations, achieving enhanced flexibility through the packaging of specific plugins.

  • APISIX stands out for its capability to substantially minimize latency for DataVisor, leading to shorter processing times for user requests.

  • Datavisor has successfully processed large-scale requests, ensuring system stability and reliability under high loads with APISIX, providing a robust foundation for managing spikes in user traffic.

Background

The realm of risk management deals with substantial volumes of sensitive data and transaction information, necessitating reinforced security measures to ward off potential cyber-attacks and data breaches. Accordingly, just like most risk management enterprises, DataVisor needs a solution for gateway management to ensure security and efficiency.

In product development, DataVisor has adopted a comprehensive solution for gateway and authentication. APISIX is not just a standalone component within the product ecosystem; it collaborates with other products like AWS API Gateway, Application Load Balancer (ALB), Imperva, and an integrated OAuth authentication mechanism. All these tools, each equipped with gateway functionalities, work together to facilitate traffic access within DataVisor's system.

APISIX_Datavisor

Pain Points Before Using APISIX

In the pre-APISIX landscape, DataVisor encountered significant challenges in the risk control industry, highlighting the imperative need for a transformative solution to enhance efficiency and resilience.

  • Performance Pressure: In the risk control industry, where performance metrics are crucial, DataVisor faces the challenge of ensuring the timely and efficient execution of risk control calculations. Delays in these calculations could lead to a loss of control over potential risks in this competitive and risk-prone environment.

  • Balancing Security and User Experience: The primary goal of risk control tasks is to intercept potentially harmful user actions while preserving a seamless user experience. DataVisor's system must ensure user safety without compromising the natural flow of user interactions, posing a challenging balance.

  • Difficulty with Gateway Tools Meeting Requirements: Many API gateway tools on the market face issues like high latency or erratic performance. Such challenges can impact the stability and availability of DataVisor's system, especially when efficiently managing business traffic is crucial. Therefore, choosing a stable, low-latency API gateway tool is crucial for maintaining the system's smooth operation.

Why APISIX

When deciding on a solution for the company's production setup, DataVisor carefully compared options and settled on APISIX for several reasons:

  1. Cost-Effectiveness: Compared to basic application gateways from cloud providers (like ALB), APISIX offers significant cost savings for our operations.

  2. High Performance, Low Latency: APISIX stands out for its outstanding performance. Unlike other API Gateway tools, APISIX not only avoids noticeable delays but is also less prone to performance spikes, such as P99 or P9999, which ensures a smoother experience without significant latency issues.

  3. Industry-Specific Focus: In the field of risk control, our business system requires a rapid risk control computation time of 50 milliseconds. Failing to complete this computation promptly results in the immediate dismissal of the risk control result. The main goal of risk control is to intercept potentially harmful user actions without disrupting their normal activities.

Implementation of APISIX

Currently, DataVisor is expanding the usage of APISIX across a growing range of business scenarios.

Given that DataVisor does not engage in direct business activities and primarily serves various vendors who call upon our services, APISIX acts as the gateway for traffic deployed on the public network. This deployment approach may seem somewhat unconventional in practical scenarios. Typically, APISIX might be situated on an intranet or one layer below the public network. However, the company's strategy to deploy it directly on the public network allows APISIX to manage traffic originating from diverse business channels efficiently.

APISIX_Datavisor_process

To offer a more concrete understanding of how we implement APISIX in our production environment, a typical use case is provided below.

Customer A initiates access to our system via the red route to acquire an authorization access token. Subsequently, it interacts with DataVisor's internal authorization server or connects to other authorization servers, such as the widely used Okta, through APISIX. The company's primary authentication mechanism involves routing all traffic to Okta for the initial authentication process.

Once customers obtain different tokens, APISIX's routing capabilities will direct this authenticated traffic to various Kubernetes clusters. Presently, DataVisor has deployed an active-active Kubernetes cluster, with traffic being routed to either Cluster A or Cluster B. Typically, it directs traffic to one Kubernetes cluster, while the other serves as a reserve, only handling traffic during extensive upgrades or cluster updates.

APISIX_Datavisor_server

Regarding the usage of the gateway, DataVisor has opted for a relatively straightforward and standard deployment approach. An interesting observation is that DataVisor can place APISIX outside our Kubernetes cluster. This step is made possible by APISIX's outstanding performance, requiring minimal CPU resources. Utilizing smaller instances outside the cluster to deploy APISIX has proven effective in handling significant network traffic.

In DataVisor's production setting, the company has deployed three APISIX nodes, each potentially configured with only two cores. It also has employed minicomputers with 2G or 4G of memory to manage the traffic load. According to the developer from DataVisor, APISIX's performance is expected to rival that of NGINX and OpenResty, and perhaps even surpass his initial expectations.

Customization of APISIX

Enhancing the Privileged Process

While there is not a concept of privileged process in NGINX, it is present in OpenResty, where it stands at the same level as worker processes. This process is somewhat special because it doesn't handle incoming network traffic—it does not listen on any ports. However, it can perform various computations and data collection tasks. As a result, DataVisor has extended this privileged process to cater to their specific needs.

APISIX_Datavisor_backend

The diagram above provides a clear overview of the relationship between APISIX and DataVisor's backend services. The company's primary utilization of APISIX is for receiving and distributing traffic.

At the gateway level, APISIX conducts pre-processing before the traffic enters. What sets DataVisor's configuration apart is the introduction of a small process at the APISIX layer. This process, functioning like a Sidecar, operates concurrently with the APISIX process, which is responsible for executing its designated tasks. Following this step, it transmits the gathered data to APISIX, which, in turn, conveys it back to the upper layer of the system to perform specific business rules. While this usage pattern is relatively uncommon and typically not encountered in common business scenarios, it may be applicable in risk control.

APISIX_Datavisor_worker

How is the implementation of the privileged process handled? DataVisor's model typically follows a master-worker structure, with worker processes responsible for managing business traffic and the master process forking a unique privileged process. In their development, they restrict the privileged process to just one. Thus, DataVisor has devised a distinct strategy: within the privileged process, they fork another process to handle additional tasks, ensuring it does not interfere with the demanding responsibilities of the privileged process.

For data collection, communication between the privileged process and worker processes is facilitated through a shared-dict. The performance of shared-dict is notably robust, meeting the demands of the majority of scenarios.

Developing Plugins

APISIX_Datavisor_plugin

As a result of DataVisor's modifications to APISIX, numerous functionalities in the packaged product are embedded deeply within the project, making dynamic adjustments challenging. Accordingly, DataVisor opted to package specific plugins, integrate them into the APISIX project, and then make modifications using the Dashboard.

The process of developing plugins with APISIX is highly convenient and facilitates the effortless creation of high-performance plugins. Presently, APISIX supports Lua for plugin development and multiple programming languages such as Java, Go, and Python. This versatility empowers users to implement a diverse array of functionalities.

Achievements After Using APISIX

The deployment of APISIX has resulted in an overall enhancement of the DataVisor system's performance, yielding exceptional production outcomes.

  • Latency Reduction: One of the standout features of APISIX is its remarkable ability to substantially reduce latency. In comparison to alternative solutions, Datavisor has observed shorter processing times for user requests, a critical factor in delivering a better user experience.

  • Throughput Boost: The introduction of APISIX has led to a significant increase in throughput, allowing the system to handle concurrent requests with greater efficiency. Unlike using other API gateway products, Datavisor has achieved successful large-scale request processing, ensuring the stability and reliability of the system under high loads through APISIX. This outcome solidifies a dependable foundation for managing spikes in user traffic.

APISIX_Datavisor_effect

  • Effortless, Multilingual Plugin Mastery: Developing plugins with APISIX is exceptionally user-friendly, enabling the smooth creation of high-performance plugins. Also, APISIX supports Lua for plugin development and various programming languages, including Java, Go, and Python. This versatility provides users the capability to implement a wide range of functionalities, which has enhanced DataVisor's development experience and empowered users to leverage their existing expertise.

Summary

To sum up, DataVisor's application experience with APISIX is noteworthy. By using APISIX, DataVisor has streamlined operations and ensured system stability and faster processing times for large-scale requests. Additionally, APISIX provided DataVisor with a developer-friendly environment for creating, customizing, and optimizing high-performance plugins with ease. These experiences, successful in both technical implementation and establishing a robust foundation, contribute to DataVisor's resilience in the risk control industry.

Tags:
Risk Control