Give Up Spring Cloud Gateway! How Huanbei, a Fintech APP, Utilizes Apache APISIX

September 21, 2022

Case Study

Love & Hate for Java

Why Do Financial Systems Prefer Java

Java is always popular and preferred by many developers since its release because of its language advantages and enormous ecosystem.

In the recent 15 to 20 years, many financial systems have chosen Java as their primary tech stack. After some investigations, we have concluded the following advantages of Java:

Advantages of Java

Because of the above reasons, Java gains the favor of the financial software system.

The Status Quo of Java in the Cloud-Native Era

With the rapid development of technology, the industry will abandon monolithic architecture shortly, and microservices and cloud-native will become the mainstream. However, in recent years’ technology environment, Java has started losing its dominant role in some business scenarios.

Java's Shortcomings

First, Java has weak performance; you would understand why I am saying this by comparing Java with the C-related tech stack.

Second, Java runs on virtual machines, and the virtual machines take care of Java’s memory management. Therefore, Java becomes less competitive when high-performance or dynamic changes are needed.

Apart from that, Java needs much more resources. A framework is easy to build without concern about the cost. However, since computations have become much more detailed and granular in the cloud-native era, resources have become more precious than ever. Moreover, Java would cost enormous resources to run because it is heavy and needs restarting from time to time. As a result, Java would have issues with a high demand for QPS and business continuity.

Last but not least, the pointer issue is also worth discussing. Pointer is a good resource for those developers who are familiar with C/C++. However, Java is running on a virtual machine, which means memory management is handled by the GC (Garbage Collection) instead of developers. In that case, Java’s performance might not be enough for some circumstances when there is a strict demand for high concurrency, traffic, and performance.

Why Did HuanBei Choose APISIX

Shuhe Group is a financial technology platform that provides efficient, intelligent services for companies and individuals in business; it has products like HuanBei, EnjoyPay, etc. HuanBei is a platform that offers an installment service that serves multiple consumption scenes. By working with licensed financial institutions, HuanBei also provides personal loan services and offers startup loan funds. HuanBei always uses a Java tech stack to develop its products in the business.

Spring Cloud Gateway is a gateway aiming to better manage microservices under the Spring Cloud ecosystem. Spring Cloud Gateway is a decent API gateway for companies using Java as their primary developing language. However, in the recent API gateway upgrade, HuanBei abandoned the long-used Spring Cloud Gateway and started using Apache APISIX.

Architecture Differences Between Spring Cloud Gateway & APISIX

HuanBei used three different API gateway systems before adapting APISIX. HuanBei used Spring Cloud Gateway as its operation and exit systems’ gateway and OpenResty as its business system’s gateway.

HuanBei uses Spring Cloud Gateway as its operation and exit systems’ gateway initially due to Spring Cloud Gateway having an enormous ecosystem and an easy-to-deploy & easy-to-maintain distributed development system. In order to rapidly build up business models, HuanBei used all services provided by Spring Cloud when the business foundation was built.

With the development of the business, the gateway started having some stability issues in the original architecture, like memory overflow, high CPU usage, etc. Therefore, HuanBei uses Apache APISIX as the only gateway in the architecture to improve gateway performance and uniformly manage multiple gateways.

In the new gateway framework, the gateway would directly transfer request traffic to the business system via service discovery. However, suppose the backend application doesn’t support service discovery or doesn’t have a healthy Pod in the Consul. Then, the system will redirect the traffic to the previous intranet K8s Ingress instead.

The Practical Application of APISIX

Huanbei has to modify APISIX configuration in real business scenarios since it could not directly use Apache APISIX due to its multiple internal gateway frameworks.

APISIX Build & Deployment

During the internal development, HuanBei put the source codes of the APISIX gateway and custom codes in different route paths and let them cooperate and iterate independently. HuanBei used a Docker image to deploy the gateway. First, it would create a default image based on APISIX’s one specific version, and then it used custom codes to wrap it into a new image.

The wrap of custom codes didn’t use lua_package_path to specify the code directory; instead, it directly covered the default image apisix under the source codes directory if there existed the file with the same name. Dockerfile is shown below:

FROM registry.xxx.net:5001/apisix-shuhe:v1.5
ENV APP_NAME={{APP_NAME}}
COPY {{PRODUCT_FILE}} /tmp/deploy2/artifact.tar.gz

RUN mkdir /tmp/deploy/ && tar -xf /tmp/deploy2/artifact.tar.gz -C /tmp/deploy/ && \
cp -R /tmp/deploy/apisix/ /usr/local/apisix/ && \
cp /tmp/deploy/bin/apisix /usr/bin/apisix && \
cp /tmp/deploy/conf/apisix-$APP_NAME.yaml /usr/local/apisix/conf/apisix.yaml && \
cp /tmp/deploy/conf/config-$APP_NAME.yaml /usr/local/apisix/conf/config.yaml && \
set -x && \
bin='#! /usr/local/openresty/luajit/bin/luajit\npackage.path = "/usr/local/apisix/?.lua;" .. package.path' && \
sed -i "1s@.*@$bin@" /usr/bin/apisix && \
rm -rf /tmp/*

APISIX log is stored locally (could be collected by Syslog or other plugins); we could modify NGINX’s config template and check the used Profile to decide whether we want to store the logs locally or store them in FLUENTD through Syslog. We also need to replace the FLUENTD_HOST variable when building the images shown below:

{% **if** gw_profile and **string**.find( gw_profile,'local') then %}
access_log logs/access.**log** main;error_log logs/
**error**.**log** warn;
{%else%}
access_log syslog:server=${FLUENTD_HOST}:5141 json_format;
error_log syslog:server=${FLUENTD_HOST}:5142 warn;
{%end%}

In the NGINX config template, HuanBei not only modified the log storage but also added ENV environment variables and lua_shared_dicts config in loops and fixed some NGINX optimization parameters.

HuanBei separates the traffic based on different business needs and creates multiple gateways with similar functionalities. Therefore, HuanBei uses a “single source code but multiple gateway applications” plan internally. First, HuanBei configures each gateway’s config-xxx.yaml file by using the Profile function, and then it could build up different gateway’s Docker images based on its application’s name while building up the images through the DevOps platform.

Enterprise-level Customized Plugins

When visiting the operation system internally, the system would call many backend APIs to fetch data, and these APIs need to be included in the whitelist of the API gateway configuration. The permission system should maintain a related API list since the page would offer different ranges of API based on each login user’s role in the operation system. Whenever there is a new API call from the page, developers need to add configure twice in the gateway and permission system, which is redundant and repetitive.

In order to achieve it, HuanBei removes the barrier between the gateway config and the permission system’s config, and it only keeps the entrance of the permission system. The gateway configuration management system would fetch the permission API periodically and then switch it to the gateway’s API whitelist config. This behavior would omit one more unneeded user configuration action and help the permission system do the permission control. In addition, it guarantees the backend API called in the operation page exists in the permission system config.

In business scenarios, there are always some needs that native plugins could not satisfy, which requires customized development. APISIX provides many tools which could help customize the native plugins easily. The following table lists some customized plugins developed based on APISIX inside HuanBei:

PluginStageDescription
gw-token-checkaccess_by_luaVerify the token, the token has special verification rules
gw-limit-rateaccess_by_luaRate limit API priority requests
gw-request-decryptaccess_by_luaDecrypt requests
gw-sign-checkaccess_by_luaVerify requests
gw-mock-pluginaccess_by_luaConnect to company’s mock platform, and transfer the mock API to the mock platform, only open to development and testing environment
gw-micro-envrewrite_by_lua header_filter_by_luaSupport company’s microservice environment, only open to development and testing environment
registry-plusaccess_by_luaReceive outside notifications
serv-maintenance-checkaccess_by_luaClose website maintenance mode
ingress-metric-rptlogCustomized metrics reports

Gateway Canary Release

Previously HuanBei used OpenShift as its K8s containers (it has upgraded to ACK cluster currently), and the Ingress is built with Haproxy.

Due to the reason that public network K8s Ingress’s Haproxy couldn’t split the traffic of a single domain into two different Namspace’s route paths, we need to consider deploying the new gateway in the same Namespace instead of the old gateway. In other words, each domain’s route path would have multiple services, and we could control new and old gateway traffic by assigning different portions of the total traffic.

The actual implementation flow is shown below, it would add groups c and d to deploy a new gateway under the old gateway’s Namespace, and it could control the traffic proportion of new and old gateways.

Many Java developers would choose Spring Cloud in the microservice architecture since Spring Cloud could support Java seamlessly, and it embeds class libraries in the source codes. However, upgrading difficult situations appears in practice. For example, suppose the team needs to maintain multiple class libraries, and there are 10 different languages with 10 different versions, then this team needs to maintain 100 different class libraries.

Right now, we could easily use a proxy (API gateway) to resolve multi-version and multi-languages issues. So what are the benefits for companies which use a Java tech stack and choose APISIX as their API gateway? We conclude based on two aspects from the practical experience of HuanBei.

For the Company

1. Great Functionality & Performance

APISIX’s QPS could reach 80k if HuanBei uses 4-core virtual machines without any plugins to stress test APISIX. Furthermore, APISIX perfectly resolves the performance issue of Spring Cloud Gateway when receiving the consumer traffic, and its performance improves by 30% in the production environment compared to previous gateways.

Apart from that, APISIX could meet all company requirements under the actual testing, like authentication and identification, observability, service discovery, rate limiting, and four-layer and seven-layer traffic transfer. Regarding feature extension, APISIX supports more than 70 plugins, and most businesses can use its native plugins, which reduces a tremendous amount of development work.

2. Reduce Business Cost

Before using APISIX, companies must increase the number of servers to resolve performance issues, significantly increasing the hardware costs.

HuanBei has calculated the cost, and it finds that the cost of servers has reduced by about 60% after using APISIX. Furthermore, after uniting the tech stack, the business could extend different features quickly based on APISIX native architecture, reduce development expenses, and speed up product release time.

For Developers

1. Meet Business Requirements

All software or technology used in the business should serve the needs. However, from the practical test and research results, APISIX has better stability, observability, and extendability.

The software aims to serve the business. Therefore, if a business requirement could help the company save resources, no matter which tech stack this company uses, it should also use the components that match the company.

2. Reduce Maintenance Fees

Compared to previously used OpenResty, APISIX has a lower learning cost and is easier to maintain. Meanwhile, APISIX’s rich plugins simplify some standard features’ deployments and developments, reducing product release time.

Meanwhile, APISIX’s powerful logs and dynamic debugging feature help detect the points of failure in the business so that we can rapidly locate the error and save time.

At the brutal growth stage, the only thing that matters is efficiency. So the director would prefer their familiar language to build up the system and choose different tech stacks during the low-level framework selections to launch the business models more rapidly. Different directors would choose other tech stacks, which would cause many future troubles. However, most active financial enterprises and financial services companies would face the same technical issue: the multi-tech-stacks issue. When this issue happens, they need to combine their tech stacks into one.

When the company business runs on the right track, it is time for the company to split its system vertically. The company needs to turn its information silo architecture into three-layer architecture consisting of a front-end, middle-end, and back-end. Then, when the system stably operates, it is time for companies to implement lower-level components by themselves.

The final goal of building a system is to share. The system has lower maintenance expenses if it has more excellent repeatability. Therefore, when business operation stabilizes, many companies start vertical splitting or implementing the lower-level fundamental components to control the maintenance expenses.

For companies, expense is always the most important principle to consider. At the brutal growth stage, companies only need to launch and let businesses operate as soon as possible. Still, the expense should have the highest priority under the budget in this large environment. In that case, we have to choose between efficiency and cost. Therefore, companies would not use cutting-edge technology if the budget is limited. Likewise, technical staff would not consider the following questions when they choose the tech stack: How would this new technology affect the team? How many benefits could this new technology bring to the current infrastructure?

Technical staff would only consider the expense of this new technology.

Topics: