Multi-layer Caching in API Gateway Tackles High Traffic Challenges

January 26, 2024


As the use of APIs continues to grow in modern development, the demand for an efficient and reliable API gateway has also increased. The API gateway serves as the singular entry point for all incoming API requests, allowing them to be efficiently managed and distributed across various microservices. While the API gateway offers numerous benefits, it may face challenges when dealing with high-traffic scenarios.

Caching Mechanism of APISIX

The following flowchart illustrates the efficient caching mechanism used by APISIX to minimize latency and improve performance. By caching responses at multiple levels, APISIX can effectively reduce the load on upstream servers and provide a more responsive experience for clients.

Client <-- HTTP Request --> APISIX Worker
    (Check LRU Cache in process level)
    (No cache hit)
    (Check Shared DICT Cache in data plane level)
        (Lock not acquired)
            (Acquire lock, check cache)
                (Cache hit)
                (Return cached response, release locks)
                (Cache miss)
                    (Query Redis)
                        (Acquire Mutex)
                            (Query Redis)
                                (Cache miss)
                                    (Retrieve response from upstream)
                                    (Cache response in shared DICT cache)
                                    (Return response to client)
                                (Cache hit)
                                    (Copy response to shared DICT cache)
                                    (Return cached response to client)
                                (Release Redis Mutex)
                        (Release lock)
    (Cache hit)
        (Return cached response)

LRU: First-Layer Cashing in APISIX Single Worker Level

The LRU (Least Recently Used) cache at the worker level of APISIX is a crucial component responsible for caching frequently accessed data within each working process. This cache system employs the LRU eviction algorithm, efficiently storing and retrieving data while prioritizing the handling of the least recently used data. By caching frequently accessed data in memory, APISIX significantly reduces latency and costs when querying external data sources, such as routing rules or authentication tokens, thereby enhancing system response speed.

Through this intelligent caching mechanism, APISIX efficiently utilizes system resources when handling a large volume of requests, thereby improving overall system performance and stability. APISIX, with its advanced LRU cache, provides developers with a reliable and efficient API gateway solution, facilitating smooth communication with external services.

Shared Dict: Second-Layer Cashing in APISIX Node Level

The shared memory dictionary (shared dict) cache between all working processes in one APISIX node. It serves as a centralized cache for commonly accessed data, including API response data or response headers. Multiple worker processes can simultaneously access and update this cache to ensure data consistency and avoid unnecessary data duplication.

This shared memory dictionary cache exhibits outstanding performance, leveraging advanced technologies such as memory locking and efficient data structures. This enables it to achieve the goal of minimizing contention and maximizing throughput. Through memory locking, it effectively controls concurrent access, ensuring consistency during simultaneous read and write operations across multiple working processes. Efficient data structure design enables the shared memory dictionary cache to execute data retrieval and update operations more quickly, enhancing overall performance.

The introduction of the shared memory dictionary cache injects greater performance and scalability into the data plane of APISIX, providing developers with a reliable tool to excel in handling large-scale data and requests.

APISIX Multi-Layer Caching Mechanism

The diagram below illustrates the multi-layer caching mechanism of APISIX, similar to the principle of a funnel. Specifically, the L1 cache utilizes an LRU cache within the worker, the L2 cache is a shared dict among multiple workers, and the L3 cache is a Redis database external to the API gateway.

Here's an example to elucidate: when 10,000 user requests query data through APISIX, assuming the hit rate of the L1 cache is 90%, 9000 requests will be directly returned. The remaining 1000 requests will then query the L2 cache. Assuming the hit rate of the L2 cache is also 90%, then 100 requests will proceed to query the L3 cache, Redis. Before these 100 requests query Redis, they will first query mutex (mutual exclusion) to ensure that only one request queries Redis at a time, preventing the Dogpile Effect.

APISIX Multi-Level Cache Mechanism


By leveraging the capabilities of multi-layered caching, APISIX efficiently handles the majority of client requests without the need for frequent queries to external data storage components such as Redis or Postgres. This not only significantly reduces overall latency but also enhances the throughput of the API gateway, providing businesses with an efficient and robust solution that simplifies communication with external services. The optimization design ensures system robustness, enabling APISIX to flexibly address high-concurrency scenarios and create a more reliable and high-performance development environment for engineers.

CachingAPISIX Basics