Dynamic Rate-Limiting in OpenResty
API7.ai
January 6, 2023
In the previous article, I introduced you to leaky bucket
and token bucket
algorithms, which are common for dealing with burst traffic. Also, we learned how to conduct rate limiting of requests using the NGINX configuration. However, using the NGINX configuration is only at the level of usability and there is still a long way from being useful.
The first problem is that the rate limit key is limited to a range of NGINX variables and cannot be set flexibly. For example, there is no way to set different speed limit thresholds for different provinces and different client channels, which is a common requirement with NGINX.
Another bigger problem is that the rate cannot be dynamically adjusted, and each change requires reloading the NGINX service. As a result, limiting the speed based on different periods can only be lamely implemented through external scripts.
It is important to understand that technology serves the business, and at the same time, the business drives the technology. At the time of NGINX's birth, there was little need to adjust the configuration dynamically; it was more about reverse proxying, load balancing, low memory usage, and other similar needs that drove NGINX's growth. In terms of technology architecture and implementation, no one could have predicted the massive explosion of demand for dynamic and fine-grained control in scenarios such as the mobile Internet, IoT, and microservices.
OpenResty's use of Lua scripting makes up for the lack of NGINX in this area, making it an effective complement. This is why OpenResty is so widely used as a replacement for NGINX. In the next few articles, I'll continue introducing you to more dynamic scenarios and examples in OpenResty. Let's start by looking at how to use OpenResty to implement dynamic rate limiting.
In OpenResty, we recommend using lua-resty-limit-traffic to limit traffic. It includes limit-req
(limit request rate), limit-count
(limit request count), and limit-conn
(limit concurrent connections); and provides limit.traffic
to aggregate these three methods.
Limit request rate
Let's start by looking at limit-req
, which uses a leaky bucket algorithm to limit the rate of requests.
In the previous section, we briefly introduced the key implementation code of the leaky bucket algorithm in this resty library, and now we will learn how to use this library. First, let's look at the following sample code.
resty --shdict='my_limit_req_store 100m' -e 'local limit_req = require "resty.limit.req"
local lim, err = limit_req.new("my_limit_req_store", 200, 100)
local delay, err = lim:incoming("key", true)
if not delay then
if err == "rejected" then
return ngx.exit(503)
end
return ngx.exit(500)
end
if delay >= 0.001 then
ngx.sleep(delay)
end'
We know that lua-resty-limit-traffic
uses a shared dict
to store and count keys, so we need to declare the 100m
space for my_limit_req_store
before we can use limit-req
. This is similar for limit-conn
and limit-count
, which both need separate shared dict
space to be distinguished.
limit_req.new("my_limit_req_store", 200, 100)
The above line of code is one of the most critical lines of code. It means that a shared dict
called my_limit_req_store
is used to store statistics, and the rate per second is set to 200
, so that if it exceeds 200
but is less than 300
(this value is calculated from 200 + 100
), it will be queued, and if it exceeds 300
, it will be rejected.
After the setup is done, we have to process the request from the client. lim: incoming("key", true)
is here to do this. incoming
has two parameters, which we need to read in detail.
The first parameter, the user-specified key for the rate-limiting, is a string constant in the above example, which means that the rate-limiting should be uniform for all clients. If you want to limit the rate according to different provinces and channels, it is very simple to use both information as the key, and the following is the pseudo-code to achieve this requirement.
local province = get_ province(ngx.var.binary_remote_addr)
local channel = ngx.req.get_headers()["channel"]
local key = province .. channel
lim:incoming(key, true)
Of course, you can also customize the meaning of the key and the conditions for calling incoming
, so that you can get a very flexible effect of rate-limiting.
Let's look at the second parameter of the incoming
function, and it is a boolean value. The default is false
, meaning that the request will not be recorded in the shared dict
for statistics; it is just an exercise. If it is set to true, it will have a real effect. Therefore, in most cases, you will need to set it to true explicitly.
You may wonder why this parameter exists. Consider a scenario where you set up two different limit-req
instances with different keys, one key being the hostname and the other key being the IP address of the client. Then, when a client request is processed, the incoming
methods of these two instances are called in order, as indicated by the following pseudo-code.
local limiter_one, err = limit_req.new("my_limit_req_store", 200, 100)
local limiter_two, err = limit_req.new("my_limit_req_store", 20, 10)
limiter_one :incoming(ngx.var.host, true)
limiter_two:incoming(ngx.var.binary_remote_addr, true)
If the user's request passes limiter_one
's threshold detection but is rejected by limiter_two
's detection, then the limiter_one:incoming
function call should be considered an exercise and we don't need to count it.
In this case, the above code logic is not rigorous enough. We need to do a walkthrough of all limiters beforehand so that if a limiter threshold is triggered that can reject the client request, it can directly return.
for i = 1, n do
local lim = limiters[i]
local delay, err = lim:incoming(keys[i], i == n)
if not delay then
return nil, err
end
end
This is what the second argument of the incoming
function is all about. This code is the core code of the limit.traffic
module, which is used to combine multiple rate limiters.
Limit the number of requests
Let's take a look at limit.count
, a library that limits the number of requests. It works like the GitHub API Rate Limiting, which limits the number of user requests in a fixed time window. As usual, let's start with a sample code.
local limit_count = require "resty.limit.count"
local lim, err = limit_count.new("my_limit_count_store", 5000, 3600)
local key = ngx.req.get_headers()["Authorization"]
local delay, remaining = lim:incoming(key, true)
You can see that limit.count
and limit.req
are used similarly. We start by defining a shared dict
in nginx.conf
.
lua_shared_dict my_limit_count_store 100m;
Then new
a limiter object, and finally use the incoming
function to determine and process it.
However, the difference is that the second return value of the incoming function in limit-count
represents the remaining calls, and we can add fields to the response header accordingly to give the client a better indication.
ngx.header["X-RateLimit-Limit"] = "5000"
ngx.header["X-RateLimit-Remaining"] = remaining
Limit the number of concurrent connections
limit.conn
is a library for limiting the number of concurrent connections. It differs from the two previously mentioned libraries in that it has a special leaving API
, which I'll briefly describe here.
Limiting the request rate and the number of requests, as mentioned above, can be done directly in the access
phase. Unlike limiting the number of concurrent connections, which requires not only determining whether the threshold is exceeded in the access
phase but also calling the leaving
API in the log
phase.
log_by_lua_block {
local latency = tonumber(ngx.var.request_time) - ctx.limit_conn_delay
local key = ctx.limit_conn_key
local conn, err = lim:leaving(key, latency)
}
However, the core code of this API is quite simple, which is the following line of code that decrements the number of connections by one. If you don't cleanup in the log
phase, the number of connections will keep going up and soon reach the concurrency threshold.
local conn, err = dict:incr(key, -1)
Combination of rate limiters
This is the end of the introduction of each of these three methods. Finally, let's see how to combine limit.rate
, limit.conn
and limit.count
. Here we need to use the combine
function in limit.
traffic`.
local lim1, err = limit_req.new("my_req_store", 300, 200)
local lim2, err = limit_req.new("my_req_store", 200, 100)
local lim3, err = limit_conn.new("my_conn_store", 1000, 1000, 0.5)
local limiters = {lim1, lim2, lim3}
local host = ngx.var.host
local client = ngx.var.binary_remote_addr
local keys = {host, client, client}
local delay, err = limit_traffic.combine(limiters, keys, states)
This code should be easy to understand with the knowledge you just gained. the core code of the combine
function, which we already mentioned in the analysis of limit.rate
above, is mainly implemented with the help of the drill
function and the uncommit
function. This combination allows you to set different thresholds and keys for multiple limiters to achieve more complex business requirements.
Summary
Not only does limit.traffic
support the three rate limiters mentioned today, but as long as the rate limiter has incoming
and uncommit
API, it can be managed by the combine
function of limit.traffic
.
Finally, I'll leave you with a homework question. Can you write an example that combines the token and bucket rate limiters we introduced before? Feel free to write down your answer in the comment section to discuss with me, and you are also welcome to share this article with your colleagues and friends to learn and communicate together.