Introduction of Common APIs in OpenResty
In the previous articles, you have been familiar with many important Lua APIs in OpenResty. Today, we will learn about some other general APIs, mainly related to regular expressions, time, process, etc.
Regular Expressions-related APIs
Let's start by looking at the most commonly used and the most important regular expressions. In OpenResty, we should use the set of APIs provided by
ngx.re.* to handle the logic related to regular expressions instead of using Lua pattern matching. This is not only for performance reasons but also because Lua regularity is self-contained and not a
PCRE specification, which would be annoying for most developers.
In the previous articles, you have already come across some of the
ngx.re.* APIs, the documentation of which is very detailed. Thus I won't list them more. Here, I will introduce the following two APIs separately.
The first one is
ngx.re.split. String cutting is a very common function, and OpenResty also provides a corresponding API, but many developers can't find such a function and have to choose to implement it themselves.
ngx.re.split API is not in
lua-nginx-module but in
lua-resty-core; it is not in the documentation of the
lua-resty-core home page but in the documentation of the
lua-resty-core/lib/ngx/re.md third-level directory. As a result, many developers are completely unaware of the existence of this API.
Similarly, APIs that are hard to discover include
enable_privileged_agent, etc., which we mentioned earlier. So how do we quickly solve this problem? In addition to reading the
lua-resty-core home page documentation, you need to read through the
*.md documentation in the
lua-resty-core/lib/ngx/ directory as well.
Second, I want to introduce
lua_regex_match_limit. We haven't talked about the NGINX commands provided by OpenResty before because, in most cases, the default values are sufficient, and there is no need to modify them at runtime. The exception to this is the
lua_regex_match_limit command, which is related to regular expressions.
We know that if we use a regular engine that is implemented based on backtracking NFA, then there is a risk of Catastrophic Backtracking, where the regular is backtracking too much when matching, causing CPU to be 100% and services to be blocked.
Once a catastrophic backtrace occurs, we need to use
gdb to analyze the dump or use
systemtap to analyze the online environment to locate it. Unfortunately, detecting it beforehand isn't easy because only special requests will trigger it. This allows attackers to take advantage of this, and
ReDoS (RegEx Denial of Service) refers to this type of attack.
Here, I mainly introduce you to how to use the following line of code in OpenResty to avoid the above problems simply and effectively:
lua_regex_match_limit is used to limit the number of backtracking by the
PCRE regular engine. This way, even if catastrophic backtracking occurs, the consequences will be limited to a range that will not cause your CPU to be full.
The most commonly used time API is
ngx.now, which prints out the current timestamp, such as the following line of code:
resty -e 'ngx.say(ngx.now())'
As you can see from the printed results,
ngx.now includes the fractional part, so it is more accurate. The related
ngx.time API only returns the integer part of the value. The others,
ngx.http_time are mainly used to return and process time in different formats. If you want to use them, you can check the documentation, they are not difficult to understand, so I don't need to talk about them.
However, it is worth mentioning that these APIs that return the current time, if not triggered by a non-blocking network IO operation, will always return the cached value rather than the current real-time time as we would like. Take a look at the following sample code:
$ resty -e 'ngx.say(ngx.now()) os.execute("sleep 1") ngx.say(ngx.now())'
Between the two calls to
ngx.now, we used Lua's blocking function to sleep for
1 second, but the timestamp returned is the same on both occasions, as shown by the printed results.
So, what if we replace it with a non-blocking sleep function? For example, the following new code:
$ resty -e 'ngx.say(ngx.now()) ngx.sleep(1) ngx.say(ngx.now())'
It will print a different timestamp. This leads us to
ngx.sleep, a non-blocking sleep function. In addition to sleeping for a specified amount of time, this function has another special purpose.
For example, if you have a piece of code that is doing intensive calculations, which takes a lot of time, the requests corresponding to this piece of code will keep taking up worker and CPU resources during this time, causing other requests to queue up and not get a timely response. At this point, we can intersperse
ngx.sleep(0) to make this code give up control so that other requests can also be processed.
Worker and process API
OpenResty provides the
ngx.process.* APIs to obtain information about workers and processes. The former relates to Nginx worker processes, while the latter refers to all Nginx processes in general, not only worker processes, but also the master process, privileged process, and so on.
The problem of
Finally, let's look at the issue of
null values. In OpenResty, the determination of
true value and
null values has been a very troublesome and confusing point.
Let's look at the definition of a
true value in Lua: except for
false, they are all
true values would also include
Let's look at
nil in Lua, which means
undefined. For example, if you declare a variable but haven't initialized it, its value is
$ resty -e 'local a ngx.say(type(a))'
nil is also a data type in Lua. Having understood these two points, let's now look at the other issues derived from these two definitions.
The first issue is
ngx.null. Because Lua's
nil cannot be used as the value of a
table, OpenResty introduces
ngx.null as the
null value in the table.
$ resty -e 'print(ngx.null)' null
$ resty -e 'print(type(ngx.null))' userdata
As you can see from the two pieces of code above,
ngx.null is printed as
null, and its type is
userdata, so can it be treated as a
false value? Of course not. The boolean value of
$ resty -e 'if ngx.null then ngx.say("true") end'
So, keep in mind that only
false values. If you miss this point, it is easy to step into the pitfalls, for example, when you use
lua-resty-redis and make the following judgment:
local res, err = red:get("dog") if not res then res = res + "test" end
If the return value
nil, the function call has failed; if
ngx.null, the key
dog does not exist in redis, then the code crashes if the key
dog does not exist.
The second issue is
cdata:NULL. When you call a C function through the LuaJIT FFI interface, and the function returns a
NULL pointer, then you will encounter another kind of
$ resty -e 'local ffi = require "ffi" local cdata_null = ffi.new("void*", nil) if cdata_null then ngx.say("true") end'
cdata:NULL is also
true. But what's more puzzling is that the following code, which prints
true, means that
cdata:NULL is equivalent to
$ resty -e 'local ffi = require "ffi" local cdata_null = ffi.new("void*", nil) ngx.say(cdata_null == nil)'
So how should we handle
cdata:NULL? It is not a good solution to let the application layer care about these troubles. It's better to do a second-level wrapper and not let the caller know these details.
It's better to do a second-level wrapper and not let the caller know these details.
Finally, let's look at the
null values that appear in
cjson library takes the
NULL in json, decodes it into Lua
lightuserdata, and uses
cjson.null to represent.
$ resty -e 'local cjson = require "cjson" local data = cjson.encode(nil) local decode_null = cjson.decode(data) ngx.say(decode_null == cjson.null)'
cjson.null after being encoded and decoded by JSON. As you can imagine, it is introduced for the same reason as
nil cannot be used as a value in a
So far, Have you been confused by so many kinds of null values in OpenResty? Don't worry. Read this part a few more times and sort it out yourself, then you won't be confused. Of course, we need to think more in the future about whether it works when writing something like
if not foo then.
Today's article introduces you to the Lua APIs commonly used in OpenResty.
Finally, I'll leave you with a question: In the
ngx.now example, why is the value of the
ngx.now not modified when there is no
yield operation? Welcome to share your opinion in the comments, and also welcome you to share this article so that we can communicate and improve together.