Three Commonly-Used Lua Resty Libraries in OpenResty

API7.ai

January 13, 2023

OpenResty (NGINX + Lua)

Learning about programming languages and platforms is often a matter of understanding the standard and third-party libraries rather than the syntax itself. After learning its API and performance optimization techniques, we need to learn the use of various lua-resty libraries to extend our OpenResty capabilities to more scenarios.

Where to find the lua-resty library?

Compared to PHP, Python, and JavaScript, the current OpenResty standard and third-party libraries are still relatively barren, and finding the right lua-resty libraries is not easy. However, here are still two recommended sources to help you find them faster.

The first recommendation is the awesome-resty repository maintained by Aapo. This repository organizes OpenResty-related libraries by category and is all-inclusive, including NGINX C modules, lua-resty libraries, web frameworks, routing libraries, templates, testing frameworks, etc. It is your first choice for OpenResty resources.

If you don't find the right library in Aapo's repository, you can also look at luarocks, topm, or GitHub. There might exist some libraries that haven't been open-sourced for long without much attention.

In the previous articles, we have learned about quite a few useful libraries like lua-resty-mlcache, lua-resty-traffic, lua-resty-shell, etc. Today, in the last article of the OpenResty performance optimization section, we get to know 3 more unique peripheral libraries, all contributed by developers in the community.

Performance improvement of ngx.var

First, let's look at a C module: lua-var-nginx-module. As I mentioned earlier, ngx.var is a relatively performance-consuming operation. Thus, in practice, we need to use ngx.ctx as a layer of cache.

So is there any way to completely solve the performance problem of ngx.var?

This C module does some experimentation in this area, and the results are remarkable, with a 5x performance improvement over ngx.var. It uses the FFI approach, so you need to compile OpenResty with the following compile option first.

./configure --prefix=/opt/openresty \
         --add-module=/path/to/lua-var-nginx-module

Then use luarocks to install the lua library in the following way:

luarocks install lua-resty-ngxvar

The method called here is also very simple, requiring only one line of fetch function. It works equally to the original ngx.var.remote_addr to get the IP address of the client.

content_by_lua_block {
    local var = require("resty.ngxvar")
    ngx.say(var.fetch("remote_addr"))
}

After understanding these basic operations, you may be more curious about how this module achieves a significant performance improvement. As we always say, "there are no secrets in front of the source code". So let's take find out how to fetch the remote_addr variable.

ngx_int_t
ngx_http_lua_var_ffi_remote_addr(ngx_http_request_t *r, ngx_str_t *remote_addr)
{
    remote_addr->len = r->connection->addr_text.len;
    remote_addr->data = r->connection->addr_text.data;

    return NGX_OK;
}

After reading this code, you will see that this Lua FFI approach is the same as the lua-resty-core approach. It has the obvious advantage of using FFI to get variables directly, bypassing the original lookup logic of ngx.var. Its disadvantage is obvious: adding C functions and FFI calls for each variable you want to get, which is time and energy-consuming.

Some people may ask, "Why would I say this is time and energy-consuming? Doesn't the C code above look pretty substantial?" Let's take a look at the source of these lines of code, which come from src/http/ngx_http_variables.c in the NGINX code.

static ngx_int_t
ngx_http_variable_remote_addr(ngx_http_request_t *r,
ngx_http_variable_value_t *v, uintptr_t data)
{
    v->len = r->connection->addr_text.len;
    v->valid = 1;
    v->no_cacheable = 0;
    v->not_found = 0;
    v->data = r->connection->addr_text.data;

    return NGX_OK;
}

After seeing the source code, the mystery is revealed! lua-var-nginx-module is a porter of NGINX variable code, with FFI wrapping in the outer layer, and in this way, it achieves performance optimization. This is a good idea and a good direction for optimization.

When learning a library or a tool, we must not just stop at the level of operation, but also ask why we do it and look at the source code. Of course, I also strongly encourage you to contribute code to support more NGINX variables.

JSON Schema

Here I introduce a lua-resty library: lua-rapidjson. It is a wrapper around rapidjson, Tencent's open-source JSON library, and is known for its performance. Here, we focus on the difference between it and cjson: JSON Schema support.

JSON Schema is a common standard that allows us to precisely describe the format of parameters in an interface and how they are to be validated. Here is a simple example:

"stringArray": {
    "type": "array",
    "items": { "type": "string" },
    "minItems": 1,
    "uniqueItems": true
}

This JSON accurately describes that the stringArray parameter is of a type string array and that the array cannot be empty, nor can the array elements be duplicated.

lua-rapidjson allows us to use JSON Schema in OpenResty, which can bring great convenience to the interface validation. For example, for the limit count interface described earlier, we can use the following schema to describe:

local schema = {
    type = "object",
    properties = {
        count = {type = "integer", minimum = 0},
        time_window = {type = "integer",  minimum = 0},
        key = {type = "string", enum = {"remote_addr", "server_addr"}},
        rejected_code = {type = "integer", minimum = 200, maximum = 600},
    },
    additionalProperties = false,
    required = {"count", "time_window", "key", "rejected_code"},
}

You will find that this can lead to two very obvious benefits:

  1. For the front end, the front end can directly reuse this schema description for front-end page development and parameter validation without having to concern itself with the back end.
  2. For the back end, the back end directly uses lua-rapidjson's schema validation function SchemaValidator to determine the legitimacy of the interface, and there is no need to write extra code.

Worker communication

Finally, I'd like to talk about the lua-resty library that enables communication between workers in OpenResty, where there is no mechanism for direct communication between workers, which poses a lot of problems. Let's imagine a scenario:

An OpenResty service has 24 worker processes, and when the administrator updates a configuration of the system through the REST HTTP APIs, only one Worker receives the update from the administrator and writes the result to the database, updating the shared dict and the lru cache within its own Worker. So, how can the other 23 workers be notified to update this configuration?

A notification mechanism among multiple Workers is needed to accomplish the above task. In the case that OpenResty does not support it, we have to save the day by shared dict data across workers.

lua-resty-worker-events is a concrete implementation of this idea. It maintains a version number in a shared dict, and when a new message is published, it adds one to the version number and puts the contents of the message into the dictionary with the version number as the key.

event_id, err = _dict:incr(KEY_LAST_ID, 1)
success, err = _dict:add(KEY_DATA .. tostring(event_id), json)

Also, a polling loop with a default interval of 1 second is created in the background using ngx.timer to constantly check for changes in the version number:

local event_id, err = get_event_id()
if event_id == _last_event then
    return "done"
end

This way, as soon as a new event notification is found to be processed, the message content is retrieved from the shared dict based on the version number:

while _last_event < event_id do
    count = count + 1
    _last_event = _last_event + 1
    data, err = _dict:get(KEY_DATA..tostring(_last_event))
end

Overall, although lua-resty-worker-events has a one second delay, it still implements a Worker-to-Worker event notification mechanism.

However, in some real-time scenarios, such as message pushing, OpenResty's lack of direct communication between Worker processes may cause you some problems. There is no better solution for this, but if you have good ideas, please feel free to discuss them on Github. Many features of OpenResty are driven by the community to build a virtuous ecological cycle.

Summary

The three libraries we introduced today are unique and bring more possibilities for OpenResty applications. Finally, an interactive topic, have you found any interesting libraries around OpenResty? Or what do you find or wonder about the libraries mentioned today? You are welcome to send this article to the OpenResty users around you to exchange and progress together.