Communication magic between NGINX workers: one of the most important data structures `shared dict`

API7.ai

October 27, 2022

OpenResty (NGINX + Lua)

As we said in the previous article, the table is the only data structure in Lua. This corresponds to the shared dict, which is the most important data structure you can use in OpenResty programming. It supports data storage, reading, atomic counting, and queuing operations.

Based on the shared dict, you can implement caching and communication between multiple Workers and rate limiting, traffic statistics, and other functions. You can use shared dict as a simple Redis, except that the data in shared dict is not persistent, so you must consider the loss of stored data.

Several ways of data sharing

In writing the OpenResty Lua code, you will inevitably encounter data sharing between different Workers in different phases of the request. You may also need to share data between Lua and C code.

So, before we formally introduce the shared dict APIs, let's first understand the common data-sharing methods in OpenResty and learn how to choose a more appropriate data-sharing method according to the current situation.

The first is variables in NGINX. It can share data between NGINX C modules. Naturally, it can also share data between C modules and the lua-nginx-module provided by OpenResty, as in the following code.

location /foo {
     set $my_var ''; # this line is required to create $my_var at config time
     content_by_lua_block {
         ngx.var.my_var = 123;
         ...
     }
 }

However, using NGINX variables to share data is slow because it involves hash lookups and memory allocation. Also, this approach has the limitation that it can only be used to store strings and cannot support complex Lua types.

The second one is ngx.ctx, which can share data between different phases of the same request. It is a normal Lua table, so it is fast and can store various Lua objects. Its lifecycle is request-level; when the request ends, ngx.ctx is destroyed.

The following is a typical usage scenario where we use ngx.ctx to expensive cache calls like NGINX variables and use it at various stages.

location /test {
     rewrite_by_lua_block {
         ngx.ctx.host = ngx.var.host
     }
     access_by_lua_block {
        if (ngx.ctx.host == 'api7.ai') then
            ngx.ctx.host = 'test.com'
        end
     }
     content_by_lua_block {
         ngx.say(ngx.ctx.host)
     }
 }

In this case, if you use curl to access it.

curl -i 127.0.0.1:8080/test -H 'host:api7.ai'

It will then print out test.com, showing that ngx.ctx is sharing data at different stages. Of course, you can also modify the above example by saving more complex objects like tables instead of simple strings to see if it meets your expectations.

However, a special note here is that because the lifecycle of ngx.ctx is request-level, it doesn't cache at the module level. For example, I made the mistake of using this in my foo.lua file.

local ngx_ctx = ngx.ctx

local function bar()
    ngx_ctx.host =  'test.com'
end

We should call and cache at the function-level.

local ngx = ngx

local function bar()
    ngx_ctx.host =  'test.com'
end

There are many more details to ngx.ctx, which we'll continue to explore later in the performance optimization section.

The third approach uses module-level variables to share data across all requests within the same Worker. Unlike the previous NGINX variables and ngx.ctx, this approach is a little less understandable. But don't worry, the concept is abstract, and the code comes first, so let's look at an example to understand a module-level variable.

-- mydata.lua
local _M = {}

local data = {
    dog = 3,
    cat = 4,
    pig = 5,
}

function _M.get_age(name)
    return data[name]
end

return _M

The configuration in nginx.conf is as follows.

location /lua {
     content_by_lua_block {
         local mydata = require "mydata"
         ngx.say(mydata.get_age("dog"))
     }
 }

In this example, mydata is a module that is loaded only once by the Worker process, and all requests processed by the Worker after that share the code and data of the mydata module.

Naturally, the data variable in the mydata module is a module-level variable located at the module's top-level, i.e., at the beginning of the module, and is accessible to all functions.

So, you can put data that needs to be shared between requests in the top-level variable of the module. However, it is essential to note that we generally only use this way to store read-only data. If write operations are involved, you must be very careful because there may be a race condition, which is a tricky bug to locate.

We can experience this with the following most simplified example.

-- mydata.lua
local _M = {}

local data = {
    dog = 3,
    cat = 4,
    pig = 5,
}

function _M.incr_age(name)
    data[name]  = data[name] + 1
    return data[name]
end

return _M

In the module, we add the incr_age function, which modifies the data in the data table.

Then, in the calling code, we add the most critical line ngx.sleep(5), where sleep is a yield operation.

location /lua {
     content_by_lua_block {
         local mydata = require "mydata"
         ngx.say(mydata. incr_age("dog"))
         ngx.sleep(5) -- yield API
         ngx.say(mydata. incr_age("dog"))
     }
 }

Without this line of sleep code (or other non-blocking IO operations, such as accessing Redis, etc.), there would be no yield operation, no contention, and the final output would be sequential.

But when we add this line of code, even if it is only within 5 seconds of sleep, another request will likely call the mydata.incr_age function and modify the variable's value, thus causing the final output numbers to be discontinuous. The logic is not that simple in actual code, and the bug is much more challenging to locate.

So, unless you are sure there is no yield operation in between that will give control to the NGINX event loop, I recommend keeping your module-level variables read-only.

The fourth and final approach uses shared dict to share data that can be shared among multiple workers.

This approach is based on a red-black tree implementation, which performs well. Still, it has its limitations: you must declare the size of the shared memory in the NGINX configuration file beforehand, and this cannot be changed at runtime:

lua_shared_dict dogs 10m;

The shared dict also only caches string data and does not support complex Lua data types. This means that when I need to store complex data types such as tables, I will have to use JSON or other methods to serialize and deserialize them, which will naturally cause a lot of performance loss.

Anyway, there is no silver bullet here, and there is no perfect way to share data. You must combine multiple methods according to your needs and scenarios.

Shared dict

We have spent a lot of time learning about the data sharing part above, and some of you may wonder: it seems that they are not directly related to shared dict. Isn't that off-topic?

Actually, no. Please think about it: why is there a shared dict in OpenResty? Recall that the first three methods of data sharing are all at the request level or the individual Worker level. Therefore, in the current implementation of OpenResty, only shared dict can accomplish data sharing between Workers, enabling communication between Workers, which is the value of its existence.

In my opinion, understanding why technology exists and figuring out its differences and advantages compared to other similar technologies is far more important than just being proficient at calling the APIs it provides. This technical vision gives you a degree of foresight and insight and is arguably an important difference between engineers and architects.

Back to the shared dict, which provides more than 20 Lua APIs to the public, all of which are atomic, so you don't have to worry about competition in the case of multiple Workers and high concurrency.

These APIs all have detailed official documents, so I won't go into them all. I want to emphasize again that no technical course can replace a careful reading of the official documentation. No one can skip these time-consuming and stupid procedures.

Next, let's continue to look at the shared dict APIs, which can be divided into three categories: dict read/write, queue operation, and management.

Dict read/write

Let's first look at the dict read and write classes. In the original version, there were only APIs for dict read and write classes, the most common features of shared dictionaries. Here is the simplest example.

$ resty --shdict='dogs 1m' -e 'local dict = ngx.shared.dogs
                               dict:set("Tom", 56)
                               print(dict:get("Tom"))'

In addition to set, OpenResty also provides four writing methods: safe_set, add, safe_add, and replace. The meaning of the safe prefix here is that if the memory is full, instead of eliminating the old data according to LRU, the write will fail and return no memory error.

In addition to get, OpenResty also provides the get_stale method for reading data, which has an additional return value for expired data compared to the get method.

value, flags, stale = ngx.shared.DICT:get_stale(key)

You can also call the delete method to delete the specified key, which is equivalent to set(key, nil).

Queue Operation

Turning to queue operations, it is a later addition to OpenResty that provides a similar interface to Redis. Each element in a queue is described by ngx_http_lua_shdict_list_node_t.

typedef struct {
    ngx_queue_t queue;
    uint32_t value_len;
    uint8_t value_type;
    u_char data[1];
} ngx_http_lua_shdict_list_node_t;

I have posted the PR of these queueing APIs in the article. If you are interested in this, you can follow the documentation, test cases, and source code to analyze the specific implementation.

However, there are no corresponding code examples for the following five queue APIs in the documentation, so I will briefly introduce them here.

  • lpush``/``rpush means adding elements at both ends of the queue.
  • lpop``/``rpop, which pops elements at both ends of the queue.
  • llen, which indicates the number of elements returned to the queue.

Let's not forget another useful tool we discussed in the last article: test cases. We can usually find the corresponding code in a test case if it is not in the documentation. The queue-related tests are precisely in file 145-shdict-list.t.

=== TEST 1: lpush & lpop
--- http_config
    lua_shared_dict dogs 1m;
--- config
    location = /test {
        content_by_lua_block {
            local dogs = ngx.shared.dogs

            local len, err = dogs:lpush("foo", "bar")
            if len then
                ngx.say("push success")
            else
                ngx.say("push err: ", err)
            end

            local val, err = dogs:llen("foo")
            ngx.say(val, " ", err)

            local val, err = dogs:lpop("foo")
            ngx.say(val, " ", err)

            local val, err = dogs:llen("foo")
            ngx.say(val, " ", err)

            local val, err = dogs:lpop("foo")
            ngx.say(val, " ", err)
        }
    }
--- request
GET /test
--- response_body
push success
1 nil
bar nil
0 nil
nil nil
--- no_error_log
[error]

Management

The final management API is also a subsequent addition and is a popular requirement in the community. One of the most typical examples is the usage of shared memory. For example, if a user requests 100M of space as a shared dict, is this 100M enough? How many keys are stored in it, and which keys are they? These are all authentic questions.

For this kind of problem, the OpenResty official hopes that users use the flame graphs to solve them, i.e., in a non-invasive way, keeping the code base efficient and tidy, instead of providing an invasive API to return the results directly.

But from a user-friendly perspective, these management APIs are still essential. After all, open-source projects are designed to solve product requirements, not to showcase the technology itself. So, let's look at the following management APIs that will be added later.

First is get_keys(max_count?), which by default only returns the first 1024 keys; if you set max_count to 0, it will return all keys. Then come to capacity and free_space, both of which are part of the lua-resty-core repository, so you need to require them before using them.

require "resty.core.shdict"

local cats = ngx.shared.cats
local capacity_bytes = cats:capacity()
local free_page_bytes = cats:free_space()

They return the size of the shared memory (the size configured in lua_shared_dict) and the number of bytes of free pages. Since the shared dict is allocated by page, even if free_space returns 0, there may be space in the allocated pages. Hence, its return value does not represent how much shared memory is occupied.

Summary

In practice, we often use multi-level caching, and the official OpenResty project also has a caching package. Can you find out which projects they are? Or do you know some other lua-resty libraries that encapsulate caching?

You are welcome to share this article with your colleagues and friends so that we can communicate and improve together.