OpenResty FAQ | Dynamic Load, NYI, and Caching of Shared Dict

The Openresty article series has been updated so far, and the part about performance optimization is all we have learned. Congratulations to you for not falling behind, still actively learning and practicing, and enthusiastically leaving your thoughts.

We've collected a lot of more typical and interesting questions, and here's a look at five of them.

Question 1: How do I accomplish dynamic loading of Lua modules?

Description: I have a question about the dynamic loading implemented in OpenResty. How can I use the loadstring function to finish loading a new file after it has been replaced? I understand that loadstring can only load strings, so if I want to reload a lua file/module, how can I do it in OpenResty?

As we know, loadstring is used to load a string, while loadfile can load a specified file, for example: loadfile("foo.lua"). These two commands achieve the same result. As for how to load Lua modules, here is an example:

resty -e 'local s = [[
local ngx = ngx
local _M = {}
function _M.f()
    ngx.say("hello world")
end
return _M
]]
local lua = loadstring(s)
local ret, func = pcall(lua)
func.f()'

The content of string s is a complete Lua module. So, when you find a change in the code of this module, you can restart the load with loadstring or loadfile. This way, the functions and variables in it will be updated with it.

To take it a step further, you can also wrap the fetching of changes and reloading with a layer called the code_loader function.

local func = code_loader(name)

This makes code updates much more concise. At the same time, code_loader generally uses lru cache to cache s to avoid calling loadstring every time.

Question 2: Why doesn't OpenResty forbid blocking operations?

Description: Over the years, I've always wondered, since these blocking calls are officially discouraged, why not just disable them? Or add a flag to let the user choose to disable it?

Here's my personal opinion. First, because the ecosystem around OpenResty is not perfect, sometimes we have to call blocking libraries to implement some functions. For example, before version 1.15.8, you had to use the Lua library os.execute instead of lua-resty-shell to call external commands. For example, in OpenResty, reading and writing files is still only possible with the Lua I/O library, and there is no non-blocking alternative.

Secondly, OpenResty is very cautious about such optimizations. For example, lua-resty-core has been developed for a long time, but it has never been turned on by default, requiring you to manually call require 'resty.core'. It was turned on until the latest 1.15.8 release.

Finally, the OpenResty maintainers prefer to standardize blocking calls by automatically generating highly optimized Lua code through the compiler and DSL. So, there is no effort to do something like flag options on the OpenResty platform itself. Of course, I am unsure whether this direction can solve the problem.

From an external developer's point, the more practical problem is how to avoid such blocking. We can extend Lua code detection tools, such as luacheck, to find and alert on common blocking operations, or we can intrusively disable or rewrite certain functions directly by rewriting _G, e.g.:

resty -e '_G.ngx.print = function()
ngx.say("hello")
end
ngx.print()'

# hello

With this sample code, you can rewrite the ngx.print function directly.

Question 3: Does the operation of LuaJIT's NYI have a significant impact on performance?

Description: loadstring shows never in LuaJIT's NYI list. Will it have a big impact on performance?

Regarding LuaJIT's NYI, we don't need to be too strict. For operations that can be JIT, the JIT approach is naturally the best; but for operations that cannot be JIT yet, we can continue to use it.

For performance optimization, we need to take a statistically based scientific approach, which is what flame graph sampling is all about. Premature optimization is the root of all evil. We only need to optimize for hot code that makes a lot of calls and consumes a lot of CPU.

Back to loadstring, we will only call it to reload when the code changes, not when requested, so it is not a frequent operation. At this point, we don't have to worry about its impact on the overall performance of the system.

In conjunction with the second blocking issue, in OpenResty, we sometimes also invoke blocking file I/O operations during the init and init worker phases. This operation is more performance-compromising than NYI, but since it is performed only once when the service is started, it is acceptable.

As always, performance optimization should be viewed from a macro perspective, a point you need to pay particular attention to. Otherwise, by obsessing over a particular detail, you will likely optimize for a long time but not have a good effect.

Question 4: Can I implement dynamic upstream by myself?

Description: For dynamic upstream, my approach is to set up 2 upstreams for a service, select different upstreams according to the routing conditions, and directly modify the IP in the upstream when the machine IP changes. Is there any disadvantage or pitfall in this approach compared to using balancer_by_lua directly?

The advantage of balancer_by_lua is that it allows the user to choose the load balancing algorithm, for example, whether to use roundrobin or chash, or any other algorithm that the user implements, which is flexible and high-performance.

If you do it in the way of routing rules, it is the same in terms of the result. But the upstream health check needs to be implemented by you, adding a lot of extra workload.

We can also expand on this question by asking how we should implement this scenario for abtest, which requires a different upstream.

You can decide which upstream to use in the balancer_by_lua phase based on uri, host, parameters, etc. You can also use API gateways to turn these judgments into routing rules, deciding which route to use in the initial access phase, and then finding the specified upstream through the binding relationship between the route and the upstream. This is a common approach to API gateways, and we'll talk more specifically about it later in the hands-on section.

Question 5: Is caching of `shared dict` mandatory?

Description:

In real production applications, I think the shared dict layer of cache is a must. It seems that everyone only remembers the goodness of lru cache, no restrictions on data format, no need to deserialize, no need to calculate memory space based on k/v volume, no contention between workers, no read/write locks and high performance.

However, don't ignore that one of its most fatal weaknesses is that the life cycle of the lru cache follows the Worker. Whenever NGINX reloads, this part of the cache will be completely lost, and at this point, if there is no shared dict, the L3 data source will be hung in minutes.

Of course, this is the case of higher concurrency, but since caching is used, the business volume is certainly not small, which means that the analysis just mentioned still applies. If I'm right in this view?

In some cases, it is true that, as you said, the shared dict is not lost during reload, so it is necessary. But there is a particular case where only the lru cache is acceptable if all the data is actively available from L3, the data source, in the init phase or init_worker phase.

For example, if the open source API gateway APISIX has its data source in etcd, it only fetches data from etcd. It caches it in the lru cache during the init_worker phase, and later cache updates are actively fetched through etcd's watch mechanism. This way, even if NGINX reloads, there will be no cache stampede.

So, we can have preferences in choosing technology but don't generalize absolutely because no one silver bullet can fit all caching scenarios. It is an excellent way to build a minimal available solution according to the needs of the actual scenario and then gradually increase it.