The JIT Compiler's Drawback: Why Avoid NYI?

API7.ai

September 30, 2022

OpenResty (NGINX + Lua)

In the previous article, we looked at FFI in LuaJIT. If your project only uses the API provided by OpenResty and you don't need to call C functions, then FFI is not that important for you. You just need to make sure that lua-resty-core is enabled.

But NYI in LuaJIT, which we'll talk about today, is a crucial problem that every engineer using OpenResty can't escape, significantly impacting performance.

You can quickly write logically correct code using OpenResty, but without understanding NYI, you can't write efficient code and can't leverage the power of OpenResty. The performance difference between the two is at least an order of magnitude.

What is NYI?

Let's start by recalling a point we've made before.

LuaJIT's runtime, in addition to an assembly implementation of the Lua interpreter, has a JIT compiler that can generate machine code directly.

The implementation of the JIT compiler in LuaJIT is not yet complete. It cannot compile some functions because they are challenging to implement and because the LuaJIT authors are currently semi-retired. These include the common pairs() function, the unpack() function, the Lua C module based on the Lua CFunction implementation, and so on. This allows the JIT compiler to fall back to interpreter mode when it encounters an operation it does not support on the current code path.

LuaJIT's official website has a complete list of these NYIs, and I suggest you go through it. The goal of the article is not for you to memorize this list, but for you to consciously remind yourself of it when writing code.

Below, I've taken a few functions from the NYI list for the string library.

string library

The compile status of string.byte is yes, which means that it can be optimized with JIT, and you can use it in your code without fear.

The compile status of string.char is 2.1, which means it has been supported since LuaJIT 2.1. As we know, LuaJIT in OpenResty is based on LuaJIT 2.1, so you can use it safely.

The compilation state of string.dump is never, i.e., it will not be optimized with JIT and will fall back to interpreter mode. As of now, there are no plans to support this in the future.

string.find has a compilation status of 2.1 partial, meaning that it is partially supported from LuaJIT 2.1, and the note after that says that it only supports searching for fixed strings, not pattern matching. So for finding fixed strings, string.find can be optimized with JIT.

Naturally, we should avoid using NYI so that more of our code can be JIT-compiled and performance can be guaranteed. However, in a real-world environment, we sometimes inevitably need to use some NYI functions, so what should we do?

Alternatives to NYI

Don't worry. Most NYI functions we can respectfully leave behind and implement their functionality in other ways. Next, I've selected a few typical NYIs to explain and walk you through the different types of NYI alternatives. In this way, you can also learn about other NYIs.

string.gsub()

Let's first look at the string.gsub() function, which is Lua's built-in string manipulation function that does global string substitution, such as the following example.

$ resty -e 'local new = string.gsub("banana", "a", "A"); print(new)'
bAnAnA

This function is an NYI function and cannot be compiled by JIT.

We could try to find a replacement function in OpenResty's API, but for most people, it's not practical to remember all the APIs and their usage. That's why I always open the GitHub documentation page for the lua-nginx-module in my development work.

For example, we can use gsub as a keyword to search the documentation page, and ngx.re.gsub will come to mind.

We can also use the restydoc tool recommended before searching for the OpenResty API. You can try using it to search for gsub.

$ restydoc -s gsub

As you can see, instead of returning the ngx.re.gsub we were expecting, Lua's functions are shown. In fact, at this stage, restydoc returns an exact unique match, so it is more suitable for use if you know the API name explicitly. For fuzzy searches, you still have to do it manually in the documentation.

Going back to the search results, we see that the function definition of ngx.re.gsub is as follows:

newstr, n, err = ngx.re.gsub(subject, regex, replace, options?)

Here, the function parameters and return values are named with specific meanings. In fact, in OpenResty, I don't recommend you to write a lot of comments. Most of the time, a good name is better than several lines of comments.

For engineers unfamiliar with the OpenResty regular system, you may be confused when you see the variable options at the end. However, the explanation of the variable is not in this function but the documentation for the ngx.re.match function.

If you look at the documentation for options, you'll see that if we set it to jo, it turns on PCRE JIT, so that code using ngx.re.gsub can be JIT-compiled by LuaJIT as well as by PCRE JIT.

I won't go into the details of the documentation. The OpenResty documentation is excellent, so read it carefully and you can solve most of your problems.

string.find()

Unlike string.gsub, string.find is JIT-able in plain mode (i.e., string lookup), while string.find is not JIT-able for string lookups with regularity, which is done using OpenResty's API ngx.re.find.

So, when you do a string find in OpenResty, you must first clearly distinguish whether you are looking for a fixed string or a regular expression. If it is the former, use string.find and remember to set plain to true at the end.

string.find("foo bar", "foo", 1, true)

In the latter case, you should use OpenResty's API and turn on the JIT option for PCRE.

ngx.re.find("foo bar", "^foo", "jo")

It would be more appropriate to make a layer of wrapping here and turn the optimization options on by default, without letting the end user know so many details. That way, it's a uniform string lookup function to the outside. As you can feel, sometimes too many options and too much flexibility are not a good thing.

unpack()

The third function we'll look at is unpack(). unpack() is also a function that needs to be avoided, especially not in the loop body. Instead, you can access it using an array's index numbers, as in this example from the following code.

$ resty -e '
 local a = {100, 200, 300, 400}
 for i = 1, 2 do
    print(unpack(a))
 end'

$ resty -e 'local a = {100, 200, 300, 400}
 for i = 1, 2 do
    print(a[1], a[2], a[3], a[4])
 end'

Let's dig a little deeper into unpack, and this time we can use restydoc to search for.

$ restydoc -s unpack

As you can see from the unpack documentation, unpack(list [, i [, j]]) is equivalent to return list[i], list[i+1], list[j], and you can think of unpack as syntactic sugar. This way, you can access it exactly as an array index without breaking LuaJIT's JIT compilation.

pairs()

Finally, let's look at the pairs() function that traverses the hash table, which also cannot be compiled by JIT.

Unfortunately, however, there is no equivalent alternative to this. You can only try to avoid it or use arrays accessed by numeric index instead, and in particular, do not traverse the hash table on the hot code path. Here I explain the code hot path, which means the code will be returned for execution many times, for example, inside a giant loop.

Having said these four examples, let's summarize that to circumvent the use of NYI functions, you need to pay attention to these two points.

  • Use the API provided by OpenResty in preference to Lua's standard library functions. Remember that Lua is an embedded language, and we are programming in OpenResty, not Lua.
  • If you have to use the NYI language as a last resort, please ensure it is not on the code hot path.

How to detect NYI?

All this talk about NYI circumvention is to teach you what to do. However, it would be inconsistent with one of the philosophies OpenResty espouses if it ended abruptly here.

What can be done automatically by the machine does not involve humans.

People are not machines, and there will always be oversights. Automating the detection of NYI used in code is an essential reflection of the value of an engineer.

Here I recommend the jit.dump and jit.v modules that come with LuaJIT. They both print out the process of how the JIT compiler works. The former outputs detailed information that can be used to debug LuaJIT itself. You can refer to its source code for a deeper understanding; the latter output is more straightforward, with each line corresponding to a trace, and is usually used to check if it can be JIT.

How should we do this? We can start by adding the following two lines of code to init_by_lua.

local v = require "jit.v"
v.on("/tmp/jit.log")

Then, run your stress test tool or a few hundred unit test sets to get LuaJIT hot enough to trigger JIT compilation. Once that's done, check the results of /tmp/jit.log.

Of course, this approach is relatively tedious, so if you want to keep things simple, resty is sufficient, and the OpenResty CLI comes with the following options.

$resty -j v -e 'for i=1, 1000 do
      local newstr, n, err = ngx.re.gsub("hello, world", "([a-z])[a-z]+", "[$0,$1]", "i")
 end'
 [TRACE   1 (command line -e):1 stitch C:107bc91fd]
 [TRACE   2 (1/stitch) (command line -e):2 -> 1]

Where -j in resty is the LuaJIT-related option, the values dump and v follow, corresponding to turning on jit.dump and jit.v mode.

In the output of the jit.v module, each line is a successfully compiled trace object. Just now is an example of a JIT-capable trace, and if NYI functions are encountered, the output will specify that they are NYIs, as in the example of the following pairs.

$resty -j v -e 'local t = {}
 for i=1,100 do
     t[i] = i
 end

 for i=1, 1000 do
     for j=1,1000 do
         for k,v in pairs(t) do
             --
         end
     end
 end'

It cannot be JIT'd, so the result indicates an NYI function in line 8.

 [TRACE   1 (command line -e):2 loop]
 [TRACE --- (command line -e):7 -- NYI: bytecode 72 at (command line -e):8]

Write at the end

This is the first time we've talked about OpenResty performance issues in more length. After reading these optimizations about NYI, what do you think? You can leave a comment with your opinion.

Finally, I'll leave you with a thought-provoking question when discussing alternatives to the string.find() function; I mentioned that it would be better to do a layer of wrapping and turn on optimization options by default. So, I'll leave that task to you for a little test drive.

Feel free to write your answers in the comments section, and you are welcome to share this article with your colleagues and friends to communicate and progress together.