Advantages and Disadvantages of `string` in OpenResty
API7.ai
December 8, 2022
In the last article, we got familiar with the common blocking functions in OpenResty, which are often misused for beginners. Starting from this article, we will get into the core of performance optimization, which will involve a lot of optimization techniques that can help us quickly improve the performance of OpenResty code, so don't take it lightly.
In this process, we need to write more test code to experience how to use these optimization techniques and verify their effectiveness so we can make good use of them.
Behind the Scenes of Performance Optimization Tips
Optimization techniques are all part of the "practice" part, so before we do that, let's talk about the "theory" of optimization.
Performance optimization methods will change with the iterations of LuaJIT and OpenResty. Some methods may be directly optimized by the underlying technology and no longer need to be mastered; at the same time, there will be some new optimization techniques. Therefore, it is most important to master the constant concept behind these optimization techniques.
Let's take a look at some of the critical ideas about performance in OpenResty programming.
Theory 1: Processing requests should be short, simple, and fast
OpenResty is a web server, so it often handles 1,000+, 10,000+, or even 100,000+ client requests simultaneously. Therefore, to achieve the highest overall performance, we must ensure that individual requests are processed quickly and that various resources, such as memory, are recovered.
- The "short" mentioned here means that the request life cycle should be short so as not to take up resources for a long time without releasing them; even for long connections, a threshold of time or number of requests should be set to release resources regularly.
- The second "simple" refers to doing only one thing in an API. Break up complex business logic into multiple APIs and keep the code simple.
- Finally, "fast" means don't block the main thread and don't run too many CPU operations. Even if you have to do so, don't forget to work with other methods we introduced in the last article.
This architectural consideration is not only suitable for OpenResty, but also for further development languages and platforms, so I hope you can understand and think about it carefully.
Theory 2: Avoid generating intermediate data
Avoiding useless data in the intermediate process is arguably the most dominant optimization theory in OpenResty programming. Let's look at a small example to explain useless data in the intermediate process.
$ resty -e 'local s= "hello"
s = s .. " world"
s = s .. "!"
print(s)
'
In this code snippet, we did several splicing operations on the s
variable to get the result hello world!
. But only the final hello world!
state of s
is useful. The initial value of s
and the intermediate assignments are all intermediate data that should be generated as little as possible.
The reason is that these temporary data will bring initialization and GC performance loss. Do not underestimate these losses; if this appears in hot code such as loops, the performance will be obviously degraded. I will also explain this later with a string example.
string
s are immutable
Now, back to the subject of this article, string
. Here, I'm highlighting the fact that string
s are immutable in Lua.
Of course, this doesn't mean that string
s can't be spliced, modified, etc., but when we modify a string
, we don't change the original string
but create a new string
object and change the reference to the string
. So naturally, if the original string
does not have any other references, it will be recovered by Lua's GC (garbage collection).
The apparent benefit of immutable string
s is that they save memory. This way, there will be only one copy of the same string
in memory, and different variables will point to the same memory address.
The disadvantage of this design is that when it comes to adding and reclaiming string
s, every time you add a string
, LuaJIT has to call lj_str_new
to inquire if the string
already exists; if not, it needs to create a new string
. If you do this very often, it will have a massive impact on performance.
Let's look at a concrete example of a string
splicing operation like the one in this example, which is found in many OpenResty open-source projects.
$ resty -e 'local begin = ngx.now()
local s = ""
-- `for` loop, using `..` to perform string splicing
for i = 1, 100000 do
s = s .. "a"
end
ngx.update_time()
print(ngx.now() - begin)
'
What this sample code does is do 100,000 string
splices on the s
variable and print out the runtime. Although the example is a bit extreme, it gives a good idea of the difference between before and after performance optimization. Without optimization, this code runs for 0.4 seconds on my laptop, which is still relatively slow. So how should we optimize it?
In the previous articles, the answer was given, which is to use table
to do a layer of encapsulation, removing all the temporary intermediate string
s and keeping only the original data and the final result. Let's look at the concrete code implementation.
$ resty -e 'local begin = ngx.now()
local t = {}
-- for loop that uses an array to hold the string, counting the length of the array each time
for i = 1, 100000 do
t[#t + 1] = "a"
end
-- Stitching strings using the concat method of arrays
local s = table.concat(t, "")
ngx.update_time()
print(ngx.now() - begin)
'
We can see that this code saves each string in turn with table
, and the index is determined by #t + 1
, that is, the current length of table
plus 1
. Finally, use the table.concat
function to concatenate each array element. This naturally skips all the temporary strings and avoids 100,000 times lj_str_new
and GC.
That was our code analysis, but how does the optimization work? The optimized code takes only 0.007 seconds, which means a performance improvement of more than 50 times. In an actual project, the performance improvement might be even more pronounced because, in this example, we only added one character a
at a time.
What would the performance difference be if the new string
is in the length of 10x a
?
Are the 0.007 seconds of code good enough for our optimization work? No, it can still be optimized. Let's modify one more line of code and see the result.
$ resty -e 'local begin = ngx.now()
local t = {}
-- for loop, using an array to hold the string, maintaining the length of the array itself
for i = 1, 100000 do
t[i] = "a"
end
local s = table.concat(t, "")
ngx.update_time()
print(ngx.now() - begin)
'
This time, we changed t[#t + 1] = "a"
to t[i] = "a"
, and with just one line of code, we can avoid 100,000 function calls to get the length of the array. Remember the operation to get the length of an array that we mentioned in the table
section earlier? It has a time complexity of O(n)
, a relatively expensive operation. So, here we simply maintain our array index to bypass the operation of getting the array length. As the saying goes, if you can't afford to mess with it, you can avoid it.
Of course, this is a simpler way to write it. The following code illustrates more clearly how to maintain the index of an array by ourselves.
$ resty -e 'local begin = ngx.now()
local t = {}
local index = 1
for i = 1, 100000 do
t[index] = "a"
index = index + 1
end
local s = table.concat(t, "")
ngx.update_time()
print(ngx.now() - begin)
'
Reduce other temporary string
s
The mistakes we just talked about, temporary string
s caused by string
splicing, are apparent. With a few reminders of the sample code above, I believe we will not make similar mistakes again. However, some more hidden temporary string
s are generated in OpenResty, which are much less easily detected. For example, the string
handling function we will discuss below is often used. Can you imagine that it also generates temporary string
s?
As we know, the string.sub
function intercepts a specified part of a string
. As we mentioned earlier, string
s in Lua are immutable, so intercepting a new string involves lj_str_new
and subsequent GC operations.
resty -e 'print(string.sub("abcd", 1, 1))'
The function of the above code is to fetch the first character of the string
and print it out. Naturally, it will inevitably generate a temporary string
. Is there a better way to accomplish the same effect?
resty -e 'print(string.char(string.byte("abcd")))'
Naturally so. Looking at this code, we first use string.byte
to get the numeric code of the first character and then use string.char
to convert the number to the corresponding character. This process does not generate any temporary string
s. Therefore, it is most efficient to use string.byte
to do the string
-related scanning and analysis.
Leverage SDK support for table
type
After learning how to reduce the temporary string
, are you eager to try it? Then, we can take the result of the sample code above and output it to the client as the response body's content. At this point, you can pause and try to write this code yourself first.
$ resty -e 'local begin = ngx.now()
local t = {}
local index = 1
for i = 1, 100000 do
t[index] = "a"
index = index + 1
end
local response = table.concat(t, "")
ngx.say(response)
'
If you can write this code, you're already ahead of most OpenResty developers. OpenResty's Lua API already takes into account the use of table
s for string
splicing, so in ngx.say
, ngx.print
, ngx.log
, cosocket:send
, and other APIs that may take a lot of string
s, it accepts not only string
as a parameter, but also accepts table
as a parameter.
resty -e 'local begin = ngx.now()
local t = {}
local index = 1
for i = 1, 100000 do
t[index] = "a"
index = index + 1
end
ngx.say(t)
'
In this last code snippet, we omit the local response = table.concat(t, "")
, the string
splicing step, and pass the table
directly to ngx.say
. This shifts the string
splicing task from the Lua level to the C level, avoiding another string
lookup, generation, and GC. For long string
s, this is another significant performance gain.
Summary
After reading this article, we can see that a lot of OpenResty's performance optimization deals with various details. Therefore, we need to know LuaJIT and OpenResty's Lua API well to achieve optimal performance. This also reminds us that if we have forgotten the previous content, we must review and consolidate it in time.
Finally, think about a problem: write the strings hello
, world
, and !
to the error log. Can we write a sample code without string
splicing?
Also, don't forget the other question in the text. What would be the performance difference in the following code if the new string
s are in the length of 10x a
?
$ resty -e 'local begin = ngx.now()
local t = {}
for i = 1, 100000 do
t[#t + 1] = "a"
end
local s = table.concat(t, "")
ngx.update_time()
print(ngx.now() - begin)
'
You are also welcome to share this article with your friends to learn and communicate.