OpenResty FAQ | Privileged Process Permission, Execution Phase, and more

API7.ai

November 11, 2022

OpenResty (NGINX + Lua)

This article contains six frequently asked questions:

1. Privileged Process Permissions

Q: What is the privileged process? How could the non-privileged user get the root permissions? Can you introduce some scenarios of the privileged process?

A: The privileged process's permissions are the same as the master process's. If you start OpenResty as a non-privileged user, then the master process inherits the user's privilege, meaning the "privileged process" has no right now.

It is easy to understand that there are no root privileges when a normal user starts a process.

As for the usage scenarios of the privileged process, we generally use it for tasks that require high privileges, such as cleaning logs and restarting OpenResty. However, be careful not to use the privileged process to run the worker process tasks because of security risks.

One developer runs all the timer tasks in the privileged process. Why does he do this? Because there is only one privileged process, in this way, the timer doesn't start repeatedly.

The developer is "smart" as he reached the goal without using worker.id. However, don't forget, it is very dangerous if the timer task relies on the client's input.

2. Phasing and Debugging

Q: After running ngx.say('hello'), will OpenResty respond to the client directly after executing the rest logic in the current phase? Which means it will not continue to run later phases.

A: It is not. We can look at its execution phase:

image

You can test ngx.say in the content phase first, then use ngx.log in the log or body filter phase to print the log.

In previous articles, I didn't specifically mention the problem of doing code debugging in OpenResty, about which developers may feel confused.

There are no advanced features for debugging code in OpenResty, such as breakpoints (there are some paid plugins, but I haven't used them), and you can only use ngx.say and ngx.log to see the output. This is how all the developers I know do their debugging, including the OpenResty authors and contributors. Therefore, you need robust test cases and debug logs as a guarantee.

3. The Practice of ngx.exit

Q: In the previous one articles, there has one description: OpenResty's HTTP Status Code has one special constant ngx.OK. After running ngx.exit(ngx.OK), the request exits the current phase and moves on to the next phase instead of returning to the client directly.

I remember that ngx.OK should not be considered as an HTTP status code, its value is 0. My understanding is:

  • After running ngx.exit(ngx.OK), ngx.exit(ngx.ERROR) or ngx.exit(ngx.DECLINED), the request exits the current phase and moves on to the next phase.
  • When ngx.exit(ngx.HTTP_*) takes the various HTTP status codes of ngx.HTTP_* as a parameter, it will respond directly to the client.

I don't know if my understanding is right.

A: Regarding your first question, ngx.ok is not an HTTP status code but a constant in OpenResty with a value of 0.

As for the second question, the official documentation for ngx.exit can be the exact answer:

  1. When status >= 200 (i.e., ngx.HTTP_OK and above), it will interrupt the execution of the current request and return status code to nginx.

  2. When status == 0 (i.e., ngx.OK), it will only quit the current phase handler (or the content handler if the content_by_lua* directive is used) and continue to run later phases (if any) for the current request.

However, the documentation does not mention how OpenResty handles ngx.exit(ngx.ERROR) and ngx.exit(ngx.DECLINED). We can do a test as follows:

location /lua {
    rewrite_by_lua "ngx.exit(ngx.ERROR)";
    echo hello;
}

Visiting this location, you can see that the HTTP response code is empty, the response body is also empty, and it does not go to the next phase of execution.

As you get deeper and deeper into the OpenResty learning process, you are bound to find at some point that neither the documentation nor the test cases can answer your questions. At this point, you need to build your test cases to verify your ideas. You can do this manually, or you can add the tests to the test case set built by test::nginx.

4. Variables and Race Condition

Q: As mentioned earlier, the scope of the ngx.var variable is between the nginx C and lua-nginx-module modules.

  1. I don't quite understand this. From the request perspective, does it mean a single request in a worker process?

  2. My understanding is that when we manipulate variables within a module. If there is a blocking operation between two operations, there may exist a race condition. So if there is no blocking operation between two operations, and it happens that the current process enters the ready queue when the CPU time is up, is there possible to exist a race condition?

A: Let's look at these questions.

First, regarding the ngx.var variable, your understanding is correct. The lifecycle of ngx.var is the same as the request, and it disappears when the request ends. But its advantage is that the data can be passed in C modules and Lua code, which is not possible in several other ways.

Second, as long as there is a yield operation between two operations, there may be a race condition rather than a blocking operation. There is no race condition when there is a blocking operation. In other words, as long as you don't give the initiative to NGINX's event loop, there will be no race condition.

5. The shared dict Does Not Need Locking

Q: If multiple workers store data concurrently, is it necessary to add locks?

For example:

resty --shdict 'dogs 10m' -e 'local dogs = ngx.shared.dogs
local lock= ngx.xxxx.lock
lock.lock()
 dogs:set("Jim", 8)
lock.unlock()
 local v = dogs:get("Jim")
 ngx.say(v)
 '

A: You don't need to add a lock here because no matter get or set operation, the operation of shared dict is atomic. OpenResty has already considered this kind of locking-like processing.

6. Time Operation in OpenResty

Q: Using ngx.now() to get time, does it happen in the resume function restore phase?

A: NGINX is designed with performance first in mind and caches time. We can verify this by the source code of ngx.now:

static int
ngx_http_lua_ngx_now(lua_State *L)
{
    ngx_time_t              *tp;

    tp = ngx_timeofday();

    lua_pushnumber(L, (lua_Number) (tp->sec + tp->msec / 1000.0L));

    return 1;
}

As you can see, behind the ngx.now() function that gets the current time is the ngx_timeofday function of NGINX. The ngx_timeofday function is a macro definition:

#define ngx_timeofday()      (ngx_time_t *) ngx_cached_time

Here the value of ngx_cached_time will be updated only in the ngx_time_update function.

So, the question turns to "when does the ngx_time_update function get called?" If you trace it in the NGINX source code, you will see that the calls to ngx_time_update occur in the event loop, then this problem is solved.

Summary

You should also be able to find out through these questions, the benefit of open-source projects is that you can follow the clues and look for answers in the source code, which will give you a feeling of solving the case.

Finally, I hope that through communication and Q&A, I can help you turn what you learn into what you get. You are also welcome to forward this article, and we will communicate and improve together.