Beyond the Web Server: Privileged Process and Timer Tasks

API7.ai

November 3, 2022

OpenResty (NGINX + Lua)

In the previous article, we introduced the OpenResty APIs, shared dict , and cosocket, all of which implement functionality within the realm of NGINX and web servers, providing a programmable Web server that is lower-cost, easier-to-maintain implementation.

However, Openresty can do more than that. Let's pick a few features in OpenResty that go beyond the Web server and introduce them today. They are timer tasks, privileged process, and non-blocking ngx.pipe.

Timer Tasks

In OpenResty, we sometimes need to regularly perform specific tasks in the background, such as synchronizing data, cleaning up logs, etc. If you were to design it, how would you do it? The easiest way to think of is to provide an API interface to the outside world to perform these tasks, then use the system's crontab to call curl at regular intervals to access this interface, and then implement this requirement in an indirect way.

However, this would not only be fragmented but also bring higher complexity to the operation and maintenance. So, OpenResty provides ngx.timer to solve this kind of requirement. You can take ngx.timer as a client request simulated by OpenResty to trigger the corresponding callback function.

OpenResty's timer tasks can be divided into the following two types.

  • ngx.timer.at is used to execute one-time timer tasks.
  • ngx.timer.every is used to execute fixed-period timer tasks.

Remember the thought-provoking question I left at the end of the last article? The question was how to break the restriction that cosocket cannot be used in init_worker_by_lua, and the answer is ngx.timer.

The following code starts a timer task with a delay of 0. It starts the callback handler function, and in this function, it uses cosocket to access a website.

init_worker_by_lua_block {
    local function handler()
        local sock = ngx.socket.tcp()
        local ok, err = sock:connect(“api7.ai", 80)
    end

    local ok, err = ngx.timer.at(0, handler)
}

This way, we get around the restriction that cosocket cannot be used at this stage.

Going back to the user requirement we mentioned at the beginning of this section, ngx.timer.at does not address the need to run periodically; in the code example above, it is a one-time task.

So, how do we do this periodically? You seem to have two options based on the ngx.timer.at API.

  • You can implement the periodic task yourself by using a while true infinite loop in the callback function that sleeps for a while after executing the task.
  • You can also create another new timer at the end of the callback function.

However, before making a choice, there is one thing we need to clarify: the timer is essentially a request, although the request was not initiated by the client. For the request, it has to exit after completing its task and can not always be resident. Otherwise, it is easy to cause a variety of resource leakage.

Therefore, the first solution of using while true to implement periodic tasks is unreliable. The second solution is feasible but recursively creates timers, which is not easy to understand.

So, is there a better solution? The new ngx.timer.every API behind OpenResty is specifically designed to solve this problem, and it is a solution closer to crontab.

The downside is that you never have a chance to cancel a timer task after starting it. After all, ngx.timer.cancel is still a to-do function.

At this point, you will face a problem: the timer is running in the background and cannot be canceled; if there are many timers, it is easy to run out of system resources.

Therefore, OpenResty provides two directives, lua_max_pending_timers and lua_max_running_timers to limit them. The former represents the maximum number of timers waiting to be executed, and the latter represents the maximum number of currently running timers.

You can also use the Lua API to get the values of the currently waiting and running timer tasks, as shown in the following two examples.

content_by_lua_block {
    ngx.timer.at(3, function() end)
    ngx.say(ngx.timer.pending_count())
}

This code will print a 1, indicating that there is one scheduled task waiting to be executed.

content_by_lua_block {
    ngx.timer.at(0.1, function() ngx.sleep(0.3) end)
    ngx.sleep(0.2)
    ngx.say(ngx.timer.running_count())
}

This code will print a 1, indicating that there is one scheduled task running.

Privileged Process

Next, let's look at the privileged process. As we all know, NGINX is divided into Master process and Worker processes, where the worker processes handle user requests. We can get the type of the process through the process.type API provided in lua-resty-core. For example, you can use resty to run the following function.

$ resty -e 'local process = require "ngx.process"
ngx.say("process type:", process.type())'

You will see that it returns a single result instead of worker, which means that resty starts NGINX with a Worker process, not a Master process. This is true. In the resty implementation, you can see that the Master process is turned off with a line like this.

master_process off;

OpenResty extends NGINX by adding a privileged agent, The privileged process has the following special features.

  • It does not monitor any ports, which means it does not provide services to the outside world.

  • It has the same privileges as the Master process, which is generally the root user's privilege, allowing it to do many tasks that are impossible for the Worker process.

  • The privileged process can only be opened in the init_by_lua context.

  • Also, the privileged process only makes sense if they run in the init_worker_by_lua context because no requests are triggered, and they don't go to the content, access, etc. contexts.

Let's look at an example of a privileged process that is turned on.

init_by_lua_block {
    local process = require "ngx.process"

    local ok, err = process.enable_privileged_agent()
    if not ok then
        ngx.log(ngx.ERR, "enables privileged agent failed error:", err)
    end
}

After opening the privileged process with this code and starting the OpenResty service, we can see that the privileged process is now part of the NGINX process.

nginx: master process
nginx: worker process
nginx: privileged agent process

However, if privileges are only run once during the init_worker_by_lua phase, which is not a good idea, how should we trigger the privileged process?

Yes, the answer is hidden in the knowledge just taught. Since it doesn't listen to ports, i.e., it can't be triggered by terminal requests, the only way to trigger it periodically is to use the ngx.timer we just introduced:

init_worker_by_lua_block {
    local process = require "ngx.process"

    local function reload(premature)
        local f, err = io.open(ngx.config.prefix() .. "/logs/nginx.pid", "r")
        if not f then
            return
        end
        local pid = f:read()
        f:close()
        os.execute("kill -HUP " .. pid)
    end

    if process.type() == "privileged agent" then
         local ok, err = ngx.timer.every(5, reload)
        if not ok then
            ngx.log(ngx.ERR, err)
        end
    end
}

The code above implements the ability to send HUP semaphores to the master process every 5 seconds. Naturally, you can build on this to do more exciting things, such as polling the database to see if there are tasks for the privileged process and executing them. Since the privileged process has root privileges, this is obviously a bit of a "backdoor" program.

Non-blocking ngx.pipe

Finally, look at the non-blocking ngx.pipe, which uses Lua's standard library to execute an external command line that sends a signal to the Master process in the code example we just described.

os.execute("kill -HUP " .. pid)

Naturally, this operation will block. So, is there a non-blocking way to call external programs in OpenResty? After all, you know that if you are using OpenResty as a complete development platform and not as a web server, this is what you need. For this reason, the lua-resty-shell library was created, and using it to invoke the command line is non-blocking:

$ resty -e 'local shell = require "resty.shell"
local ok, stdout, stderr, reason, status =
    shell.run([[echo "hello, world"]])
    ngx.say(stdout)

This code is a different way of writing hello world, calling the system's echo command to complete the output. Similarly, you can use resty.shell as an alternative to the os.execute call in Lua.

We know that the underlying implementation of lua-resty-shell relies on the ngx.pipe API in lua-resty-core, so this example uses lua-resty-shell to print out hello world, using ngx.pipe instead, would look like this.

$ resty -e 'local ngx_pipe = require "ngx.pipe"
local proc = ngx_pipe.spawn({"echo", "hello world"})
local data, err = proc:stdout_read_line()
ngx.say(data)'

The above is the underlying code of the lua-resty-shell implementation. You can check out the ngx.pipe documentation and test cases for more information on how to use it. Therefore, I won't go into it here.

Summary

That's it. We've done with the main content for today. From the above features, we can see that OpenResty is also trying to move closer to the direction of a universal platform while making a better NGINX, hoping that developers can try to unify the technology stack and use OpenResty to solve their development needs. This is quite friendly to operations and maintenance because maintenance costs are lower as long as you deploy an OpenResty on it.

Finally, I'll leave you with a thought-provoking question. Since there may be multiple NGINX Workers, the timer will run once for each Worker, which is unacceptable in most scenarios. How can we ensure that the timer runs only once?

Feel free to leave a comment with your solution, and feel free to share this article with your colleagues and friends so that we can communicate and improve together.