"inspect": Apache APISIX Lua Dynamic Debugging Plugin

JinHua Luo

January 29, 2023

Technology

Why do we need Lua Dynamic Debugging Plugin?

Apache APISIX contains a significant amount of Lua code. Is there a way to inspect the values of variables in the code at runtime without modifying the source code?

Modifying Lua source code for debugging has several drawbacks:

  • Production environments should not and often do not allow modifying source code
  • Modifying source code necessitates reloading, which disrupts the functioning of business processes
  • It can be challenging to modify source code in containerized environments
  • Temporarily generated codes are prone to be omitted to rollback, leading to maintenance issues

We usually need to inspect the values of variables, not only when functions begin or end but also when specific conditions are met, such as when a loop reaches a certain number of iterations or when a specific condition is true. Additionally, it is not always enough to simply print out variable values; it may also be necessary to send relevant information to external systems. Furthermore, how to make this process dynamic, and can it be done without negatively impacting the program's performance?

The Lua dynamic debugging plugin, inspect, helps you to fulfill the above requirements.

  • Customizable breakpoint handling
  • Dynamic breakpoint setting
  • Multiple breakpoints can be set
  • Breakpoints can be set to trigger only once
  • The performance impact can be controlled and kept within a specific scope

Working principle of plugin

It fully utilizes Lua's Debug API function to implement its features. During the interpreter mode, each executed bytecode could be mapped to a specific file and a line number. In order to set breakpoints, we only need to check if the line number is equal to the expected value and execute the breakpoint function we have already defined. This allows us to process the context information of the corresponding line, including upvalue, local variables, and some metadata, such as the stack.

APISIX uses Lua's JIT implementation: LuaJIT, where many hot code paths are compiled into machine code to execute. However, these are not affected by Debug API, so we need to clear the JIT cache before turning on breakpoints. The key is that we can choose only to clear the JIT cache of a specific Lua function, reducing the impact on global performance. When a program runs, many JIT-compiled code blocks will be named trace in LuaJIT. These traces are associated with Lua functions, and a single Lua function may include multiple traces, referring to different hot paths within the function.

We can specify their function objects and clear their JIT cache for global and module-level functions. However, if the line number corresponds to other function types, such as anonymous functions, we cannot obtain the function object globally. In such cases, we can only clear all JIT cache. New traces cannot be generated during debugging, but existing un-cleared traces continue to run. As long as we have enough control, the program performance will not be affected since an online system that has been running for a long time usually will not generate new traces. Once debugging ends and all breakpoints have been revoked, the system will return to normal JIT mode, and the cleared JIT cache will be regenerated once it re-enters the hot spot.

Installation and configuration

This plugin is enabled by default.

Configure conf/confg.yaml correctly to enable this plugin:

plugins:
...
  - inspect
plugin_attr:
  inspect:
    delay: 3
    hooks_file: "/usr/local/apisix/plugin_inspect_hooks.lua"

The plugin reads breakpoint definitions from the file '/usr/local/apisix/plugin_inspect_hooks.lua' every 3 seconds by default. To debug, you just need to edit this file.

We recommend creating a soft link to this path so that it is easier to archive different historical versions of the breakpoint file.

Please note that when the file is modified, the plugin will clear all previous breakpoints and enable all new breakpoints defined in the breakpoint file. These breakpoints will be effective across all worker processes.

Generally, it is not necessary to delete the file, as when defining breakpoints, you can specify when to revoke them.

Deleting this file will cancel all breakpoints for all worker processes.

Logs of the start and stop of breakpoints will be recorded at the 'WARN' log level.

Define breakpoints

require("apisix.inspect.dbg").set_hook(file, line, func, filter_func)
  • file, the file name, which could be any unambiguous file name or path.
  • line, the line number in the file, please note that breakpoints are closely linked to line numbers, so if the code changes, the line number must also change.
  • func, the name of the function whose trace should be cleared. If it is nil, then all traces in the luajit vm will be cleared.
  • filter_func, a custom Lua function that handles the breakpoint
    • The input parameter is called table that includes the following:
      • finfo: the return value of debug.getinfo(level, "nSlf")
      • uv: upvalues hash table
      • vals: local variables hash table
    • If the return value of the function is true, then the breakpoint will be automatically revoked. Otherwise, the breakpoint will continue to be effective.

For example:

local dbg = require "apisix.inspect.dbg"
dbg.set_hook("limit-req.lua", 88, require("apisix.plugins.limit-req").access,
function(info)
    ngx.log(ngx.INFO, debug.traceback("foo traceback", 3))
    ngx.log(ngx.INFO, dbg.getname(info.finfo))
    ngx.log(ngx.INFO, "conf_key=", info.vals.conf_key)
    return true
end)
dbg.set_hook("t/lib/demo.lua", 31, require("t.lib.demo").hot2, function(info)
    if info.vals.i == 222 then
        ngx.timer.at(0, function(_, body)
            local httpc = require("resty.http").new()
            httpc:request_uri("http://127.0.0.1:9080/upstream1", {
                method = "POST",
                body = body,
            })
        end, ngx.var.request_uri .. "," .. info.vals.i)
        return true
    end
    return false
end)
--- more breakpoints ...

Please note that the demo breakpoint organizes some information and sends it to an external server. Also, the used resty.http library is an asynchronous library based on cosocket.

Whenever calling OpenResty's asynchronous API, it must be sent with a delay by using timer, because executing functions on breakpoints is synchronous and blocking, it will not return to the main program of nginx for asynchronous processing, so it needs to be delayed.

Use cases

Determine routes based on the content of the request body

Let's say we have a requirement: how can we set up a route that only accepts POST requests with the string APISIX: 666 in the request body?

In the route configuration, there is a vars field, which can be used to check the value of nginx variables in order to determine if the route should be matched. The $request_body variable provided by nginx contains the request body value, so we can use this variable to implement our requirement.

Let's try out to configure the routes:

curl http://127.0.0.1:9180/apisix/admin/routes/var_route \
-H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -i -d '
{
    "uri": "/anything",
    "methods": ["POST"],
    "vars": [["request_body", "~~", "APISIX: 666"]],
    "upstream": {
        "type": "roundrobin",
        "nodes": {
            "httpbin.org": 1
        }
    }
}'

Then we could try this:

curl http://127.0.0.1:9080/anything
{"error_msg":"404 Route Not Found"}
curl -i http://127.0.0.1:9080/anything -X POST -d 'hello, APISIX: 666.'
HTTP/1.1 404 Not Found
Date: Thu, 05 Jan 2023 03:53:35 GMT
Content-Type: text/plain; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Server: APISIX/3.0.0
{"error_msg":"404 Route Not Found"}

Weird, why can't we match this route?

We will then take a look at the documentation for this variable in nginx:

The variable’s value is made available in locations processed by the proxy_pass, fastcgi_pass, uwsgi_pass, and scgi_pass directives when the request body was read to a memory buffer.

In other words, we need to first read request body before we use this variable.

When matching the route, will this variable be empty? We can use the inspect plugin to verify it.

We found the line of code that matches the route:

apisix/init.lua

...
api_ctx.var.request_uri = api_ctx.var.uri .. api_ctx.var.is_args .. (api_ctx.var.args or "")
router.router_http.match(api_ctx)
local route = api_ctx.matched_route
if not route then
...

Let's verify the varibale request_body in line 515, which is router.router_http.match(api_ctx).

Set up breakpoints

Edit file /usr/local/apisix/example_hooks.lua:

local dbg = require("apisix.inspect.dbg")
dbg.set_hook("apisix/init.lua", 515, require("apisix").http_access_phase, function(info)
    core.log.warn("request_body=", info.vals.api_ctx.var.request_body)
    return true
end)

Create a soft link to the breakpoint file path:

ln -sf /usr/local/apisix/example_hooks.lua /usr/local/apisix/plugin_inspect_hooks.lua

Check the logs to confirm that the breakpoint is effective.

2023/01/05 12:02:43 [warn] 1890559#1890559: *15736 [lua] init.lua:68: setup_hooks():
set hooks: err: true, hooks: ["apisix\/init.lua#515"], context: ngx.timer

Retrigger routes matching:

curl -i http://127.0.0.1:9080/anything -X POST -d 'hello, APISIX: 666.'

Check the logs:

2023/01/05 12:02:59 [warn] 1890559#1890559: *16152
[lua] [string "local dbg = require("apisix.inspect.dbg")..."]:39:
request_body=nil, client: 127.0.0.1, server: _,
request: "POST /anything HTTP/1.1", host: "127.0.0.1:9080"

Of course, request_body is empty!

Solution

Since we know that we need to read the request body to use the request_body variable, so we can't use vars to do this. Instead, we can use the filter_func field in the route to achieve our requirement.

curl http://127.0.0.1:9180/apisix/admin/routes/var_route \
-H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -i -d '
{
    "uri": "/anything",
    "methods": ["POST"],
    "filter_func": "function(_) return require(\"apisix.core\").request.get_body():find(\"APISIX: 666\") end",
    "upstream": {
        "type": "roundrobin",
        "nodes": {
            "httpbin.org": 1
        }
    }
}'

Let's verify:

curl http://127.0.0.1:9080/anything -X POST -d 'hello, APISIX: 666.'
{
  "args": {},
  "data": "",
  "files": {},
  "form": {
    "hello, APISIX: 666.": ""
  },
  "headers": {
    "Accept": "*/*",
    "Content-Length": "19",
    "Content-Type": "application/x-www-form-urlencoded",
    "Host": "127.0.0.1",
    "User-Agent": "curl/7.68.0",
    "X-Amzn-Trace-Id": "Root=1-63b64dbd-0354b6ed19d7e3b67013592e",
    "X-Forwarded-Host": "127.0.0.1"
  },
  "json": null,
  "method": "POST",
  "origin": "127.0.0.1, xxx",
  "url": "http://127.0.0.1/anything"
}

Problem resolved!

In general, we don't enable INFO level logs in a production environment, but sometimes we need to check some detailed information. How can we do that?

We usually won't directly set the INFO level and then reload, as this has two drawbacks:

  • Too many logs, which affects performance and increases the difficulty of checking
  • Reloading causes persistent connections to be disconnected, affecting online traffic

Usually, we only need to check the logs of a specific point; for example, we all know that APISIX uses etcd as a configuration distribution database, so can we see when the route configuration is updated incrementally to the data plane? What specific data have been updated?

apisix/core/config_etcd.lua

local function sync_data(self)
...
    log.info("waitdir key: ", self.key, " prev_index: ", self.prev_index + 1)
    log.info("res: ", json.delay_encode(dir_res, true), ", err: ", err)
...
end

The lua function for incremental synchronization is sync_data(), but it prints out the incremental data from etcd watch in the INFO level.

So let's try using the inspect plugin to display. We will only show the changes of routes resources.

Edit /usr/local/apisix/example_hooks.lua:

local dbg = require("apisix.inspect.dbg")
local core = require("apisix.core")
dbg.set_hook("apisix/core/config_etcd.lua", 393, nil, function(info)
    local filter_res = "/routes"
    if info.vals.self.key:sub(-#filter_res) == filter_res and not info.vals.err then
        core.log.warn("etcd watch /routes response: ", core.json.encode(info.vals.dir_res, true))
        return true
    end
    return false
end)

The logic of this breakpoint processing function clearly indicates the filtering capability. If the key of watch is /routes, and err is empty, it will print out the data returned by etcd only once, then cancel the breakpoint.

Note that sync_data() is a local function, so it cannot be referenced directly. In that case, we can only set the third parameter of set_hook to be nil, which has the side effect of clearing all traces.

In the above example, we have created a soft link, so we only need to save files after editing. Breakpoints will be enabled after a few seconds; you can check the log to confirm.

By checking the log, we can obtain the information we need, which is printed at the WARN log level. It also shows the time when we get the etcd incremental data on the data plane.

2023/01/05 14:33:10 [warn] 1890562#1890562: *231311
[lua] [string "local dbg = require("apisix.inspect.dbg")..."]:41:
etcd watch /routes response: {"headers":{"X-Etcd-Index":"24433"},
"body":{"node":[{"value":{"uri":"\/anything",
"plugins":{"request-id":{"header_name":"X-Request-Id","include_in_response":true,"algorithm":"uuid"}},
"create_time":1672898912,"status":1,"priority":0,"update_time":1672900390,
"upstream":{"nodes":{"httpbin.org":1},"hash_on":"vars","type":"roundrobin","pass_host":"pass","scheme":"http"},
"id":"reqid"},"key":"\/apisix\/routes\/reqid","modifiedIndex":24433,"createdIndex":24429}]}}, context: ngx.timer

Conclusion

Lua dynamic debugging is an important auxiliary function. With the APISIX's inspect plugin, we can do so many things such as:

  • Troubleshoot and identify the cause of issues
  • Print some blocked logs and retrieve various information as needed
  • Learn Lua code by debugging

Please read these related documentations for more details.

Tags:
APISIX BasicsAPI Gateway ConceptPlugin