"inspect": Apache APISIX Lua Dynamic Debugging Plugin
JinHua Luo
January 29, 2023
Why do we need Lua Dynamic Debugging Plugin?
Apache APISIX contains a significant amount of Lua code. Is there a way to inspect the values of variables in the code at runtime without modifying the source code?
Modifying Lua source code for debugging has several drawbacks:
- Production environments should not and often do not allow modifying source code
- Modifying source code necessitates reloading, which disrupts the functioning of business processes
- It can be challenging to modify source code in containerized environments
- Temporarily generated codes are prone to be omitted to rollback, leading to maintenance issues
We usually need to inspect the values of variables, not only when functions begin or end but also when specific conditions are met, such as when a loop reaches a certain number of iterations or when a specific condition is true. Additionally, it is not always enough to simply print out variable values; it may also be necessary to send relevant information to external systems. Furthermore, how to make this process dynamic, and can it be done without negatively impacting the program's performance?
The Lua dynamic debugging plugin, inspect
, helps you to fulfill the above requirements.
- Customizable breakpoint handling
- Dynamic breakpoint setting
- Multiple breakpoints can be set
- Breakpoints can be set to trigger only once
- The performance impact can be controlled and kept within a specific scope
Working principle of plugin
It fully utilizes Lua's Debug API
function to implement its features. During the interpreter mode, each executed bytecode could be mapped to a specific file and a line number. In order to set breakpoints, we only need to check if the line number is equal to the expected value and execute the breakpoint function we have already defined. This allows us to process the context information of the corresponding line, including upvalue, local variables, and some metadata, such as the stack.
APISIX uses Lua's JIT implementation: LuaJIT, where many hot code paths are compiled into machine code to execute. However, these are not affected by Debug API,
so we need to clear the JIT cache before turning on breakpoints. The key is that we can choose only to clear the JIT cache of a specific Lua function, reducing the impact on global performance. When a program runs, many JIT-compiled code blocks will be named trace
in LuaJIT. These traces are associated with Lua functions, and a single Lua function may include multiple traces, referring to different hot paths within the function.
We can specify their function objects and clear their JIT cache for global and module-level functions. However, if the line number corresponds to other function types, such as anonymous functions, we cannot obtain the function object globally. In such cases, we can only clear all JIT cache. New traces cannot be generated during debugging, but existing un-cleared traces continue to run. As long as we have enough control, the program performance will not be affected since an online system that has been running for a long time usually will not generate new traces. Once debugging ends and all breakpoints have been revoked, the system will return to normal JIT mode, and the cleared JIT cache will be regenerated once it re-enters the hot spot.
Installation and configuration
This plugin is enabled by default.
Configure conf/confg.yaml
correctly to enable this plugin:
plugins:
...
- inspect
plugin_attr:
inspect:
delay: 3
hooks_file: "/usr/local/apisix/plugin_inspect_hooks.lua"
The plugin reads breakpoint definitions from the file '/usr/local/apisix/plugin_inspect_hooks.lua' every 3 seconds by default. To debug, you just need to edit this file.
We recommend creating a soft link to this path so that it is easier to archive different historical versions of the breakpoint file.
Please note that when the file is modified, the plugin will clear all previous breakpoints and enable all new breakpoints defined in the breakpoint file. These breakpoints will be effective across all worker processes.
Generally, it is not necessary to delete the file, as when defining breakpoints, you can specify when to revoke them.
Deleting this file will cancel all breakpoints for all worker processes.
Logs of the start and stop of breakpoints will be recorded at the 'WARN' log level.
Define breakpoints
require("apisix.inspect.dbg").set_hook(file, line, func, filter_func)
file
, the file name, which could be any unambiguous file name or path.line
, the line number in the file, please note that breakpoints are closely linked to line numbers, so if the code changes, the line number must also change.func
, the name of the function whose trace should be cleared. If it isnil
, then all traces in theluajit vm
will be cleared.filter_func
, a custom Lua function that handles the breakpoint- The input parameter is called
table
that includes the following:finfo
: the return value ofdebug.getinfo(level, "nSlf")
uv
: upvalues hash tablevals
: local variables hash table
- If the return value of the function is
true
, then the breakpoint will be automatically revoked. Otherwise, the breakpoint will continue to be effective.
- The input parameter is called
For example:
local dbg = require "apisix.inspect.dbg"
dbg.set_hook("limit-req.lua", 88, require("apisix.plugins.limit-req").access,
function(info)
ngx.log(ngx.INFO, debug.traceback("foo traceback", 3))
ngx.log(ngx.INFO, dbg.getname(info.finfo))
ngx.log(ngx.INFO, "conf_key=", info.vals.conf_key)
return true
end)
dbg.set_hook("t/lib/demo.lua", 31, require("t.lib.demo").hot2, function(info)
if info.vals.i == 222 then
ngx.timer.at(0, function(_, body)
local httpc = require("resty.http").new()
httpc:request_uri("http://127.0.0.1:9080/upstream1", {
method = "POST",
body = body,
})
end, ngx.var.request_uri .. "," .. info.vals.i)
return true
end
return false
end)
--- more breakpoints ...
Please note that the demo
breakpoint organizes some information and sends it to an external server. Also, the used resty.http
library is an asynchronous library based on cosocket
.
Whenever calling OpenResty's asynchronous API, it must be sent with a delay by using timer
, because executing functions on breakpoints is synchronous and blocking, it will not return to the main program of nginx
for asynchronous processing, so it needs to be delayed.
Use cases
Determine routes based on the content of the request body
Let's say we have a requirement: how can we set up a route that only accepts POST requests with the string APISIX: 666
in the request body?
In the route configuration, there is a vars
field, which can be used to check the value of nginx
variables in order to determine if the route should be matched. The $request_body
variable provided by nginx
contains the request body value, so we can use this variable to implement our requirement.
Let's try out to configure the routes:
curl http://127.0.0.1:9180/apisix/admin/routes/var_route \
-H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -i -d '
{
"uri": "/anything",
"methods": ["POST"],
"vars": [["request_body", "~~", "APISIX: 666"]],
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org": 1
}
}
}'
Then we could try this:
curl http://127.0.0.1:9080/anything
{"error_msg":"404 Route Not Found"}
curl -i http://127.0.0.1:9080/anything -X POST -d 'hello, APISIX: 666.'
HTTP/1.1 404 Not Found
Date: Thu, 05 Jan 2023 03:53:35 GMT
Content-Type: text/plain; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Server: APISIX/3.0.0
{"error_msg":"404 Route Not Found"}
Weird, why can't we match this route?
We will then take a look at the documentation for this variable in nginx:
The variable’s value is made available in locations processed by the proxy_pass, fastcgi_pass, uwsgi_pass, and scgi_pass directives when the request body was read to a memory buffer.
In other words, we need to first read request body before we use this variable.
When matching the route, will this variable be empty? We can use the inspect
plugin to verify it.
We found the line of code that matches the route:
apisix/init.lua
...
api_ctx.var.request_uri = api_ctx.var.uri .. api_ctx.var.is_args .. (api_ctx.var.args or "")
router.router_http.match(api_ctx)
local route = api_ctx.matched_route
if not route then
...
Let's verify the varibale request_body
in line 515, which is router.router_http.match(api_ctx)
.
Set up breakpoints
Edit file /usr/local/apisix/example_hooks.lua
:
local dbg = require("apisix.inspect.dbg")
dbg.set_hook("apisix/init.lua", 515, require("apisix").http_access_phase, function(info)
core.log.warn("request_body=", info.vals.api_ctx.var.request_body)
return true
end)
Create a soft link to the breakpoint file path:
ln -sf /usr/local/apisix/example_hooks.lua /usr/local/apisix/plugin_inspect_hooks.lua
Check the logs to confirm that the breakpoint is effective.
2023/01/05 12:02:43 [warn] 1890559#1890559: *15736 [lua] init.lua:68: setup_hooks():
set hooks: err: true, hooks: ["apisix\/init.lua#515"], context: ngx.timer
Retrigger routes matching:
curl -i http://127.0.0.1:9080/anything -X POST -d 'hello, APISIX: 666.'
Check the logs:
2023/01/05 12:02:59 [warn] 1890559#1890559: *16152
[lua] [string "local dbg = require("apisix.inspect.dbg")..."]:39:
request_body=nil, client: 127.0.0.1, server: _,
request: "POST /anything HTTP/1.1", host: "127.0.0.1:9080"
Of course, request_body
is empty!
Solution
Since we know that we need to read the request body to use the request_body
variable, so we can't use vars
to do this. Instead, we can use the filter_func
field in the route to achieve our requirement.
curl http://127.0.0.1:9180/apisix/admin/routes/var_route \
-H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -i -d '
{
"uri": "/anything",
"methods": ["POST"],
"filter_func": "function(_) return require(\"apisix.core\").request.get_body():find(\"APISIX: 666\") end",
"upstream": {
"type": "roundrobin",
"nodes": {
"httpbin.org": 1
}
}
}'
Let's verify:
curl http://127.0.0.1:9080/anything -X POST -d 'hello, APISIX: 666.'
{
"args": {},
"data": "",
"files": {},
"form": {
"hello, APISIX: 666.": ""
},
"headers": {
"Accept": "*/*",
"Content-Length": "19",
"Content-Type": "application/x-www-form-urlencoded",
"Host": "127.0.0.1",
"User-Agent": "curl/7.68.0",
"X-Amzn-Trace-Id": "Root=1-63b64dbd-0354b6ed19d7e3b67013592e",
"X-Forwarded-Host": "127.0.0.1"
},
"json": null,
"method": "POST",
"origin": "127.0.0.1, xxx",
"url": "http://127.0.0.1/anything"
}
Problem resolved!
Print some logs that have been blocked by the log level.
In general, we don't enable INFO
level logs in a production environment, but sometimes we need to check some detailed information. How can we do that?
We usually won't directly set the INFO
level and then reload, as this has two drawbacks:
- Too many logs, which affects performance and increases the difficulty of checking
- Reloading causes persistent connections to be disconnected, affecting online traffic
Usually, we only need to check the logs of a specific point; for example, we all know that APISIX uses etcd
as a configuration distribution database, so can we see when the route configuration is updated incrementally to the data plane? What specific data have been updated?
apisix/core/config_etcd.lua
local function sync_data(self)
...
log.info("waitdir key: ", self.key, " prev_index: ", self.prev_index + 1)
log.info("res: ", json.delay_encode(dir_res, true), ", err: ", err)
...
end
The lua function for incremental synchronization is sync_data()
, but it prints out the incremental data from etcd watch in the INFO
level.
So let's try using the inspect
plugin to display. We will only show the changes of routes resources.
Edit /usr/local/apisix/example_hooks.lua
:
local dbg = require("apisix.inspect.dbg")
local core = require("apisix.core")
dbg.set_hook("apisix/core/config_etcd.lua", 393, nil, function(info)
local filter_res = "/routes"
if info.vals.self.key:sub(-#filter_res) == filter_res and not info.vals.err then
core.log.warn("etcd watch /routes response: ", core.json.encode(info.vals.dir_res, true))
return true
end
return false
end)
The logic of this breakpoint processing function clearly indicates the filtering capability. If the key
of watch
is /routes
, and err
is empty, it will print out the data returned by etcd only once, then cancel the breakpoint.
Note that sync_data()
is a local function, so it cannot be referenced directly.
In that case, we can only set the third parameter of set_hook
to be nil
, which has the side effect of clearing all traces.
In the above example, we have created a soft link, so we only need to save files after editing. Breakpoints will be enabled after a few seconds; you can check the log to confirm.
By checking the log, we can obtain the information we need, which is printed at the WARN
log level.
It also shows the time when we get the etcd incremental data on the data plane.
2023/01/05 14:33:10 [warn] 1890562#1890562: *231311
[lua] [string "local dbg = require("apisix.inspect.dbg")..."]:41:
etcd watch /routes response: {"headers":{"X-Etcd-Index":"24433"},
"body":{"node":[{"value":{"uri":"\/anything",
"plugins":{"request-id":{"header_name":"X-Request-Id","include_in_response":true,"algorithm":"uuid"}},
"create_time":1672898912,"status":1,"priority":0,"update_time":1672900390,
"upstream":{"nodes":{"httpbin.org":1},"hash_on":"vars","type":"roundrobin","pass_host":"pass","scheme":"http"},
"id":"reqid"},"key":"\/apisix\/routes\/reqid","modifiedIndex":24433,"createdIndex":24429}]}}, context: ngx.timer
Conclusion
Lua dynamic debugging is an important auxiliary function. With the APISIX's inspect
plugin, we can do so many things such as:
- Troubleshoot and identify the cause of issues
- Print some blocked logs and retrieve various information as needed
- Learn Lua code by debugging
Please read these related documentations for more details.