Handling Layer 4 of traffic and Implementing Memcached Server by OpenResty
API7.ai
November 10, 2022
In a few previous articles, we introduced some Lua APIs for handling requests, which are all related to Layer 7. In addition, OpenResty provides the stream-lua-nginx-module
module to handle traffic from Layer 4. It provides instructions and APIs that are basically the same as the lua-nginx-module
.
Today, We'll talk about implementing a Memcached server with OpenResty, which only needs about 100 lines of code. In this little hands-on, we'll use a lot of what we've learned earlier, and we'll also bring in some of the content from the testing and performance optimization chapters later.
And we should be clear that the point of this article is not to understand the functions of every line of code but to understand the full view of how OpenResty develops a project from scratch, from the perspective of requirements, testing, development, etc.
Original requirements and technical solutions
We know that HTTPS traffic is becoming mainstream, but some older browsers do not support session tickets
, so we need to store the session ID on the server side. If the local storage space is insufficient, we need a cluster for storage, and the data can be discarded, so Memcached is more suitable.
At this point, introducing Memcached should be the most straightforward solution. However, in this article, we will choose to use OpenResty to build a wheel for the following reasons.
- First, introducing Memcached directly would introduce an additional process, increasing deployment and maintenance costs.
- Second, the requirement is simple enough, requiring only
get
andset
operations, and supporting expiration. - Third, OpenResty has a
stream
module, which can quickly implement this requirement.
Since we want to implement a Memcached server, we need to understand its protocol first. Memcached protocol can support TCP and UDP. Here we use TCP. Below is the specific protocol of get
and set
commands.
Get
get value with key
Telnet command: get <key>*\r\n
Example:
get key
VALUE key 0 4 data END
Set
Save key-value to memcached
Telnet command:set <key> <flags> <exptime> <bytes> [noreply]\r\n<value>\r\n
Example:
set key 0 900 4 data
STORED
We also need to know how the "error handling" of the Memcached protocol is implemented besides get
and set
. "Error handling" is very important for server-side programs, and we need to write programs that handle not only normal requests but also exceptions. For example, in a scenario like the following:
- Memcached sends a request other than
get
orset
, how do I handle it? - What kind of feedback do I give to the Memcached client when there is an error on the server side?
Also, we want to write a Memcached-compatible client application. This way, users don't have to distinguish between the official Memcached version and the OpenResty implementation.
The following figure from the Memcached documentation describes what should be returned in case of an error and the exact format, which you can use as a reference.
Now, let's define the technical solution. We know that OpenResty's shared dict
can be used across worker
s and that putting data in a shared dict
is very similar to putting it in Memcached. They both support get
and set
operations, and the data is lost when the process is restarted. Therefore, it is appropriate to use a shared dict
to emulate Memcached, as their principles and behavior are the same.
Test-Driven Development
The next step is to start working on it. However, based on the idea of test-driven development, let's construct the simplest test case before we write the specific code. Instead of using the test::nginx
framework, which is notoriously difficult to start with, let's start with a manual test using the resty
.
$ resty -e 'local memcached = require "resty.memcached"
local memc, err = memcached:new()
memc:set_timeout(1000) -- 1 sec
local ok, err = memc:connect("127.0.0.1", 11212)
local ok, err = memc:set("dog", 32)
if not ok then
ngx.say("failed to set dog: ", err)
return
end
local res, flags, err = memc:get("dog")
ngx.say("dog: ", res)'
This test code uses the lua-rety-memcached
client library to initiate connect
and set
operations and assumes that the Memcached server listens on port 11212
on the local machine.
It looks like it should work fine. You can run this code on your machine, and, not surprisingly, it will return an error like failed to set dog: closed
, since the service is not started at this point.
At this point, your technical solution is clear: use the stream
module to receive and send data and use the shared dict
to store it.
The metric to measure the completion of the requirement is clear: run the above code and print the dog
's actual value.
Building the framework
So what are you waiting for? Start writing code!
My habit is to build a minimal runnable code framework first and then gradually populate the code. The advantage of this is that you can set many small goals during the coding process, and the test cases will give you positive feedback when you accomplish a small goal.
Let's start by setting up the NGINX configuration file since stream
and shared dict
should be preset in it. Here is the configuration file I set up.
stream {
lua_shared_dict memcached 100m;
lua_package_path 'lib/?.lua;;';
server {
listen 11212;
content_by_lua_block {
local m = require("resty.memcached.server")
m.run()
}
}
}
As you can see, several key pieces of information are in this configuration file.
- First, the code runs in the
stream
context of NGINX, not theHTTP
context, and is listening on port11212
. - Second, the name of the
shared dict
ismemcached
, and the size is100M
, which cannot be changed at runtime. - In addition, the code is located in the directory
lib/resty/memcached
, the file name isserver.lua
, and the entry function isrun()
, which you can find fromlua_package_path
andcontent_by_lua_block
.
Next, it's time to build the code framework. You can try it yourself, and then let's look at my framework code together.
local new_tab = require "table.new"
local str_sub = string.sub
local re_find = ngx.re.find
local mc_shdict = ngx.shared.memcached
local _M = { _VERSION = '0.01' }
local function parse_args(s, start)
end
function _M.get(tcpsock, keys)
end
function _M.set(tcpsock, res)
end
function _M.run()
local tcpsock = assert(ngx.req.socket(true))
while true do
tcpsock:settimeout(60000) -- 60 seconds
local data, err = tcpsock:receive("*l")
local command, args
if data then
local from, to, err = re_find(data, [[(\S+)]], "jo")
if from then
command = str_sub(data, from, to)
args = parse_args(data, to + 1)
end
end
if args then
local args_len = #args
if command == 'get' and args_len > 0 then
_M.get(tcpsock, args)
elseif command == "set" and args_len == 4 then
_M.set(tcpsock, args)
end
end
end
end
return _M
This code snippet implements the main logic of the entry function run()
. Although I haven't done any exception handling and the dependencies parse_args
, get
, and set
are all empty functions, this framework already entirely expresses the Memcached server's logic.
Filling code
Next, let's implement these empty functions in the order in which the code is executed.
First, we can parse the parameters of the Memcached command according to the Memcached protocol documentation.
local function parse_args(s, start)
local arr = {}
while true do
local from, to = re_find(s, [[\S+]], "jo", {pos = start})
if not from then
break
end
table.insert(arr, str_sub(s, from, to))
start = to + 1
end
return arr
end
My advice is to implement a version most intuitively first, without any performance optimization in mind. After all, completion is always more important than perfection, and incremental optimization based on completion is the only way to get closer to perfection.
Next, let's implement the get
function. It can query multiple keys at once, so I use a for
loop in the following code.
function _M.get(tcpsock, keys)
local reply = ""
for i = 1, #keys do
local key = keys[i]
local value, flags = mc_shdict:get(key)
if value then
local flags = flags or 0
reply = reply .. "VALUE" .. key .. " " .. flags .. " " .. #value .. "\r\n" .. value .. "\r\n"
end
end
reply = reply .. "END\r\n"
tcpsock:settimeout(1000) -- one second timeout
local bytes, err = tcpsock:send(reply)
end
There is only one line of core code here: local value, flags = mc_shdict:get(key)
, that is, querying the data from the shared dict
; as for the rest of the code, it is following the Memcached protocol to stitch the string and finally send it to the client.
Finally, let's look at the set
function. It converts the received parameters into the shared dict
API format, stores the data, and in case of errors, handles them according to Memcached's protocol.
function _M.set(tcpsock, res)
local reply = ""
local key = res[1]
local flags = res[2]
local exptime = res[3]
local bytes = res[4]
local value, err = tcpsock:receive(tonumber(bytes) + 2)
if str_sub(value, -2, -1) == "\r\n" then
local succ, err, forcible = mc_shdict:set(key, str_sub(value, 1, bytes), exptime, flags)
if succ then
reply = reply .. “STORED\r\n"
else
reply = reply .. "SERVER_ERROR " .. err .. “\r\n”
end
else
reply = reply .. "ERROR\r\n"
end
tcpsock:settimeout(1000) -- one second timeout
local bytes, err = tcpsock:send(reply)
end
In addition, you can use test cases to check and debug with ngx.log
while populating the above functions. Unfortunately, we are using ngx.say
and ngx.log
to debug as there is no break points debugger in OpenResty, which is still a slash-and-burn era waiting for further exploration.
Summary
This hands-on project is ending now, and finally, I would like to leave a question: Could you take the above Memcached server implementation code, run it completely, and pass the test case?
Today's question will probably require much effort, but this is still a primitive version. There is no error handling, performance optimization, and automation testing, which will be improved later.
If you have any doubts about today's explanation or your practice, you are welcome to leave a comment and discuss it with us. You are also welcome to share this article with your colleagues and friends so that we can practice and progress together.