Handling Layer 4 of traffic and Implementing Memcached Server by OpenResty


November 10, 2022

OpenResty (NGINX + Lua)

In a few previous articles, we introduced some Lua APIs for handling requests, which are all related to Layer 7. In addition, OpenResty provides the stream-lua-nginx-module module to handle traffic from Layer 4. It provides instructions and APIs that are basically the same as the lua-nginx-module.

Today, We'll talk about implementing a Memcached server with OpenResty, which only needs about 100 lines of code. In this little hands-on, we'll use a lot of what we've learned earlier, and we'll also bring in some of the content from the testing and performance optimization chapters later.

And we should be clear that the point of this article is not to understand the functions of every line of code but to understand the full view of how OpenResty develops a project from scratch, from the perspective of requirements, testing, development, etc.

Original requirements and technical solutions

We know that HTTPS traffic is becoming mainstream, but some older browsers do not support session tickets, so we need to store the session ID on the server side. If the local storage space is insufficient, we need a cluster for storage, and the data can be discarded, so Memcached is more suitable.

At this point, introducing Memcached should be the most straightforward solution. However, in this article, we will choose to use OpenResty to build a wheel for the following reasons.

  • First, introducing Memcached directly would introduce an additional process, increasing deployment and maintenance costs.
  • Second, the requirement is simple enough, requiring only get and set operations, and supporting expiration.
  • Third, OpenResty has a stream module, which can quickly implement this requirement.

Since we want to implement a Memcached server, we need to understand its protocol first. Memcached protocol can support TCP and UDP. Here we use TCP. Below is the specific protocol of get and set commands.

get value with key
Telnet command: get <key>*\r\n

get key
VALUE key 0 4 data END
Save key-value to memcached
Telnet command:set <key> <flags> <exptime> <bytes> [noreply]\r\n<value>\r\n

set key 0 900 4 data

We also need to know how the "error handling" of the Memcached protocol is implemented besides get and set. "Error handling" is very important for server-side programs, and we need to write programs that handle not only normal requests but also exceptions. For example, in a scenario like the following:

  • Memcached sends a request other than get or set, how do I handle it?
  • What kind of feedback do I give to the Memcached client when there is an error on the server side?

Also, we want to write a Memcached-compatible client application. This way, users don't have to distinguish between the official Memcached version and the OpenResty implementation.

The following figure from the Memcached documentation describes what should be returned in case of an error and the exact format, which you can use as a reference.

error format

Now, let's define the technical solution. We know that OpenResty's shared dict can be used across workers and that putting data in a shared dict is very similar to putting it in Memcached. They both support get and set operations, and the data is lost when the process is restarted. Therefore, it is appropriate to use a shared dict to emulate Memcached, as their principles and behavior are the same.

Test-Driven Development

The next step is to start working on it. However, based on the idea of test-driven development, let's construct the simplest test case before we write the specific code. Instead of using the test::nginx framework, which is notoriously difficult to start with, let's start with a manual test using the resty.

$ resty -e 'local memcached = require "resty.memcached"
    local memc, err = memcached:new()

    memc:set_timeout(1000) -- 1 sec
    local ok, err = memc:connect("", 11212)
    local ok, err = memc:set("dog", 32)
    if not ok then
        ngx.say("failed to set dog: ", err)

    local res, flags, err = memc:get("dog")
    ngx.say("dog: ", res)'

This test code uses the lua-rety-memcached client library to initiate connect and set operations and assumes that the Memcached server listens on port 11212 on the local machine.

It looks like it should work fine. You can run this code on your machine, and, not surprisingly, it will return an error like failed to set dog: closed, since the service is not started at this point.

At this point, your technical solution is clear: use the stream module to receive and send data and use the shared dict to store it.

The metric to measure the completion of the requirement is clear: run the above code and print the dog's actual value.

Building the framework

So what are you waiting for? Start writing code!

My habit is to build a minimal runnable code framework first and then gradually populate the code. The advantage of this is that you can set many small goals during the coding process, and the test cases will give you positive feedback when you accomplish a small goal.

Let's start by setting up the NGINX configuration file since stream and shared dict should be preset in it. Here is the configuration file I set up.

stream {
    lua_shared_dict memcached 100m;
    lua_package_path 'lib/?.lua;;';
    server {
        listen 11212;
        content_by_lua_block {
            local m = require("resty.memcached.server")

As you can see, several key pieces of information are in this configuration file.

  • First, the code runs in the stream context of NGINX, not the HTTP context, and is listening on port 11212.
  • Second, the name of the shared dict is memcached , and the size is 100M, which cannot be changed at runtime.
  • In addition, the code is located in the directory lib/resty/memcached, the file name is server.lua, and the entry function is run(), which you can find from lua_package_path and content_by_lua_block.

Next, it's time to build the code framework. You can try it yourself, and then let's look at my framework code together.

local new_tab = require "table.new"
local str_sub = string.sub
local re_find = ngx.re.find
local mc_shdict = ngx.shared.memcached

local _M = { _VERSION = '0.01' }

local function parse_args(s, start)

function _M.get(tcpsock, keys)

function _M.set(tcpsock, res)

function _M.run()
    local tcpsock = assert(ngx.req.socket(true))

    while true do
        tcpsock:settimeout(60000) -- 60 seconds
        local data, err = tcpsock:receive("*l")

        local command, args
        if data then
            local from, to, err = re_find(data, [[(\S+)]], "jo")
            if from then
                command = str_sub(data, from, to)
                args = parse_args(data, to + 1)

        if args then
            local args_len = #args
            if command == 'get' and args_len > 0 then
                _M.get(tcpsock, args)
            elseif command == "set" and args_len == 4 then
                _M.set(tcpsock, args)

return _M

This code snippet implements the main logic of the entry function run(). Although I haven't done any exception handling and the dependencies parse_args, get, and set are all empty functions, this framework already entirely expresses the Memcached server's logic.

Filling code

Next, let's implement these empty functions in the order in which the code is executed.

First, we can parse the parameters of the Memcached command according to the Memcached protocol documentation.

local function parse_args(s, start)
    local arr = {}

    while true do
        local from, to = re_find(s, [[\S+]], "jo", {pos = start})
        if not from then

        table.insert(arr, str_sub(s, from, to))

        start = to + 1

    return arr

My advice is to implement a version most intuitively first, without any performance optimization in mind. After all, completion is always more important than perfection, and incremental optimization based on completion is the only way to get closer to perfection.

Next, let's implement the get function. It can query multiple keys at once, so I use a for loop in the following code.

function _M.get(tcpsock, keys)
    local reply = ""

    for i = 1, #keys do
        local key = keys[i]
        local value, flags = mc_shdict:get(key)
        if value then
            local flags  = flags or 0
            reply = reply .. "VALUE" .. key .. " " .. flags .. " " .. #value .. "\r\n" .. value .. "\r\n"
    reply = reply ..  "END\r\n"

    tcpsock:settimeout(1000)  -- one second timeout
    local bytes, err = tcpsock:send(reply)

There is only one line of core code here: local value, flags = mc_shdict:get(key), that is, querying the data from the shared dict; as for the rest of the code, it is following the Memcached protocol to stitch the string and finally send it to the client.

Finally, let's look at the set function. It converts the received parameters into the shared dict API format, stores the data, and in case of errors, handles them according to Memcached's protocol.

function _M.set(tcpsock, res)
    local reply =  ""

    local key = res[1]
    local flags = res[2]
    local exptime = res[3]
    local bytes = res[4]

    local value, err = tcpsock:receive(tonumber(bytes) + 2)

    if str_sub(value, -2, -1) == "\r\n" then
        local succ, err, forcible = mc_shdict:set(key, str_sub(value, 1, bytes), exptime, flags)
        if succ then
            reply = reply .. “STORED\r\n"
            reply = reply .. "SERVER_ERROR " .. err .. “\r\n”
        reply = reply .. "ERROR\r\n"

    tcpsock:settimeout(1000)  -- one second timeout
    local bytes, err = tcpsock:send(reply)

In addition, you can use test cases to check and debug with ngx.log while populating the above functions. Unfortunately, we are using ngx.say and ngx.log to debug as there is no break points debugger in OpenResty, which is still a slash-and-burn era waiting for further exploration.


This hands-on project is ending now, and finally, I would like to leave a question: Could you take the above Memcached server implementation code, run it completely, and pass the test case?

Today's question will probably require much effort, but this is still a primitive version. There is no error handling, performance optimization, and automation testing, which will be improved later.

If you have any doubts about today's explanation or your practice, you are welcome to leave a comment and discuss it with us. You are also welcome to share this article with your colleagues and friends so that we can practice and progress together.