What Is gRPC? How to Work With APISIX?

September 28, 2022

Ecosystem

What is gRPC

gRPC is an RPC framework open-sourced by Google that aims to unify how services communicate. The framework uses HTTP/2 as its transfer protocol and Protocol Buffers as the interface description language. It can automatically generate code for calls between services.

Dominance of gRPC

gRPC has become the standard of RPC frameworks due to Google's exceptional influence on developers and cloud-native environments.

Want to invoke etcd functions? gRPC!

Want to send OpenCensus data? gRPC!

Want to use RPC in a microservice implemented in Go? gRPC!

The dominance of gRPC is so strong that if you didn't choose gRPC as your RPC framework, you would have to give a solid reason why. Otherwise, someone will always ask, why don’t you choose the mainstream gRPC? Even Alibaba, which has vigorously promoted its RPC framework Dubbo, has dramatically revised the protocol design in the latest version of Dubbo 3, changing it to a gRPC variant compatible with both gRPC and Dubbo 2. In fact, instead of saying Dubbo 3 is an upgrade from Dubbo 2, it is more like an acknowledgment of gRPC's supremacy.

Many services that provide gRPC also provide corresponding HTTP interfaces, but such interfaces are often only for compatibility purposes. The gRPC version has a much better user experience. If you can access it through gRPC, you can directly import the corresponding SDK. If you can only use ordinary HTTP APIs, you will usually be directed to a document page, and you need to implement the corresponding HTTP operations yourself. Although HTTP access can generate the corresponding SDK through the OpenAPI spec, only a few projects take HTTP users as seriously as gRPC since HTTP is a low priority.

Should I use gRPC

APISIX uses etcd as the configuration center. Since v3, etcd has migrated its interface to gRPC. However, no project supports gRPC in the OpenResty ecosystem, so APISIX can only call etcd's HTTP APIs. The HTTP APIs of etcd is provided through gRPC-gateway. In essence, etcd runs an HTTP to gRPC proxy on its server side, and then the external HTTP requests are converted into gRPC requests through the gRPC-gateway. Having deployed this method of communication for a few years, we have found some problems in the interaction between the HTTP API and the gRPC API. Having a gRPC-gateway does not mean that HTTP access is perfectly supported. There are still subtle differences.

Here's a list of related issues we've encountered with etcd over the past few years:

  1. gRPC-gateway disabled by default. Due to the negligence of the maintainer, the default configuration of etcd does not enable gRPC-gateway in some projects. So we had to add instructions in the document to check whether the current etcd has gRPC-gateway enabled. See https://github.com/apache/apisix/pull/2940.
  2. By default, gRPC limits responses to 4MB. etcd removes this restriction in the SDK it provides but forgot to remove it in gRPC-gateway. It turns out that the official etcdctl (built on the SDK it provides) works fine, but APISIX doesn't. See https://github.com/etcd-io/etcd/issues/12576.
  3. Same problem - this time with the maximum number of requests for the same connection. Go's HTTP2 implementation has a MaxConcurrentStreams configuration that controls the number of simultaneous requests a single client can send, defaulting to 250. Which client would normally send more than 250 requests at the same time? So etcd has always used this configuration. However, gRPC-gateway, the "client" that proxies all HTTP requests to the local gRPC interface, may exceed this limit. See https://github.com/etcd-io/etcd/issues/14185.
  4. After etcd enables mTLS, etcd uses the same certificate as both the server certificate and the client certificate, the server certificate for gRPC-gateway, and the client certificate when gRPC-gateway accesses the gRPC interface. If the server auth extension is enabled on the certificate, but the client auth extension is not enabled, an error will result in certificate verification. Once again, accessing directly with etcdctl works fine(as the certificate will not be used as a client certificate in this case), but APISIX doesn't. See https://github.com/etcd-io/etcd/issues/9785.
  5. After enabling mTLS, etcd allows security policies configuration of certificates' user information. As mentioned above, gRPC-gateway uses a fixed client certificate when accessing the gRPC interface rather than the certificate information used to access the HTTP interface at the beginning. Thus, this feature will not work naturally since the client certificate is fixed and will not be changed. See https://github.com/apache/apisix/issues/5608.

We can summarize the problems in two points:

  1. gRPC-gateway (and perhaps other attempts to convert HTTP to gRPC) is not a silver bullet that fixes all problems.
  2. The developers of etcd don't put enough emphasis on the HTTP method. And their biggest user, Kubernetes, doesn't use this feature.

We are not talking about the problems of a specific software here, etcd is just a typical example that uses gRPC. All services that use gRPC as their primary RPC framework have similar limitations in their support for HTTP.

How does APISIX 3.0 solve this problem

The saying goes, "if the mountain won't come to Muhammad, then Muhammad must go to the mountain." If we implement a gRPC client under OpenResty, we can communicate directly with the gRPC service.

Considering workload and stability, we decided to develop based on the commonly used gRPC library instead of reinventing the wheels. We examined the following gRPC libraries:

  1. NGINX's gRPC service. NGINX does not expose gRPC to external users, not even a high-level API. If you want to use it, you can only copy a few low-level functions and then integrate them into a high-level interface. Integrating them will cause additional workloads.
  2. The official gRPC library for C++. Since our system is based on NGINX, it can be a little bit complicated to integrate C++ libraries. In addition, the dependency of this library is close to 2GB, which will be a big challenge for the construction of APISIX.
  3. The official Go implementation of gRPC. Go has a powerful toolchain, and we can quickly build projects into it. However, it's a pity that this implementation's performance is far from the C++ version. So we looked at another Go implementation: https://github.com/bufbuild/connect-go/. Unfortunately, the performance of this project is not better than the official version, either.
  4. Rust implementation of gRPC library. This library would be a natural choice for combining dependency management and performance. Unfortunately, we are unfamiliar with Rust and wouldn't bet on it.

Considering the operations of a gRPC client are basically all IO bound, the performance requirement is not primary. After careful consideration, we implemented it based on Go-gRPC.

To coordinate with Lua's coroutine scheduler, we wrote an NGINX C module: https://github.com/api7/grpc-client-nginx-module. At first, we wanted to integrate the Go code into this C module by compiling it into a statically-linked library through cgo. Still, we found that since Go is a multi-threaded application, the child process will not inherit all of the threads of the parent process after forking. There is no way to adapt to NGINX's master-worker multi-process architecture. So we compiled the Go code into a dynamic link library(DLL) and then loaded it into the worker process at runtime.

We implemented a task queue mechanism to coordinate Go's coroutines with Lua's coroutines. When Lua code initiates a gRPC IO operation, it submits a task to the Go side and suspends itself. A Go coroutine will execute this task, and the execution result will be written to the queue. A background thread on the NGINX side consumes the task execution result, reschedules the corresponding Lua coroutine, and continues executing the Lua code. In this way, gRPC IO operations are no different from ordinary socket operations in the eyes of Lua code.

Now, most of the work of the NGINX C module is done. All we need to do is to take out etcd's .proto file (which defines its gRPC interface), modify it, and then load the file in Lua to get the following etcd client:

local gcli = require("resty.grpc")
assert(gcli.load("t/testdata/rpc.proto"))
local conn = assert(gcli.connect("127.0.0.1:2379"))
local st, err = conn:new_server_stream("etcdserverpb.Watch", "Watch",
                                        {create_request =
                                            {key = ngx.var.arg_key}},
                                        {timeout = 30000})
if not st then
    ngx.status = 503
    ngx.say(err)
    return
end
for i = 1, (ngx.var.arg_count or 10) do
    local res, err = st:recv()
    ngx.log(ngx.WARN, "received ", cjson.encode(res))
    if not res then
        ngx.status = 503
        ngx.say(err)
        break
    end
end

This gRPC-based implementation is better than lua-resty-etcd, an etcd-HTTP client project with 1600 lines of code in Lua alone.

Of course, we are still a long way from replacing lua-resty-etcd. To fully connect with etcd, grpc-client-nginx-module also needs to complete the following functions:

  1. mTLS support
  2. Support gRPC metadata configuration
  3. Support parameters configurations (e.g. MaxConcurrentStreams and MaxRecvMsgSize)
  4. Support for requests from L4

Fortunately, we have set up the foundation, and supporting these things is just a matter of course.

grpc-client-nginx-module will be integrated into APISIX 3.0, then APISIX users can use the methods of this module in the APISIX plugin to communicate directly with gRPC services.

With native support for gRPC, APISIX will get a better etcd experience and opens the door to possibilities for features such as gRPC health check and gRPC-based open telemetry data reporting.

We are excited to see more gRPC-based features of APISIX in the future!

Topics: