APISIX: Migrate etcd Operation From HTTP to gRPC
Limitations of Apache APISIX’s HTTP-Based etcd Operations
When etcd was in version 2.x, the API interface it exposed was HTTP 1 (we will refer to it as HTTP from now on). After the etcd was upgraded to version 3.x, it switched the protocol from HTTP to gRPC. For users that don’t support gRPC, etcd provides gRPC-Gateway to proxy HTTP requests as gRPC to access the new gRPC APIs.
When APISIX started using etcd, it used the etcd v2 API. In APISIX 2.0(2020), we have updated the requirement for etcd from version 2.x to 3.x. etcd's compatibility with HTTP has saved us effort for the version update. We just needed to modify the code on calling methods and processing responses. However, throughout the years, we have also found some problems related to etcd's HTTP API. There are still some subtle differences. We realized that having a gRPC-gateway does not mean it can perfectly support HTTP access.
Here's a list of related issues we've encountered with etcd over the past few years:
- gRPC-gateway disabled by default. Due to the negligence of the maintainer, the default configuration of etcd does not enable gRPC-gateway in some projects. So we had to add instructions in the document to check whether the current etcd has gRPC-gateway enabled. See https://github.com/apache/apisix/pull/2940.
- By default, gRPC limits responses to 4MB. etcd removes this restriction in the SDK it provides but not in gRPC-gateway. It turns out that the official etcdctl (built on the SDK it provides) works fine, but APISIX doesn't. See https://github.com/etcd-io/etcd/issues/12576.
- Same problem - this time with the maximum number of requests for the same connection. Go's HTTP2 implementation has a
MaxConcurrentStreamsconfiguration that controls the number of requests a single client can send simultaneously, defaulting to 250. Which client would normally send more than 250 requests at the same time? So etcd has always used this configuration. However, gRPC-gateway, the "client" that proxies all HTTP requests to the local gRPC interface may exceed this limit. See https://github.com/etcd-io/etcd/issues/14185.
- After etcd enables mTLS, etcd uses the same certificate as both the server certificate and the client certificate: the server certificate for gRPC-gateway and the client certificate when gRPC-gateway accesses the gRPC interface. If the server auth extension is enabled on the certificate, but the client auth extension is not enabled, an error will result in certificate verification. Once again, accessing directly with etcdctl works fine(as the certificate will not be used as a client certificate in this case), but APISIX doesn't. See https://github.com/etcd-io/etcd/issues/9785.
- After enabling mTLS, etcd allows security policies configuration of certificates' user information. As mentioned above, gRPC-gateway uses a fixed client certificate when accessing the gRPC interface rather than the certificate information used to access the HTTP interface at the beginning. Thus, this feature will not work naturally since the client certificate is fixed and will not be changed. See https://github.com/apache/apisix/issues/5608.
We can summarize the problems in two points:
- gRPC-gateway (and perhaps other attempts to convert HTTP to gRPC) is not a silver bullet that fixes all problems.
- The developers of etcd don't put enough emphasis on the HTTP to gRPC method. And their biggest user, Kubernetes, doesn't use this feature.
To solve this problem, we need to use etcd directly through gRPC, so that we don’t have to go through the gRPC-Gateway's HTTP path reserved for compatibility.
Overcoming the Challenges of Migrating to gRPC
Bug in lua-protobuf
After integrating etcd's proto file, we found that there would be occasional crashes in the Lua code, reporting an error of “table overflow". Because this crash cannot be reliably reproduced, our first instinct was to look for a minimal reproducible example. Interestingly, if you use the etcd's proto file alone, you can't reproduce the problem at all. This crash seems only to occur when APISIX is running.
After some debugging, I located the problem in lua-protobuf when parsing the
oneof field of the proto file. lua-protobuf would try to pre-allocate the table size when parsing, and the allocated size is calculated according to a particular value. There was a certain chance that this value would be a negative number. Then LuaJIT would convert this number to a large positive number when allocating, resulting in a "table overflow" error. I reported the issue to the author and we maintained a fork with a workaround internally.
The lua-protobuf author was very responsive, providing a fix the next day and releasing a new version a few days later. It turned out that when lua-protobuf cleaned up proto files that are no longer used, it missed cleaning some fields, resulting in an unreasonable negative number when
oneof was subsequently processed. The problem only occurred once in a while, and why it could not be reproduced when using the etcd proto file alone because it missed the steps of cleaning these fields.
Align with the HTTP Behavior
During the migration process, I found that the existing API doesn't return precisely the execution result but an HTTP response with response status and body. And then, the caller needs to process the HTTP response by themselves.
If the responses were in gRPC, they need to be wrapped with an HTTP response shell to align with the processing logic. Otherwise, the caller needs to modify the code in multiple places to adapt to the new (gRPC) response format. Especially considering that the old HTTP-based etcd operations also need to be supported simultaneously.
Although adding an additional layer to be compatible with HTTP response is not desired, we have to work around it. In addition to this, we also need to do some processing on the gRPC response. For example, when there is no corresponding data, HTTP does not return any data, but gRPC returns an empty table. It also needs to be tailored to align with HTTP behaviours.
From Short Connection to Long Connection
In HTTP-based etcd operations, APISIX uses short connections, so there is no need to consider connection management. All we need to do is initiate a new connection whenever we need and close it when we are done.
But gRPC cannot do this. One of the primary purposes of migrating to gRPC is to achieve multiplexing, which cannot be achieved if a new gRPC connection is created for each operation. Here we need to thank gRPC-go, for its built-in connection management capability, which can automatically reconnect once the connection is interrupted. So we can use gRPC-go to reuse the connection. And only business requirements need to be considered at the APISIX level.
The etcd operations of APISIX can be divided into two categories, one is CRUD (add, delete, modify, query) operations on etcd data; the other is to synchronize configuration from the control plane. While theoretically, these two etcd operations could share the same gRPC connection, we decided to split them into two connections for the sake of separation of responsibilities. For the connection of CRUD operations, since APISIX needs to be treated separately on startup and after startup, an if statement is added when obtaining a new connection. If there is a mismatch (i.e. the current connection is created at startup while we need a connection after startup), then we will close the current connection and create a new one. I have developed a new synchronization method for synchronization of configuration, so that each resource uses a stream under the existing connection to watch etcd.
Benefits of Migrating to gRPC
One obvious benefit after migrating to gRPC is that the number of connections required to operate etcd are greatly reduced. When operating etcd through HTTP, APISIX could only use short connections. And when synchronizing configuration, each resource will have a separate connection.
After switching to gRPC, we could use gRPC's multiplexing function, and each resource only uses a single stream instead of a complete connection. This way, the number of connections no longer increases with the number of resources. Considering that the subsequent development of APISIX will introduce more resource types, for example, the latest version 3.1 has added
secrets, and the reduction in the number of connections by using gRPC will be more significant.
When using gRPC for synchronization, each process has only one (two, if the stream subsystem is enabled) connection for configuration synchronization. In the figure below, we can see that the two processes have four connections, two of which are for configuration synchronization, the Admin API uses one connection, and the remaining connection is for the privileged agent to report server info.
For comparison, the figure below shows 22 connections required to use the original configuration synchronization method while keeping other parameters unchanged. In addition, these connections are short connections.
The only difference between these two configurations is whether gRPC is enabled for etcd operations:
etcd: use_grpc: true host: - "http://127.0.0.1:2379" prefix: "/apisix" ...
In addition to reducing the number of connections, using gRPC to access etcd directly instead of gRPC-gateway can solve a series of architecturally limited issues such as mTLS authentication mentioned at the beginning of the article. There will also be fewer problems after using gRPC, because Kubernetes uses gRPC to operate etcd. If there is a problem, it will be discovered by the Kubernetes community.
Of course, since the gRPC method is still relatively new, APISIX will inevitably have some new problems when operating etcd through gRPC. Currently, the default is still using the original HTTP-based method to operate etcd by default. Users have the option to configure
use_grpc under etcd to true in
config.yaml by themselves. You can try if the gRPC method is better. We will also continuously gather feedback from various sources to improve the gRPC-based etcd operation. When we find the gRPC approach is mature enough, we will make it the default approach.
To Maximize APISIX, You Need API7
You love the performance of Apache APISIX, not the overheads of managing it. You can focus on your core business without worrying about configuring, maintaining, and updating.
Our team comprises Apache APISIX creators and contributors, OpenResty and NGINX core maintainers, Kubernetes members, and industry experts on cloud infra. You get the best people behind the scene.
Do you want to accelerate your development with confidence? To maximize APISIX support, you need API7. We provide in-depth support for APISIX and API management solutions based on your needs!
Contact us now: https://api7.ai/contact.