What Are API Gateway Policies?

With the development of cloud-native and microservices architecture, API gateways‘ role as traffic portals is increasingly crucial. API gateways are mainly responsible for receiving requested traffic and forwarding them to the appropriate upstream services. API gateways’ policy determines the logic and rules for managing traffic, which directly determines the behavior of the business traffic.

What are API Gateway Policies

The API gateway is generally located in front of all upstream services. When a user sends a request to a service, the request will first go to the API gateway, and the API gateway will generally check several things:

Check whether the request is legal. Is it from a list of users prohibited from accessing?
Check whether the request is authenticated and the accessed content is authorized.
Check whether the request triggers specific restriction rules, such as rate limits.
Check which upstream service to be forwarded to.

After this series of steps, the request is either rejected or reaches the specified upstream service correctly. We call this processing rules the policies of the API gateway. These rules are continuously added to the gateway by the gateway administrator when the gateway is running. The gateway accepts these rules and makes corresponding traffic processing behaviors.

Taking the API gateway Apache APISIX as an example, there are two types of APISIX configuration information: the configuration file for gateway startup, such as config.yaml, this file determines some configurations necessary for the gateway to start normally. In addition, administrators can dynamically create various rules and configurations through Admin API at runtime, such as Route, Consumer, Plugin, etc. The policies of the API gateway are various rules and configurations dynamically created by the administrator through the Admin API.

This article will elaborate on the scenarios of the four API gateways, authentication and authorization, security, traffic processing, and observability, and how API gateway policies work under each scenario.

Authentication and Authorization Policy

Authentication can confirm the identity of the API caller, and authorization restricts the caller to only access resources within the authority.

Authentication and Authorization

For example, if a passenger travels to a station, he will use his ID card for "authentication" to confirm his identity before entering the station. After entering the station, he will show his ticket to the staff for confirmation and be "authorized" to enter a specific train. The primary purpose of authentication and authorization policies is to ensure that all requests forwarded to upstream services are legal and the requests only access resources within the scope of authority. Some standard policies are as follows:

Basic Auth

The Basic Access Authentication policy is the most straightforward access control technique. Generally, the user's HTTP proxy carries a request header for authentication when sending a request, which is usually: Authorization: Basic <credentials>, and credentials include the user ID and password required for user authentication, separated by :. This method does not require complex settings such as login pages and cookies. It is authenticated based on simple credential information in the request header, generally a username and password, which is simple to configure and use.

An example of a cURL request with basic authentication is as follows, with username and password:

curl -i -u 'username:password' http://127.0.0.1:8080/hello

It should be noted that the information in credentials will not be encrypted during transmission. It is only base64 encoded, so it usually needs to be used with HTTPS to ensure the security of the password.

After this policy is implemented in the gateway, requests without credentials will not be forwarded unless the correct authentication information is carried in the request. This policy implements API access verification at the minimum cost.

Key Auth

The Key Auth policy restricts API calls by adding keys to the API and using keys to control access to resources. Only requests with the correct key, which can be carried in the request header or the query, can access the resources. Usually, this key can also be used to distinguish different API callers. So that different policies or resource controls can be implemented for different callers. Same as basic auth, the key is not encrypted. Make sure the request uses HTTPS to ensure security.

Take APISIX's key-auth plugin as an example. The plugin needs to create a Consumer with a unique key value through the Admin API:

curl http://127.0.0.1:9180/apisix/admin/consumers \
-H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -d '
{
    "username": "jack",
    "plugins": {
        "key-auth": {
            "key": "jack-key"
        }
    }
}'

This request creates a Consumer named jack with the key value jack-key.

When enabling the plugin in the route, you need to configure the location and field name of the key in the request. The default configuration location is header, and the field name is apikey, so the correct code for requesting this route is:

curl -i http://127.0.0.1:8080/hello -H 'apikey: jack-key'

Once APISIX receives this request, APISIX will parse out the key and match the Consumer jack from all the configured keys. So the gateway will know that the request is sent by jack. It can be judged as an illegal request if no matching key is found.

JSON Web Token

JSON Web Token (JWT) is an open standard that defines a way to securely pass information between parties in the form of json objects. The JWT policy can combine authentication and authorization. After the user authorizes, a JWT token will be transmitted to the user, and the caller will carry this token in all subsequent requests to ensure that the request is authorized.

In the API gateway, JWT authentication can be added to the gateway through the JWT policy. So, we can separate the authentication logic from the business code, and the developers can focus more on implementing the business logic. Take APISIX's jwt-auth plugin as an example. The plugin must be enabled and configured in Consumer with its unique key, public and private keys for encryption, encryption algorithm, token expiration time, etc. At the same time, you need to enable this plugin in the route and configure the gateway to read the location and fields of the token, such as header, query, cookie, etc. This plugin will add an API to the API Gateway for issuing tokens. Before sending the request, the API that issues the token needs to be requested. The token needs to be carried in the specified location according to the configuration information when sending the request. After the request reaches the gateway, the gateway will read the token from the specified location of the request according to the configuration information and verify the token's validity. The request can be forwarded to the upstream service upon verification.

Compared with the previous two strategies, the JWT policy includes an option for the expiration time. The issued token can expire over time, but the validity period of Basic Auth and Key Auth is permanent unless the server changes the password or key. In addition, the JWT policy can be shared between multiple services, which is beneficial for single sign-on (SSO) scenarios.

OpenID Connect

OpenID Connect is an authentication and authorization method based on the OAuth2.0 protocol, providing a relatively complete application solution. The OpenID Connect policy in the API gateway will allow the upstream service obtains the user information from the identity provider (IdP), thereby protecting the API security. Common IdPs are Keycloak, Ory Hydra, Okta, Auth0, etc. Taking Apache APISIX as an example, the OpenID Connect policy workflow in the gateway is as follows:

The client sends a request to the gateway
After receiving the request, the gateway sends an authentication request to the IdP
The user will be redirected to the page provided by the IdP to complete the login authentication
IdP redirects to the gateway with the authentication code
The gateway requests an Access Token from the IdP through the code to obtain user information
The gateway can carry user information when forwarding the request upstream

This process allows authentication and authorization to be separated from the business, making the architecture more granular.

For more APISIX authentication and authorization methods, please refer to API Gateway Authentication.

Security Policy

The API gateway security policy acts like a gatekeeper to ensure safe API access, allowing legal requests to be forwarded by the gateway and blocking illegal requests on the gateway. According to the OWASP API Security Project, many possible threats and attacks on API callers exist. Using the API gateway security policy can perform security verification on all API requests, which plays an essential role in protecting the API from these security threats.

API security

The following are several important API gateway security policies:

IP Access Restrictions

The IP restriction policy restricts specific clients from accessing the API by setting certain IPs or CIDR in an allowlist or denylist to prohibit malicious access of sensitive data. Properly configuring this policy will significantly improve the API's security and enable higher API security governance.

URI Blocker

The URI blocking policy intercepts potentially dangerous API requests by setting some URI rules. For example, some security attacks detect potential vulnerabilities by sniffing the URI path and then attacking. Apache APISIX provides the uri-blocker plugin to block dangerous API requests. One can also set the regular expressions through the uri-blocker plugin. The API gateway will block the request if the request matches the rule. For example, if we configure root.exe, admin, this plugin can block */root.exe and */admin requests to protect API security further.

CORS

CORS (Cross-origin resource sharing) is the browser's security policy for cross-domain requests. Generally, before sending an xhr request in the browser, the browser will verify whether the request address and the current address are in the same origin. Requests within the same origin will be sent directly. Otherwise, the browser will first send an OPTION-type cross-domain preflight request. There will be CORS-related settings in the response header, such as the methods and the credentials allowed to be carried in the cross-domain requests. The browser will decide whether to send a formal request based on this information. For details, please refer to CORS.

Generally, the response containing CORS settings is set by the backend service. Still, if there are many services, it will be safer and more convenient to process them uniformly at the gateway level. CORS policy can set different cross-origin resolution policies on different APIs, and upstream services no longer need to handle these logics.

CSRF

CSRF is a cross-site request forgery attack, which causes end users to perform involuntary actions on the website they have authenticated. This attack is generally accompanied by social engineering (sending a malicious link to the victim via email). When the user clicks on the link, the attacker uses the authenticated identity of the victim to perform attack operations on the website. From the website’s perspective, any behavior is expected because the user has already logged in.

Usually, the back-end service of the website needs to add additional middleware to handle this part of logic, and the prevention methods also have specific standards. Using the CSRF policy can prevent this attack and perform CSRF security processing at the gateway layer to simplify the logic complexity of upstream services.

Traffic Processing Policy

The traffic processing policy mainly ensures that the upstream load of the API gateway for traffic forwarding is within the healthy range. At the same time, the request is rewritten on-demand before it is forwarded or returned to the caller. This type of policy is mainly about functionalities such as rate limiting, circuit breakers, caching, and rewriting.

Rate Limiting

In the case of limited resources, there is a limit to the service capabilities that the API can provide. If the call exceeds this limit, the upstream service may crash and cause some chain reactions. Rate limiting can avoid such cases and can effectively prevent APIs from being attacked by DDOS (Denial-of-service attack).

We can configure a time window and the maximum allowable number of requests in the rate-limiting policy. The API gateway will reject requests exceeding the maximum allowable number and return a rate-limit error message. The request won't be allowed until the number of requests is less than the limit or the next time window opens.

The variable for request counting can be set in the request or as a particular request header, for example, to set the corresponding speed limit policy according to different IPs to achieve better flexibility.

Circuit Breaking

The API circuit-breaking policy can provide the circuit-breaking capability for upstream services. When using this policy, you need to set the status codes of healthy and unhealthy upstream services for the gateway to judge the status of upstream services. In addition, it is also necessary to set the threshold of requests to trigger a break or restore health. When the upstream service continuously returns unhealthy status codes to the gateway, the gateway will break the upstream service for some time. During this period, the gateway will no longer forward requests to the upstream but directly return an error. It can prevent upstream services from "avalanche" of continuing to receive requests due to errors and protect business services. After this period, the gateway will try to forward the request to the upstream again. If it still returns an unhealthy status code, the gateway will continue to break for a longer time (doubled). Until the upstream returns a certain number of health status codes, the gateway believes that the upstream service is back to health and will continue to forward traffic to the upstream node.

In this policy, it is also necessary to set the status code and information that needs to be returned when it is unhealthy. When the upstream service is unhealthy, the request is returned directly at the gateway level to protect the stability of business services.

Traffic Splitting

The traffic splitting policy can dynamically control traffic forwarding to different upstream services in proportion. It is advantageous in canary releases and blue-green deployment.

The canary release allows only some requests to use the new service, while the other part remains in the old service. If the new service remains stable, you can increase the proportion and gradually transfer all requests to the new service until the ratio is wholly switched to complete the upgrade.

The blue-green release is another release mode, which releases a new version during the peak period without interrupting the service. Both old and new versions of the service coexist. Typically the production environment (blue) is copied into an identical but separate container (green). Release new updates to the green environment, and then release both green and blue to production. The green environment can then be tested and repaired while the user is still accessing the blue system. Requests can then be redirected to the green environment using some load-balancing policy. The blue environment can then be kept on standby as a disaster recovery option or for the next update.

APISIX supports both releases through the traffic-split plugin, making business deployment more convenient and reliable.

Response Rewrite

In the modern microservice architecture, there are many different protocols between services and no uniform response data formats. If this transcode logic is separately implemented in respective services, there will be redundant logic code, which is challenging to manage. Some response rewriting policies can handle protocol conversion, request body rewriting, and other logic.

APISIX provides a Response Rewrite plugin to modify the Body or Header information returned by upstream services and supports adding or deleting response headers, setting rules to modify the response body, etc. It is helpful in scenarios such as setting CORS response headers for cross-origin requests or setting location for redirection.

On the other hand, APISIX provides proxy-rewrite for request rewriting. The plugin can also handle the requested content proxied to the upstream service. You can rewrite the requested URI, method, request header, etc., which provides convenience for business processing in many scenarios.

Fault Injection

Fault injection is a software testing method that ensures a system's correct behavior by intentionally introducing errors. Usually, testing is done before deployment to ensure no potential failures in the production environment. In some chaos testing scenarios, it is necessary to inject some errors into the service to verify its reliability.

Software fault injection can be categorized into compile-time injection and runtime injection. Compile-time injection refers to changing some code or logic in writing software. Runtime injection tests the behaviour of software by setting errors in the running software environment. The fault injection policy can simulate faults in application network requests through runtime injection. By selecting a ratio in the policy, requests in this ratio will execute the fault logic, such as returning with a delay time or directly returning the set error code and error message to the caller. In this way, the adaptability of the software can be increased, allowing developers to see some possible error situations in advance, and make adaptive modifications to the problems before release.

Protocol Conversion

The policy of the protocol conversion class can convert between some standard protocols, like common HTTP requests and gRPC. Apache APISIX provides the grpc-transcode plugin that can transcode and forward the HTTP request to gRPC-type services. The response is returned to the client in HTTP format. In this way, the client can only focus on HTTP without paying attention to the type of upstream gRPC.

Observability Policy

Observability refers to the ability to measure the system's operating status through the system's output data. In some simple systems, because the number of system components is relatively small, the bug can be found by analyzing the status of each component when an error occurs. However, in a large-scale distributed system, the number of various microservice components is vast, and it is unrealistic to check the components one by one. At this time, the system needs to be observable. Observability provides visibility of an extensive system, and when a problem occurs, it can give the engineers the control they need to pinpoint the problem.

API Gateway Observability policy

Data collection can be implemented within application components. The API gateway is the entrance of all traffic, hence implementing the system's observability features in the API gateway can reflect the usage of the system API. An API Gateway observability policy can help all teams:

A team of engineers can monitor and resolve API issues.
The product team can understand the usage of the API to discover the business value behind it.
Sales and growth teams can monitor API usage, watch for business opportunities and ensure APIs deliver the correct data.

Observability policies are generally divided into three categories according to output data type: Tracing, Metrics, and Logging.

Tracing

In a large-scale distributed system, the relationship between services is intricate. Tracing (link tracking) can track the complete call link, dependency analysis among applications, and request statistics in distributed applications. When a problem occurs in the system, it can help engineers determine the scope and location of the investigation.

The tracing policy can integrate other distributed call link tracking systems on the API Gateway to collect and record information. For example, integrating services, like Zipkin, and SkyWalking, into the API gateway to collect data and interservice communication. Therefore, this policy lets engineers know which log to see for a specific session or related API calls and verifies the scope of troubleshooting.

Metrics

Metrics refers to a software's observation data collected during a time interval of the service operation. These data are structured by default, which makes querying and visualization easy. One can learn the system's operating status through the analysis of these data.

The metrics policy can integrate services such as Prometheus or Datadog in the API gateway to provide monitoring and alarm capabilities for the system. This policy collects data during the gateway's operation through various API gateway interfaces and reports data to Prometheus or Datadog. By visualizing these data, developers can see the running status of the gateway, the statistical information of API requests, and other statistical graphs.

Logging

The log is a text record of system events at a specific time. The log is the first place to check when a problem occurs in the system. Engineers rely on the log content to find what happened to the system and the corresponding solution. The log content is generally divided into the API request log and the gateway's running log. The API request log records all API request records during the operation of the API gateway. Engineers can learn the API access information through these records and discover and troubleshoot abnormal requests in time. The operation log of the gateway contains records of all events that occurred during the working of the gateway. When the API gateway is abnormal, it can be used as an essential basis for troubleshooting.

The logging policy can store the logs in the API gateway in the server disk or push them to some other log servers, such as HTTP log server, TCP log server, UDP log server, etc., or other log systems such as Kafka, RocketMQ, Clickhouse.

Summary

This article introduces four policies commonly used in API gateways: authentication and authorization, security, traffic processing, and observability. API Gateway receives requested traffic before all upstream services, controls where and how requests are forwarded, and directly rejects or restricts insecure and unauthorized requests. API Gateway policies can configure all of these behaviors.

For more API gateway-related information, please visit blogs and API7.ai for more commercial support.