Use APISIX, Prometheus, and KEDA to Scale Applications Elastically in Kubernetes

Introduction

The amount of traffic that enters an application varies from time to time. For example, an online shopping application is much busier during the holiday season than on regular days. To be able to adjust the application's capacity based on traffic will provide much better user experiences and services.

Apache APISIX is a high-performance cloud-native API gateway that can provide meaningful metrics to determine whether the applications need to be scaled. It is a middleware that processes all traffic sent to the upstream applications and can therefore collect traffic data along the process.

To perform elastic scaling, KEDA will be used as the controller, and Prometheus will be used to fetch metrics provided by APISIX.

using KEDA for autocaling

How to Use the Prometheus Scaler in KEDA

KEDA is an event-based autoscaler in Kubernetes, which can configure various scalers. In this article, the Prometheus scaler will be used to obtain the metrics exposed by APISIX.

Deploy KEDA

The deployment of KEDA is relatively simple, just add the corresponding Helm repo and install it.

(MoeLove) ➜ helm repo add kedacore https://kedacore.github.io/charts
"kedacore" has been added to your repositories
(MoeLove) ➜ helm repo update kedacore
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "kedacore" chart repository
Update Complete. ⎈Happy Helming!⎈
(MoeLove) ➜ helm install keda kedacore/keda --namespace keda --create-namespace
NAME: keda
LAST DEPLOYED: Thu Jan 19 00:01:00 2023
NAMESPACE: keda
STATUS: deployed
REVISION: 1
TEST SUITE: None

After the installation, the Pod has a status Running, indicating it has been installed.

(MoeLove) ➜ kubectl -n keda get pods
NAME                                               READY   STATUS    RESTARTS   AGE
keda-operator-metrics-apiserver-6d4db7dcff-ck9qg   1/1     Running   0          36s
keda-operator-5dd4748dcd-k8jjz                     1/1     Running   0          36s

Deploy Prometheus

Here we use Prometheus Operator to deploy Prometheus. Prometheus Operator can help us quickly deploy the Prometheus instance in Kubernetes and add the monitoring rules through declarative configuration.

Complete the installation of Prometheus Operator through the following steps.

(MoeLove) ➜ https://github.com/prometheus-operator/prometheus-operator/releases/download/v0.62.0/bundle.yaml
(MoeLove) ➜ kubectl apply --server-side -f bundle.yaml
customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com serverside-applied
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com serverside-applied
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator serverside-applied
clusterrole.rbac.authorization.k8s.io/prometheus-operator serverside-applied
deployment.apps/prometheus-operator serverside-applied
serviceaccount/prometheus-operator serverside-applied
service/prometheus-operator serverside-applied

Then use the following as the configuration for Prometheus and apply it to the Kubernetes cluster.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
spec:
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      app: apisix
  serviceMonitorNamespaceSelector:
    matchLabels:
      team: apisix
  resources:
    requests:
      memory: 400Mi
  enableAdminAPI: false
---
apiVersion: v1
kind: Service
metadata:
  name: prometheus
spec:
  type: LoadBalancer
  ports:
  - name: web
    port: 9090
    protocol: TCP
    targetPort: web
  selector:
    prometheus: prometheus

After applying to the Kubernetes cluster, you can see that a Prometheus instance is created under the default namespace. Since we configured Prometheus with TYPE LoadBalancer, users can directly access Prometheus through the public IP of the LoadBalancer.

(MoeLove) ➜ kubectl get svc
NAME                  TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)          AGE
kubernetes            ClusterIP      10.43.0.1       <none>         443/TCP          96m
prometheus-operator   ClusterIP      None            <none>         8080/TCP         92m
prometheus-operated   ClusterIP      None            <none>         9090/TCP         41m
prometheus            LoadBalancer   10.43.125.194   216.6.66.66    9090:30099/TCP   41m

How to Deploy the API Gateway and Enable Monitoring

Next, deploy APISIX Ingress controller and use Prometheus for metrics collection.

The method is similar for users who only use APISIX instead of the APISIX Ingress Controller. We will not explain it separately here.

Here, Helm is used for deployment, and the APISIX Ingress controller and APISIX can simultaneously be deployed to the cluster.

(MoeLove) ➜ helm repo add apisix https://charts.apiseven.com
"apisix" already exists with the same configuration, skipping
(MoeLove) ➜ helm repo update apisix
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "apisix" chart repository
Update Complete. ⎈Happy Helming!⎈
(MoeLove) ➜ helm upgrade --install apisix apisix/apisix --create-namespace  --namespace apisix --set gateway.type=LoadBalancer --set ingress-controller.enabled=true --set ingress-controller.config.apisix.serviceNamespace=apisix
Release "apisix" has been upgraded. Happy Helming!
NAME: apisix
LAST DEPLOYED: Thu Jan 19 02:11:23 2023
NAMESPACE: apisix
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Get the application URL by running these commands:
     NOTE: It may take a few minutes for the LoadBalancer IP to be available.
           You can watch the status of by running 'kubectl get --namespace apisix svc -w apisix-gateway'
  export SERVICE_IP=$(kubectl get svc --namespace apisix apisix-gateway --template "{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}")
  echo http://$SERVICE_IP:80

Next, enable the Prometheus plugin of APISIX. Please refer to the following two documents for specific configuration methods and related parameters.

Once enabled, Prometheus can capture the metrics exposed by APISIX by creating a ServiceMonitor resource.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
  labels:
    app: apisix
spec:
  selector:
    matchLabels:
      app: apisix
  endpoints:
  - port: web

Verify Application’s Elastic Scaling Capability

Create an application first.

(MoeLove) ➜ kubectl create deploy httpbin --image=kennethreitz/httpbin --port=80
deployment.apps/httpbin created
(MoeLove) ➜ kubectl expose deploy httpbin --port 80

Create the following routing rules and apply them to the Kubernetes cluster to proxy requests through APISIX.

apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
  name: httpserver-route
spec:
  http:
  - name: rule1
    match:
      hosts:
      - local.httpbin.org
      paths:
      - /*
    backends:
       - serviceName: httpbin
         servicePort: 80

Next, create a KEDA ScaledObject and configure Prometheus-related configurations.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    name: httpbin
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.default.svc:9090
      metricName: apisix_http_status
      threshold: '10'
      query: sum(rate(apisix_http_status{route="httpserver-route"}[1m]))

The above configuration means that sum(rate(apisix_http_status{route="httpserver-route"}[1m])) is used as the query expression, and if the result can reach 10, the expansion will start. (The configuration here is only for demonstrating purposes, please modify it according to your own situation). Then, we make continuous requests to the httpbin service through curl.

Later, if we check the application pods, you can see that it has been autoscaled to two by KEDA.

(MoeLove) ➜ kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
httpbin-d46d778d7-chtdw   1/1     Running   0          12m
httpbin-d46d778d7-xanbj   1/1     Running   0          10s

When the application sits idle after some time, we will find that the number of pods has automatically shrunk back to one instance.

(MoeLove) ➜ kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
httpbin-d46d778d7-chtdw   1/1     Running   0          32m

Summary

KEDA uses Prometheus as a scaler to collect the metrics exposed by APISIX. Since all traffic passes through APISIX first, getting statistics on the APISIX side will be much simpler and more convenient.

When the volume of business requests increases, the application will be automatically expanded, and when the volume of business requests drops, the application will be automatically contracted.

This method can alleviate manual expansion/reduction operations in many production environments to ensure the best user experience.