Best Practices of Integrating Prometheus with APISIX

January 13, 2024

Technology

In today's cloud-native architecture, monitoring the metrics of your API gateway is crucial. Apache APISIX, serving as a high-performance API gateway, not only offers extensive functionalities but also supports seamless integration with Prometheus to collect and monitor key API traffic metrics. This article explores how to configure and use Prometheus in Apache APISIX, highlighting essential considerations and recommending common metric configurations.

About Prometheus

Prometheus is an open-source monitoring system that collects and stores time-series data, enabling real-time monitoring and analysis of system performance. When integrated with Apache APISIX, Prometheus becomes instrumental in capturing fine-grained metrics related to API traffic.

Enabling Prometheus Plugin in Apache APISIX

  1. To enable Prometheus metrics in Apache APISIX, start by configuring the Prometheus plugin in the config.yaml file:

    plugins:
      - prometheus
    
  2. Configure the Prometheus plugin on the desired service or API to be collected. Alternatively, configure it globally. Here's an example of configuring the plugin using a cURL command:

    curl http://127.0.0.1:9180/apisix/admin/routes/1 -H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -d '
    {
        "uri": "/hello",
        "plugins": {
            "prometheus":{}
        },
        "upstream": {
            "type": "roundrobin",
            "nodes": {
                "127.0.0.1:80": 1
            }
        }
    }'
    

    For more complex configurations, refer to: Prometheus Plugin Documentation

Configuring Collection Strategy in Prometheus

In Prometheus, configure the prometheus.yml file to add APISIX as a new monitoring target:

scrape_configs:
  - job_name: 'apisix'
    static_configs:
    - targets: ['<APISIX_IP>:<APISIX_PORT>']

Common Metrics in Apache APISIX

Enterprise-specific metrics may vary, but here are some key metrics in Apache APISIX, providing rich information for system monitoring and analysis:

  • HTTP Request and Response Metrics:

    • apisix_http_request_total: Records the total HTTP requests through APISIX, offering an overview of system traffic.
    • apisix_http_request_duration_seconds: Measures HTTP request processing time, aiding in identifying performance bottlenecks.
    • apisix_http_request_size_bytes: Captures the size of HTTP requests for data analysis.
    • apisix_http_response_size_bytes: Monitors the size of HTTP responses to track response data volume.
  • Upstream Service Metrics:

    • apisix_upstream_latency: Reflects the response latency of upstream services.
    • apisix_upstream_health: Indicates the health status of upstream services.
  • System Performance Metrics:

    • apisix_node_cpu_usage: Reports the CPU usage of the APISIX node.
    • apisix_node_memory_usage: Offers insights into memory usage.
  • Traffic Metrics:

    • apisix_bandwidth: Details the bandwidth usage for both upstream and downstream traffic.
  • Error and Exception Metrics:

    • apisix_http_status_code: Distributes HTTP response status codes, particularly focusing on 4xx and 5xx errors.

Visualization and Alerts

Leverage Grafana and Prometheus integration to create dashboards for visualizing these metrics. Additionally, Prometheus alerting rules can be configured to set up alerts based on specific conditions.

Grafana Dashboard Example: Create various charts in Grafana, such as time series, bar graphs, or pie charts, to showcase APISIX's performance metrics. For instance, a dashboard displaying HTTP request counts and average response times offers real-time traffic and performance insights.

Prometheus Alerting Example: Alerting rules in Prometheus can be configured for various conditions. For instance, if the average duration of apisix_http_request_duration_seconds surpasses a predefined threshold, Prometheus can be configured to send critical alerts.

Optimization Considerations

While having extensive Prometheus metrics enhances monitoring and alerting dimensions, it's crucial to acknowledge that these metrics consume computational resources. More metrics imply higher resource demands, potentially impacting business systems.

Since version 3.0, Apache APISIX has significantly optimized the Prometheus plugin, introducing a dedicated process for metric statistics and retrieval. This improvement mitigates the impact on business traffic caused by extensive Prometheus metric statistics, which was contributed by API7.ai.

Conclusion

By integrating Prometheus with Apache APISIX, enterprises gain profound insights into their API infrastructure, ensuring efficient and secure operations. API traffic monitoring gradually becomes an essential tool for proactively preventing issues, optimizing performance, and ensuring security.

Tags:
MonitoringAPISIX Basics