Global Profiling Engine

Configure historical aggregation of stick table data

In addition to aggregating stick table data from multiple HAProxy Enterprise nodes in real time and pushing that data back to each node, the profiling engine also stores historical data. For example, you can configure it to tell you what the average HTTP request rate was at the same time of day yesterday. Or, you can check what the average rate was at this same time a week ago, and adjust rate limiting to match.

Historical data allows you to perform dynamic decisions in your load balancer based on data from the past, such as to set rate limits that change depending on the hour.

Configure the Global Profiling Engine Jump to heading

Follow these steps to configure historical aggregation of stick table data.

  1. On the Global Profiling Engine server, as shown for real-time aggregation, configure the list of peers in the /etc/hapee-extras/hapee-gpe-stktagg.cfg file.

    hapee-gpe-stktagg.cfg
    haproxy
    global
    stats socket /var/run/hapee-extras/gpe-api.sock
    aggregations data
    from any to .aggregate
    peer gpe 0.0.0.0:10000 local
    peer enterprise1 192.168.50.41:10000
    # list more 'peer' lines for other load balancers in the cluster
    # e.g. peer enterprise2 192.168.50.42:10000
    hapee-gpe-stktagg.cfg
    haproxy
    global
    stats socket /var/run/hapee-extras/gpe-api.sock
    aggregations data
    from any to .aggregate
    peer gpe 0.0.0.0:10000 local
    peer enterprise1 192.168.50.41:10000
    # list more 'peer' lines for other load balancers in the cluster
    # e.g. peer enterprise2 192.168.50.42:10000
  2. Edit the file /etc/hapee-extras/hapee-gpe.json to configure data retention policies for storing historical data.

    Data is stored in buckets of time. For example, you might keep 12 1-minute buckets, 24 1-hour buckets, and 2 week-buckets, as shown below, which would allow you to compare a client’s current request rate to the average request rate during the same hour yesterday, for example.

    hapee-gpe.json
    json
    {
    "worker_thread_count": 4,
    "inter_worker_queue_size": 1024,
    "collector_queue_size": 64,
    "httpd_port": 9888,
    "datadir": "/var/cache/hapee-extras/hct_datadir",
    "default_stick_table_handling": 1,
    "prometheus_exporter": 1,
    "ignore_tables": [],
    "detail_tables": [],
    "aggregate_tables": [],
    "stat_retentions": [
    {
    "duration": 300,
    "retention": 12
    },
    {
    "duration": 3600,
    "retention": 24
    },
    {
    "duration": 86400,
    "retention": 14
    }
    ]
    }
    hapee-gpe.json
    json
    {
    "worker_thread_count": 4,
    "inter_worker_queue_size": 1024,
    "collector_queue_size": 64,
    "httpd_port": 9888,
    "datadir": "/var/cache/hapee-extras/hct_datadir",
    "default_stick_table_handling": 1,
    "prometheus_exporter": 1,
    "ignore_tables": [],
    "detail_tables": [],
    "aggregate_tables": [],
    "stat_retentions": [
    {
    "duration": 300,
    "retention": 12
    },
    {
    "duration": 3600,
    "retention": 24
    },
    {
    "duration": 86400,
    "retention": 14
    }
    ]
    }

    In this example:

    • The httpd_port field sets the port on which to publish historical data, which HAProxy Enterprise servers poll for updates. Here, it hosts the data at port 9888. The default IP address is 0.0.0.0 and can be changed with the httpd_addr option.

    • The aggregate_tables, detail_tables, and ignore_tables fields are all empty since we set default_stick_table_handling to 1 which will aggregate all tables.

    • The prometheus_exporter field enables the generation of profiling engine data in Prometheus format from the profiling engine’s /metrics endpoint.

    • The stat_retentions section lists data retention policies. Each policy sets a duration in seconds, which is the size of the data bucket, and a retention, which is the number of buckets to keep. A bucket stores counters from your stick tables. For example, if your stick table tracks the HTTP request rate over 10 seconds, a 1-hour bucket might store many thousands of these 10-second request rate data points.

    For each bucket, the server calculates statistics and serves them on the configured port.

  3. Restart the Global Profiling Engine:

    nix
    sudo systemctl restart hapee-extras-gpe
    nix
    sudo systemctl restart hapee-extras-gpe

Configure HAProxy Enterprise Jump to heading

In this section, you will see examples of how to configure HAProxy Enterprise for historical aggregation.

Example: Use the Global Profiling Engine to enforce rate limits Jump to heading

One use case for historical aggregation is to compare a client’s current request rates against the request rates over time and to then make rate-limiting decisions based on the current rate.

Configure each HAProxy Enterprise server to download and use the historical data.

  1. Create an empty file at /etc/hapee-3.0/historical.map.

    Although an in-memory representation of this file will hold historical values received from the profiling engine, the file must exist on the filesystem when HAProxy Enterprise starts.

    HAProxy Enterprise updates a representation of this file in memory only. You will not see the contents of the file itself updated and it will remain empty, but you can see the in-memory values by calling the Runtime API show map method.

  2. Install the Update module, which polls the profiling engine for new data to load into the map file.

    nix
    sudo apt-get install hapee-<VERSION>-lb-update
    nix
    sudo apt-get install hapee-<VERSION>-lb-update

    Example for HAProxy Enterprise 3.0r1:

    nix
    sudo apt-get install hapee-3.0r1-lb-update
    nix
    sudo apt-get install hapee-3.0r1-lb-update
    nix
    sudo yum install hapee-<VERSION>-lb-update
    nix
    sudo yum install hapee-<VERSION>-lb-update

    Example for HAProxy Enterprise 3.0r1:

    nix
    sudo yum install hapee-3.0r1-lb-update
    nix
    sudo yum install hapee-3.0r1-lb-update
    nix
    sudo zypper install hapee-<VERSION>-lb-update
    nix
    sudo zypper install hapee-<VERSION>-lb-update

    Example for HAProxy Enterprise 3.0r1:

    nix
    sudo zypper install hapee-3.0r1-lb-update
    nix
    sudo zypper install hapee-3.0r1-lb-update
    nix
    sudo pkg install hapee-<VERSION>-lb-update
    nix
    sudo pkg install hapee-<VERSION>-lb-update

    Example for HAProxy Enterprise 3.0r1:

    nix
    sudo pkg install hapee-3.0r1-lb-update
    nix
    sudo pkg install hapee-3.0r1-lb-update
  3. Edit the file /etc/hapee-3.0/hapee-lb.cfg.

    In the global section of the file, add a module-load directive to load the Update module:

    hapee-lb.cfg
    haproxy
    global
    module-load hapee-lb-update.so
    hapee-lb.cfg
    haproxy
    global
    module-load hapee-lb-update.so
  4. Configure the Update module to poll the profiling engine’s /aggs endpoint for data by adding a dynamic-update section that contains an update directive.

    The url parameter should use the profiling engine’s IP address.

    hapee-lb.cfg
    haproxy
    dynamic-update
    update id /etc/hapee-3.0/historical.map map url http://192.168.50.40:9888/aggs delay 3600s log
    hapee-lb.cfg
    haproxy
    dynamic-update
    update id /etc/hapee-3.0/historical.map map url http://192.168.50.40:9888/aggs delay 3600s log

    In this example:

    • The dynamic-update section configures HAProxy Enterprise to poll the profiling engine for historical data updates.
    • The update line’s id parameter sets the local file to update (remember, this file will not be updated on disk, only in HAProxy Enterprise’s runtime memory).
    • The map parameter switches the Update module into map file mode.
    • The url parameter specifies the IP and port of the profiling engine. It specifies the /aggs URL path.
    • The delay parameter sets the polling interval to 3600 seconds. Since our smallest stat_retentions duration is 3600 seconds, we can poll GPE hourly.
    • The log parameter enables logging to the HAProxy Enterprise access log.
  5. As you would for real-time aggregation, add a peers section that lists the local node and the profiling engine on server lines.

    Here you will also define stick tables with their .aggregate clones.

    hapee-lb.cfg
    haproxy
    peers mypeers
    bind :10000
    # The local HAProxy Enterprise node hostname defined by one of the following:
    # 1) the value provided when the load balancer process is started with the -L argument
    # 2) the localpeer name from the global section of the load balancer configuration (suggested method)
    # 3) the hostname as returned by the system hostname command (default)
    server enterprise1
    # The Global Profiling Engine
    server gpe 192.168.50.40:10000
    # stick tables definitions
    table request_rates type ip size 100k expire 30s store http_req_rate(10s)
    table request_rates.aggregate type ip size 100k expire 30s store http_req_rate(10s)
    hapee-lb.cfg
    haproxy
    peers mypeers
    bind :10000
    # The local HAProxy Enterprise node hostname defined by one of the following:
    # 1) the value provided when the load balancer process is started with the -L argument
    # 2) the localpeer name from the global section of the load balancer configuration (suggested method)
    # 3) the hostname as returned by the system hostname command (default)
    server enterprise1
    # The Global Profiling Engine
    server gpe 192.168.50.40:10000
    # stick tables definitions
    table request_rates type ip size 100k expire 30s store http_req_rate(10s)
    table request_rates.aggregate type ip size 100k expire 30s store http_req_rate(10s)

    In this example:

    • Define a bind line to set the IP address and port at which this node should receive data back from the Global Profiling Engine server. In this example, the bind directive listens on all IP addresses at port 10000 and receives aggregated data.

    • Define a server line for the current load balancer server. The server name value is important because it must match the name you set in the Global Profiling Engine server’s configuration for the corresponding peer line. The hostname may be one of the following, in order of precedence:

      • the value provided with the -L argument specified on the command line used to start the load balancer process
      • the localpeer name specified in the global section of the load balancer configuration (this method is used in this example)
      • the host name returned by the system hostname command. This is the default, but we recommend using one of the other two methods
    • Define a server line for the Global Profiling Engine server. Set its IP address and port. The name you set here is also important. It must match the corresponding peer line in the Global Profiling Engine server’s configuration.

  6. Use map fetch methods in your frontend section to read information from the local map file and make traffic routing decisions.

    In the example below, we deny clients that have a request rate higher than the 99th percentile of requests from the same hour (3600 seconds) yesterday (86400 seconds ago).

    hapee-lb.cfg
    haproxy
    frontend fe_main
    bind :80
    # add records to the stick table using the client's IP address as the table key
    http-request track-sc0 src table mypeers/request_rates
    # store the 99th percentile rate and the client's current rate in variables
    http-request set-var(req.rate_99percentile) str(/request_rates.http_req_rate.3600sec.86400sec_ago.99p),map(/etc/hapee-3.0/historical.map,1000)
    http-request set-var(req.client_rate) sc_http_req_rate(0,mypeers/request_rates.aggregate)
    # set ACL expressions
    acl historical_rate_greater_than_zero var(req.rate_99percentile) -m int gt 0
    acl client_rate_exceeds_historical_rate var(req.rate_99percentile),sub(req.client_rate) -m int lt 0
    # deny the request if it exceeds the historical rate
    http-request deny deny_status 429 if historical_rate_greater_than_zero client_rate_exceeds_historical_rate
    default_backend webservers
    hapee-lb.cfg
    haproxy
    frontend fe_main
    bind :80
    # add records to the stick table using the client's IP address as the table key
    http-request track-sc0 src table mypeers/request_rates
    # store the 99th percentile rate and the client's current rate in variables
    http-request set-var(req.rate_99percentile) str(/request_rates.http_req_rate.3600sec.86400sec_ago.99p),map(/etc/hapee-3.0/historical.map,1000)
    http-request set-var(req.client_rate) sc_http_req_rate(0,mypeers/request_rates.aggregate)
    # set ACL expressions
    acl historical_rate_greater_than_zero var(req.rate_99percentile) -m int gt 0
    acl client_rate_exceeds_historical_rate var(req.rate_99percentile),sub(req.client_rate) -m int lt 0
    # deny the request if it exceeds the historical rate
    http-request deny deny_status 429 if historical_rate_greater_than_zero client_rate_exceeds_historical_rate
    default_backend webservers

    In this example:

    • The http-request track-sc0 line adds the current client to the stick table, using their source IP address as the primary key.

    • The http-request set-var(req.rate_99percentile) line reads the value of the /request_rates.http_req_rate.3600sec.86400sec_ago.99p statistic from the historical.map data. If that statistic does not exist or has no data (which happens if there was no traffic during that hour), a value of 1000 is used instead. See the Reference guide to learn how these statistics are named.

    • The http-request set-var(req.client_rate) line retrieves the current client’s request rate from the mypeers/request_rates.aggregate table, which uses real-time aggregation to collect data from all load balancers in your cluster.

    • The http-request deny line rejects requests if the client’s current request rate (aggregated across load balancers) exceeds the 99th percentile rate for all users from the same hour yesterday. (note: If the historical rate is zero, then it defaults to a value of 1000).

  7. Restart HAProxy Enterprise.

    nix
    sudo systemctl restart hapee-3.0-lb
    nix
    sudo systemctl restart hapee-3.0-lb

Verify the setup. First, check that the HAProxy Enterprise admin logs show that the Update module is downloading the map file successfully. If there was an error, it will be written there. If everything worked, there will be no output (no errors).

Also, verify that data is being published by calling the /aggs URL with curl on the aggregation server. You will need to wait until the first bucket has been populated with data, though, which depends on the size of the bucket, before you will see data.

nix
curl http://localhost:9888/aggs
nix
curl http://localhost:9888/aggs
output
text
/request_rates.http_req_rate.3600sec.3600sec_ago.cnt 13
/request_rates.http_req_rate.3600sec.3600sec_ago.sum 29
/request_rates.http_req_rate.3600sec.3600sec_ago.avg 0
/request_rates.http_req_rate.3600sec.3600sec_ago.per_sec_avg 0
/request_rates.http_req_rate.3600sec.3600sec_ago.burst_avg 2
/request_rates.http_req_rate.3600sec.3600sec_ago.min 1
/request_rates.http_req_rate.3600sec.3600sec_ago.max 7
/request_rates.http_req_rate.3600sec.3600sec_ago.50p 2
/request_rates.http_req_rate.3600sec.3600sec_ago.75p 3
/request_rates.http_req_rate.3600sec.3600sec_ago.90p 3
/request_rates.http_req_rate.3600sec.3600sec_ago.95p 3
/request_rates.http_req_rate.3600sec.3600sec_ago.99p 3
/request_rates.http_req_rate.3600sec.3600sec_ago.99.9p 3
output
text
/request_rates.http_req_rate.3600sec.3600sec_ago.cnt 13
/request_rates.http_req_rate.3600sec.3600sec_ago.sum 29
/request_rates.http_req_rate.3600sec.3600sec_ago.avg 0
/request_rates.http_req_rate.3600sec.3600sec_ago.per_sec_avg 0
/request_rates.http_req_rate.3600sec.3600sec_ago.burst_avg 2
/request_rates.http_req_rate.3600sec.3600sec_ago.min 1
/request_rates.http_req_rate.3600sec.3600sec_ago.max 7
/request_rates.http_req_rate.3600sec.3600sec_ago.50p 2
/request_rates.http_req_rate.3600sec.3600sec_ago.75p 3
/request_rates.http_req_rate.3600sec.3600sec_ago.90p 3
/request_rates.http_req_rate.3600sec.3600sec_ago.95p 3
/request_rates.http_req_rate.3600sec.3600sec_ago.99p 3
/request_rates.http_req_rate.3600sec.3600sec_ago.99.9p 3

You can also call the Runtime API’s show map function to see the data stored in the map file.

Example: Use the Global Profiling Engine to calculate response time percentiles Jump to heading

You can use the Global Profiling Engine to track response time percentiles across your HAProxy Enterprise cluster. You can record these response time percentiles on some interval and then use the data for analysis.

Configure each HAProxy Enterprise server to download and use the historical data.

  1. Create an empty file at /etc/hapee-3.0/historical.map.

    Although an in-memory representation of this file will hold historical values received from the profiling engine, the file must exist on the filesystem when HAProxy Enterprise starts. HAProxy Enterprise updates a representation of this file in memory only. You will not see the contents of the file itself updated and it will remain empty, but you can see the in-memory values by calling the Runtime API show map method.

  2. Install the Update module, which polls the profiling engine for new data to load into the map file.

    nix
    sudo apt-get install hapee-<VERSION>-lb-update
    nix
    sudo apt-get install hapee-<VERSION>-lb-update

    Example for HAProxy Enterprise 3.0r1:

    nix
    sudo apt-get install hapee-3.0r1-lb-update
    nix
    sudo apt-get install hapee-3.0r1-lb-update
    nix
    sudo yum install hapee-<VERSION>-lb-update
    nix
    sudo yum install hapee-<VERSION>-lb-update

    Example for HAProxy Enterprise 3.0r1:

    nix
    sudo yum install hapee-3.0r1-lb-update
    nix
    sudo yum install hapee-3.0r1-lb-update
    nix
    sudo zypper install hapee-<VERSION>-lb-update
    nix
    sudo zypper install hapee-<VERSION>-lb-update

    Example for HAProxy Enterprise 3.0r1:

    nix
    sudo zypper install hapee-3.0r1-lb-update
    nix
    sudo zypper install hapee-3.0r1-lb-update
    nix
    sudo pkg install hapee-<VERSION>-lb-update
    nix
    sudo pkg install hapee-<VERSION>-lb-update

    Example for HAProxy Enterprise 3.0r1:

    nix
    sudo pkg install hapee-3.0r1-lb-update
    nix
    sudo pkg install hapee-3.0r1-lb-update
  3. Edit the file /etc/hapee-3.0/hapee-lb.cfg.

    In the global section of the file, add a module-load directive to load the Update module:

    hapee-lb.cfg
    haproxy
    global
    module-load hapee-lb-update.so
    hapee-lb.cfg
    haproxy
    global
    module-load hapee-lb-update.so
  4. Configure the Update module to poll the profiling engine’s /aggs endpoint for data by adding a dynamic-update section that contains an update directive.

    The url parameter should use the profiling engine’s IP address.

    hapee-lb.cfg
    haproxy
    dynamic-update
    update id /etc/hapee-3.0/historical.map map url http://192.168.50.40:9888/aggs delay 10s log
    hapee-lb.cfg
    haproxy
    dynamic-update
    update id /etc/hapee-3.0/historical.map map url http://192.168.50.40:9888/aggs delay 10s log

    In this example:

    • The dynamic-update section configures HAProxy Enterprise to poll the profiling engine for historical data updates.
    • The update line’s id parameter sets the local file to update (remember, this file will not be updated on disk, only in HAProxy Enterprise’s runtime memory).
    • The map parameter switches the Update module into map file mode.
    • The url parameter specifies the IP and port of the profiling engine. It specifies the /aggs URL path.
    • The delay parameter sets the polling interval to 10 seconds.
    • The log parameter enables logging to the HAProxy Enterprise access log.
  5. As you would for real-time aggregation, add a peers section that lists the local node and the profiling engine on server lines.

    Here you will also define stick tables with their .aggregate clones.

    hapee-lb.cfg
    haproxy
    peers mypeers
    bind :10000
    server enterprise1
    server gpe 192.168.50.40:10000
    table st_responsetime type string len 64 size 100k expire 1h store gpt0
    table st_responsetime.aggregate type string len 64 size 100k expire 1h store gpt0
    hapee-lb.cfg
    haproxy
    peers mypeers
    bind :10000
    server enterprise1
    server gpe 192.168.50.40:10000
    table st_responsetime type string len 64 size 100k expire 1h store gpt0
    table st_responsetime.aggregate type string len 64 size 100k expire 1h store gpt0

    Be sure that the hostname of the HAProxy Enterprise node and the hostname of the Global Profiling Engine instance that you specify are the configured hostnames of those instances. Use the hostname command on each instance to retrieve the names.

    nix
    hostname
    nix
    hostname
    output
    text
    enterprise1
    output
    text
    enterprise1
  6. In your frontend, track the total response time of each request in the stick table mypeers/st_responsetime. We use the general purpose tag (gpt0) to store the response time value in the stick table. Each record uses a unique ID as its key, where the unique ID’s format is a combination of the client’s IP address, client’s port, frontend IP address, frontend port, a timestamp, a request counter, and the process ID.

    hapee-lb.cfg
    haproxy
    frontend fe_main
    bind :80
    # generate a unique ID
    unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid
    http-request set-var(txn.path) path
    # add records to the stick table using the unique ID as table key
    http-request track-sc0 unique-id table mypeers/st_responsetime
    # prepare and perform the calculation for response times
    http-response set-var-fmt(txn.response_time) %Tr
    http-response set-var-fmt(txn.connect_time) %Tc
    http-response set-var-fmt(txn.queue_time) %Tw
    http-response sc-set-gpt0(0) var(txn.response_time),add(txn.queue_time),add(txn.connect_time)
    # store the 99th percentile rate in variables
    http-request set-var(req.response_time_99percentile) str(/st_responsetime.gpt0.3600sec.3600sec_ago.99p),map(/etc/hapee-3.0/historical.map,1000)
    default_backend webservers
    hapee-lb.cfg
    haproxy
    frontend fe_main
    bind :80
    # generate a unique ID
    unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid
    http-request set-var(txn.path) path
    # add records to the stick table using the unique ID as table key
    http-request track-sc0 unique-id table mypeers/st_responsetime
    # prepare and perform the calculation for response times
    http-response set-var-fmt(txn.response_time) %Tr
    http-response set-var-fmt(txn.connect_time) %Tc
    http-response set-var-fmt(txn.queue_time) %Tw
    http-response sc-set-gpt0(0) var(txn.response_time),add(txn.queue_time),add(txn.connect_time)
    # store the 99th percentile rate in variables
    http-request set-var(req.response_time_99percentile) str(/st_responsetime.gpt0.3600sec.3600sec_ago.99p),map(/etc/hapee-3.0/historical.map,1000)
    default_backend webservers
  7. Restart HAProxy Enterprise.

    nix
    sudo systemctl restart hapee-3.0-lb
    nix
    sudo systemctl restart hapee-3.0-lb

Verify the setup. First, check that the HAProxy Enterprise admin logs show that the Update module is downloading the map file successfully. If there was an error, it will be written there. If everything worked, there will be no output (no errors).

Also, verify that data is being published by calling the /aggs URL with curl on the aggregation server. You will need to wait until the first bucket has been populated with data, though, which depends on the size of the bucket, before you will see data.

nix
curl http://localhost:9888/aggs
nix
curl http://localhost:9888/aggs
output
text
/st_responsetime.gpt0.3600sec.3600sec_ago.cnt 2
/st_responsetime.gpt0.3600sec.3600sec_ago.sum 24153
/st_responsetime.gpt0.3600sec.3600sec_ago.avg 12076
/st_responsetime.gpt0.3600sec.3600sec_ago.per_sec_avg 0
/st_responsetime.gpt0.3600sec.3600sec_ago.burst_avg 0
/st_responsetime.gpt0.3600sec.3600sec_ago.min 9769
/st_responsetime.gpt0.3600sec.3600sec_ago.max 14384
/st_responsetime.gpt0.3600sec.3600sec_ago.50p 9775
/st_responsetime.gpt0.3600sec.3600sec_ago.75p 14391
/st_responsetime.gpt0.3600sec.3600sec_ago.90p 14391
/st_responsetime.gpt0.3600sec.3600sec_ago.95p 14391
/st_responsetime.gpt0.3600sec.3600sec_ago.99p 14391
/st_responsetime.gpt0.3600sec.3600sec_ago.99.9p 14391
output
text
/st_responsetime.gpt0.3600sec.3600sec_ago.cnt 2
/st_responsetime.gpt0.3600sec.3600sec_ago.sum 24153
/st_responsetime.gpt0.3600sec.3600sec_ago.avg 12076
/st_responsetime.gpt0.3600sec.3600sec_ago.per_sec_avg 0
/st_responsetime.gpt0.3600sec.3600sec_ago.burst_avg 0
/st_responsetime.gpt0.3600sec.3600sec_ago.min 9769
/st_responsetime.gpt0.3600sec.3600sec_ago.max 14384
/st_responsetime.gpt0.3600sec.3600sec_ago.50p 9775
/st_responsetime.gpt0.3600sec.3600sec_ago.75p 14391
/st_responsetime.gpt0.3600sec.3600sec_ago.90p 14391
/st_responsetime.gpt0.3600sec.3600sec_ago.95p 14391
/st_responsetime.gpt0.3600sec.3600sec_ago.99p 14391
/st_responsetime.gpt0.3600sec.3600sec_ago.99.9p 14391

You can also call the Runtime API’s show map function to see the data stored in the map file.

Do you have any suggestions on how we can improve the content of this page?