Global Profiling Engine
Global Profiling Engine reference
This section describes all configuration options for the Global Profiling Engine.
GPE peers configuration Jump to heading
The following fields can be set in the /etc/hapee-extras/hapee-gpe-stktagg.cfg
file:
global section Jump to heading
The global
section supports the following fields:
Field | Description |
---|---|
bind |
Available since GPE 1.0 Adds a listener to which aggregations sections can be attached. This allows GPE to listen for several aggregations sections on the same IP/port. You can set multiple bind lines, and more than one can serve the same aggregations section if they use the same use_aggrs values. See the bind syntax section below. |
cpu-map |
This has the same meaning as the cpu-map directive for HAProxy. It configures the CPU affinity of the GPE processes. |
dynamic-peers |
Available since GPE 1.0 Enables the GPE to accept up to 64 concurrent connections from peers that are not defined in the configuration file by aggregations sections. This limit of 64 concurrents peers includes the ones set up by the configuration file. |
hash-table |
Available since GPE 1.0 Configures global hash-table settings that are common for all stick tables managed by the process. |
source |
Binds an IP/port for outgoing connections. Set it as source <ipv4 or ipv6>[:<port>] . |
stats socket |
Creates a listener with which you can interact through a TCP or UNIX domain socket with the aggregator at runtime. |
trash-batch |
Sets the number of incoming update messages on a table to perform a lookup of expired entries to trash. A maximum of the same number of entries is trashed. Default: 10000. |
The bind
line has the following syntax:
text
bind <ip:port> [ssl] [crt <path>] [ca-file <path] [verify none | optional | required] [use_aggrs (all | <aggregation ID>[,aggregation ID>...)]
text
bind <ip:port> [ssl] [crt <path>] [ca-file <path] [verify none | optional | required] [use_aggrs (all | <aggregation ID>[,aggregation ID>...)]
where:
Argument | Description |
---|---|
use_aggrs |
This option must be followed by the special value all or by a list of aggregations section identifiers seperated by commas. If some sections are not defined in the configuration file, they will be ignored. The all special value may be used to specify that you want the listener to accept incoming connections for any of the aggregation configurations defined by aggregations sections. This is also the default value when use_aggrs is not specified. NOTE: As a listener configured on a bind line has no peer name, a peer configured on the load balancer side to connect to such a listener must set a peer with the aggregations section identifier as the peer name. |
The hash-table
line has the following syntax:
text
hash-table [load-factor <lf>] [low <low>] [high <high>] [steps <steps>]
text
hash-table [load-factor <lf>] [low <low>] [high <high>] [steps <steps>]
where:
Argument | Description |
---|---|
load-factor <lf> |
Sets the optimal load factor that the GPE will try its best to enforce on the hash-table. Load factor is the ratio of the current number of entries in the table divided by the number of buckets. When load factor is reached, the GPE will automatically try to grow (and rehash) the hash-table on the fly. It expects values between 1 and 255. Default: 3. The default value is known to offer a good performance/memory ratio, but if memory is not a concern and performance should prevail, setting this value to 2 could have a noticeable impact. In some setups setting load-factor to 1 may also be relevant. |
low <low> |
Sets the pow2 (power of 2) exponent that will be used to compute the number of buckets. It expects values between 0 and 31. All hash-tables will be initialized with this value, which means it will directly affect the memory footprint for all stick-tables handled by the GPE. A large value means higher performance (less resizing involved), and a lower value means a smaller upfront memory cost. Default: 12. The default value means the hash-table will start with 4096 buckets (pow2(12) = 4096). |
high <high> |
Sets the pow2 (power of 2) exponent that will be used to compute the buckets upper limit (maximum number of buckets that the hash-table may allocate upon resizes). It expects values between 0 and 31. Default: 20. The default value means the hash-table may allocate up to 1m buckets (pow2(20) = 1048576) upon resize. |
steps <steps> |
Takes a list of intermediary pow2 exponents (separated by coma) that will be used for hash-table resizing. Values outside of <low>...<high> range will be ignored: if no valid step is specified between <low> and <high> , then the hash-table will directly jump from <low> to <high> pow2 upon resize. Set <step> to all to use all possible pow2 exponents between <low> and <high> . Default: 14,16,18,19,20. |
aggregations section Jump to heading
You can have one or more aggregations
sections, where the following fields are supported:
Field | Description |
---|---|
dynamic-peers |
Available since GPE 1.0 This field has exactly the same meaning as dynamic-peers in the global section. It enables the dynamic peers feature. But contrary to the option for global section, the one for aggregations sections limits the feature to that aggregations section. |
forward |
This field supports the syntax forward <suffix1,suffix2...> . The stick tables with those suffixes are considered to be forwarded from upward to downward servers. Note that updates to one of those tables coming from a down server will be ignored. |
from |
Gives some information to the GPE about how to name the destination stick tables (the aggregated stick-tables). It accepts two forms of syntax: from any to <suffix> or from <suffix1,suffix2,...> to <suffix> [accept-no-suffix] [no-feedback] [global-exp] . A suffix string must have . as the first character. Specify multiple suffixes by separating them with commas with no spaces between. Use any to aggregate all stick tables. Or specify which tables to aggregate by their suffix, which requires that you add suffixes to your stick table names in the load balancer configuration. If accept-no-suffix is specified, then aggregate tables that have no suffix at all in addition to those with suffixes specified in the list. If no-feedback is specified, then only “up” peers will receive aggregation results. If global-exp is specified, then all connected peers for a given table entry will have their expire tracker refreshed each time one of the peers pushes an update to the entry. |
peer |
The peer lines support the syntax peer <peer name> <ip:port> ['local' or 'up' or 'down'] [group <group_id>] [ssl] [crt <path>] [ca-file <path>] ['verify none' or 'verify optional' or 'verify required'] . They are made of two mandatory settings, their name and IP/port, followed by optional, exclusive keywords local , up or down , and possibly the optional group setting for multilevel setups. ssl enables TLS/SSL. crt is the certificate to present to the peer. ca-file is the file containing the trusted CAs in PEM format. verify has three possible values: none means that verify is disabled; optional means that a client certificate is requested but verify is performed only if the peer provide a certificate in response; required means that the peer certificate is mandatory in the response and the verification is always performed. Note that there can only be at most 64 peers handled by an aggregations section regardless of the fact if they are dynamic or declared inside an aggregations section. So, there can be at most 64 peer lines, the local peer being not included. if there is no listener configured by a bind line in the global section to handle the aggregations section, one peer must be local and followed by the local keyword. |
Historical stats configuration Jump to heading
The following fields can be set in the /etc/hapee-extras/hapee-gpe.json
file:
Field | Description |
---|---|
aggregate_tables |
An array of stick table names that should be processed. One set of aggregates will be created for each stick table as a whole. |
collector_queue_size |
The size of the message queue between workers and the collector. It must be greater than 2 and a power of 2. Default: 64 . |
datadir |
The directory in which to store historical data files. |
default_stick_table_handling |
Indicates how the server should process stick tables that are not listed in the ignore_tables , detail_tables , or aggregate_tables arrays. Values: 0 = ignore, 1 = aggregate, 2 = detailed processing (Experimental). |
default_toplist_table_handling |
Available since GPE 1.0 Indicates whether to enable the generation of toplist statistics for tables not included in enable_toplist_tables . Enabled when 1 , disabled when 0 . Default: 0 . |
detail_tables (Experimental) |
An array of stick table names that should be processed. One set of aggregates will be created for every value in the stick table. |
disable_toplist_tables |
Available since GPE 1.0 A list of stick tables to not generate toplists for. |
enable_toplist_tables |
Available since GPE 1.0 A list of stick tables to generate toplists for. |
httpd_addr |
Available since GPE 1.0 The IP address on which to publish historical statistics data. Can be IPv4 or IPv6, with or without brackets. IPv4 wildcard is * . Default: 0.0.0.0 . |
httpd_port |
The TCP port on which to publish historical statistics data. Default: 9888 . |
ignore_tables |
An array of stick table names that should be skipped during processing. |
inter_worker_queue_size |
The size of the message queue that handles communication between workers. It must be greater than 2 and a power of 2. Default: 1024 . |
prometheus_exporter |
Generate profiling engine data in Prometheus format at the profiling engine’s /metrics endpoint. Default: 0 (disable). |
stat_retentions |
An array of data retention policies. Each policy should have: duration : an integer value in seconds indicating the size of the bucket (i.e. time period) to aggregate data for. retention : the number of buckets to keep. |
toplist_element_count |
Available since GPE 1.0 The number of elements in a toplist. Default: 10 . Max: 50 . |
worker_thread_count |
The number of worker threads to start. Default: 2 . |
Historical statistics reference Jump to heading
Calling the /aggs
endpoint on port 9888 returns a list of available statistics. For example, if you set the following retention policy in the stat_retentions
field:
json
// 24 1-hour buckets"duration": 3600,"retention": 24
json
// 24 1-hour buckets"duration": 3600,"retention": 24
The /aggs
endpoint would return data for each bucket. Each bucket contains one hour of data (3600-seconds), representing one of the hours during the last 24 hours. For example, the following statistics are recorded for the hour that happened one hour ago:
text
/request_rates.http_req_rate.3600sec.3600sec_ago.cnt 362/request_rates.http_req_rate.3600sec.3600sec_ago.sum 3547/request_rates.http_req_rate.3600sec.3600sec_ago.avg 10/request_rates.http_req_rate.3600sec.3600sec_ago.per_sec_avg 0/request_rates.http_req_rate.3600sec.3600sec_ago.burst_avg 10/request_rates.http_req_rate.3600sec.3600sec_ago.min 8/request_rates.http_req_rate.3600sec.3600sec_ago.max 11/request_rates.http_req_rate.3600sec.3600sec_ago.50p 9/request_rates.http_req_rate.3600sec.3600sec_ago.75p 9/request_rates.http_req_rate.3600sec.3600sec_ago.90p 9/request_rates.http_req_rate.3600sec.3600sec_ago.95p 9/request_rates.http_req_rate.3600sec.3600sec_ago.99p 9/request_rates.http_req_rate.3600sec.3600sec_ago.99.9p 9
text
/request_rates.http_req_rate.3600sec.3600sec_ago.cnt 362/request_rates.http_req_rate.3600sec.3600sec_ago.sum 3547/request_rates.http_req_rate.3600sec.3600sec_ago.avg 10/request_rates.http_req_rate.3600sec.3600sec_ago.per_sec_avg 0/request_rates.http_req_rate.3600sec.3600sec_ago.burst_avg 10/request_rates.http_req_rate.3600sec.3600sec_ago.min 8/request_rates.http_req_rate.3600sec.3600sec_ago.max 11/request_rates.http_req_rate.3600sec.3600sec_ago.50p 9/request_rates.http_req_rate.3600sec.3600sec_ago.75p 9/request_rates.http_req_rate.3600sec.3600sec_ago.90p 9/request_rates.http_req_rate.3600sec.3600sec_ago.95p 9/request_rates.http_req_rate.3600sec.3600sec_ago.99p 9/request_rates.http_req_rate.3600sec.3600sec_ago.99.9p 9
Each line is a key and value. The key has this format:
/name-of-stick-table . name-of-counter . bucket-duration . time since bucket occurred . statistic
For example:
text
/request_rates.http_req_rate.3600sec.86400sec_ago.99p
text
/request_rates.http_req_rate.3600sec.86400sec_ago.99p
Some lines have a negative number in them:
text
/request_rates.http_req_rate.3600sec.-3600sec_ago.cnt
text
/request_rates.http_req_rate.3600sec.-3600sec_ago.cnt
This indicates a sliding time window that has a begin and end time that changes at a regular inteveral (i.e. an hour ago from now, or more realistically, at the next time the smallest bucket is calculated). In contrast, the following metric would be updated only at the top of every hour:
text
/request_rates.http_req_rate.3600sec.3600sec_ago.cnt
text
/request_rates.http_req_rate.3600sec.3600sec_ago.cnt
The table below describes each statistic:
Field | Description |
---|---|
cnt |
The count of data points of the counter (e.g. HTTP rate limit) recorded in the bucket. |
sum |
The sum of all data point values in the bucket. |
avg |
An average of all data points in the bucket that preserves the time period of the stick table counter, which makes it easy to work with when comparing it to current request rates in HAProxy Enterprise; the sum of all data points (e.g. 3547) is multiplied by the stick table counter period (e.g. 10 for http_req_rate(10s) ), then divided by the duration of the bucket (e.g. 3600). |
persec_avg |
An average of all data points in the bucket, converted to a 1-second average (discards the period of the stick table counter). |
burst_avg |
A traditional, mathematical average; the sum of all data points is divided by the count. |
min |
The minimum data point value in the bucket. |
max |
The maximum data point value in the bucket. |
50p |
The 50th percentile. |
75p |
The 75th percentile. |
90p |
The 90th percentile. |
95p |
The 95th percentile. |
99p |
The 99th percentile. |
99.9p |
The 99.9th percentile. |
Do you have any suggestions on how we can improve the content of this page?