HAProxy 3.0 maintains the strong momentum of our open-source load balancer into 2024 with improvements to simplicity, security, reliability, flexibility, and more. In this blog post, we'll dive into what’s new in version 3.0, providing examples and context. It’s a long list, so get comfortable and maybe bring a snack!
All these improvements (and more) will be incorporated into HAProxy Enterprise 3.0, releasing later this year.
Simplicity
New crt-store configuration section
The new crt-store
configuration section provides a flexible way to store and consume SSL certificates. Replacing crt-list
, crt-store
separates certificate storage from their use in a frontend, and provides better visibility for certificate information by moving it from external files and placing it into the main HAProxy configuration. The crt-store
section allows you to individually specify the locations of each certificate component, for example certificates files, key files, and OCSP response files. Aliases provide support for human-friendly names for referencing the certificates more easily on bind
lines. The ocsp-update
argument is now configured in a crt-store
instead of a crt-list
.
Consider the following example where a frontend references a crt-list
for its certificate information, including OCSP response files. This is how you would use ocsp-update
prior to crt-stores
:
frontend main | |
bind :443 ssl crt-list /etc/haproxy/certs/crt-list.txt | |
default_backend app |
The contents of crt-list.txt
would be:
/etc/haproxy/certs/site1.pem [alpn h2 ocsp-update on] |
And contained in site1.pem
would be the public certificate, any intermediate certificates, and the private key. Using ocsp-update on
, HAProxy knows to look for an .ocsp
file in the same directory as the certificates and keys with the same name (site1.ocsp
).
Note that defining the certificate information in this way makes the information less visible, as it is not defined within the HAProxy configuration file.
To remedy this, use a crt-store
section. To the HAProxy configuration file, add the following:
crt-store web | |
crt-base /etc/haproxy/ssl/certs | |
load crt "site1.pem" ocsp "site1.ocsp" alias "site1" ocsp-update on |
Note that in this example, the certificates, keys, and OCSP response files are split into their own files, and may be referenced individually. We are specifying a crt-base
; this is the directory where we will place the files.
Now, you can reference these certificate elements in a frontend by their alias:
frontend main | |
bind *:443 ssl crt "@web/site1" | |
default_backend app |
You must reference the certificates in a crt-store
by using @<crt-store name>/<certificate name or alias>
, which in this case is @web/site1
. Note here that we also gave our certificates for site1 an alias, and are referencing them by the alias (site1
). If you do not provide a name for your crt-store
, you can reference its certificates like so: @/site1
, leaving out the <crt-store name>
.
You can also use crt-store
to specify keys separately from their certificates, as is the case in this example:
crt-store web | |
crt-base /etc/ssl/certs/ | |
key-base /etc/ssl/private/ | |
load crt "site1.crt" alias "site1" | |
load crt "site2.crt" key "site2.key" |
In this case, for site2, there are separate crt
and key
files. We specify a crt-base
, the location of the certificates, and a key-base
—the location of the keys. Once you define your certificates and keys in a crt-store
you can then reference multiple certificates on the bind
line:
frontend main | |
bind *:443 ssl crt "@web/site1" crt "@web/site2.crt" |
Since we did not give site2’s certificates an alias, you must use the full filename (site2.crt
) to reference the certificate.
You can also specify the crt-base
and key-base
in your global settings. Note that including crt-base
or key-base
in a crt-store
take precedence over the global settings. The same is true for using absolute paths when specifying your certificates and keys.
Security
Protocol glitches
Some HTTP/2 requests are valid from a protocol perspective but pose problems anyway. For example, sending a single header as a large number of CONTINUATION frames could cause a denial of service. HAProxy now counts these so-called glitches and allows you to set a limit on them. You can also track them in a stick table to identify buggy applications or misbehaving clients.
OpenSSL security level
A new global keyword ssl-security-level
allows you to set globally, that is, on every HAProxy SSL context, the OpenSSL’s internal security level. This enforces the appropriate checks and restrictions per the level specified. Parameters specified, including cipher suite encryption algorithms, supported ECC curves, supported signature algorithms, DH parameter sizes, certificate key sizes, and signature algorithms, inconsistent with the level will be rejected. For more information see: OpenSSL Set Security Level.
Specify a value between 0 and 5 to set the level, where 5 is the most strict:
global | |
ssl-security-level 5 |
Dependency on libsystemd removed
HAProxy’s security has been hardened by the removal of the dependency on the libsystemd
library. This libsystemd
library has many additional dependencies of its own, including the library at fault for the XZ Utils backdoor vulnerability. Removing the dependency on libsystemd
means that HAProxy is not exposed to these undesired libraries.
Prevent HAProxy from accepting traffic on privileged ports
Two new global settings, harden.reject-privileged-ports.tcp
for TCP connections and harden.reject-privileged-ports.quic
for QUIC connections, enable HAProxy to ignore traffic when the client uses a privileged port (0 - 1023) as their source port. Clients using this range of ports are suspicious and such behavior often indicates spoofing, such as to launch DNS/NTP amplification attacks. The benefit in using these settings is that during DNS/NTP amplification attacks, CPU is reserved as HAProxy will drop the packets on these privileged ports instead of parsing them.
Reliability
Persist stats
Reloading HAProxy will no longer reset the HAProxy Stats page, as long as you call the new Runtime API command dump stats-file
first to save the current state to a file and then load that file with the stats-file
global configuration directive. Upon reload, the new processes set their counters to the previous values (as they appear in the stats file). Only proxy counters are supported which includes values for frontends, backends, servers, and listeners. Ensure that you've set a GUID on each frontend, backend, listen, and server object by using the new guid
keyword. Only objects to which you have assigned a GUID will have their stats persisted across the reload.
To enable this behavior, follow these five steps:
Step 1
Assign GUIDs to each frontend
, backend
, listen
, and server
object in your configuration. Here is a simple configuration with one frontend that routes to one backend with one server. The frontend, backend, and server will each have a globally unique identifier assigned:
frontend main | |
guid 64aeebba-0911-483c-aa39-8f689cf3f657 | |
bind *:80 | |
default_backend app | |
backend app | |
guid 9de9a5de-c3c2-47c4-b945-c4f61b7ece6a | |
balance roundrobin | |
server app1 127.0.0.1:8080 check guid ae9194cc-f473-43de-ac21-43142340cf9c |
Step 2
Reload HAProxy to apply your configuration changes:
$ sudo systemctl reload haproxy |
Step 3
Call dump stats-file
to create the stats file, redirecting the output to your desired location:
$ echo "dump stats-file" | \ | |
sudo socat stdio tcp4-connect:127.0.0.1:9999 > <path/to/stats/file> |
The data in the file will capture the current state of the objects.
Step 4
Add stats-file
to your global settings with the path to your stats file location that will be reloaded:
global | |
stats-file <path/to/stats/file> |
Step 5
Reload HAProxy. It will reload the stats counters from the file.
You can also generate GUIDs for items on bind
lines. The guid-prefix
should be used in this case to specify a string prefix for automatically generated GUIDs. Each listener on the bind line will have a GUID automatically generated for it with the prefix included. Here's an example:
bind :80 guid-prefix my-prefix |
HTTP/2: Limit the number of streams per connection
You can limit the number of total streams processed per incoming connection for HTTP/2 using the global keyword tune.h2.fe-max-total-streams
. Once this limit is reached, HAProxy will send a graceful GOAWAY frame informing the client that it will close the connection after all pending streams have been closed. Usually, this prompts clients to reestablish their connections. This is helpful in situations where there is an imbalance in the load balancing due to clients maintaining long-lived connections.
Diagnostic mode
Diagnostic mode (-dD
) will now report on likely problematic ACL pattern values that look like known ACL/sample fetch keywords in addition to its other warnings about suspicious configuration statements. It does not prevent startup. This is helpful for troubleshooting problematic configurations.
Consistent hash server mapping
When load balancing using a hash-based algorithm (balance hash
), HAProxy must keep track of which server is which. Instead of using numeric IDs to compute hash keys for the servers in a backend, the hash-key
directive now supports using the servers’ addresses and ports to compute the hash keys. This is useful in cases where multiple HAProxy processes are balancing traffic to the same set of servers, as each independent HAProxy process will calculate the same hash key and therefore agree on routing decisions, even if its list of servers is in a different order.
The new hash-key
directive allows you to specify how the node keys, that is, the hash for each server, are calculated. This applies to node keys where you have set hash-type
to consistent
. There are three options for hash-key
:
id
: the server’s numericid
if set, or its position in the server listaddr
: the server’s IP addressaddr-port
: the server’s IP address and port
If you were to use hash-key
id, and have no IDs explicitly set for your servers, the hashes could be inconsistent across load balancers if your server list were in different orders on each server. Using addr
or addr-port
solves this problem.
Consider the following backend:
backend app | |
balance hash pathq | |
hash-type consistent | |
server app1 192.168.56.30:80 check hash-key addr | |
server app2 192.168.56.31:80 check hash-key addr |
This backend will use the hash
load balancing algorithm, calculating the hash using the sample expression pathq
(the request’s URL path with the query-string). Since we specified the hash-type
as consistent, and the servers’ hash-key
as addr
(the IP address of the server), the servers’ IP addresses are used in the hash key computation. As such, if we had multiple load balancers with the servers listed in different orders, the routing would be consistent across the load balancers.
If multiple instances of HAProxy are running on the same host, for example in Docker containers, using both the IP address and the port (hash-key addr-port
) is appropriate for creating the hash key.
Memory limit
To limit the total allocatable memory to a specified number of megabytes across all processes, use -m <limit>
. This is useful in situations where resources are constrained, but note that specifying a limit this way may cause connection refusals and slowdowns depending on memory usage requirements. The limiting mechanism now uses RLMIT_DATA
instead of RLIMIT_AS
, which prevents an issue where the processes were reaching out-of-memory conditions below the configured limit. In master-worker mode, the memory limit is applied separately to the master and its forked worker process, as memory is not shared between processes.
Unix sockets and namespaces
Version 3.0 supports the namespace
keyword for UNIX sockets specified on bind
and server
lines. If a permissions issue arises, HAProxy will log a permissions error instead of “failed to create socket”, since the capability cap_sys_admin
is required for using namespaces for sockets.
For example, to specify a UNIX socket with a namespace, use the namespace
keyword on your server
line:
backend app | |
server app1 /var/run/my-application-socket.sock check send-proxy namespace my-namespace |
gRPC
With the gRPC protocol, the client can cancel communication with the server, which should be conveyed to the server so that it can clean up resources and perhaps invoke cancellations of its own to upstream gRPC services. Cancellations are useful for a few reasons. Often, gRPC clients configure deadlines so that a call will be canceled if it runs too long. Or a client might invoke a cancellation if it finishes early and no longer needs the stream. Read more about cancellation in the gRPC documentation.
Prior to HAProxy 3.0, cancellations, which are represented as RST_STREAM frames in HTTP/2 or as STOP_SENDING frames in HTTP/3, were not being properly relayed to the server. This has been fixed. Furthermore, HAProxy 3.0 adds new fetch methods that indicate whether the stream was canceled (aka aborted) and the error code. Below, we include them in a custom log format:
log-format "$HAPROXY_HTTP_LOG_FMT fs-aborted: %[fs.aborted] fs-rst-code: %[fs.rst_code] bs-aborted: %[bs.aborted] bs-rst-code: %[bs.rst_code]" |
Here's the output when the client cancels:
fs-aborted: 1 fs-rst-code: 8 bs-aborted: 0 bs-rst-code: - |
New global keyword: expose-deprecated-directives
If you want to use deprecated directives you must also use the expose-deprecated-directives
global keyword which will silence warnings. This global keyword applies to deprecated directives that do not have alternative solutions.
Emergency memory allocation
The mechanism for emergency memory allocation in low memory conditions has been improved for HTTP/1 and applets. Forward progress in processing is guaranteed as tasks are now queued according to their criticality and which missing buffer they require.
Flexibility
Write HAProxy logs as JSON or CBOR
You can now format log lines as JSON and CBOR. When configuring a custom log format, you will indicate which to use, and then in parentheses set the key for each field.
Here is an example for JSON:
frontend mysite | |
bind :80 | |
log-format "%{+json}o %(client_ip)ci %(client_port)cp %(request_date)tr %(frontend_name)ft %(backend_name)b %(server_name)s %(time_to_receive)TR %(time_waiting)Tw %(time_to_connect)Tc %(time_server_response)Tr %(time_active)Ta %(status_code)ST %(bytes_read)B %(request_cookies)CC %(response_cookies)CS %(termination_state)tsc %(process_active_connections)ac %(frontend_active_connections)fc %(backend_active_connections)bc %(server_active_connections)sc %(retries)rc %(server_queue)sq %(backend_queue)bq %(request_headers)hr %(response_headers)hs %(request_line)r" | |
use_backend webservers |
This generates messages in the log file that looks like this:
{"client_ip": "172.21.0.1", "client_port": 57526, "request_date": "05/Jun/2024:20:59:57.698", "frontend_name": "mysite", "backend_name": "webservers", "server_name": "web1", "time_to_receive": 0, "time_waiting": 0, "time_to_connect": 0, "time_server_response": 0, "time_active": 0, "status_code": 200, "bytes_read": 993, "request_cookies": "", "response_cookies": "", "termination_state": "----", "process_active_connections": 1, "frontend_active_connections": 1, "backend_active_connections": 0, "server_active_connections": 0, "retries": 0, "server_queue": 0, "backend_queue": 0, "request_headers": null, "response_headers": null, "request_line": "GET / HTTP/1.1"} |
If you switch that line to start with %{+cbor}
instead of %{+json}
, then the generated messages will look like this:
BF69636C69656E745F69706A3137322E32312E302E316B636C69656E745F706F727419E0306C726571756573745F64617465781830352F4A756E2F323032343A32313A30323A33392E3334346D66726F6E74656E645F6E616D657F666D7973697465FF6C6261636B656E645F6E616D656A776562736572766572736B7365727665725F6E616D6564776562316F74696D655F746F5F72656365697665006C74696D655F77616974696E67006F74696D655F746F5F636F6E6E656374007474696D655F7365727665725F726573706F6E7365006B74696D655F616374697665006B7374617475735F636F646518C86A62797465735F726561641903E16F726571756573745F636F6F6B696573F670726573706F6E73655F636F6F6B696573F6717465726D696E6174696F6E5F7374617465642D2D2D2D781A70726F636573735F6163746976655F636F6E6E656374696F6E7301781B66726F6E74656E645F6163746976655F636F6E6E656374696F6E7301781A6261636B656E645F6163746976655F636F6E6E656374696F6E730078197365727665725F6163746976655F636F6E6E656374696F6E73006772657472696573006C7365727665725F7175657565006D6261636B656E645F7175657565006F726571756573745F68656164657273F670726573706F6E73655F68656164657273F66C726571756573745F6C696E657 |
Virtual and optional ACL and Map files
In containerized environments, sometimes you don't want to deal with attaching a volume and mapping that volume to the container's filesystem, especially when it comes to storing data that is dynamic. In other words, wouldn't it be nice to remove the need to attach a volume? With HAProxy 3.0, you can now work with virtual ACL and map files. By prefixing your ACL and Map files with virtual@
, HAProxy won't search the filesystem for the files but rather creates in-memory representations of them only.
The configuration below sets the stage for adding IP addresses to a denylist, where that denylist is virtual:
frontend mysite | |
bind :80 | |
acl denylist src -f virt@denylist.acl | |
http-request deny if denylist | |
default_backend webservers |
Then to add an address to the denylist, use the Runtime API. Below, we deny the IP address 172.20.0.1
:
$ echo "add acl virt@denylist.acl 172.20.0.1" | sudo socat stdio tcp4-connect:127.0.0.1:9999 |
You can also prefix the filename with opt@
, which marks the file as optional. In that case, HAProxy will check for the file on the filesystem, but if it doesn't find it, will assume it is virtual. That's useful for later saving the contents to a file so that they persist across reloading HAProxy.
Report the key that matched in a map file
A request matched a row in a map file, but which row? This version of HAProxy adds several, new map_*_key
converters that return the key that was matched, rather than the associated value, making it easier to view the reason why the load balancer rerouted a request or took some other action.
In the example below, we use a Map file for path-based routing, where the request's URL path determines which backend to send a request to. By using a converter that ends in _key
, in this case map_beg_key
, which will match the beginning of the key in the file and then return the key, we can record which key in the Map file matches the request:
frontend mysite | |
bind :80 | |
# Store path in a variable | |
http-request set-var(txn.path) path | |
use_backend %[var(txn.path),map_beg(/etc/haproxy/paths.map,webservers)] | |
# Custom log format using map_beg_key | |
log-format "Matching key: %[var(txn.path),map_beg_key(/etc/haproxy/paths.map)]" |
Let's assume that paths.map
looks like this:
/api apiservers | |
/cart cartservers |
Then when a user requests a URL that begins with /api
, they'll be sent to the backend named apiservers
. When they request a URL that begins with /cart
, they'll be sent to the backend named cartservers
. If a request matches neither, it will record a dash in the logs. Our log shows this:
Matching key: /api | |
Matching key: /cart | |
Matching key: - |
Explicitly set default TLS certificates
HAProxy can proxy traffic for multiple websites through the same frontend. To choose the right TLS certificate, it will compare the TLS Server Name Indication (SNI) value of the incoming connection with the certificates it finds in the certificates directory. If no certificate in your directory matches, it will resort to using a default one, which is the certificate that was loaded first (was first in the directory). But what if you wanted to set multiple defaults? For example, suppose you wanted to default to an ECDSA certificate, if supported, otherwise default to an RSA certificate? Now you have multiple ways to do this:
Ensure that the first file in the directory is a multi-cert bundle.
Use the new
default-crt
argument on abind
line.If using
crt-list
, indicate a default certificate by marking it with an asterisk.
To demonstrate setting one or more default-crt
arguments on the frontend's bind
line, below we set crt
to a directory so that HAProxy will select the correct certificate from that directory based on SNI. But if there is no match, it will instead use one of the default files—either ECDSA or RSA:
frontend mysite | |
bind :80 | |
bind :443 ssl default-crt /certs/default.pem.ecdsa /certs/default.pem.rsa crt /certs/ | |
# Redirects to HTTPS | |
http-request redirect scheme https unless { ssl_fc } |
Track specific HTTP errors
Until now, you could capture in a stick table the count and rate of client HTTP errors (4xx status codes) and server HTTP errors (5xx status codes), but you could not control specifically which status codes were included. This version adds the global directives http-err-codes
and http-fail-codes
that let you set the status codes you care about, allowing you to ignore those that don't matter to you. This works for responses from backend servers and for responses from HAProxy.
Tracking client HTTP errors can be useful for discovering misconfigured client applications, such as those that repeatedly use the wrong credentials or that make an unusually high number of requests. Below, we configure a stick table to track only 403 Forbidden errors, but you can also configure HAProxy to track a range of status codes by separating them with commas or indicating a range of codes with a dash:
global | |
http-err-codes 403 | |
frontend mysite | |
bind :80 | |
# Create storage for tracking client errors | |
stick-table type ip size 1m expire 24h store http_err_cnt | |
# Begin tracking requests | |
http-request track-sc0 src | |
default_backend webservers |
Then we use the Runtime API command show table
to see the number of 403 errors from each client. Here, the client that has the source IP address 172.19.0.10 has had 20 errors:
$ echo "show table mysite" | socat stdio tcp4-connect:127.0.0.1:9999 | |
# table: mysite, type: ip, size:1048576, used:1 | |
0x7f99498020e8: key=172.19.0.10 use=0 exp=86396991 shard=0 http_err_cnt=20 |
Error counts show you client-side issues, such as requesting a missing page or a forbidden page. On the other hand, you can set the global directive http-fail-codes
to track server HTTP errors, such as 500 Internal Server Error and 503 Service Unavailable. Use it with a stick table that tracks http_fail_rate
or http_fail_cnt
to track server-side failure rates and counts.
Stick table pointers
Previously, you could use the Runtime API commands clear table [table] [key]
and set table [table] [key]
to remove or update a record in a stick table based on its key. You can now pass the record's unique ID (its pointer) to remove it or update it. You get the pointer from the show table [table] command.
In the example below, the pointer is 0x7f7de4bb50d8
:
$ echo "show table mysite" | socat stdio tcp4-connect:127.0.0.1:9999 | |
# table: mysite, type: ip, size:1048576, used:1 | |
0x7f7de4bb50d8: key=172.20.0.1 use=0 exp=86389115 shard=0 http_req_cnt=4 |
To update the record, use its pointer. Below we set the HTTP request count to zero:
$ echo "set table mysite ptr 0x7f7de4bb50d8 data.http_req_cnt 0" | socat stdio tcp4-connect:127.0.0.1:9999 |
Similarly, to remove the record, use its pointer:
$ echo "clear table mysite ptr 0x7f7de4bb50d8" | socat stdio tcp4-connect:127.0.0.1:9999 |
Linux capabilities
Version 2.9 added the ability to preserve previously set Linux capabilities after the HAProxy process starts up. Linux capabilities are permissions granted to a process that it would not otherwise have when running as a non-root user. These permissions become relevant since HAProxy does not run as a root user after startup.
In version 3.0, HAProxy is smarter about checking capabilities and will no longer emit error messages regarding the need for root permissions when running in transparent proxying mode or when binding to a port below 1024 (as is the case when using QUIC), so long as cap_net_admin
is set. Additionally, file-system capabilities can now be set on the binary and if you start HAProxy as a root user, then adding setcap
in the configuration is enough for adding those capabilities to the process. HAProxy will move the capabilities set on the binary (Permitted set) to its Effective set as long as the capabilities are also present in the setcap
keyword list.
As a refresher, you can indicate which Linux capabilities to preserve after startup by using the setcap
global keyword, specifying capabilities with a comma between each one:
global | |
setcap cap_net_bind_service,cap_net_admin |
Note that due to some security restrictions set in place by modules such as SELinux or Seccomp, HAProxy may be unable to set the required capabilities on the process. In this case, you must be also set the capabilities from the command line on the HAProxy binary:
$ sudo setcap cap_net_bind_service,cap_net_admin=p /usr/local/sbin/haproxy |
Available capabilities that you can preserve using setcap
include:
cap_net_admin
: Used for transparent proxying mode and for binding to privileged ports (lower than 1024, for example, for QUIC).cap_net_raw
(subset ofcap_net_admin
): Used for setting a source IP address that does not belong to the system itself, as is the case with transparent proxying mode.cap_net_bind_service
: Used for binding a socket to a specific network interface. This is required when using QUIC and binding to a privileged port.cap_sys_admin
: Used for creating a socket in a specific network namespace.
Set fwmark on packets to clients and servers
With HAProxy, you can set the fwmark
on an IP packet, which classifies it so that, for example, it can use a specific routing table. HAProxy 3.0 now supports setting an fwmark
on connections to backend servers as well as to clients connected on the frontend. Use the set-fc-mark
and set-bc-mark
actions. These replace the set-mark
action, which had applied only to frontends.
To test this, first, give HAProxy the cap_net_admin
capability, which is required for setting marks, by adding the setcap
directive to the global
section of your configuration:
global | |
setcap cap_net_admin |
We'd like to mark packets coming from the servers backend and ensure that they always go out through a specific network interface. Let's set an fwmark
with a value of 2 (an arbitrary number) for the current connection. We hardcode the value, but you can also use an expression of fetch methods and converters to set it:
backend servers | |
http-request set-bc-mark 2 | |
server server1 192.168.56.11:80 |
Now that we're marking packets, we just need to tell the network stack to route those packets through the desired interface, which is eth2
in this case:
$ sudo ip rule add fwmark 2 table 2 | |
$ sudo ip route add table 2 default via 192.168.56.1 dev eth2 |
To verify that the mark was set and that traffic is using the eth2
interface, you can use iptables
to log the traffic:
$ sudo iptables -A OUTPUT --dst 192.168.56.11 -j LOG --log-prefix "output: " |
Watch the kernel log. It shows OUT=eth2
and MARK=0x2
:
$ sudo tail -f /var/log/kern.log | |
output: IN= OUT=eth2 SRC=192.168.56.5 DST=192.168.56.11 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=21392 DF PROTO=TCP SPT=54738 DPT=80 WINDOW=64240 RES=0x00 SYN URGP=0 MARK=0x2 |
Set traffic priority
HAProxy can modify the header of an IP packet to include the Differentiated Services (DS) field. This field classifies the packet so that the network stack can prioritize it higher or lower in relation to other traffic on the network. New in this version of HAProxy, you can set this field on connections to backend servers in addition to frontend connections to clients. To set the value, use the set-fc-tos
and set-bc-tos
actions (referring to the old Type of Service (TOS) header field, which has been superseded by DS).
First, give the HAProxy binary the cap_net_admin
capability, which is required for setting network priority:
$ sudo setcap 'cap_net_admin=ep' /usr/local/sbin/haproxy |
Then in the HAProxy configuration file, add the setcap
directive to preserve that capability after HAProxy drops root privileges:
global | |
setcap cap_net_admin |
We would like to prioritize video traffic on the backend, so we set the set-bc-tos
directive to 26. You can learn more about DSCP in RFC4594, and find common values on the IANA DSCP webpage. Although we're hardcoding the value, you can also use an expression of fetch methods and converters to set it:
backend servers | |
http-request set-bc-tos 26 | |
server server1 192.168.56.11:80 |
New sample fetches
New fetch methods introduced in this version expose data that you can use in ACL expressions. They include fetches that return the number of open HTTP streams for a backend or frontend, the size of the backend queue, the allowed number of streams, and a value that indicates whether a connection was redispatched because a server became unreachable.
bc_be_queue
– The number of streams de-queued while waiting for a connection slot on the target backendbc_glitches
– The number of protocol glitches counted on the backend connectionbc_nb_streams
– The number of streams opened on the backend connectionbc_srv_queue
– The number of streams de-queued while waiting for a connection slot on the target serverbc_settings_streams_limit
– The maximum number of streams allowed on the backend connectionfc_glitches
– The number of protocol glitches counted on the frontend connectionfc_nb_streams
– The number of streams opened on the frontend connectionfc_settings_streams_limit
– The maximum number of streams allowed on the frontend connectiontxn.redispatched
– True if the connection has experienced redispatch
Weights in log backends
Log backends were introduced in version 2.9. They allow you to set mode log
in a backend to load balance the Syslog protocol. You can connect to backend Syslog servers over TCP or UDP.
In version 3.0, you can now set weights on server lines in your mode log
backends. The example below demonstrates a log backend that uses weights:
log-forward syslog | |
bind :514 | |
dgram-bind :514 | |
log backend@mylog local1 | |
backend mylog | |
mode log | |
balance roundrobin | |
server log1 udp@192.168.56.42:514 weight 10 | |
server log2 udp@192.168.56.43:514 weight 20 |
Here, HAProxy listens for incoming log messages on TCP and UDP ports 514. As a simple setup, you can run an NGINX web server as a Docker container, setting up the Docker daemon to forward the container's logs to HAProxy by saving the following as /etc/docker/daemon.json
:
{ | |
"log-driver": "syslog", | |
"log-opts": { | |
"syslog-address": "udp://127.0.0.1:514", | |
"syslog-facility": "local1" | |
} | |
} |
Then run the NGINX container and watch the logs come through on your Syslog servers as you make web requests:
$ sudo docker run -p 80:80 -it --rm nginx:latest |
Because we've set weights on the servers in HAProxy, the servers will receive different amounts of log traffic.
Support for UUIDv7 identifiers
You can now generate universally unique identifiers that use the UUIDv7 format. UUIDs of this type factor in the current UNIX timestamp and are therefore time sortable, which tells you when a UUID was generated in relation to other UUIDs. In the example below, we set the unique ID format to version 7 and use the unique-id
fetch method to get a new UUID to include in the logs:
frontend mysite | |
bind :80 | |
unique-id-format %[uuid(7)] | |
log-format "%{+json}o %(id)[unique-id] %(client_ip)ci %(client_port)cp %(request_line)r" | |
use_backend webservers |
Log messages will look like this, where the first field is the UUID:
{"id": "018fea59-8306-7d01-9a97-01332adf4905", "client_ip": "172.21.0.1", "client_port": 53280, "request_line": "GET /favicon.ico HTTP/1.1"} |
HTTP forward proxy for OCSP updates
HAProxy 2.8 introduced a new and simpler way to configure OCSP stapling. HAProxy periodically contacts your SSL certificate issuer's OCSP server to get the revocation status of your SSL certificate. In version 3.0, if you're operating in an air-gapped environment where the HAProxy server does not have direct access to the Internet, and therefore can't connect to the OCSP server, you can set an HTTP forward proxy to reach the Internet. Add the ocsp-update.httpproxy
global directive to indicate the proxy's address:
global | |
ocsp-update.httpproxy 192.168.0.10:8000 |
Then the HTTP forward proxy can relay the OCSP request to the OCSP server.
Reverse HTTP
Reverse HTTP, which is an experimental feature added in HAProxy 2.9, allows HAProxy load balancers to self-register with an edge HAProxy load balancer and to then fill in as backend servers. In version 2.9, the mechanism for matching client requests with the correct backend server / HAProxy instance was via SNI. A new directive named pool-conn-name
provides more flexibility, allowing you to set the name to match with an expression of fetch methods and converters.
Observability
Prometheus
When configuring the Prometheus exporter, you can now include a URL parameter named extra-counters
, which enables additional counters that provide information related to the HTTP protocol, QUIC, and SSL. Set this in your Prometheus server's prometheus.yml
file, replacing haproxy
with your load balancer's IP address:
scrape_configs: | |
- job_name: 'load-balancer-metrics' | |
static_configs: | |
- targets: ['haproxy:8405'] | |
params: | |
extra-counters: ["on"] |
Decrypt TLS 1.3 packet captures to backend servers
When diagnosing network issues, you will often need to analyze TLS-encrypted traffic to see the underlying application-layer protocol messages. For example, when using Wireshark, you can import a packet capture file that contains the traffic you would like to analyze, but then you need a way to decipher the captured packets. The most common way to do this is to import a key log file into Wireshark, which contains the secrets needed to decipher a TLS session.
While prior versions of HAProxy supported producing a key log file for deciphering traffic between the client and HAProxy, HAProxy 3.0 adds the ability to produce a key log file for deciphering TLS traffic between HAProxy and backend servers, specifically when using TLS 1.3.
Follow these seven steps:
Step 1
In your HAProxy configuration, set tune.ssl.keylog
to on
in the global
section. This activates the retrieval of the TLS keys you will use for decryption in Wireshark:
global | |
tune.ssl.keylog on |
Step 2
Force HAProxy and the backend servers to use TLS 1.3 by adding the ssl-min-ver
argument to the servers:
backend servers | |
server s1 192.168.56.50:443 ssl verify required ca-file /etc/haproxy/certs/ca.crt ssl-min-ver TLSv1.3 |
Step 3
Define a custom log format that writes TLS session secrets to the access log:
frontend mysite | |
log-format "CLIENT_EARLY_TRAFFIC_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_client_early_traffic_secret]\nCLIENT_HANDSHAKE_TRAFFIC_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_client_handshake_traffic_secret]\nSERVER_HANDSHAKE_TRAFFIC_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_server_handshake_traffic_secret]\nCLIENT_TRAFFIC_SECRET_0 %[ssl_bc_client_random,hex] %[ssl_bc_client_traffic_secret_0]\nSERVER_TRAFFIC_SECRET_0 %[ssl_bc_client_random,hex] %[ssl_bc_server_traffic_secret_0]\nEXPORTER_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_exporter_secret]\nEARLY_EXPORTER_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_early_exporter_secret]" |
Step 4
After HAProxy connects to a backend server, the access log will contain the keys for that TLS session. The access log will contain lines like this:
CLIENT_EARLY_TRAFFIC_SECRET C030AF8EAEE688F1F3A360E5D53E260DEAB346F93CE594153D95E33E4BFD5F80 - | |
CLIENT_HANDSHAKE_TRAFFIC_SECRET C030AF8EAEE688F1F3A360E5D53E260DEAB346F93CE594153D95E33E4BFD5F80 15ab9abf57145fe49c73d9a617eca9b918d5c4dd455c4bb923c04a936475241facbac21f66bca7c459f5179f753f4afa | |
SERVER_HANDSHAKE_TRAFFIC_SECRET C030AF8EAEE688F1F3A360E5D53E260DEAB346F93CE594153D95E33E4BFD5F80 09bded135c6b85959d0c2eaf09d177cc4fb9e2d9777cbda5a234d0894ef84b64bbd346cc331a16111d4273d639090d5b | |
CLIENT_TRAFFIC_SECRET_0 C030AF8EAEE688F1F3A360E5D53E260DEAB346F93CE594153D95E33E4BFD5F80 155b07c8fcef945cbad456f6b11e216fde42f9ac1cdc8c6eff4bed845caf520a2a490ccba3ae06ffe3d9091904674c41 | |
SERVER_TRAFFIC_SECRET_0 C030AF8EAEE688F1F3A360E5D53E260DEAB346F93CE594153D95E33E4BFD5F80 18ed3dc1188b7ed1085cbdf41b0f0388b80904f6f21b8962f57cdf460d5694f2b2d99f7055ac44f0e6afefc9e790626b | |
EXPORTER_SECRET C030AF8EAEE688F1F3A360E5D53E260DEAB346F93CE594153D95E33E4BFD5F80 9479651fd91e38d549b284ecae7c6430743ae56cc4e8fb899eaf0a4016891d3991b01691c1c4c787d95a10c |
Step 5
Copy these lines to a text file. Then import the file into Wireshark via Edit > Preferences > Protocols > TLS > (Pre)-Master-Secret log filename.
Step 6
At the same time, capture traffic between HAProxy and the backend servers. For example, you can run tcpdump
on the HAProxy server to get a PCAP file:
$ tcpdump -s 0 port 443 -i eth0 -w mycap.pcap |
Step 7
Open this PCAP file in Wireshark to see the deciphered traffic.
Runtime API
show quic verbosity
The Runtime API's show quic
command gained finer-grained levels of verbosity and the ability to filter the output to a specific connection. Previously, the verbosity levels you could specify were online
or full
. Now you can also specify a comma-delimited list of values that determine the output. Values include tp
, sock
, pktns
, cc
, and mux
. Also, in this version you can specify the connection's hexadecimal address to see information for just that connection.
Here are examples that show output for the new verbosity levels:
$ echo "show quic" | socat stdio tcp4-connect:127.0.0.1:9999 | |
# conn/frontend state in_flight infl_p lost_p Local Address Foreign Address local & remote CIDs | |
0x7f3501354e80[04]/mysite ESTAB 0 0 0 172.22.0.5:443 172.22.0.1:40761 2beffb69742599f2 fd1050 | |
$ echo "show quic tp" | socat stdio tcp4-connect:127.0.0.1:9999 | |
* 0x7f44d3b356a0[00]: scid=afa23221abf0a878........................ dcid=d473e9.................................. | |
loc. TPs: odcid=0917fd0b378fc3ae5b761af255 iscid=afa23221abf0a878 | |
midle_timeout=30000ms mudp_payload_sz=2048 ack_delay_exp=3 mack_delay=25ms act_cid_limit=8 | |
md=1687140 msd_bidi_l=16380 msd_bidi_r=16380 msd_uni=16380 ms_bidi=100 ms_uni=3 | |
(no_act_migr,stless_rst_tok) | |
rem. TPs: iscid=d473e9 | |
midle_timeout=30000ms mudp_payload_sz=65527 ack_delay_exp=3 mack_delay=20ms act_cid_limit=8 | |
md=25165824 msd_bidi_l=12582912 msd_bidi_r=1048576 msd_uni=1048576 ms_bidi=16 ms_uni=16 | |
(no_act_migr) versions:chosen=0x00000001,negotiated=0x00000001 | |
st=opened mux=ready expire=04s | |
$ echo "show quic sock" | socat stdio tcp4-connect:127.0.0.1:9999 | |
* 0x7f44d3b36670[01]: scid=36416af6fda6c5d2........................ dcid=47e1d4.................................. | |
st=opened mux=ready expire=18s | |
fd=71 local_addr=172.22.0.5:443 foreign_addr=172.22.0.1:40203 | |
$ echo "show quic pktns" | socat stdio tcp4-connect:127.0.0.1:9999 | |
* 0x7f44d3b36670[01]: scid=36416af6fda6c5d2........................ dcid=47e1d4.................................. | |
st=opened mux=ready expire=07s | |
[01rtt] rx.ackrng=1 tx.inflight=0 | |
$ echo "show quic cc" | socat stdio tcp4-connect:127.0.0.1:9999 | |
* 0x7f44d3b356a0[02]: scid=6efe15fc4097ac7d........................ dcid=5ed3d6.................................. | |
st=opened mux=ready expire=27s | |
srtt=1 rttvar=0 rttmin=1 ptoc=0 cwnd=17093 mcwnd=17093 sentpkts=13 lostpkts=0 reorderedpkts=0 | |
$ echo "show quic mux" | socat stdio tcp4-connect:127.0.0.1:9999 | |
* 0x7f44d3b356a0[02]: scid=6efe15fc4097ac7d........................ dcid=5ed3d6.................................. | |
st=opened mux=ready expire=16s | |
qcc=0x0x7f44cde23090 flags=0x0 sc=0 hreq=0 | |
qcs=0x0x7f44cde36c60 id=2 flags=0x0 st=OPN rxoff=29 | |
qcs=0x0x7f44cde36930 id=3 flags=0x0 st=OPN txoff=17 | |
qcs=0x0x7f44cde36df0 id=6 flags=0x0 st=OPN rxoff=1 | |
qcs=0x0x7f44cde0c050 id=10 flags=0x0 st=OPN rxoff=1 |
Cookies for dynamic servers
Static cookies for session persistence are now supported for dynamically added servers. Dynamic servers refer to servers that do not have an explicit entry within your HAProxy configuration file. They are dynamic in the sense that you can add them programmatically using Runtime API commands. Dynamic servers are valuable in cases where you may have many servers that scale with traffic: when traffic loads are high, you add more servers, and when traffic loads diminish, you remove servers.
Previous versions of HAProxy did not support adding these dynamic servers and also using static cookies with those servers, but as of version 3.0, you can now use the add server
command to add the server and specify its static cookie using just one command. Note that when adding a dynamic server, you must choose a load balancing algorithm for your backend that is dynamic (roundrobin
, leastconn
, or random
).
To add a dynamic server to your backend with a static cookie, issue the add server command, specifying your cookie:
$ echo "add server app/app2 172.16.0.12:8080 cookie app2" | socat stdio tcp4-connect:127.0.0.1:31232 |
Here's the output:
New server registered. |
You can also enable auto-generated names for session persistence cookies. For more information see our guide for setting a dynamic cookie key In that case, if you set the cookie
argument on your add server
command, the static cookie you specify will take precedence over the backend’s setting for dynamic cookies.
As a reminder, servers added via the Runtime API are not persisted after a reload. To ensure that servers you add via the Runtime API persists after a reload, be sure to also add them into your HAProxy configuration file (thus making them static servers).
Del server
Removing a dynamic server with the del server
command is faster now that the command can close idle connections, where previously it would wait for the connections to close by themselves. This improvement is made possible by changes to the removal mechanism which allows forceful closure of idle connections.
Wait command
The new wait
command for the Runtime API will wait for some specified time before then performing the following command.You could use this to collect metrics on a certain interval. This is also useful in cases where you need to wait until a server becomes removable (the server has been drained of connections) before running additional commands, such as del server
.
The syntax for the command is: wait { -h | <delay> } [<condition> [<args>...]]
If you do not provide any conditions, the command will simply wait for the requested delay (in default milliseconds) time before it continues processing.
With this release, the only supported condition is srv-removable
which will wait until the specified server is removable. When using socat
, be sure to extend socat
’s timeout to account for the wait time.
The following example calls show activity
, waits 10 seconds, then calls show activity
again. Note that socat
’s timeout value has been increased to 20 seconds:
$ echo "show activity; wait 10s; show activity" | socat -t 20 stdio tcp4-connect:127.0.0.1:9999 |
Here's the output:
thread_id: 1 (1..2) | |
date_now: 1717443564.121845 | |
uptime_now: 659.199676 | |
[...] | |
Done. | |
thread_id: 1 (1..2) | |
date_now: 1717443574.123014 | |
uptime_now: 669.200845 | |
[...] |
This example disables the server named app/app1
, calls shutdown sessions
for the server, waits for the server to be removable (using the condition srv-removable
), and then once removable, deletes the server:
$ echo "disable server app/app1; shutdown sessions server app/app1; wait 2s srv-removable app/app1; del server app/app1" | socat stdio tcp4-connect:127.0.0.1:31232 |
Finally, here's the output:
Done. | |
Server deleted. |
Performance
Fast forwarding
Zero-copying forwarding was introduced in Version 2.9 for TCP, HTTP/1, HTTP/2, and HTTP/3. As of version 3.0, applets, such as the cache, can also take advantage of the fast forwarding mechanism which avoids queuing more data when the mux buffer is full. This results in significantly less memory usage and higher performance. This behavior can be disabled by using tune.applet.zero-copy-forwarding
for applets only, or tune.disable.zero-copy-forwarding
globally.
In regards to QUIC, simplification of the internal send API resulted in removal of one buffer copy. The fast forwarding now considers the flow control, which reduces the number of thread wakeups and optimizes packet filling.
The HTTP/1 mutex now also supports zero-copy forwarding for chunks of unknown size. For example, chunks whose size may be larger than the buffer.
Ebtree update
Performance improved for ebtree on non-x86 machines. This results in approximately 3% faster task switching and approximately 2% faster string lookups.
Server name lookup
Configuration parsing time will see an improvement thanks to a change in server name lookups. The lookups now use a tree, which improves lookup time, whereas before the lookup was a linear operation.
QUIC
QUIC users will see a performance increase when using two new global settings:
tune.quic.reorder-ratio
By adjusting
tune.quic.reorder-ratio
, you can change how quickly HAProxy detects packet losses. This setting applies to outgoing packets. When HAProxy receives an acknowledgement (ACK) for a packet it sent to the destination and that ACK arrived before other expected ACKs, or in other words it arrived out of sequence, it could indicate that packets never reached the destination. If it happens frequently, it indicates a poor network condition. By lowering the ratio, you're lowering the number of packets that can be missing ACKs before HAProxy takes action. That action is to reduce the packet sending window, which forces HAProxy to slow down its rate of transfer, at the cost of slower throughput. The default value is 50%, which means that the latest ACKed packet was halfway up the range of sent packets awaiting acknowledgements, with packets preceding it not yet ACKed.
tune.quic.cc-hystart
Use this setting to enable use of the HyStart++ (RFC9406) algorithm instead of the Cubic algorithm. This provides an alternative to the TCP slow start phase of the congestion control algorithm. This algorithm may show better recovery patterns regarding packet loss.
Additionally, the send path for QUIC was improved by cleanup on some of the low level QUIC sending functions. This includes exclusive use of sendmsg()
(a system call for sending messages over sockets), optimizations avoiding unnecessary function calls, and avoiding copies where possible.
Traces
An improvement to traces will enable their use on servers with moderate to high levels of traffic without risk of server impact.
The improvement to traces was made possible by the implementation of near lockless rings. Previously, the locking mechanism limited the possibility of using traces in a production environment. Now that the rings are nearly lockless, allowing for parallel writes per group of threads, performance with traces enabled has been increased up to 20x.
A new global setting tune.ring.queues
that sets the number of write queues in front of ring buffers can be used for debugging, as changing the value may reveal important behavior during a debugging session. This should only be changed if you are instructed to do so for troubleshooting.
Stick tables
Stick tables have received a performance boost due to a change in the locking mechanism. Previously, when the number of updates was high, stick tables could cause CPU usage to rise, due to the overhead associated with locking. Using peers amplified this issue. Stick tables are now sharded over multiple, smaller tables, each having their own lock, thus reducing lock contention. Also, the interlocking between peers and tables has been significantly reduced.
This means that on systems with many threads, stick table performance improves greatly. On a system with 80 threads, we measured performance gains of approximately 6x. As for systems with low thread counts, performance could be improved by as much as 2x when using peers.
Lua
Single-threaded Lua scripts using lua-load
will see a performance improvement. This improvement is the result of a change to the loading mechanism, where the maximum number of instructions is now divided by the number of threads. This makes it so that waiting threads have a shorter wait time and share the time slot more evenly. Safeguards are in place to prevent thread contention for threads waiting for the global Lua lock.
Use tune.lua.forced-yield
to tune the thread yielding behavior. For scripts that use lua-load
, the optimal (and default) value was found to be the maximum of either 500 instructions or 10000 instructions divided by the number of threads. As for scripts loaded using lua-load-per-thread
, in cases where more responsiveness is required, the value can be lowered from the default of 10000 instructions. In cases where the results of the Lua scripts are mandatory for processing the data, the value can be increased, but with caution, as an increase could cause thread contention.
Breaking Changes
Multiple CLI commands no longer supported
Previously, it was occasionally possible to successfully issue multiple commands by separating them with newlines, which had the potential to produce unexpected results for long-running commands that may only partially complete. A warning will now be emitted when a \n
is detected in a command and the command will not be accepted. This change has also been backported to ensure that user scripts that utilize this behavior can be remedied.
Enabled keyword rejected for dynamic servers
When defining a dynamic server, use of the enabled keyword is now rejected with an error, whereas previously it was only silently ignored. Here's a sample input:
$ echo "add server app/app2 127.0.0.1:5571 enabled check cookie app2" | socat stdio tcp4-connect:127.0.0.1:9999 |
This produces the following output:
'enabled' option is not accepted for dynamic server |
Invalid request targets rejected for non-compliant URIs
Parsing is now more strict during H1 processing for request target validation. This means that where previously, for compatibility, non-standard-compliant URIs were forwarded as-is for HTTP/1, now some invalid request targets are rejected with a 400 Bad Request error. The following rules now apply:
The asterisk-form is now only allowed for OPTIONS and OTHER methods. There must now be only one asterisk and nothing more.
The CONNECT method must have a valid authority-form. All other forms are rejected.
The authority-form is now only supported for the CONNECT method. Origin-form is only checked for the CONNECT method.
Absolute-form must have a scheme and a valid authority.
Tune.ssl.ocsp-update renamed to oscp-update
The tune.ssl.oscp-update
global keyword is now named oscp-update, as ocsp-update
is unrelated to SSL tuning.
Development Improvements
This release brings with it some major improvements for developers and contributors, as well as aids in saving time diagnosing issues and speeding up recovery:
The internal API for applets has been simplified, with new applet code having its own buffers, and keyword handlers for the Runtime API now have their own buffers as well.
Updates to the makefile improve ease of use for packagers, including improved warnings plus easier passing of
CFLAGS
andLDFLAGS
. Unused build options will produce a warning which will assist in debugging typos for build options with long names.A new debugging feature has been added to the SSL and HTTP cache that allows assignment of a name to some memory areas so that it is more easily identified in the process map (using
/proc/$pid/maps
or usingpmap
on Linux versions 5.17 or greater). This makes it so that you can more easily determine where and why memory is being used. Future iterations will include more places where this debugging feature is implemented, further improving troubleshooting.By default HAProxy tries hard to prevent any thread and process creation after it starts. This is particularly important when running HAProxy’s own test suite, when executing Lua files of uncertain origin, and when experimenting with development versions, which may still contain bugs whose exploitability is uncertain. Generally speaking, it's a best practice to make sure that no unexpected background activity can be triggered by traffic. However, this may prevent external checks from working, and it may break some very specific Lua scripts which actively rely on the ability to fork. This global option
insecure-fork-wanted
disables this protection. As of version 3.0, you can also activate this option by using-dI
(-d uppercase “i”) on the HAProxy command line. Note that it is a bad idea to disable it, as a vulnerability in a library or within HAProxy itself may be easier to exploit once disabled. In addition, forking from Lua, or anywhere else, is not reliable, as the forked process could embed a lock set by another thread and cause operations to never cease execution. As such, we recommend that you use this option with extreme caution, and that you move any workload requiring such a fork to a safer solution (such as using agents instead of external checks).The DeviceAtlas module has been updated to support the new version of DeviceAtlas.
Support for FreeBSD 14 (and its new
sched_setaffinity()
system call) has been added.
Conclusion
HAProxy 3.0 was made possible through the work of contributors that pour immense effort into open-source projects like this one. This work includes participating in discussions, bug reporting, testing, documenting, providing help, writing code, reviewing code, and hosting packages.
While it's sadly impossible to include every contributor name here, all of you are invaluable members of the HAProxy community! Thank you.
Subscribe to our blog. Get the latest release updates, tutorials, and deep-dives from HAProxy experts.