HAProxy supports many load balancing algorithms, which may be used in many different types of use cases. That said, cache servers, which deliver most of the time the static content from your web applications, may require some specific load balancing algorithms.
HAProxy stands in front of your cache server for some good reasons:
SSL offloading
HTTP content-switching capabilities
advanced load balancing algorithms
The main purpose of this article is to show how HAProxy can be used to aggregate Varnish servers' memory storage in some kind of “JBOD” mode (like the “Just a Bunch Of Disks“). The main purpose of the examples delivered here is to optimize the resources on the cache, mainly its memory, in order to improve the HIT rate. This will also improve your application response time.
Content Switching in HAProxy
This has been covered many times on this blog. As a quick introduction for readers who are not familiar with HAProxy, let’s explain how it works.
Clients will get connected to HAProxy through a Frontend. Then HAProxy routes traffic to a backend (server farm) where the load-balancing algorithm is used to choose a server.
A frontend can point to multiple backends and the choice of a backend is made through acls and use_backend rules. Acls can be formed using fetches. A fetch is a directive that instructs HAProxy where to get content from.
Enough theory, let’s make a practical example: splitting static and dynamic traffic using the following rules:
Static content is hosted on domain names starting with ‘static.’ and ‘images.’
Static content files extensions are ‘.jpg’ ‘.png’ ‘.gif’ ‘.css’ ‘.js’
Static content can match any of the rules above
anything which is not static is considered dynamic
The configuration snippet below should be integrated into the HAProxy frontend. It matches the rules above to do traffic splitting. The varnish servers will stand in the bk_static farm.
frontend ft_public
<frontend settings>
acl static_domain req.hdr_beg(Host) -i static. images.
acl static_content path_end -i .jpg .png .gif .css .js
use_backend bk_static if static_domain or static_content
default_backend bk_dynamic
backend bk_static
<parameters related to static content delivery>
The configuration above creates 2 named acls ‘static_domain‘ and ‘static_content‘ which are used by the used_backend rule to route the traffic to varnish servers.
Haproxy & Hash Based Load Balancing Algorithm
Later in this article, we’ll heavily use the hash-based load balancing algorithms from HAProxy. So a few pieces of information here (nonexhaustive, it would deserve a long blog article) which will be useful for people wanting to understand what happens deep inside HAProxy.
The following parameters are taken into account when computing a hash algorithm:
number of servers in the farm
weight of each server in the farm
status of the servers (UP or DOWN)
If any of the parameters above changes, the whole hash computation also changes. Hence request may hit another server. This may lead to a negative impact on the response time of the application (during a short period of time). Fortunately, HAProxy allows ‘consistent’ hashing, which means that only the traffic related to the change will be impacted.
That’s why you’ll see a lot of hash-type consistent directives in the configuration samples below.
Load Balancing Varnish Cache Server
Now, let’s focus on the magic we can add in the bk_static server farm.
Hashing the URL
HAProxy can hash the URL to pick up a server. With this load-balancing algorithm, we guarantee that a single URL will always hit the same Varnish server.
Hashing the URL path only
In the example below, HAProxy hashes the URL path, which is from the first slash ‘/’ character up to the question mark ‘?’:
backend bk_static
balance uri
hash-type consistent
Hashing the whole URL, including the query string
In some cases, the query string may contain some variables in the query string, which means we must include the query string in the hash:
backend bk_static
balance uri whole
hash-type consistent
Query string parameter hash
That said, in some cases (API, etc…), hashing the whole URL is not enough. We may want to hash only on a particular query string parameter. This applies well in cases where the client can forge itself the URL and all the parameters may be randomly ordered.
The configuration below tells HAProxy to apply the hash to the query string parameter named ‘id’ (IE: /image.php?width=512&id=12&height=256)
backend bk_static
balance url_param id
hash-type consistent
Hash on an HTTP header
HAProxy can apply the hash to a specific HTTP header field. The example below applies it on the Host header. This can be used for people hosting many domain names with a few pages, like users dedicated pages.
backend bk_static
balance hdr(Host)
hash-type consistent
Compose your own hash: concatenation of Host header and URL
Nowadays, HAProxy becomes more and more flexible and we can use this flexibility in its configuration.
Imagine, in your varnish configuration, you have a storage hash key based on the concatenation of the host header and the URI, then you may want to apply the same load-balancing algorithm into HAProxy, to optimize your caches.
The configuration below creates a new HTTP header field named X-LB which contains the host header (converted to lowercase) concatenated to the request uri (converted in lowercase too).
backend bk_static
http-request set-header X-LB %[req.hdr(Host),lower]%[req.uri,lower]
balance hdr(X-LB)
hash-type consistent
Conclusion
HAProxy and Varnish work very well together. Each soft can benefit from the performance and flexibility of the other one.
Links
Subscribe to our blog. Get the latest release updates, tutorials, and deep-dives from HAProxy experts.