It's easy to protect any application or web server against unexpected high loads using the ALOHA load balancer and HAProxy.
The response time of web servers is directly related to the number of requests they have to manage simultaneously. The response time is not linearly linked to the number of requests; it looks exponential.
Simultaneous Connections Limiting
Simultaneous connections limiting is a number (aka the limit) a load balancer will consider as the maximum number of requests to send to a backend server at the same time. Of course, since HAProxy has such a function, the ALOHA load balancer does.
The graph below shows a server response time compared to the number of simultaneous users browsing the website:
Smart Handling of Requests Peak With HAProxy
The meaning is to prevent too many requests from being forwarded to an application server, by adding a limit for simultaneous requests for each server of the backend.
Fortunately, unlike some load balancers, HAProxy would not reject any request over the limit.
HAProxy uses a queueing system and will wait for the backend server to be able to answer. This mechanism will add slow delays to requests in the queue, but it has a few advantages :
no client requests are rejected
every request can be served faster than with an overloaded backend server
the delay is still acceptable (a few ms in queue)
your server won’t crash because of the spike
Simultaneous requests limiting occurs on the server side: HAProxy will limit the number of concurrent requests to the server despite what happens on the client side. HAProxy will never refuse any client connection until the underlying server runs out of capacity.
Concrete Numbers
If you read carefully the graph above, you can easily see that the more your server has to process requests at the same time, the longer each request will take to process.
The table below summarizes the time spent by our example server to process 250 requests with different simultaneous requests limiting values:
Number of requests | Simultaneous requests limit | Average time per request | Longest response time in ms |
---|---|---|---|
250 | 10 | 9 | 225 |
250 | 20 | 9 | 112 |
250 | 30 | 9 | 75 |
250 | 50 | 25 | 125 |
250 | 100 | 100 | 250 |
250 | 150 | 225 | 305 |
250 | 250 | 625 | 625 |
It’s up to the website owner to know what will be the best limit to set up on HAProxy. You can approximate it by using HTTP benchmark tools and by comparing the average response time to a constant number of requests you send to your backend server.
From the example above, we can see we would get the best of this backend server by setting up the limit to 30.
Setting up a limit too low would imply a queueing request for a longer time, and setting it too high would be counter-productive by slowing down each request because of server capacity.
Similar Articles:
HAProxy Simultaneous Requests Limiting Configuration
The simultaneous requests limiting configuration is made with the maxconn
keyword on the server line definition.Example:
frontend APPLI1
bind :80
mode http
option http-server-close
default_backend APPLI1
backend APPLI1
balance roundrobin
mode http
server server1 srv1:80 maxconn 30
server server2 srv2:80 maxconn 30
Subscribe to our blog.
Get the latest release updates, tutorials, and deep-dives from HAProxy experts.