Use rate limiting in HAProxy to stop clients from making too many requests and promote fair usage of your services.
Rate limiting in HAProxy stops a client from making too many requests during a window of time. You might have a policy that stipulates how many requests a client can make, just as a matter of keeping resource usage fair. Or, you may want to put rate-limiting in place to guard against certain types of attacks like application-layer DDoS attacks.
There are several ways for you to turn on rate-limiting. Each technique uses the flexible building blocks of the HAProxy configuration language, combining access control lists (ACLs), stick tables, and maps, to compose a slightly different solution meant for a particular use case.
In this blog post, we’ll zero in on limiting the number of HTTP requests that a client can make. We’ll save other interesting scenarios, such as limiting the number of connections, the bytes flowing in, the bytes flowing out, and the maximum amount of errors, for another time. We’ll also avoid itemizing every way that you can react to misbehaving clients and simply focus on denying them. However, there are, in fact, many actions you can take in HAProxy when you see a client exceeding a rate limit. For example, you can tarpit them, send them to a different pool of servers or ban them for some extended period of time. HAProxy Enterprise adds even more options, such as the ability to present reCAPTCHA and JavaScript challenges.
Setting the Maximum Connections
Before diving into actual rate limiting, note that you can achieve a level of fairness by enabling queuing. Queuing means that you can store excess connections in HAProxy until your servers are freed up to handle them. HAProxy is designed to hold onto lots of connections without a sharp increase in memory or CPU usage. However, queueing has to be turned on before you’ll see the benefit.
Use the maxconn
parameter on a server
line to cap the number of concurrent connections that will be sent. Here’s an example that sends up to 30 connections at a time to each server. After all, servers reach their maximum, the connections queue up in HAProxy:
backend servers | |
server s1 192.168.30.10:80 check maxconn 30 | |
server s2 192.168.31.10:80 check maxconn 30 | |
server s3 192.168.31.10:80 check maxconn 30 |
If all 30 connections are being used on all three servers, or in other words 90 connections are active, then new connections will have to wait in line for a slot to free up. This means that the servers themselves won’t become overloaded.
In all likelihood, a server will become available fast enough that the client will never even know the difference. You can define how long clients should be queued by adding the timeout queue
setting, like this:
backend servers | |
timeout queue 10s | |
server s1 192.168.30.10:80 check maxconn 30 | |
server s2 192.168.31.10:80 check maxconn 30 | |
server s3 192.168.31.10:80 check maxconn 30 |
The idea behind setting a timeout is that it’s better to let some clients receive a 503 Service Unavailable error than to allow your servers to become buried under the load. Or, from the client’s perspective, it’s better to get an error and deal with it (programmatically, of course), than to wait an extended amount of time and possibly cause errors that are more difficult to resolve.
Sliding Window Rate Limiting
Let’s look at the most straightforward case of rate-limiting. In this scenario, you want to limit the number of requests that a user can make within a certain period of time. The period is a sliding window. So, if you set it to allow no more than 20 requests per client during the last 10 seconds, HAProxy will count the last 10 seconds. Consider this HAProxy configuration:
frontend website | |
bind :80 | |
stick-table type ipv6 size 100k expire 30s store http_req_rate(10s) | |
http-request track-sc0 src | |
http-request deny deny_status 429 if { sc_http_req_rate(0) gt 20 } | |
default_backend servers |
The stick-table
directive creates a key-value store for storing counters like the HTTP request rate per client. The key is the client’s IP address, as configured by the type
parameter, which is used to store and aggregate that client’s number of requests. The http-request track-sc0 src
line adds the client as a record in the stick table. The counters begin to be recorded as soon as the IP is added.
A stick table record expires and is removed after a period of inactivity by the client, as set by the expire
parameter. That’s just a way of freeing up space. Without an expire
parameter, the oldest records are evicted when the storage becomes full. Here, we’re allowing 100,000 records.
The http-request deny
line sets the rate limit threshold and the action to take when someone exceeds it. Here, we’re allowing up to 20 concurrent requests and denying additional ones with a 429 Too Many Requests response until the count during the last 10 seconds is below the threshold again. Other actions include forwarding the client to a dedicated backend or silently dropping the connection. The sc_http_req_rate
fetch method returns the client’s current request rate.
You can play with the time period or the threshold. Instead of counting requests over 10 seconds, you might extend it to something like 1000 requests during the last 24 hours. Simply change the counter specified on the stick-table
line from http_req_rate(10s)
to http_req_rate(24h)
. Then update the http-request deny
line to allow no more than 1000 requests.
We covered a similar example in our blog post, Bot Protection with HAProxy. In that post, we demonstrate how to track a client’s error rate, which can be used to detect vulnerability scanners.
Rate Limit by Fixed Time Window
Suppose you wanted to allow up to 1000 requests per day. In the last example, we used a sliding window. So, if a person makes 500 requests on Monday and another 500 on Tuesday, the combined total will count towards the 1000 requests limit during the last 24 hours. If instead, you decided that a person should be allowed 1000 requests from sunup to sundown, but the count should reset at midnight each day, then you’d have to go about it differently.
Rather than using the http_req_rate
counter, which takes a time period, you’d use http_req_cnt
, which increments forever until reset or until the expiration is hit. You would then use the HAProxy Runtime API to clear all records at exactly midnight.
First, update your frontend
to look like this:
frontend website | |
bind :80 | |
stick-table type ipv6 size 100k expire 24h store http_req_cnt | |
http-request track-sc0 src | |
http-request deny deny_status 429 if { sc_http_req_cnt(0) gt 1000 } | |
default_backend servers |
Now, when a client makes request 1001, they will be denied. However, you need a way to reset this status at the end of each day. Enable the Runtime API by adding a stats socket
directive to the global
section of your HAProxy configuration:
global | |
stats socket /run/haproxy.sock mode 660 level admin |
Next, install the socat utility and use it to invoke the clear table
Runtime API command to clear all records from the stick table:
$ echo "clear table website" | sudo socat stdio /run/haproxy.sock |
You could set up a cron job to do this automatically each day. Set it and forget it. If you need to clear a single record as a one-off, you can include the client’s IP address, as shown:
$ echo "clear table website key 192.168.50.10" | sudo socat stdio /run/haproxy.sock |
Rate Limit by URL
Some pages require more processing time than others, such as pages that query a database to render a report. They might need a stricter rate limit. In that case, you might decide to set the limit threshold depending on the page. In this scenario, we’ll check the URL path as an added dimension.
First, add a file called rates.map to the /etc/haproxy directory. This map file will associate URL paths with their rate limits. Add the following to it, in which three paths are associated with various thresholds:
/urla 10 | |
/urlb 20 | |
/urlc 30 |
Next, update your HAProxy configuration to look like this:
frontend website | |
bind :80 | |
stick-table type binary len 20 size 100k expire 10s store http_req_rate(10s) | |
# Track client by base32+src (Host header + URL path + src IP) | |
http-request track-sc0 base32+src | |
# Check map file to get rate limit for path | |
http-request set-var(req.rate_limit) path,map_beg(/etc/haproxy/rates.map,20) | |
# Client's request rate is tracked | |
http-request set-var(req.request_rate) base32+src,table_http_req_rate() | |
# Subtract the current request rate from the limit | |
# If less than zero, set rate_abuse to true | |
acl rate_abuse var(req.rate_limit),sub(req.request_rate) lt 0 | |
# Deny if rate abuse | |
http-request deny deny_status 429 if rate_abuse | |
default_backend servers |
Instead of keying off of IP addresses in the stick table, we’ve specified a type
of binary. This is populated with a hash of the HTTP Host header, the URL path, and the client’s source IP address. You get all of that when the http-request track-sc0 base32+src
directive is called. That way, you can differentiate a client’s request rate over a number of different web pages.
The first http-request set-var
line finds the request rate threshold in the rates.map file for the current URL path being requested. If the requested URL is not found in the map file, a default of 20 is used. It stores the result in a variable named req.rate_limit. The next http-request set-var
line sets a variable named req.request_rate to the client’s current request rate for the page.
In order to compare the allowed limit with the client’s request rate, we subtract one from the other and make sure that the difference is more than zero. If it isn’t, we deny the request because they’ve surpassed the threshold for that page.
Rate Limit by URL Parameter
Here’s a slight variation on rate-limiting by URL path: rate-limiting by URL parameter. You might use this if your clients include an API token in the URL to identify themselves.
frontend website | |
bind :80 | |
stick-table type string size 100k expire 24h store http_req_rate(24h) | |
# check for token parameter | |
acl has_token url_param(token) -m found | |
# check if exceeds limit | |
acl exceeds_limit url_param(token),table_http_req_rate() gt 1000 | |
# start tracking based on token parameter | |
http-request track-sc0 url_param(token) unless exceeds_limit | |
# Deny if missing token or exceeds limit | |
http-request deny deny_status 429 if !has_token or exceeds_limit | |
default_backend servers |
Here, we’re using a sliding window of 24 hours, during which time a client can make up to 1000 requests. The stick table’s type
is a string and we’re using the http-request track-sc0
line to store a URL parameter named token as the key in the table. So, a user might request a page like this:
http://yourwebsite.com/api/v1/does_a_thing?token=abcd1234
The has_token ACL ensures that a token is included in the URL. The exceeds_limit ACL finds the current request count for the last 24 hours. The http-request deny
line denies the request if the client has exceeded the limit or didn’t give a token. Note that we’ve added an unless exceeds_limit clause to the end of the http-request track-sc0
line since there’s no point in continuing to increment the counter after they’ve exceeded the limit. It also prevents the client from being perpetually blocked and lets the entry expire.
You may wonder when you should use the http_req_rate(24h)
counter vs the http_req_cnt
counter in conjunction with an expire
parameter set to 24h. The former is a sliding window over the last 24 hours. The latter begins when the user sends their first request and increments from then on until the expiration. However, unless you’re manually clearing the table every 24 hours via the Runtime API, the http_req_cnt
could stay in effect for a long time while the client stays active. That’s because the expiration is reset whenever the record is touched.
HAProxy Enterprise reCAPTCHA Module
HAProxy Enterprise adds several security-related modules that help you correctly identify bots and respond intelligently. One is the reCAPTCHA module. When an attacker launches a denial-of-service attack, oftentimes they’ll deploy a legion of bots to throw requests at you. When you detect that a client has exceeded your rate limit, rather than just denying them, you can send them a reCAPTCHA challenge.
The benefit of a reCAPTCHA is that it lowers the risk of false positives. Maybe a legitimate user got caught by the limit. The module lets them prove that they’re a human. Those malicious bots will be stopped, or at least slowed to the point that it’s inconvenient for them to keep attacking your service, but true, human visitors will be able to pass the test and continue.
Enabling Rate Limiting in HAProxy
In this blog post, you learned several ways to enable rate-limiting in HAProxy. Using its building blocks—ACLs, stick tables, and maps—various sophisticated techniques are not only possible but easy to implement. A common approach is to track users over a sliding window of time. However, you can use the Runtime API to clear stick table records to achieve fixed-time-period rate limiting, as in midnight-to-midnight. Also, because you have access to all of the information inside of the HTTP request, it’s possible to base your rate limit on the URL’s path or parameters.
If you enjoyed this post and want to see more like it subscribe to this blog! You can also follow us on Twitter and join the conversation on Slack.
Want to hear about the additional security features available in HAProxy Enterprise? Contact us to learn more and sign up for a free trial.
Subscribe to our blog. Get the latest release updates, tutorials, and deep-dives from HAProxy experts.