What is rate limiting?

June 3rd, 2024

3 min read

Rate limiting is an important security measure that restricts how clients consume web applications, APIs, and LLMs. It controls how often a client accesses a network or resource, or performs certain actions, within a specified timeframe. As a result, organizations commonly use rate limits to help prevent abuse from both human users and automated bots — a key threat to application performance and availability.

In load balancing, the load balancer itself may support rate limiting without requiring extra infrastructure components. When activity is too high, the load balancer enforces a temporary "cooldown" period during which a client's requests are denied or ignored (or otherwise limited by CAPTCHA challenges).

How does rate limiting work?

Rate limits are highly configurable and primarily couple request volumes with a client IP address. A load balancer can use "stick tables" to store each client's request rates. These storage locations are in-memory on the load balancer itself and count the total amount of times important network events occur. Additionally, stick tables can track error rates to help support ACL expressions and rate limit failed login attempts or provide circuit-breaking functionality (supporting rate limiting).

When a specific source IP makes too many requests too quickly, their requests aren't fulfilled. Users may or may not receive a warning to reduce their activity or a challenge to prove they're human, indicating that the system has flagged their behavior as suspicious.

Because of that, rate limits are important countermeasures against brute force attacks, DDoS, DoS, and web scrapers. API abuse is also a common concern, which can happen purposefully through intentional overuse or accidentally. In a real-life scenario, a human user who has exceeded their limits might not see their app content load as expected.

Does HAProxy support rate limiting?

Yes! HAProxy uses stick tables to store IP address-related activity, which directly supports our Global Rate Limiting security feature. These count the number of client requests made, total errors triggered, and webpage access over a configurable time period. See our rate limiting blog post for configuration and usage examples.

Our Global Profiling Engine requests aggregated stick table data and pushes any relevant findings to all running HAProxy Enterprise nodes, which enables actions such as rate limiting. Check out our Global Profiling Engine documentation to learn more.

Beyond Basic Routing: Building an AI-Aware Gateway for LLM Security

PRODUCT OVERVIEW

LEARNING HUB

Technical Resources

Use Cases

Expert Support

Community

HAProxy Data Plane API 101: Powering Interactions Across HAProxy

PARTNERS

Company

CONNECT WITH US

HAProxyConf 2025, San Francisco

What is rate limiting?

How does rate limiting work?

Does HAProxy support rate limiting?

Related Content

What is web app and API protection (WAAP)?

What is an AI gateway?

What is an API gateway?

What is a denial-of-service (DoS) attack?

Privacy Settings

Beyond Basic Routing: Building an AI-Aware Gateway for LLM Security

Expert Support

Community

HAProxy Data Plane API 101: Powering Interactions Across HAProxy

PARTNERS

Company

CONNECT WITH US

HAProxyConf 2025, San Francisco

How does rate limiting work?

Does HAProxy support rate limiting?

Related Content

What is web app and API protection (WAAP)?

What is an AI gateway?

What is an API gateway?

What is a denial-of-service (DoS) attack?

Stay in the loop