Rate Limiting vs Throttling in Spring Boot (With Custom Implementation)

 

Introduction

In modern backend systems, rate limiting is essential to ensure:

  • System stability
  • Fair usage
  • Protection against abuse (bots, brute force attacks)

Most developers rely on libraries, but understanding how to build one from scratch gives you a serious edge in interviews and real-world system design.

In this blog, we’ll:

  • Understand rate limiting vs throttling
  • Learn sliding window algorithm
  • Build a custom rate limiter in Spring Boot (no libraries)
  • Cover real-world best practices

What is Rate Limiting?

Rate limiting controls how many requests a user can make within a time window.

Example:

A user can make only 5 API calls per minute

If exceeded → server returns:

HTTP 429 Too Many Requests

What is Throttling?

Throttling controls the speed of requests instead of blocking them.

Example:

  • First few requests → fast
  • Excess requests → delayed

Rate Limiting vs Throttling

FeatureRate Limiting Throttling 
BehaviorBlocks requestsSlows requests
Use caseSecurityLoad control
Response429 errorDelayed response

Real-World Example

Think of an ATM:

  • Max 5 withdrawals/day → Rate Limiting
  • Slow processing during rush → Throttling

Algorithm We Will Use: Sliding Window

Instead of resetting counts every minute, we:

  • Track timestamps of requests
  • Remove old ones (outside time window)
  • Allow only if count is within limit

More accurate than fixed window.


Implementation in Spring Boot (No Library)


Step 1: Create Rate Limiter Service

import org.springframework.stereotype.Service;

import java.time.Instant;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;

@Service
public class CustomRateLimiterService {

private static final int MAX_REQUESTS = 5;
private static final long TIME_WINDOW = 60; // seconds

private final Map<String, Deque<Long>> requestStore = new ConcurrentHashMap<>();

public boolean allowRequest(String key) {
long currentTime = Instant.now().getEpochSecond();

requestStore.putIfAbsent(key, new ArrayDeque<>());
Deque<Long> timestamps = requestStore.get(key);

synchronized (timestamps) {

// Step 1: Remove expired requests
while (!timestamps.isEmpty() &&
currentTime - timestamps.peekFirst() >= TIME_WINDOW) {
timestamps.pollFirst();
}

// Step 2: Check limit
if (timestamps.size() < MAX_REQUESTS) {
timestamps.addLast(currentTime);
return true;
}

return false;
}
}
}

Code Explanation

ConcurrentHashMap

Stores request data per user/IP.


Deque<Long>

Maintains timestamps of requests in order.


Removing old timestamps

currentTime - timestamps.peekFirst() >= TIME_WINDOW

Ensures only last 60 seconds of requests are counted.


Thread Safety

synchronized (timestamps)

Prevents race conditions in concurrent environments.



Step 2: Create Filter to Intercept Requests

import jakarta.servlet.*;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.stereotype.Component;

import java.io.IOException;

@Component
public class CustomRateLimitingFilter implements Filter {

private final CustomRateLimiterService rateLimiterService;

public CustomRateLimitingFilter(CustomRateLimiterService rateLimiterService) {
this.rateLimiterService = rateLimiterService;
}

@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
throws IOException, ServletException {

HttpServletRequest httpRequest = (HttpServletRequest) request;
HttpServletResponse httpResponse = (HttpServletResponse) response;

String ip = httpRequest.getRemoteAddr();

if (rateLimiterService.allowRequest(ip)) {
chain.doFilter(request, response);
} else {
httpResponse.setStatus(429);
httpResponse.getWriter().write("Too many requests. Try again later.");
}
}
}

Code Explanation

Filter

Intercepts every incoming request before reaching controller.


getRemoteAddr()

Identifies client (can be replaced with userId or API key).


Decision Logic

  • Allow → pass request
  • Reject → return 429


Step 3: Create Test API

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class TestController {

@GetMapping("/test")
public String test() {
return "Request successful!";
}
}

Testing

  • Call /test 5 times → ✅ success
  • 6th request → ❌ HTTP 429

Best Practices (Very Important)

Use User ID Instead of IP

IP can be unreliable behind proxies.


Use X-Forwarded-For

String ip = request.getHeader("X-Forwarded-For");

Add Retry Header

response.addHeader("Retry-After", "60");

Avoid Memory Leaks

  • Clean unused entries
  • Add expiry logic

Different Limits per API

Critical APIs (payments) → stricter limits



Limitations of Custom Approach

  • Not distributed (won’t work across multiple instances)
  • Memory grows with users
  • Requires manual cleanup

Production systems use:

  • Redis
  • API Gateway (Spring Cloud Gateway, NGINX)

Bonus: Convert to Throttling

Instead of blocking:

if (!rateLimiterService.allowRequest(ip)) {
Thread.sleep(500);
}

Now requests are slowed instead of rejected.


Conclusion

Building your own rate limiter helps you:

  • Understand system design deeply
  • Crack backend interviews
  • Design scalable APIs

Key Takeaways

Rate Limiting = control request count
Throttling = control request speed
Sliding Window = accurate algorithm
Custom solution = great for learning
Use Redis/Gateway in production

Post a Comment

0 Comments