Introduction
In modern backend systems, rate limiting is essential to ensure:
- System stability
- Fair usage
- Protection against abuse (bots, brute force attacks)
Most developers rely on libraries, but understanding how to build one from scratch gives you a serious edge in interviews and real-world system design.
In this blog, we’ll:
- Understand rate limiting vs throttling
- Learn sliding window algorithm
- Build a custom rate limiter in Spring Boot (no libraries)
- Cover real-world best practices
What is Rate Limiting?
Rate limiting controls how many requests a user can make within a time window.
Example:
A user can make only 5 API calls per minute
If exceeded → server returns:
HTTP 429 Too Many Requests
What is Throttling?
Throttling controls the speed of requests instead of blocking them.
Example:
- First few requests → fast
- Excess requests → delayed
Rate Limiting vs Throttling
| Feature | Rate Limiting | Throttling |
|---|---|---|
| Behavior | Blocks requests | Slows requests |
| Use case | Security | Load control |
| Response | 429 error | Delayed response |
Real-World Example
Think of an ATM:
- Max 5 withdrawals/day → Rate Limiting
- Slow processing during rush → Throttling
Algorithm We Will Use: Sliding Window
Instead of resetting counts every minute, we:
- Track timestamps of requests
- Remove old ones (outside time window)
- Allow only if count is within limit
More accurate than fixed window.
Implementation in Spring Boot (No Library)
Step 1: Create Rate Limiter Service
import org.springframework.stereotype.Service;
import java.time.Instant;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;
@Service
public class CustomRateLimiterService {
private static final int MAX_REQUESTS = 5;
private static final long TIME_WINDOW = 60; // seconds
private final Map<String, Deque<Long>> requestStore = new ConcurrentHashMap<>();
public boolean allowRequest(String key) {
long currentTime = Instant.now().getEpochSecond();
requestStore.putIfAbsent(key, new ArrayDeque<>());
Deque<Long> timestamps = requestStore.get(key);
synchronized (timestamps) {
// Step 1: Remove expired requests
while (!timestamps.isEmpty() &&
currentTime - timestamps.peekFirst() >= TIME_WINDOW) {
timestamps.pollFirst();
}
// Step 2: Check limit
if (timestamps.size() < MAX_REQUESTS) {
timestamps.addLast(currentTime);
return true;
}
return false;
}
}
}
Code Explanation
ConcurrentHashMap
Stores request data per user/IP.
Deque<Long>
Maintains timestamps of requests in order.
Removing old timestamps
currentTime - timestamps.peekFirst() >= TIME_WINDOW
Ensures only last 60 seconds of requests are counted.
Thread Safety
synchronized (timestamps)
Prevents race conditions in concurrent environments.
Step 2: Create Filter to Intercept Requests
import jakarta.servlet.*;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.stereotype.Component;
import java.io.IOException;
@Component
public class CustomRateLimitingFilter implements Filter {
private final CustomRateLimiterService rateLimiterService;
public CustomRateLimitingFilter(CustomRateLimiterService rateLimiterService) {
this.rateLimiterService = rateLimiterService;
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
throws IOException, ServletException {
HttpServletRequest httpRequest = (HttpServletRequest) request;
HttpServletResponse httpResponse = (HttpServletResponse) response;
String ip = httpRequest.getRemoteAddr();
if (rateLimiterService.allowRequest(ip)) {
chain.doFilter(request, response);
} else {
httpResponse.setStatus(429);
httpResponse.getWriter().write("Too many requests. Try again later.");
}
}
}
Code Explanation
Filter
Intercepts every incoming request before reaching controller.
getRemoteAddr()
Identifies client (can be replaced with userId or API key).
Decision Logic
- Allow → pass request
- Reject → return 429
Step 3: Create Test API
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class TestController {
@GetMapping("/test")
public String test() {
return "Request successful!";
}
}
Testing
-
Call
/test5 times → ✅ success - 6th request → ❌ HTTP 429
Best Practices (Very Important)
Use User ID Instead of IP
IP can be unreliable behind proxies.
Use X-Forwarded-For
String ip = request.getHeader("X-Forwarded-For");
Add Retry Header
response.addHeader("Retry-After", "60");
Avoid Memory Leaks
- Clean unused entries
- Add expiry logic
Different Limits per API
Critical APIs (payments) → stricter limits
Limitations of Custom Approach
- Not distributed (won’t work across multiple instances)
- Memory grows with users
- Requires manual cleanup
Production systems use:
- Redis
- API Gateway (Spring Cloud Gateway, NGINX)
Bonus: Convert to Throttling
Instead of blocking:
if (!rateLimiterService.allowRequest(ip)) {
Thread.sleep(500);
}
Now requests are slowed instead of rejected.
Conclusion
Building your own rate limiter helps you:
- Understand system design deeply
- Crack backend interviews
- Design scalable APIs
Key Takeaways
Rate Limiting = control request count
Throttling = control request speed
Sliding Window = accurate algorithm
Custom solution = great for learning
Use Redis/Gateway in production
0 Comments