Understanding an API Rate limiter (System Design)

Harsh Vardhan Gautam
3 min readJan 21, 2023

--

In general, a rate limiter as the name itself suggests, it is a system which limits the rate of something. In case of API rate limiter, that something is no other than the API Requests.

Image showing filtering process of api requests through api rate limiter

Why to have an API Rate limiter at first place, right? Let’s quickly look at it’s advantages.

  1. An API Rate limiter protects a system from possible DoS attack. DoS stands for Denial of Service.
  2. By limiting the API requests, the server will also get protected from being overloaded.
  3. As an outcome of point 2, the company will save the cost as fewer servers can perform efficiently.

Let’s start with the system design of our API rate limiter. First of all, let me tell you that we’ll be designing our system in a basic version. If you need in depth understanding & want to learn in-depth system design of the same, let me know in the comments, I’ll share the resources with you.

Okay, so to get started, let’s jot down parameters which will play role in our system.

  1. Client
  2. API Rate limiter
  3. API Servers

Our API Rate limiter’s overall system design would look something like this.

System design of a basic API rate limiter

Let’s design the API Rate limiter. So if you try to think technically, you’ll find that API Rate limiter is actually a very simple thing yet a quite powerful one.

An API Rate limiter requires below mentioned steps to perform it’s check.

  1. Rate limiter needs to keep track of number of client requests.
  2. It needs to validate whether incoming client request exceeds the limit of requests allowed for a client.
  3. It needs to reset the track of number of client requests after a preset frequency in order to make client again make a request.

There are plenty of algorithms which an API rate limiter can use to solve the above 3 points. I’ll be using an algorithm known as Token bucket algorithm.

This algorithm is very simple & intuitive to understand. As it’s name suggests, in this algorithm, we keep a bucket for each client & pre-fill the bucket with some set of tokens. Each time a request comes from a client, we check the bucket for that client & validates whether where are enough tokens available or not. If yes, api rate limiter will forward the client request to appropriate server else the request will be dropped.
Simultaneously, we’ll also keep refilling the client buckets at some preset frequency so that client can make requests again after defined duration.

I’ve implemented a very simple api rate limiter using javascript. If you’ve come till here, you’ll find that in the code, I tried to mock exactly the same architecture that we discussed today. Please do check the codesandbox once here https://codesandbox.io/embed/agitated-gwen-0hpjer?expanddevtools=1&fontsize=14&theme=dark

That’s quite a wonderful learning, isn’t it?

Thank you so much for taking out some time for yourself to learn something. I appreciate your decision.

I hope that the article has added some value to your learning. If you have any feedback or comment, please do share with me in the comments.

Thank You!

--

--