Making a custom redis command for rate limiting (TypeScript + Lua)
What we’re building
In order to add the free tier for smudge.ai, I needed a way to limit the number of commands free users could run over a set period of time. Existing Node.js rate limiters didn’t quite fit the bill (more on that below), so I rolled my own*.
I used Redis as the data store. One catch for a TypeScript dev like myself: Redis scripting is all Lua. The approach I took was to prototype using the language I know, then port the solution to Lua once I had proved that the logic was sound. While the specific problem of building a custom rate limiter is fairly niche, defining custom Redis commands that you can call from Node.js is quite useful! With that in mind, I’m sharing what I’ve learned about this so far and hope you get something out of it, too.
In this post we’ll…
- Prototype the logic for a custom rate limiter in TypeScript
- Add persistence by swapping out the in-memory store for a Redis store
- Then move the heavy lifting into a custom Redis command that’ll be defined in a Lua script
Before we get into all that, there’s an important question to answer.
What is a day?
In order to limit free tier users to n
commands per day, we first need to answer the question: what is a day? Seems simple, but there are some nuances.
- A day could be a fixed window beginning at midnight in the user’s timezone and lasting for 24 hours. While this definition aligns with most people’s natural understanding of a day, implementing it requires tracking each user’s timezone as well as handling complexities like daylight savings or users changing timezones.
- A day can also be a 24-hour sliding window during which a maximum of
n
commands can be run. With this approach, commands become available one by one as they are freed up. Another way think of this is: “the limit defines the maximum number of windows that can exist at any given time”. For my use case, users tend to use the demo in short bursts, so they’d be frustrated by the slow drip of commands being freed up sequentially and might find it hard to keep track of when commands become available again. - A day can also be a 24-hour fixed window starting with your first command. This is the approach we ended up using. With a fixed window with a user-defined start time, we don’t need any timezone handling logic. And predictability is better in comparison to a sliding window.Because an important tradeoff with this approach is that it’s less clear when the next window begins compared to, say, a fixed window in the user’s timezone, we show the time remaining in the UI so there is no guesswork.
Let’s look at some code.
We’ll need a function to record each hit to the rate limiter, which returns whether the request should be blocked (limited: true
) and if so, how long until a retry is allowed (retryAfter: windowDuration - timeSinceWindowCreation
).
To allow each user 10 requests per 24-hour period, we’ll set the following parameters:
We also need a place to store the records containing the number of times each user has triggered the rate limiter since their windowStart
timestamp.
Lastly, we need the function that registers a hit, incrementing a user’s count
in the store
and returning whether they should be limited
.
Problems with this solution
It’s in memory.
- Rate limiting is isolated to a single long-running process, so say goodbye to horizontal scaling or serverless code.
- Restarting the server will nuke the
store
, resetting everybody’s rate limit windows.
It’s not configurable. The
max
andwindowDuration
are hard-coded constants.- Easy fix: wrap the whole thing in a function that accepts those two parameters. Leaving this as an exercise for the reader.
It grows indefinitely. Not an issue in practice unless you have an enormous number of IDs in the
store
, but because expired records are never pruned this will consume more and more memory with use.
Adding persistence and scalability with Redis
Let’s address the first issue and move the store somewhere off-server and with persistence. Redis (or any of its forks) tends to be the go-to solution when it comes to rate limiting due to its speed and built-in functions that support expiring keys, which make it a breeze to build a rate limiter.
We can essentially swap out the in-memory store
for a Redis hash map.
timeElapsed
can never exceed windowDuration
now that there’s a pexpire
. On the other hand, I am not sure how much pexpire
can be relied upon between restarts. If anyone with a definitive answer wants to let me know, please do!Problems with this solution
- Race conditions. In the time between
await store.hgetall(id)
and the commands that update the record, a parallel update could have arrived. It’s for this reason we can’t just solve all our problems with a pipeline. - Redis return types. I confess! The code above won’t run. I simplified some Redis types to avoid a distracting noisy diff.
(Rant!)
…was more sensible than a named parameters object?!
If your Node.js Redis client really must maintain full parity with the underlying Redis API’s terse syntax, make it opt-in and nested under a redis.raw
namespace or something. The current API means you need to memorize or look up the Redis docs for all but the most basic commands.
/rant. Probably just a skill issue. Back to work.
- It can go faster. We’re making sequential async requests to another server. What if there were a way to run these all in a single command?
Lua time!
You can run as many Redis functions as you would like within a single async command using Lua scripting, reducing the waterfall of round trips to just a single request/response.
Using ioredis we’ll define a rateLimit
command in Lua that gets called in place of most of our previous logic. (If you prefer not to use ioredis, you can always use other Redis clients and call the underlying SCRIPT LOAD
and EVALSHA
commands manually.)
The goal will be to write a script that we can pass all the same inputs to, which will still return whether the request was limited, and if so include the time remaining until a retry is allowed.
A brief implementation note: instead of returning an object like { limited: true, retryAfter 2500 }
, for this implementation we’ll be using a tuple, like [true, 2500]
to represent the same information. But because booleans from the Lua script show up as numbers, that tuple will actually be [1, 2500]
.
{ false, 0 }
in Lua becomes [null, 0]
in js, whereas a return value of { true, 0 }
becomes [1, 0]
. So, for consistency, I’m sticking with numbers.The arguments passed in to the custom rateLimit
command from our TypeScript code are then read within the Lua script similar to how one would read CLI arguments.
Finally, here’s the Lua equivalent to the logic from the previous iteration.
Why this is worthwhile
We make a lot of concessions in this code. There’s a call to another language, which needs to be loaded and evaluated. We sacrifice some type safety and readability. And the Lua code is in a string! But in exchange we get:
- Speed. In the hot path, there’s a single request to our Redis instance instead of 2-3 round trips. For our app, the Redis instance that the server connects to will be in the same datacenter, so rate limiting typically has sub-millisecond timing.
- Reliability. We’ve removed the race conditions from the previous iteration.
- Memory efficiency. With
PEXPIRE
any expired windows will delete themselves automatically. - Scalability. We can scale our application server horizontally without potentially spreading the window records across multiple servers.
- Persistence. When the server restarts, the store in Redis remains intact.
Conclusion
Honestly, you should probably just use something like Upstash for your rate limiting and call it a day! If your use case requires a fixed window (and you don’t care about the exact window start time), or if it requires a sliding window or token bucket approach, then their API is a joy to use.
On the other hand, if you need finer control over the start time for your fixed window rate limiter or simply enjoy building things yourself, consider giving an approach like this a try.
Where to go from here
There are many further improvements I’d like to make. Adding an ephemeral cache in memory would allow skipping the hit to Redis while the rate limiter is hot. For multi-region apps, the limiter could support multiple Redis instances, selecting the one closest to the requesting server’s region and then propagating updates across the other instances. And adding support for multiple limiters, each with its own configurable limit
and windowDuration
, can be done by prefixing each key with a string unique to that limiter.
What would you change? Feel free to submit corrections, ideas, or any feedback to feedback@smudge.ai.
Thank you for reading!
And a huge thank you to @onsclom for inspiring this project, working with me to make it happen, and showing me how to make those canvas visualizations.
Here’s the final code from this article all together.
Smudge.ai is a Chrome extension that lets you save custom ChatGPT commands into your right-click menu. If that’s something you’re interested in, as a thank you for taking the time to read this post, you can take 20% off forever with the discount code RATELIMIT20. Cheers!