Making a custom redis command for rate limiting (TypeScript + Lua)

What we’re building

In order to add the free tier for smudge.ai, I needed a way to limit the number of commands free users could run over a set period of time. Existing Node.js rate limiters didn’t quite fit the bill (more on that below), so I rolled my own*.

* Edit: May 2, 2024Well, you live and learn. It turns out we could have relied on an off-the-shelf Token Bucket limiter instead. Regardless, I had a great time and learned a lot by building this rate limiter from scratch and invoking a custom Redis command from Node.js.

I used Redis as the data store. One catch for a TypeScript dev like myself: Redis scripting is all Lua. The approach I took was to prototype using the language I know, then port the solution to Lua once I had proved that the logic was sound. While the specific problem of building a custom rate limiter is fairly niche, defining custom Redis commands that you can call from Node.js is quite useful! With that in mind, I’m sharing what I’ve learned about this so far and hope you get something out of it, too.

In this post we’ll…

Before we get into all that, there’s an important question to answer.

What is a day?

In order to limit free tier users to n commands per day, we first need to answer the question: what is a day? Seems simple, but there are some nuances.

  • A day could be a fixed window beginning at midnight in the user’s timezone and lasting for 24 hours. While this definition aligns with most people’s natural understanding of a day, implementing it requires tracking each user’s timezone as well as handling complexities like daylight savings or users changing timezones.
    (In the visualizations, 
     will represent a successful request and 
     will represent a request that has been blocked by the rate limiter. You can add a simulated request manually with the Hit button, which pauses the automatic stream.)
  • A day can also be a 24-hour sliding window during which a maximum of n commands can be run. With this approach, commands become available one by one as they are freed up. Another way think of this is: “the limit defines the maximum number of windows that can exist at any given time”. For my use case, users tend to use the demo in short bursts, so they’d be frustrated by the slow drip of commands being freed up sequentially and might find it hard to keep track of when commands become available again.
  • A day can also be a 24-hour fixed window starting with your first command. This is the approach we ended up using. With a fixed window with a user-defined start time, we don’t need any timezone handling logic. And predictability is better in comparison to a sliding window.
    Because an important tradeoff with this approach is that it’s less clear when the next window begins compared to, say, a fixed window in the user’s timezone, we show the time remaining in the UI so there is no guesswork.

Let’s look at some code.

We’ll need a function to record each hit to the rate limiter, which returns whether the request should be blocked (limited: true) and if so, how long until a retry is allowed (retryAfter: windowDuration - timeSinceWindowCreation).

To allow each user 10 requests per 24-hour period, we’ll set the following parameters:

const limit = 10;
const windowDuration = 24 * 60 * 60 * 1000;

We also need a place to store the records containing the number of times each user has triggered the rate limiter since their windowStart timestamp.

type LimitRecord = {
count: number;
windowStart: Date;
};
const store = new Map<string, LimitRecord>();

Lastly, we need the function that registers a hit, incrementing a user’s count in the store and returning whether they should be limited.

// hit('user-001'); { limited: false }
// hit('user-001'); { limited: false }
// ... 8 more times ...
// hit('user-001'); { limited: true, retryAfter: 24h }
export function hit(id: string) {
const now = new Date();
const record = store.get(id);
const timeElapsed = record ? now.getTime() - record.windowStart.getTime() : 0;
// if no record or it expired, set a new one
if (!record || timeElapsed > windowDuration) {
store.set(id, { count: 1, windowStart: now });
return { limited: false };
}
// increment the counter if it's within the limit
if (record.count < limit) {
record.count += 1;
return { limited: false };
}
return {
limited: true, // the request has been rate limited
retryAfter: windowDuration - timeElapsed,
};
}

Problems with this solution

  1. It’s in memory.

    • Rate limiting is isolated to a single long-running process, so say goodbye to horizontal scaling or serverless code.
    • Restarting the server will nuke the store, resetting everybody’s rate limit windows.
  2. It’s not configurable. The max and windowDuration are hard-coded constants.

    • Easy fix: wrap the whole thing in a function that accepts those two parameters. Leaving this as an exercise for the reader.
  3. It grows indefinitely. Not an issue in practice unless you have an enormous number of IDs in the store, but because expired records are never pruned this will consume more and more memory with use.

Adding persistence and scalability with Redis

Let’s address the first issue and move the store somewhere off-server and with persistence. Redis (or any of its forks) tends to be the go-to solution when it comes to rate limiting due to its speed and built-in functions that support expiring keys, which make it a breeze to build a rate limiter.

We can essentially swap out the in-memory store for a Redis hash map.

import { Redis } from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
export function hit(id: string) {
export async function hit(id: string) {
const now = new Date();
const record = store.get(id);
const record = await store.hgetall(id);
const timeElapsed = record ? now.getTime() - record.windowStart.getTime() : 0;
// if no record or it expired, set a new one
if (!record || timeElapsed > windowDuration) { // [1]
store.set(id, { count: 1, windowStart: now });
await redis.hset(id, 'windowStart', now, 'count', 1);
await redis.pexpire(id, windowDuration); // new!
return { limited: false };
}
// increment the counter if it's within the limit
if (record.count < limit) {
record.count += 1;
await redis.hincrby(id, 'count', 1);
return { limited: false };
}
return {
limited: true, // the request has been rate limited
retryAfter: windowDuration - timeElapsed,
};
}
[1]Aside: I’ve been curious whether I can safely remove the second half of this condition, since the timeElapsed can never exceed windowDuration now that there’s a pexpire. On the other hand, I am not sure how much pexpire can be relied upon between restarts. If anyone with a definitive answer wants to let me know, please do!

Problems with this solution

  1. Race conditions. In the time between await store.hgetall(id) and the commands that update the record, a parallel update could have arrived. It’s for this reason we can’t just solve all our problems with a pipeline.
  2. Redis return types. I confess! The code above won’t run. I simplified some Redis types to avoid a distracting noisy diff.
(Rant!)
Who decided that this…
redis.set(
'myKey', // key
'Hello, world!', // value
'EX', // parameter name
300, // parameter value
'NX', // flag
'GET', // flag
);

…was more sensible than a named parameters object?!

redis.set({
key: 'myKey',
value: 'Hello, world!',
expireSeconds: 300,
onlySetIfNotExists: true,
returnPreviousValue: true,
});

If your Node.js Redis client really must maintain full parity with the underlying Redis API’s terse syntax, make it opt-in and nested under a redis.raw namespace or something. The current API means you need to memorize or look up the Redis docs for all but the most basic commands.

/rant. Probably just a skill issue. Back to work.

  1. It can go faster. We’re making sequential async requests to another server. What if there were a way to run these all in a single command?

Lua time!

You can run as many Redis functions as you would like within a single async command using Lua scripting, reducing the waterfall of round trips to just a single request/response.

Using ioredis we’ll define a rateLimit command in Lua that gets called in place of most of our previous logic. (If you prefer not to use ioredis, you can always use other Redis clients and call the underlying SCRIPT LOAD and EVALSHA commands manually.)

The goal will be to write a script that we can pass all the same inputs to, which will still return whether the request was limited, and if so include the time remaining until a retry is allowed.

A brief implementation note: instead of returning an object like { limited: true, retryAfter 2500 }, for this implementation we’ll be using a tuple, like [true, 2500] to represent the same information. But because booleans from the Lua script show up as numbers, that tuple will actually be [1, 2500].

import { Redis, type Result } from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
// 1. write a script in lua to execute in redis,
// then use it to define a custom command
redis.defineCommand('rateLimit', {
numberOfKeys: 1,
lua: `[a string of lua]`, // implemented later
});
// 2. let ts know the lua command's types [2]
declare module 'ioredis' {
interface RedisCommander<Context> {
rateLimit(
key: string,
limit: number,
windowDuration: number,
now: number,
): Result<[number, number], Context>;
}
}
export async function hit(id: string) {
const now = new Date();
// 3. run the custom command
const [limited, retryAfter] = await redis.rateLimit(
id,
limit,
windowDuration,
now.getTime(),
);
// we're representing bools as numbers [3]
if (limited === 0) {
return { limited: false };
}
return { limited: true, retryAfter };
}
[2] Because the TypeScript compiler has no way to know the custom command’s types, we have to declare them manually. There’s no type-safety with this. We have to carefully ensure that the Lua script adheres to these types. (Probably a good case for a unit test.) The type declaration is copied from this scripting example.
[3]From my tests, returning { false, 0 } in Lua becomes [null, 0] in js, whereas a return value of { true, 0 } becomes [1, 0]. So, for consistency, I’m sticking with numbers.

The arguments passed in to the custom rateLimit command from our TypeScript code are then read within the Lua script similar to how one would read CLI arguments.

-- (within the lua string above)
local key = KEYS[1] -- lua uses 1-based indexing
local limit = tonumber(ARGV[1])
local windowDuration = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

Finally, here’s the Lua equivalent to the logic from the previous iteration.

-- (continued from above)
-- attempt to fetch an existing record
-- (see https://redis.io/commands/hmget/)
local record = redis.call('HMGET', key, 'windowStart', 'count')
local windowStart = record[1]
local count = record[2]
-- if no record or it expired, set a new one
-- (we get all false values if no record exists)
if windowStart == false then
redis.call('HSET', key, 'windowStart', now, 'count', 1)
redis.call('PEXPIRE', key, windowDuration)
return { 0, 0 } -- becomes the tuple `[0, 0]` in js
end
-- increment the counter if it's within the limit
-- (tonumber because all values are saved as strings)
if tonumber(count) < limit then
redis.call('HINCRBY', key, 'count', 1)
return { 0, 0 }
end
local timeElapsed = now - tonumber(windowStart);
-- the request has been rate limited
-- (return `1` for `true`)
return { 1, windowDuration - timeElapsed }

Why this is worthwhile

We make a lot of concessions in this code. There’s a call to another language, which needs to be loaded and evaluated. We sacrifice some type safety and readability. And the Lua code is in a string! But in exchange we get:

  • Speed. In the hot path, there’s a single request to our Redis instance instead of 2-3 round trips. For our app, the Redis instance that the server connects to will be in the same datacenter, so rate limiting typically has sub-millisecond timing.
  • Reliability. We’ve removed the race conditions from the previous iteration.
  • Memory efficiency. With PEXPIRE any expired windows will delete themselves automatically.
  • Scalability. We can scale our application server horizontally without potentially spreading the window records across multiple servers.
  • Persistence. When the server restarts, the store in Redis remains intact.

Conclusion

Honestly, you should probably just use something like Upstash for your rate limiting and call it a day! If your use case requires a fixed window (and you don’t care about the exact window start time), or if it requires a sliding window or token bucket approach, then their API is a joy to use.

On the other hand, if you need finer control over the start time for your fixed window rate limiter or simply enjoy building things yourself, consider giving an approach like this a try.

Where to go from here

There are many further improvements I’d like to make. Adding an ephemeral cache in memory would allow skipping the hit to Redis while the rate limiter is hot. For multi-region apps, the limiter could support multiple Redis instances, selecting the one closest to the requesting server’s region and then propagating updates across the other instances. And adding support for multiple limiters, each with its own configurable limit and windowDuration, can be done by prefixing each key with a string unique to that limiter.

What would you change? Feel free to submit corrections, ideas, or any feedback to feedback@smudge.ai.

Thank you for reading!


And a huge thank you to @onsclom for inspiring this project, working with me to make it happen, and showing me how to make those canvas visualizations.


Here’s the final code from this article all together.
import { Redis, type Result } from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
const rateLimitScript = `
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local windowDuration = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local record = redis.call('HMGET', key, 'windowStart', 'count')
local windowStart = record[1]
local count = record[2]
if windowStart == false
redis.call('HSET', key, 'windowStart', now, 'count', 1)
redis.call('PEXPIRE', key, windowDuration)
return { 0, 0 }
end
if tonumber(count) < limit then
redis.call('HINCRBY', key, 'count', 1)
return { 0, 0 }
end
local timeElapsed = now - tonumber(windowStart);
return { 1, windowDuration - timeElapsed }`;
redis.defineCommand('rateLimit', {
numberOfKeys: 1,
lua: rateLimitScript,
});
declare module 'ioredis' {
interface RedisCommander<Context> {
rateLimit(
key: string,
limit: number,
windowDuration: number,
now: number,
): Result<[number, number], Context>;
}
}
const limit = 10;
const windowDuration = 24 * 60 * 60 * 1000;
export async function hit(id: string) {
const now = new Date();
const [limited, retryAfter] = await redis.rateLimit(
id,
limit,
windowDuration,
now.getTime(),
);
if (limited === 0) {
return { limited: false };
}
return { limited: true, retryAfter };
}

Smudge.ai is a Chrome extension that lets you save custom ChatGPT commands into your right-click menu. If that’s something you’re interested in, as a thank you for taking the time to read this post, you can take 20% off forever with the discount code RATELIMIT20. Cheers!