Why I Obsessed Over JavaScript Generators
So here’s how my obsession with JavaScript generators started…
A few days back, I was working on a task where I had to write a data-fix/migration script. At first, it looked easy and ran well for a few users, but then we had to run it for all users. That’s when it started taking more time and causing memory and CPU issues, as it had to fetch all users’ data and store it while processing.
I understood that to handle this efficiently, we needed to process data in batches. We considered a database-level approach using cursors, but after studying them more, I found out that they can also cause performance issues. I started searching for better approaches, and that’s when I discovered a solution with generators.
So, enough of the past. Let’s focus on the present.
Understanding the Foundations
Before diving into generators, we need to understand a few concepts like iterator and iterable. Let’s get started.
What are Iterators?
Simple definition: any object that follows the iterator protocol. This means any object which has a method called next()
that returns an object containing two keys named value
and done
.
value
refers to the current value for the corresponding iterationdone
indicates whether the iteration is finished or not
Here’s an example:
1const getUsers = {
2 index: 0,
3 next() {
4 const users = ["Tejas", "Omkar", "Yash", "Manthan"];
5 return {
6 value: users.at(this.index++),
7 done: this.index > users.length,
8 };
9 },
10};
11
12getUsers.next() // {value: 'Tejas', done: false}
13getUsers.next() // {value: 'Omkar', done: false}
14getUsers.next() // {value: 'Yash', done: false}
15getUsers.next() // {value: 'Manthan', done: false}
16getUsers.next() // {value: undefined, done: true}
17// ... this still continues if called
18getUsers.next() // {value: undefined, done: true}
In the above snippet, you can see we’re indicating that iteration ends when the index is out of array bounds using the done
key, but we’re still able to call the next()
method. There’s no automatic termination here.
By the way, we can also have an infinite iterator like this:
1const iAmInfinity = {
2 index: 0,
3 next() {
4 return {
5 value: this.index++,
6 done: false,
7 };
8 },
9};
10
11iAmInfinity.next() // {value: 0, done: false}
12iAmInfinity.next() // {value: 1, done: false}
13iAmInfinity.next() // {value: 2, done: false}
14iAmInfinity.next() // {value: 3, done: false}
15iAmInfinity.next() // {value: 4, done: false}
16// ... continues indefinitely
So, this is the iterator protocol. Not exciting, right? Same as functions and closures? But bear with me for the next 10 minutes—you’ll understand what I’m getting at.
What are Iterables?
Again, simple definition: any object that follows the iterable protocol. This means any object which has a method called [Symbol.iterator]()
that returns an iterator.
Here’s an example:
1const getUsers = {
2 [Symbol.iterator]() {
3 return {
4 index: 0,
5 next() {
6 const users = ["Tejas", "Omkar", "Yash", "Manthan"];
7 return {
8 value: users.at(this.index++),
9 done: this.index > users.length,
10 };
11 },
12 };
13 },
14};
Now you might ask: what’s the difference? Just some weird syntax change? But wait a minute…
You know what for...of
takes as input to loop over? That’s right—an iterable!
So you can do the following with a custom-defined iterable:
1for (const user of getUsers) {
2 console.log(user); // "Tejas", "Omkar", "Yash", "Manthan"
3}
And since we can do similar things with arrays, objects, etc., this means I can also destructure it like:
1const [u1, u2, ...others] = getUsers;
2console.log(u1); // "Tejas"
3console.log(u2); // "Omkar"
4console.log(others); // ['Yash', 'Manthan']
Enter Generators
Now generators enter the scene. They are similar to iterators and iterables but with cleaner syntax.
1function* getUsers() {
2 yield "Tejas";
3 yield "Omkar";
4 yield "Yash";
5 yield "Manthan";
6}
7
8const getUsersGenerator = getUsers();
9
10getUsersGenerator.next() // {value: 'Tejas', done: false}
11getUsersGenerator.next() // {value: 'Omkar', done: false}
12getUsersGenerator.next() // {value: 'Yash', done: false}
13getUsersGenerator.next() // {value: 'Manthan', done: false}
14getUsersGenerator.next() // {value: undefined, done: true}
15// ... this continues if called
16getUsersGenerator.next() // {value: undefined, done: true}
17
18// We can also use them in for..of
19for (const user of getUsers()) {
20 console.log(user); // "Tejas", "Omkar", "Yash", "Manthan"
21}
As you can see, it’s the same as a normal function but with a *
at the end of the function
keyword, which acts as an identifier for generator functions. We also see a keyword called yield
which acts as the return value for the current iteration. Using yield
, we can start and pause our processing/execution.
Normal functions run from start to end, but using yield
we can pause the function and restart its execution from the paused step. It continues working until it reaches the last yield
statement or return
statement.
Async Generators: The Real Magic
There are also Asynchronous Generators which handle async processing, and here’s how I used them to solve my use case:
1async function* getUsersInBatch(batchSize = 10) {
2 const usersCount = await getAllUsersCount();
3 const totalBatches = Math.ceil(usersCount / batchSize);
4
5 for (let batchNumber = 0; batchNumber < totalBatches; batchNumber++) {
6 const offset = batchNumber * batchSize;
7 const limit = Math.min(batchSize, usersCount - offset);
8
9 const usersBatch = await getUsers(limit, offset);
10
11 yield {
12 usersBatch,
13 batchNumber: batchNumber + 1,
14 totalBatches
15 };
16 }
17}
18
19// Usage
20for await (const { usersBatch, batchNumber, totalBatches } of getUsersInBatch()) {
21 await performDataFixForUsers(usersBatch); // some function for processing
22 console.log(`Completed ${batchNumber}/${totalBatches} batches`);
23}
This might look like a lot to digest, but let me explain how it works step by step:
- Initialize the generator: We call
getUsersInBatch()
, which returns a generator object - Calculate batches: We first get the total count of users to determine batch count, then calculate the actual number of batches needed to fetch all users
- Process in batches: We use a loop to fetch users with
limit
andoffset
, and after fetching every batch, weyield
that batch - Pause and resume: The
yield
statement is inside afor
loop, which means this function continues working until all batches are completed (all users are fetched)
Here’s the beautiful part: we pause execution after fetching every batch and give that batch to a function for processing only that particular batch. Once processing is finished, we restart execution—the batchNumber
is incremented and we fetch a new batch based on the new limit
and offset
.
Also, this generator function is generic, meaning we can use it in multiple use cases where we need to fetch users in batches and perform some processing on them.
Why This Approach Won
This solution solved my original problem perfectly:
- Memory efficient: Only one batch is loaded in memory at a time
- Pausable execution: Processing can be paused and resumed
- Clean code: Much more readable than manual pagination logic
- Reusable: The generator can be used across different scenarios
So yes, that’s how I fell in love with generators.
See you in next blog post!