paint-brush
Node.js Performance: The Stuff I Wish Someone Had Told Meby@menghnanimohit89
New Story

Node.js Performance: The Stuff I Wish Someone Had Told Me

by 6mFebruary 26th, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

I have been in the tech industry for more than a decade, and during this time, I must admit I have broken production multiple times. Through this article, I aim to share some useful things that I wish I had known earlier.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Node.js Performance: The Stuff I Wish Someone Had Told Me
undefined HackerNoon profile picture

I have been in the tech industry for more than a decade, and during this time, I must admit I have broken production multiple times. Through this article, I aim to share some useful things that I wish I had known earlier.

The Day Our Server was on Fire (Not Literally, Thank God)

I led the release of a feature where the users could bulk-upload their product catalogs. It was supposed to be a simple CSV processing logic; what could really go wrong?


What actually happened:


// My terrible, horrible, no-good code
router.post('/upload', (req, res) => {
  const products = readFileSync(req.files[0].path, 'utf-8')
    .split('\n')
    .map(line => parseLine(line)); // heavy CPU work
  
  saveToDatabase(products);
  res.send('Done!');
});


Let's try to understand in depth, User 1 uploads 100 rows, User 2 uploads 1000 rows (still manageable), but then User 3 decides to upload their entire product catalog (50k rows), and things just went haywire after that.


After some research, we realized that we should always have processed rows in batches to keep our event loop from going crazy.

router.post('/upload', async (req, res) => {
  // Tell the user we got their file before we process it
  res.send('Upload received, processing started');
  
  const readable = createReadStream(req.files[0].path)
    .pipe(csv.parse({ headers: true }));
    
  // Process 100 rows at a time to keep the event loop happy
  let chunk = [];
  for await (const row of readable) {
    chunk.push(row);
    
    if (chunk.length >= 100) {
      await processChunk(chunk);
      chunk = [];
      // Let other requests breathe
      await new Promise(r => setTimeout(r, 10));
    }
  }
  
  // Don't forget the last chunk!
  if (chunk.length > 0) {
    await processChunk(chunk);
  }
});

The Memory Leak That Wasn't

A few years ago, I was debugging a memory leak, and I was not getting anywhere until it actually was a problem with ... MongoDB's connection pool.


All of our API endpoints looked like this:


app.get('/api/whatever', async (req, res) => {
  const db = await MongoClient.connect(url);
  try {
    const data = await db.collection('stuff').find();
    res.json(data);
  } finally {
    await db.close(); // thought I was being clever here
  }
});


The problem with this approach is that if we create and close connections for every request, it will be a super slow and undesirable experience. In addition, the connection churn is too much for the Node.js garbage collector to handle.


Spoiler alert: Creating and closing connections for every request is SLOW. And apparently, Node.js garbage collection couldn't keep up with the connection churn. Our memory graph looked like a very drunk snake. The memory graph looked like a wildly erratic rollercoaster ride.


So, what should we do here? Just create one connection pool.

// db.js - The file I should've written on day one
let _db = null;

async function getDb() {
  if (!_db) {
    const client = new MongoClient(url, {
      maxPoolSize: 50,
      minPoolSize: 10,
      // Found these numbers through trial and error
      // AKA: "let's change it and see what breaks"
    });
    
    _db = await client.connect();
    
    // The lifesaver
    process.on('SIGINT', () => {
      _db.close();
      process.exit();
    });
  }
  return _db;
}

Real Talk About Worker Threads

The right technology is only truly effective when applied in the right context. Lately, it’s become a trend to use worker threads everywhere. But in my opinion, they should only be used when really needed. Let’s take a look at what not to do:


// Don't copy this. Seriously.
const workers = new Array(100).fill(null).map(() => 
  new Worker('./worker.js')
);

function getNextWorker() {
  return workers[Math.floor(Math.random() * workers.length)];
}


What this is really doing is spawning 100 workers but think what’s happening to the memory usage.


Instead, we should be having only 2 workers always ready with an option to go higher only when it’s actually needed. Also, kill the idle workers after a fixed amount of time.


const WorkerPool = require('./worker-pool'); // wrote this myself after much pain
const pool = new WorkerPool({
  min: 2,          // always keep 2 workers ready
  max: 10,         // spin up more as needed
  idleTimeout: 30000  // kill idle workers after 30s
});

app.post('/process-image', async (req, res) => {
  const worker = await pool.acquire();
  try {
    const result = await worker.process(req.body);
    res.json(result);
  } finally {
    pool.release(worker);
  }
});

The Simplest and Quickest Performance Fix Ever

As products scale and companies grow, acquiring more clients and generating more revenue, the demand for better performance intensifies. One of the simplest performance fixes I've implemented in my career is this:


app.use(express.json({ limit: '1mb' }));


Previously, we were letting users upload JSONs with no limitations, we had a few customers sending up to 50MB JSONs, and Express would spend forever trying to parse it in the memory.

Monitoring Without Losing Your Mind

One last thing I would like to cover today is Monitoring, I have a script that monitors memory usage and logs significant changes over time. It looks like the following:


// Poor man's performance monitoring
const intervals = {};

setInterval(() => {
  const used = process.memoryUsage();
  
  Object.keys(used).forEach(key => {
    const lastUsage = intervals[key] || used[key];
    const delta = used[key] - lastUsage;
    
    if (Math.abs(delta) > 1000000) { // 1MB change
      console.log(`Memory ${key}: ${delta / 1024 / 1024}MB change`);
      // Slack webhook goes here if you're fancy
    }
    
    intervals[key] = used[key];
  });
}, 30000);


It may not be elegant, but it gets the job done.

What I Actually Learned

Let’s do a quick recap of the concepts we discussed above:

  1. Batch Processing is your friend – Processing CSV uploads in batches prevents the event loop from choking.


  2. Connection Pools are a lifesaver – Reusing database connections avoids excessive memory churn and improves performance. When something is slow, it's probably doing more work than you think.


  3. Use Worker Threads Wisely – Keep a minimal number of workers ready and scale only when necessary. They are like jalapeños —you probably need fewer than you think.


  4. Limit JSON Uploads – Restricting payload size prevents excessive memory usage and slowdowns.


  5. Monitor Memory Trends – Measuring and tracking memory changes over time to catch issues before they escalate. If there is smoke, there has to be a fire somewhere!


Got stories of your own server disasters? Please drop them in the comments. I am always looking for new ideas!