meteor

Diagnosing observer leaks in Meteor

An observer leak silently drains your Meteor server's RAM and Mongo CPU until the inevitable OOM. Here's how to find them — and stop them shipping.

You have a Meteor app. RAM has been creeping up for weeks. Connection count is flat. You restart the process and it's fine for a few hours, then it starts climbing again. You're paying for double the boxes you should need, and Mongo is hot.

You've got an observer leak.

This guide shows you how to confirm it, find the leaky publication, and ship a fix.

What an observer leak looks like

LiveQuery — Meteor's reactive layer — keeps a server-side observer open for each subscription. If your code starts an observer manually (via cursor.observeChanges() inside a publication) and forgets to register this.onStop(), that observer never gets cleaned up when the client disconnects. It just sits there, replaying changes from MongoDB into a phantom client that left.

Multiply by every page-view from every user over a week and you have a classic memory leak. RAM goes up. Mongo CPU goes up. P99 latency goes up. Everyone gets paged.

Step 1 — confirm the leak exists

In the UptimeClarity dashboard, open Reactive → Overview. Sort by Observers descending. The leak looks like this:

Publication           Observers   Conns   Δ observers / hr
posts.byUser              4,812     142            +218 ⚠
inbox.unread              1,210     142             ±0
team.members                 47     142             ±0

posts.byUser has 34× the observers it has connections, and the count is strictly rising.

If you're using a different APM, look for:

  • A livequery.observers.count metric. If it's monotonically rising while ddp.connections is flat, that's the same signal.
  • Or run db.serverStatus().connections against Mongo and watch oplog cursors — leaked observers tail the oplog forever.

Step 2 — find the unstopped observer

Open the offending publication. Look for observe() or observeChanges() calls. The bug is always the same shape:

server/publications/posts.js
Meteor.publish('posts.byUser', function (userId) {
  check(userId, String);
  // BUG: handle is created but never stopped.
  const handle = Posts.find({ userId }).observeChanges({
    added:   (id, fields) => this.added('posts', id, fields),
    changed: (id, fields) => this.changed('posts', id, fields),
    removed: (id) => this.removed('posts', id),
  });
  this.ready();
});

The fix is one line:

server/publications/posts.js
Meteor.publish('posts.byUser', function (userId) {
  check(userId, String);
  const handle = Posts.find({ userId }).observeChanges({
    added:   (id, fields) => this.added('posts', id, fields),
    changed: (id, fields) => this.changed('posts', id, fields),
    removed: (id) => this.removed('posts', id),
  });
  this.onStop(() => handle.stop()); // ← fix the leak
  this.ready();
});

Step 3 — ship the fix safely

Once the fix is in, watch the Observers chart in the dashboard. After deploy, the observer count should:

  1. 1

    Briefly drop as old observers tied to disconnecting users finally release.

  2. 2

    Then plateau at a count proportional to active connections (typically 1–3× concurrent users for a hot pub).

  3. 3

    Stay flat overnight. If you re-deploy 24 hours later and the count is still flat, you've fixed it.

Bonus — switch the driver while you're here

If the leaky publication was on the polling driver (10s re-query loop, visible as Driver: polling in the Reactive view), now is the time to fix that too. Polling adds about 6× the Mongo load of change_stream and is the second most common cause of Meteor scaling pain.

If your cluster is a replica set on Mongo 4.0+, change streams are usually just on — but a single subtle query operator ($where, $text, certain geo predicates) will kick the publication back to polling silently. The LiveQuery & observers doc has the full driver-fallback ladder.

TL;DR

  1. Watch observers ÷ connections. Rising means leaking.
  2. Audit every manual observe/observeChanges for missing this.onStop().
  3. Re-check the chart 24h after deploy.
  4. While you're there, get off polling if you can.