Steven's Knowledge

Real-Time Communication

WebSocket, SSE, long polling, WebTransport — pushing data to clients and scaling persistent connections

Real-Time Communication

Request-response covers most use cases. But when the server needs to push data to the client — chat messages, live dashboards, collaborative editing, notifications — you need a persistent connection.

This page covers the protocols, when to use each, and the scaling challenges that arise when your server maintains thousands of open connections instead of handling them one at a time.

The Options

ProtocolDirectionTransportBrowserUse case
WebSocketBidirectionalTCP (upgraded from HTTP)YesChat, gaming, collaboration
SSEServer → ClientHTTP/1.1 or 2YesLive feeds, notifications, dashboards
Long PollingServer → Client (simulated)HTTPYesLegacy fallback
WebTransportBidirectionalHTTP/3 (QUIC)PartialLow-latency, unreliable delivery OK

WebSocket

WebSocket provides a full-duplex, persistent TCP connection between client and server. After an HTTP upgrade handshake, both sides can send messages at any time.

Connection Lifecycle

Client                          Server
  │                               │
  │── GET /ws (Upgrade: websocket) →│
  │                               │
  │← 101 Switching Protocols ─────│
  │                               │
  │◄═══════ Full-duplex ═════════►│
  │        (messages flow         │
  │         both directions)      │
  │                               │
  │── Close frame ───────────────→│
  │← Close frame ─────────────────│
  │                               │

Server Implementation (Node.js)

import { WebSocketServer } from 'ws';
import http from 'http';

const server = http.createServer();
const wss = new WebSocketServer({ server });

// Track connected clients by room
const rooms = new Map<string, Set<WebSocket>>();

wss.on('connection', (ws, req) => {
  const userId = authenticateFromHeaders(req);  // Auth on connect
  if (!userId) {
    ws.close(4001, 'Unauthorized');
    return;
  }

  ws.on('message', (data) => {
    const msg = JSON.parse(data.toString());

    switch (msg.type) {
      case 'join':
        joinRoom(msg.room, ws);
        break;
      case 'message':
        broadcastToRoom(msg.room, {
          type: 'message',
          from: userId,
          text: msg.text,
          timestamp: Date.now(),
        }, ws);
        break;
    }
  });

  ws.on('close', () => {
    removeFromAllRooms(ws);
  });

  // Heartbeat to detect dead connections
  ws.isAlive = true;
  ws.on('pong', () => { ws.isAlive = true; });
});

// Ping every 30s, terminate if no pong
const heartbeat = setInterval(() => {
  wss.clients.forEach((ws) => {
    if (!ws.isAlive) return ws.terminate();
    ws.isAlive = false;
    ws.ping();
  });
}, 30_000);

function broadcastToRoom(room: string, msg: object, sender: WebSocket) {
  const clients = rooms.get(room);
  if (!clients) return;
  const payload = JSON.stringify(msg);
  for (const client of clients) {
    if (client !== sender && client.readyState === WebSocket.OPEN) {
      client.send(payload);
    }
  }
}

server.listen(8080);

Client Implementation

class ReconnectingWebSocket {
  private ws: WebSocket | null = null;
  private reconnectDelay = 1000;
  private maxDelay = 30_000;

  constructor(private url: string) {
    this.connect();
  }

  private connect() {
    this.ws = new WebSocket(this.url);

    this.ws.onopen = () => {
      this.reconnectDelay = 1000;  // Reset on success
      console.log('Connected');
    };

    this.ws.onmessage = (event) => {
      const msg = JSON.parse(event.data);
      this.handleMessage(msg);
    };

    this.ws.onclose = (event) => {
      if (event.code !== 1000) {  // Not a clean close
        this.scheduleReconnect();
      }
    };

    this.ws.onerror = () => {
      this.ws?.close();
    };
  }

  private scheduleReconnect() {
    const jitter = Math.random() * 1000;
    setTimeout(() => this.connect(), this.reconnectDelay + jitter);
    this.reconnectDelay = Math.min(this.reconnectDelay * 2, this.maxDelay);
  }

  send(msg: object) {
    if (this.ws?.readyState === WebSocket.OPEN) {
      this.ws.send(JSON.stringify(msg));
    }
  }
}

Always implement reconnection with exponential backoff and jitter. WebSocket connections drop — network changes, server deploys, load balancer timeouts. The client must handle this gracefully.

When to Use WebSocket

  • Bidirectional, low-latency communication. Chat, multiplayer games, collaborative editing.
  • High-frequency updates in both directions. The client sends and receives frequently.
  • Binary data. WebSocket supports binary frames natively.

When to Avoid WebSocket

  • Server-to-client only. Use SSE — it's simpler and works with HTTP caching and load balancers.
  • Infrequent updates. Polling every 30 seconds is simpler than maintaining a persistent connection.
  • You need HTTP semantics. WebSocket has no status codes, no headers, no caching. If you need those, you don't need WebSocket.

Server-Sent Events (SSE)

SSE is a one-way stream from server to client over a standard HTTP connection. The client uses the EventSource API; the server sends a stream of text/event-stream responses.

// Server (Node.js / Express)
app.get('/events', (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  res.flushHeaders();

  // Send a comment every 15s to keep the connection alive
  const keepAlive = setInterval(() => {
    res.write(': keepalive\n\n');
  }, 15_000);

  // Send events
  const sendEvent = (event: string, data: object) => {
    res.write(`event: ${event}\n`);
    res.write(`data: ${JSON.stringify(data)}\n\n`);
  };

  // Subscribe to events (e.g., from Redis pub/sub)
  const unsubscribe = eventBus.subscribe((event) => {
    sendEvent(event.type, event.payload);
  });

  req.on('close', () => {
    clearInterval(keepAlive);
    unsubscribe();
  });
});
// Client (Browser)
const source = new EventSource('/events');

source.addEventListener('notification', (event) => {
  const data = JSON.parse(event.data);
  showNotification(data);
});

source.addEventListener('update', (event) => {
  const data = JSON.parse(event.data);
  updateDashboard(data);
});

source.onerror = () => {
  // EventSource auto-reconnects — no manual reconnection needed
  console.log('Connection lost, reconnecting...');
};

SSE Advantages Over WebSocket

  • Auto-reconnection. The EventSource API reconnects automatically with the Last-Event-ID header.
  • Works with HTTP/2. Multiple SSE streams share a single TCP connection (no per-stream overhead).
  • Works with standard HTTP infrastructure. Load balancers, CDNs, proxies, caching — all work out of the box.
  • Simpler to implement and debug. It's just HTTP. curl works.

SSE Limitations

  • Server-to-client only. The client cannot send messages through the SSE connection (use regular HTTP requests).
  • Text only. No binary frames (base64-encode if needed, but that's overhead).
  • Connection limit. Browsers limit the number of SSE connections per domain (6 in HTTP/1.1, no limit in HTTP/2).

Long Polling

Long polling is the legacy fallback: the client makes an HTTP request, the server holds it open until there's data (or a timeout), then responds. The client immediately makes another request.

// Server
app.get('/poll', async (req, res) => {
  const lastId = req.query.lastId;
  const timeout = 30_000;

  try {
    const events = await waitForEvents(lastId, timeout);
    res.json({ events, lastId: events.at(-1)?.id });
  } catch {
    res.json({ events: [], lastId }); // Timeout, no new events
  }
});
// Client
async function poll(lastId?: string) {
  while (true) {
    try {
      const res = await fetch(`/poll?lastId=${lastId ?? ''}`);
      const { events, lastId: newId } = await res.json();
      lastId = newId;
      events.forEach(handleEvent);
    } catch {
      await sleep(5000);  // Back off on error
    }
  }
}

Use long polling when: WebSocket and SSE are not available (corporate firewalls, ancient proxies). Otherwise, prefer SSE or WebSocket.

WebTransport

WebTransport runs over HTTP/3 (QUIC) and offers features that WebSocket cannot:

  • Unreliable delivery. Send datagrams that may be lost — useful for gaming and live video where old data is worthless.
  • Multiple streams. Independent streams within one connection — head-of-line blocking in one stream doesn't affect others.
  • Better congestion control. QUIC's congestion control is per-stream, not per-connection.

Browser support is still limited. Use WebTransport for latency-critical applications where you control both client and server.

Scaling Real-Time Connections

A single server can handle ~10K-100K WebSocket connections (depending on message frequency and payload size). Beyond that, you need to scale horizontally — and that introduces a coordination problem.

The Problem

User A connected to Server 1  ──→  sends message to Room X
User B connected to Server 2  ──→  needs to receive the message

If users in the same room are on different servers, the servers need a way to forward messages.

Solution: Pub/Sub

Use an external pub/sub system (Redis, NATS, Kafka) as a message bus:

import Redis from 'ioredis';

const pub = new Redis();
const sub = new Redis();

// When this server receives a message for a room
function handleRoomMessage(room: string, msg: object) {
  // Publish to Redis — all servers subscribed to this room will receive it
  pub.publish(`room:${room}`, JSON.stringify(msg));
}

// Subscribe to room channels
sub.subscribe('room:general');
sub.on('message', (channel, message) => {
  const room = channel.replace('room:', '');
  const msg = JSON.parse(message);
  // Broadcast to local WebSocket clients in this room
  broadcastToLocalClients(room, msg);
});

Sticky Sessions

An alternative (or complement) to pub/sub: ensure all connections for a group go to the same server.

# Nginx sticky sessions based on a cookie
upstream websocket_servers {
    ip_hash;  # or use a cookie-based approach
    server backend1:8080;
    server backend2:8080;
}

Sticky sessions are simpler but fragile. If a server goes down, all its connections need to reconnect and may land on a different server. Pub/sub is more resilient.

Connection State

Real-time servers are stateful — they hold open connections. This complicates deployment:

  • Graceful shutdown. On deploy, stop accepting new connections, drain existing ones (send a "reconnect" message), wait, then terminate.
  • Health checks. A server with 50K connections is healthy but shouldn't receive new ones if it's about to be drained. Separate the "accepting connections" health check from the "is alive" health check.
  • Connection limits. Set per-server limits and reject new connections when full. The client will reconnect to another server.

The Decision

Server needs to push data to the client, client doesn't send much back?
  → SSE (simplest, auto-reconnect, works with HTTP infra)

Both sides send data frequently, low latency matters?
  → WebSocket

Need unreliable delivery or multiple independent streams?
  → WebTransport (if browser support is sufficient)

Stuck behind restrictive infrastructure?
  → Long polling (last resort)

For most applications, SSE covers 80% of real-time needs (notifications, live feeds, dashboards). Reach for WebSocket when you genuinely need bidirectional, high-frequency communication.

On this page