▍Case Study · Doubt & Discussion
Four chat types, one socket layer.
Inside a learning platform, students, instructors, and admins needed four kinds of conversation. This is how one module ended up handling all of them.
- <100msmessage delivery
- −70%API latency
- −60%DB round-trips
01 / Context
Four conversations, one module.
The platform needed four kinds of conversation. A student asking a private question. A small group working through a problem together. A whole class discussing a topic with their instructor. Someone announcing something to everyone at once.
Four interfaces. Four scopes. The easy path was four modules — built once each, owned by whoever needed them.
We didn't take it.
02 / The decision
One realtime layer.
The decision that shaped every part of the build was the refusal to fragment. One realtime layer. One message shape. One delivery primitive. Whether you were sending a private message or a campus-wide announcement, the path was the same — only the routing changed.
That call made every later feature cheaper. Read receipts worked the same way for a direct message and an announcement. Mentions, replies, file attachments — written once, available everywhere. What looked like four products from the outside was one engine on the inside.
03 / Delivery
A user lives in two rooms at a time.
Every connected person joins a personal room the moment they sign in. That's the address for anything aimed at them specifically — unread counts, new-message badges, conversation-list updates.
When they open a chat, they also join the room for that conversation. Messages get sent to the room; everyone inside sees them in real time. Leave the chat, and you stop receiving its live traffic — the personal room is enough to know something new arrived.
The two-room model was easy to reason about and easy to scale. Adding a feature meant choosing which room it belonged to. Nothing more.
04 / The small move
Reading the room — literally.
The most satisfying part of the build came from a small UX problem. If the recipient was already looking at the thread when a new message arrived, marking it unread felt wrong. The module should know they'd already seen it.
So the system kept a quiet, running record of which conversation each person was currently viewing — small, cheap to update, faster than any database hit. When a message arrived, the system checked: is the recipient already here? If yes, the message arrived already read. If no, it queued an unread badge for later.
Nothing visible to the user except that the module just felt right inside the larger platform. The kind of detail nobody notices until you take it away.
05 / Presence
Who's here, in O(log n).
That "quiet, running record" is a Redis sorted set. Every heartbeat is a ZADD with the timestamp as the score — so "who's online" is a single range query, and stale entries age out by score with no sweeper job. Presence reads never touch PostgreSQL.
With Redis fronting the hot path — presence, unread counts, the conversation list — measured API latency dropped 70% under concurrency, and message delivery stayed under 100ms end to end.
06 / History
Scrollback without OFFSET.
Chat history breaks naive pagination. OFFSET walks every skipped row, so page fifty of a busy thread costs fifty times page one — and rows shift under you as new messages land. Keyset pagination fixes both: remember the last message's id, ask for what came before it, let the index do the work. Every page costs the same O(log n) lookup.
The other classic failure was N+1: load twenty conversations, fire twenty more queries for their senders and attachments. DataLoader collapses those into one batched query per entity type within a tick. DB round-trips for high-concurrency chat history fell by 60% — the same screens, a fraction of the database's attention.
Architecture
The system on one page.

FIG. — System architecture, production
What looked like four modules from the outside was one engine on the inside.