Now
updated · May 2026
What I'm on, right now.
A live page. Reads more like a notebook than a portfolio. Updated when the answer to “what are you working on” actually changes.
Distributed systems, slowly.
- Working through Designing Data-Intensive Applications — Kleppmann. The chapter on replication is the one I keep coming back to.
- Skimming recent LLM-serving papers when they cross my feed — interested in the queuing and batching side, less the model architecture side.
Small experiments around inference.
- A toy RAG harness over my own blog content. Mostly an excuse to think about chunking and vector recall at small scale.
- Sharpening LeetCode mediums in the background — system-design interviews are where the next conversation lives.
Open questions on my desk.
- How do LLM inference systems handle backpressure when the queue saturates and the user is still typing?
- At what concurrency does client-side inference stop being cheaper than centralised? The proctoring rebuild answered it for one workload — is there a general shape?
Reach out
If you're working on something in this space, I'd like to hear about it.