Will Brown

Blog

Thoughts, musings, works-in-progress, info-dumps.

ParaLLM: 1600+ tok/s on a MacBook

June 23, 2024

Batched KV caching for fast parallel LLM inference in MLX.

Read more →

Generative AI Handbook: A Roadmap for Learning Resources

June 05, 2024

I organized all my favorite LLM explainers into a 'book'.

Read more →

Five years of feedback loops

April 29, 2024

A summary of my PhD + thoughts on what comes next.

Read more →