Notes on task design, model evaluation, failure attribution, and real-world validation.
Blogs
Featured Blogs
Notes and article series on AI evaluation, data products, and LLM fundamentals.
Blog Series
Start with these series
Blogs
Recent / Featured Posts
Growth Data Agent Product Design
Data Agent
How do we know whether a model is strong?
LLM Evaluation
How LLMs are trained: from data to dialogue
LLM Training
Transformer Block: the building block of LLMs
Transformer
Attention: how does a model decide where to look?
Transformer
What problem did the Transformer actually solve?
Transformer