Research depth. Operator instinct. Institutional rigor. Built for the agentic era.
It all began in a small studio nestled in the heart of New York


01
The Reinforcement Learning Renaissance (Part 1)
As pre-training plateaus near GPT-4-level performance, reinforcement learning emerges as the new scaling paradigm. Exploring how RL is driving the next wave of AI breakthroughs from reasoning models to data-efficient learning.

02
Continual Learning: The Promised Land
This post is a builder’s map of the landscape: what works, what’s emerging, and what it unlocks. The underlying tension is simple: plasticity (the ability to learn new things) trades off against stability (retaining what you already know). That tradeoff shows up everywhere — in weights, in context, and in the loops that connect them.

01
The Reinforcement Learning Renaissance (Part 1)
As pre-training plateaus near GPT-4-level performance, reinforcement learning emerges as the new scaling paradigm. Exploring how RL is driving the next wave of AI breakthroughs from reasoning models to data-efficient learning.

02
Continual Learning: The Promised Land
This post is a builder’s map of the landscape: what works, what’s emerging, and what it unlocks. The underlying tension is simple: plasticity (the ability to learn new things) trades off against stability (retaining what you already know). That tradeoff shows up everywhere — in weights, in context, and in the loops that connect them.

01
The Reinforcement Learning Renaissance (Part 1)
As pre-training plateaus near GPT-4-level performance, reinforcement learning emerges as the new scaling paradigm. Exploring how RL is driving the next wave of AI breakthroughs from reasoning models to data-efficient learning.

02
Continual Learning: The Promised Land
This post is a builder’s map of the landscape: what works, what’s emerging, and what it unlocks. The underlying tension is simple: plasticity (the ability to learn new things) trades off against stability (retaining what you already know). That tradeoff shows up everywhere — in weights, in context, and in the loops that connect them.