[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI

2025-12-31

Josh McGrath from OpenAI discusses the evolution of post-training, including RLVR, agent efficiency, and token efficiency in models such as GPT-4.1 and GPT-5.1. He shares insights on his work from pre-training data curation to shipping various GPT models.

Listen