The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731
Mahesh Sathiamoorthy joins the TWIML AI Podcast to explain how reinforcement learning (RL) is used to build custom AI agents on foundation models. The discussion covers data curation, evaluation, RL as a more robust alternative to prompting, and limitations of supervised…