The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Technology

About

Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and more.

Episodes

How to Find the Agent Failures Your Evals Miss with Scott Clark - #767 2026-05-07
In this episode, Scott Clark, co-founder and CEO of Distributional, explores how teams can reliably operate and improve complex LLM systems and agents in production. He introduces a Maslow’s hierarchy of observability and discusses how map…
How to Engineer AI Inference Systems with Philip Kiely - #766 2026-04-30
Philip Kiely, head of AI education at Baseten, discusses AI inference engineering. He covers topics such as GPU programming, distributed systems, the difference between inference and model serving, and the role of batching, quantization, s…
How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765 2026-04-16
Rashmi Shetty from Capital One discusses the design and deployment of multi-agent AI systems in a regulated environment. The episode covers their platform-centric approach, governance, developer experience, and strategies for building and…
The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764 2026-03-26
Stefano Ermon of Stanford University and Inception Labs discusses the development of diffusion language models for text and code generation. He explains the technical transition from image to language applications, the performance of Mercu…
Agent Swarms and Knowledge Graphs for Autonomous Software Development with Siddhant Pardeshi - #763 2026-03-10
Siddhant Pardeshi of Blitzy joins the podcast to discuss building autonomous software development systems. The conversation covers the use of agent swarms, knowledge graphs, and hybrid search approaches to deliver production-ready code at…
AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More with Sebastian Raschka - #762 2026-02-26
Sebastian Raschka joins The TWIML AI Podcast to discuss the evolution of the LLM landscape and predictions for 2026. The episode covers shifts toward reasoning-focused post-training, agentic workflows, architecture trends, and navigating A…
The Evolution of Reasoning in Small Language Models with Yejin Choi - #761 2026-01-29
Yejin Choi joins TWIML AI Podcast to discuss enhancing reasoning in small language models. The conversation covers using diverse data, synthetic generation, and reinforcement learning to improve capabilities, the risks of model homogeneity…
Intelligent Robots in 2026: Are We There Yet? with Nikita Rudin - #760 2026-01-08
Nikita Rudin joins the TWIML AI Podcast to discuss the current state of robotic capabilities, the challenges in achieving full autonomy, and the progress made through reinforcement learning and simulation. The conversation covers the sim2r…
Rethinking Pre-Training for Agentic AI with Aakanksha Chowdhery - #759 2025-12-17
Aakanksha Chowdhery argues that pre-training methods must be fundamentally rethought to build agentic AI, moving beyond static benchmarks and next-token prediction to support multi-step workflows and long-form reasoning.
Why Vision Language Models Ignore What They See with Munawar Hayat - #758 2025-12-09
This episode features Munawar Hayat discussing challenges in Vision-Language Models, such as object hallucination and the use of attention-guided alignment. The conversation also covers contrastive learning for retrieval tasks and the Mult…
Scaling Agentic Inference Across Heterogeneous Compute with Zain Asgar - #757 2025-12-02
Zain Asgar joins TWIML AI to discuss Gimlet Labs’ strategy for scaling agentic inference across heterogeneous compute. The discussion covers Gimlet’s approach to disaggregating workloads across various hardware, optimizing unit economics,…
Proactive Agents for the Web with Devi Parikh - #756 2025-11-19
Devi Parikh joins the TWIML AI Podcast to discuss proactive autonomous agents for web interaction. The episode covers technical challenges, the benefits of visually-grounded models over the DOM, Yutori's training pipeline, and how agents h…
AI Orchestration for Smart Cities and the Enterprise with Robin Braun and Luke Norris - #755 2025-11-12
Robin Braun and Luke Norris discuss AI orchestration for automating workflows and deriving value from enterprise data, focusing on smart city use cases in Vail, Colorado, such as accessibility compliance and fire detection. They also cover…
Building an AI Mathematician with Carina Hong - #754 2025-11-04
Carina Hong discusses Axiom's work on an "AI Mathematician," leveraging LLMs, formal proof languages, and code generation. The episode covers technical challenges like data gaps and autoformalization, and potential applications in formal v…
High-Efficiency Diffusion Models for On-Device Image Generation and Editing with Hung Bui - #753 2025-10-28
Hung Bui from Qualcomm explains high-efficiency diffusion models for on-device generative AI, detailing SwiftBrush and SwiftEdit, which achieve text-to-image generation and editing in a single step using a novel distillation framework.
Vibe Coding's Uncanny Valley with Alexandre Pesant - #752 2025-10-22
Alexandre Pesant joins the TWIML AI Podcast to discuss vibe coding, AI's potential to shift software development towards intent expression, and the capabilities and limitations of coding agents. The conversation covers context engineering,…
Dataflow Computing for AI Inference with Kunle Olukotun - #751 2025-10-14
This episode features Kunle Olukotun discussing dataflow computing for AI inference. He explains how dynamically configured architectures match AI model dataflow graphs, benefiting LLM inference by reducing memory bandwidth issues and enha…
Recurrence and Attention for Long-Context Transformers with Jacob Buckman - #750 2025-10-07
Jacob Buckman joins the TWIML AI Podcast to discuss achieving long context in transformers. The episode covers bottlenecks, techniques like windowed attention and power retention, compute architecture reasoning, and Manifest AI's open sour…
The Decentralized Future of Private AI with Illia Polosukhin - #749 2025-09-30
Illia Polosukhin joins TWIML AI to discuss his vision for decentralized, private AI, leveraging the NEAR Protocol with confidential computing and blockchain. The conversation covers trust through open training, verifiable inference, and fo…
Inside Nano Banana 🍌 and the Future of Vision-Language Models with Oliver Wang - #748 2025-09-23
Oliver Wang, tech lead for Gemini 2.5 Flash Image (Nano Banana), joins the TWIML AI Podcast to discuss the capabilities of this new vision-language model, including image generation and editing. The discussion covers the shift to multimoda…
Is It Time to Rethink LLM Pre-Training? with Aditi Raghunathan - #747 2025-09-16
Aditi Raghunathan joins the TWIML AI Podcast to discuss limitations in current LLM pre-training, exploring concepts like "Roll the dice & look before you leap" for more creative outputs. The episode also covers "catastrophic overtraining"…
Building an Immune System for AI Generated Software with Animesh Koratana - #746 2025-09-09
Animesh Koratana joins TWIML AI Podcast to discuss PlayerZero's platform, which addresses the challenges of production-ready AI-assisted coding tools. The platform uses code simulations and an ensemble of LLMs to create an "immune system"…
Autoformalization and Verifiable Superintelligence with Christian Szegedy - #745 2025-09-02
Christian Szegedy joins the TWIML AI Podcast to discuss autoformalization, an AI-driven method to translate mathematical concepts into formal, verifiable logic. This approach aims to improve AI safety and create verifiable data for trainin…
Multimodal AI Models on Apple Silicon with MLX with Prince Canuma - #744 2025-08-26
Prince Canuma, an ML engineer, joins TWIML AI to discuss optimizing AI inference on Apple Silicon with MLX. They cover adapting models, trade-offs between GPU and Neural Engine, optimization techniques, and Prince's contributions like Fusi…
Genie 3: A New Frontier for World Models with Jack Parker-Holder and Shlomi Fruchter - #743 2025-08-19
Researchers Jack Parker-Holder and Shlomi Fruchter discuss Google DeepMind's Genie 3 model, which generates playable virtual worlds. They cover its scaled-up capabilities, architecture, visual memory, and potential as a training environmen…
Closing the Loop Between AI Training and Inference with Lin Qiao - #742 2025-08-12
Lin Qiao shares insights on the generative AI development lifecycle, emphasizing the importance of aligning training and inference systems for efficient production pipelines. She discusses leveraging proprietary data for model improvement,…
Context Engineering for Productive AI Agents with Filip Kozera - #741 2025-07-29
Filip Kozera explains his approach to creating agentic workflows using natural language as a programming interface, detailing the architecture, reflection loops, and tool-calling capabilities of these "background agents." The discussion co…
Infrastructure Scaling and Compound AI Systems with Jared Quincy Davis - #740 2025-07-22
Jared Quincy Davis explains compound AI systems, which compose multiple AI models for better speed, accuracy, and cost. The episode covers techniques like laconic decoding, inference-time scaling, and the co-design of AI algorithms and clo…
Building Voice AI Agents That Don’t Suck with Kwindla Kramer - #739 2025-07-15
Kwindla Kramer joins the TWIML AI Podcast to discuss building production-ready conversational voice AI. The episode covers the full stack, modular vs. end-to-end models, interruption handling, and future trends in voice AI technology.
Distilling Transformers and Diffusion Models for Robust Edge Use Cases with Fatih Porikli - #738 2025-07-09
This episode features Fatih Porikli of Qualcomm AI Research discussing their CVPR papers on DiMA, an autonomous driving system using distilled LLMs, and SharpDepth, which uses diffusion distillation for accurate depth estimation. The discu…
Building the Internet of Agents with Vijoy Pandey - #737 2025-06-24
Vijoy Pandey joins the TWIML AI Podcast to discuss the integration challenges of specialized AI agents and introduces Cisco's 'Internet of Agents' vision and the open-source platform AGNTCY. The episode covers agent collaboration phases, c…
LLMs for Equities Feature Forecasting at Two Sigma with Ben Wellington - #736 2025-06-17
Ben Wellington discusses Two Sigma's AI-driven approach to equities feature forecasting, detailing their methods for creating features, managing data, and building predictive models. The episode also covers the influence of multimodal LLMs…
Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso - #735 2025-06-10
Jason Corso from Voxel51 joins the show to discuss zero-shot auto-labeling for computer vision. The conversation covers the FiftyOne platform, research into how auto-labeling rivals human performance, and workflows for minimizing human rev…
Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734 2025-06-05
Charles Martin joins the TWIML AI Podcast to discuss Weight Watcher, an open-source tool for analyzing Deep Neural Networks (DNNs) based on theoretical physics. The conversation covers the tool's ability to detect learning phases like grok…
Google I/O 2025 Special Edition - #733 2025-05-28
This episode from the TWIML AI Podcast features a crossover recording at Google I/O 2025, interviewing guests from Google DeepMind and Daily. They discuss Gemini model enhancements, the Gemini API, real-time voice AI challenges, and new fe…
RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732 2025-05-21
Sebastian Gehrmann joins the TWIML AI Podcast to discuss the risks associated with retrieval-augmented generation (RAG) systems, explaining how they can unintentionally decrease model safety. The conversation also covers generative AI safe…
From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731 2025-05-13
Mahesh Sathiamoorthy joins the TWIML AI Podcast to explain how reinforcement learning (RL) is used to build custom AI agents on foundation models. The discussion covers data curation, evaluation, RL as a more robust alternative to promptin…
How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730 2025-05-06
Josh Tobin of OpenAI joins the TWIML AI Podcast to explain how OpenAI develops AI agents like Deep Research, Operator, and Codex CLI. The discussion covers their evolution from simple LLM workflows to reinforcement learning for complex, mu…
CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729 2025-04-30
Nidhi Rastogi joins TWIML AI to discuss CTIBench, a new benchmark for evaluating Large Language Models (LLMs) in Cyber Threat Intelligence. The discussion covers the evolution of AI in cybersecurity, the benefits and challenges of using LL…
Generative Benchmarking with Kelly Hong - #728 2025-04-23
Kelly Hong of Chroma discusses Generative Benchmarking, a method for evaluating retrieval systems using synthetic data. The conversation covers the limitations of traditional benchmarks, the process of generating realistic queries, and the…
Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727 2025-04-14
Emmanuel Ameisen discusses two papers on the biology of LLMs and circuit tracing. His team uses mechanistic interpretability to understand Claude's internal workings, revealing how it plans ahead, performs math, and processes concepts acro…
Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen - #726 2025-04-08
This episode features Maohao Shen discussing Satori, a system that uses reinforcement learning and a Chain-of-Action-Thought approach to improve LLM reasoning abilities, allowing for self-reflection and correction.
Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725 2025-03-31
Drago Anguelov from Waymo details their use of foundation models, including vision-language and generative AI, to enhance autonomous driving systems. The discussion covers their custom model, multimodal sensor integration, safety validatio…
Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724 2025-03-24
This episode features Julie Kallini discussing her research on dynamic token merging for efficient byte-level language models, addressing tokenization issues and exploring byte-level alternatives. They also delve into "Mission: Impossible…
Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723 2025-03-17
Jonas Geiping joins the TWIML AI Podcast to discuss his paper on scaling test-time compute with latent reasoning using a recurrent depth approach. The discussion covers internal vs. verbalized reasoning, latent space search, dynamic comput…
Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722 2025-03-10
Chengzu Li joins the TWIML AI Podcast to discuss his paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” They delve into the motivations behind MVoT, its relation to cognitive science principles, the framework's…
Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721 2025-03-03
This episode features Niklas Muennighoff discussing his S1 reasoning model, which uses test-time scaling. The discussion compares S1 to models like OpenAI O1 and DeepSeek R1, covering its training, data curation, and a novel "budget forcin…
Accelerating AI Training and Inference with AWS Trainium2 with Ron Diamant - #720 2025-02-24
Ron Diamant joins the TWIML AI Podcast to discuss AWS Trainium2, a chip designed for accelerating AI training and inference. The episode covers its architecture, performance, tooling ecosystem, and various deployment options within AWS.
π0: A Foundation Model for Robotics with Sergey Levine - #719 2025-02-18
On the TWIML AI Podcast, Sergey Levine discusses π0, a foundation model for robotics. The episode covers the model architecture, training methods, data collection, the FAST tokenizer, and the open-sourcing of π0.
AI Trends 2025: AI Agents and Multi-Agent Systems with Victor Dibia - #718 2025-02-10
Victor Dibia joins the TWIML AI Podcast to discuss AI trends for 2025, including AI agents and multi-agent systems. The episode covers their unique abilities like reasoning and adapting, the rise of agentic foundation models, and emerging…