The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Technology
About
Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and more.
Episodes
- How to Find the Agent Failures Your Evals Miss with Scott Clark - #767
In this episode, Scott Clark, co-founder and CEO of Distributional, explores how teams can reliably operate and improve complex LLM systems and agents in production. He introduces a Maslow’s hierarchy of observability and discusses how map…
- How to Engineer AI Inference Systems with Philip Kiely - #766
Philip Kiely, head of AI education at Baseten, discusses AI inference engineering. He covers topics such as GPU programming, distributed systems, the difference between inference and model serving, and the role of batching, quantization, s…
- How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765
Rashmi Shetty from Capital One discusses the design and deployment of multi-agent AI systems in a regulated environment. The episode covers their platform-centric approach, governance, developer experience, and strategies for building and…
- The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764
Stefano Ermon of Stanford University and Inception Labs discusses the development of diffusion language models for text and code generation. He explains the technical transition from image to language applications, the performance of Mercu…
- Agent Swarms and Knowledge Graphs for Autonomous Software Development with Siddhant Pardeshi - #763
Siddhant Pardeshi of Blitzy joins the podcast to discuss building autonomous software development systems. The conversation covers the use of agent swarms, knowledge graphs, and hybrid search approaches to deliver production-ready code at…
- AI Trends 2026: OpenClaw Agents, Reasoning LLMs, and More with Sebastian Raschka - #762
Sebastian Raschka joins The TWIML AI Podcast to discuss the evolution of the LLM landscape and predictions for 2026. The episode covers shifts toward reasoning-focused post-training, agentic workflows, architecture trends, and navigating A…
- The Evolution of Reasoning in Small Language Models with Yejin Choi - #761
Yejin Choi joins TWIML AI Podcast to discuss enhancing reasoning in small language models. The conversation covers using diverse data, synthetic generation, and reinforcement learning to improve capabilities, the risks of model homogeneity…
- Intelligent Robots in 2026: Are We There Yet? with Nikita Rudin - #760
Nikita Rudin joins the TWIML AI Podcast to discuss the current state of robotic capabilities, the challenges in achieving full autonomy, and the progress made through reinforcement learning and simulation. The conversation covers the sim2r…
- Rethinking Pre-Training for Agentic AI with Aakanksha Chowdhery - #759
Aakanksha Chowdhery argues that pre-training methods must be fundamentally rethought to build agentic AI, moving beyond static benchmarks and next-token prediction to support multi-step workflows and long-form reasoning.
- Why Vision Language Models Ignore What They See with Munawar Hayat - #758
This episode features Munawar Hayat discussing challenges in Vision-Language Models, such as object hallucination and the use of attention-guided alignment. The conversation also covers contrastive learning for retrieval tasks and the Mult…
- Scaling Agentic Inference Across Heterogeneous Compute with Zain Asgar - #757
Zain Asgar joins TWIML AI to discuss Gimlet Labs’ strategy for scaling agentic inference across heterogeneous compute. The discussion covers Gimlet’s approach to disaggregating workloads across various hardware, optimizing unit economics,…
- Proactive Agents for the Web with Devi Parikh - #756
Devi Parikh joins the TWIML AI Podcast to discuss proactive autonomous agents for web interaction. The episode covers technical challenges, the benefits of visually-grounded models over the DOM, Yutori's training pipeline, and how agents h…
- AI Orchestration for Smart Cities and the Enterprise with Robin Braun and Luke Norris - #755
Robin Braun and Luke Norris discuss AI orchestration for automating workflows and deriving value from enterprise data, focusing on smart city use cases in Vail, Colorado, such as accessibility compliance and fire detection. They also cover…
- Building an AI Mathematician with Carina Hong - #754
Carina Hong discusses Axiom's work on an "AI Mathematician," leveraging LLMs, formal proof languages, and code generation. The episode covers technical challenges like data gaps and autoformalization, and potential applications in formal v…
- High-Efficiency Diffusion Models for On-Device Image Generation and Editing with Hung Bui - #753
Hung Bui from Qualcomm explains high-efficiency diffusion models for on-device generative AI, detailing SwiftBrush and SwiftEdit, which achieve text-to-image generation and editing in a single step using a novel distillation framework.
- Vibe Coding's Uncanny Valley with Alexandre Pesant - #752
Alexandre Pesant joins the TWIML AI Podcast to discuss vibe coding, AI's potential to shift software development towards intent expression, and the capabilities and limitations of coding agents. The conversation covers context engineering,…
- Dataflow Computing for AI Inference with Kunle Olukotun - #751
This episode features Kunle Olukotun discussing dataflow computing for AI inference. He explains how dynamically configured architectures match AI model dataflow graphs, benefiting LLM inference by reducing memory bandwidth issues and enha…
- Recurrence and Attention for Long-Context Transformers with Jacob Buckman - #750
Jacob Buckman joins the TWIML AI Podcast to discuss achieving long context in transformers. The episode covers bottlenecks, techniques like windowed attention and power retention, compute architecture reasoning, and Manifest AI's open sour…
- The Decentralized Future of Private AI with Illia Polosukhin - #749
Illia Polosukhin joins TWIML AI to discuss his vision for decentralized, private AI, leveraging the NEAR Protocol with confidential computing and blockchain. The conversation covers trust through open training, verifiable inference, and fo…
- Inside Nano Banana 🍌 and the Future of Vision-Language Models with Oliver Wang - #748
Oliver Wang, tech lead for Gemini 2.5 Flash Image (Nano Banana), joins the TWIML AI Podcast to discuss the capabilities of this new vision-language model, including image generation and editing. The discussion covers the shift to multimoda…
- Is It Time to Rethink LLM Pre-Training? with Aditi Raghunathan - #747
Aditi Raghunathan joins the TWIML AI Podcast to discuss limitations in current LLM pre-training, exploring concepts like "Roll the dice & look before you leap" for more creative outputs. The episode also covers "catastrophic overtraining"…
- Building an Immune System for AI Generated Software with Animesh Koratana - #746
Animesh Koratana joins TWIML AI Podcast to discuss PlayerZero's platform, which addresses the challenges of production-ready AI-assisted coding tools. The platform uses code simulations and an ensemble of LLMs to create an "immune system"…
- Autoformalization and Verifiable Superintelligence with Christian Szegedy - #745
Christian Szegedy joins the TWIML AI Podcast to discuss autoformalization, an AI-driven method to translate mathematical concepts into formal, verifiable logic. This approach aims to improve AI safety and create verifiable data for trainin…
- Multimodal AI Models on Apple Silicon with MLX with Prince Canuma - #744
Prince Canuma, an ML engineer, joins TWIML AI to discuss optimizing AI inference on Apple Silicon with MLX. They cover adapting models, trade-offs between GPU and Neural Engine, optimization techniques, and Prince's contributions like Fusi…
- Genie 3: A New Frontier for World Models with Jack Parker-Holder and Shlomi Fruchter - #743
Researchers Jack Parker-Holder and Shlomi Fruchter discuss Google DeepMind's Genie 3 model, which generates playable virtual worlds. They cover its scaled-up capabilities, architecture, visual memory, and potential as a training environmen…
- Closing the Loop Between AI Training and Inference with Lin Qiao - #742
Lin Qiao shares insights on the generative AI development lifecycle, emphasizing the importance of aligning training and inference systems for efficient production pipelines. She discusses leveraging proprietary data for model improvement,…
- Context Engineering for Productive AI Agents with Filip Kozera - #741
Filip Kozera explains his approach to creating agentic workflows using natural language as a programming interface, detailing the architecture, reflection loops, and tool-calling capabilities of these "background agents." The discussion co…
- Infrastructure Scaling and Compound AI Systems with Jared Quincy Davis - #740
Jared Quincy Davis explains compound AI systems, which compose multiple AI models for better speed, accuracy, and cost. The episode covers techniques like laconic decoding, inference-time scaling, and the co-design of AI algorithms and clo…
- Building Voice AI Agents That Don’t Suck with Kwindla Kramer - #739
Kwindla Kramer joins the TWIML AI Podcast to discuss building production-ready conversational voice AI. The episode covers the full stack, modular vs. end-to-end models, interruption handling, and future trends in voice AI technology.
- Distilling Transformers and Diffusion Models for Robust Edge Use Cases with Fatih Porikli - #738
This episode features Fatih Porikli of Qualcomm AI Research discussing their CVPR papers on DiMA, an autonomous driving system using distilled LLMs, and SharpDepth, which uses diffusion distillation for accurate depth estimation. The discu…
- Building the Internet of Agents with Vijoy Pandey - #737
Vijoy Pandey joins the TWIML AI Podcast to discuss the integration challenges of specialized AI agents and introduces Cisco's 'Internet of Agents' vision and the open-source platform AGNTCY. The episode covers agent collaboration phases, c…
- LLMs for Equities Feature Forecasting at Two Sigma with Ben Wellington - #736
Ben Wellington discusses Two Sigma's AI-driven approach to equities feature forecasting, detailing their methods for creating features, managing data, and building predictive models. The episode also covers the influence of multimodal LLMs…
- Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso - #735
Jason Corso from Voxel51 joins the show to discuss zero-shot auto-labeling for computer vision. The conversation covers the FiftyOne platform, research into how auto-labeling rivals human performance, and workflows for minimizing human rev…
- Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734
Charles Martin joins the TWIML AI Podcast to discuss Weight Watcher, an open-source tool for analyzing Deep Neural Networks (DNNs) based on theoretical physics. The conversation covers the tool's ability to detect learning phases like grok…
- Google I/O 2025 Special Edition - #733
This episode from the TWIML AI Podcast features a crossover recording at Google I/O 2025, interviewing guests from Google DeepMind and Daily. They discuss Gemini model enhancements, the Gemini API, real-time voice AI challenges, and new fe…
- RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732
Sebastian Gehrmann joins the TWIML AI Podcast to discuss the risks associated with retrieval-augmented generation (RAG) systems, explaining how they can unintentionally decrease model safety. The conversation also covers generative AI safe…
- From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731
Mahesh Sathiamoorthy joins the TWIML AI Podcast to explain how reinforcement learning (RL) is used to build custom AI agents on foundation models. The discussion covers data curation, evaluation, RL as a more robust alternative to promptin…
- How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730
Josh Tobin of OpenAI joins the TWIML AI Podcast to explain how OpenAI develops AI agents like Deep Research, Operator, and Codex CLI. The discussion covers their evolution from simple LLM workflows to reinforcement learning for complex, mu…
- CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729
Nidhi Rastogi joins TWIML AI to discuss CTIBench, a new benchmark for evaluating Large Language Models (LLMs) in Cyber Threat Intelligence. The discussion covers the evolution of AI in cybersecurity, the benefits and challenges of using LL…
- Generative Benchmarking with Kelly Hong - #728
Kelly Hong of Chroma discusses Generative Benchmarking, a method for evaluating retrieval systems using synthetic data. The conversation covers the limitations of traditional benchmarks, the process of generating realistic queries, and the…
- Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727
Emmanuel Ameisen discusses two papers on the biology of LLMs and circuit tracing. His team uses mechanistic interpretability to understand Claude's internal workings, revealing how it plans ahead, performs math, and processes concepts acro…
- Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen - #726
This episode features Maohao Shen discussing Satori, a system that uses reinforcement learning and a Chain-of-Action-Thought approach to improve LLM reasoning abilities, allowing for self-reflection and correction.
- Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725
Drago Anguelov from Waymo details their use of foundation models, including vision-language and generative AI, to enhance autonomous driving systems. The discussion covers their custom model, multimodal sensor integration, safety validatio…
- Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724
This episode features Julie Kallini discussing her research on dynamic token merging for efficient byte-level language models, addressing tokenization issues and exploring byte-level alternatives. They also delve into "Mission: Impossible…
- Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723
Jonas Geiping joins the TWIML AI Podcast to discuss his paper on scaling test-time compute with latent reasoning using a recurrent depth approach. The discussion covers internal vs. verbalized reasoning, latent space search, dynamic comput…
- Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722
Chengzu Li joins the TWIML AI Podcast to discuss his paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” They delve into the motivations behind MVoT, its relation to cognitive science principles, the framework's…
- Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721
This episode features Niklas Muennighoff discussing his S1 reasoning model, which uses test-time scaling. The discussion compares S1 to models like OpenAI O1 and DeepSeek R1, covering its training, data curation, and a novel "budget forcin…
- Accelerating AI Training and Inference with AWS Trainium2 with Ron Diamant - #720
Ron Diamant joins the TWIML AI Podcast to discuss AWS Trainium2, a chip designed for accelerating AI training and inference. The episode covers its architecture, performance, tooling ecosystem, and various deployment options within AWS.
- π0: A Foundation Model for Robotics with Sergey Levine - #719
On the TWIML AI Podcast, Sergey Levine discusses π0, a foundation model for robotics. The episode covers the model architecture, training methods, data collection, the FAST tokenizer, and the open-sourcing of π0.
- AI Trends 2025: AI Agents and Multi-Agent Systems with Victor Dibia - #718
Victor Dibia joins the TWIML AI Podcast to discuss AI trends for 2025, including AI agents and multi-agent systems. The episode covers their unique abilities like reasoning and adapting, the rise of agentic foundation models, and emerging…