The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Dataflow Computing for AI Inference with Kunle Olukotun - #751

This episode features Kunle Olukotun discussing dataflow computing for AI inference. He explains how dynamically configured architectures match AI model dataflow graphs, benefiting LLM inference by reducing memory bandwidth issues and enhancing performance. The discussion also…

Listen