The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Infrastructure Scaling and Compound AI Systems with Jared Quincy Davis - #740
Jared Quincy Davis explains compound AI systems, which compose multiple AI models for better speed, accuracy, and cost. The episode covers techniques like laconic decoding, inference-time scaling, and the co-design of AI algorithms and cloud infrastructure.