The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Why Vision Language Models Ignore What They See with Munawar Hayat - #758
This episode features Munawar Hayat discussing challenges in Vision-Language Models, such as object hallucination and the use of attention-guided alignment. The conversation also covers contrastive learning for retrieval tasks and the MultiHuman Testbench for generative models.