Skip to yearly menu bar Skip to main content


Poster

CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios

Qilang Ye ⋅ Zitong Yu ⋅ Rui Shao ⋅ Xinyu Xie ⋅ Philip Torr ⋅ Xiaochun Cao
2024 Poster

Abstract

Chat is not available.