All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

Ashmal Vayani, Dinura Dissanayake, Hasindri Watawana, Noor Ahsan, Nevasini Sasikumar, Omkar Thawakar, Henok Biadglign Ademtew, Yahya Hmaiti, Amandeep Kumar, Kartik Kuckreja, Mykola Maslych, Wafa Al Ghallabi, Chao Qin, Abdelrahman M Shaker, Mike Zhang, Mahardika Krisna Ihsani, Amiel Esplana, Monil Gokani, Shachar Mirkin, Harsh SinghAshay Srivastava, Endre Hamerlik, Fathinah Asma Izzati, Fadillah Adamsyah Maani, Sebastian Cavada, Jenny Chim, Rohit Gupta, Sanjay Manjunath, Kamila Zhumakhanova, Feno Heriniaina Rabevohitra, Azril Amirudin, Muhammad Ridzuan, Daniya Kareem, Ketan More, Kunyang Li, Pramesh Shakya, Muhammad Saad, Amirpouya Ghasemaghaei, Amirbek Djanibekov, Dilshod Azizov, Branislava Jankovic, Naman Bhatia, Johan Obando-Ceron, Olympiah Otieno, Fabian Farestam, Muztoba Rabbani, Sanoojan Baliah, Santosh Sanjeev, Abduragim Shtanchaev, Maheen Fatima, Thao Nguyen, Amrin Kareem, Toluwani Aremu, Nathan Xavier, Amit Bhatkal, Hawau Toyin, Aman Chadha, Hisham Cholakkal, Rao Muhammad Anwer, Michael Felsberg, Jorma Laaksonen, Thamar Solorio, Monojit Choudhury, Ivan Laptev, Mubarak Shah, Salman Khan, Fahad Khan

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

157 Downloads (Pure)

Abstract

Existing Large Multimodal Models (LMMs) generally focus on only a few regions and languages. As LMMs continue to improve, it is increasingly important to ensure they understand cultural contexts, respect local sensitivities, and support low-resource languages, all while effectively integrating corresponding visual cues. In pursuit of culturally diverse global multimodal models, our proposed All Languages Matter Benchmark (ALM-bench) represents the largest and most comprehensive effort to date for evaluating LMMs across 100 languages. ALM-bench challenges existing models by testing their ability to understand and reason about culturally diverse images paired with text in various languages, including many low-resource languages traditionally underrepresented in LMM research. The benchmark offers a robust and nuanced evaluation framework featuring various question formats, including true/false, multiple choice, and open-ended questions, which are further divided into short and long-answer categories. ALM-bench design ensures a comprehensive assessment of a model’s ability to handle varied levels of difficulty in visual and linguistic reasoning. To capture the rich tapestry of global cultures, ALM-bench carefully curates content from 13 distinct cultural aspects, ranging from traditions and rituals to famous personalities and celebrations. Through this, ALM-bench not only provides a rigorous testing ground for state-of-the-art open and closed-source LMMs but also highlights the importance of cultural and linguistic inclusivity, encouraging the development of models that can serve diverse global populations effectively. Our benchmark is publicly available at https://mbzuai-oryx.github.io/ALM-Bench/.
OriginalsprogEngelsk
Titel2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Antal sider11
ForlagIEEE (Institute of Electrical and Electronics Engineers)
Publikationsdato10 jun. 2025
Sider19565-19575
Artikelnummer11094031
ISBN (Trykt)979-8-3315-4365-5
ISBN (Elektronisk)979-8-3315-4364-8
DOI
StatusUdgivet - 10 jun. 2025
BegivenhedIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025 - Nashville, USA
Varighed: 11 jun. 202515 jun. 2025
https://cvpr.thecvf.com/Conferences/2025

Konference

KonferenceIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
Land/OmrådeUSA
ByNashville
Periode11/06/202515/06/2025
Internetadresse
NavnI E E E Conference on Computer Vision and Pattern Recognition. Proceedings
ISSN1063-6919

Fingeraftryk

Dyk ned i forskningsemnerne om 'All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages'. Sammen danner de et unikt fingeraftryk.
  • All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

    Vayani, A., Dissanayake, D., Watawana, H., Ahsan, N., Sasikumar, N., Thawakar, O., Ademtew, H. B., Hmaiti, Y., Kumar, A., Kuckreja, K., Maslych, M., Ghallabi, W. A., Qin, C., Shaker, A. M., Zhang, M., Ihsani, M. K., Esplana, A., Gokani, M., Mirkin, S. & Singh, H. & 47 flere, Srivastava, A., Hamerlik, E., Izzati, F. A., Maani, F. A., Cavada, S., Chim, J., Gupta, R., Manjunath, S., Zhumakhanova, K., Rabevohitra, F. H., Amirudin, A., Ridzuan, M., Kareem, D., More, K., Li, K., Shakya, P., Saad, M., Ghasemaghaei, A., Djanibekov, A., Azizov, D., Jankovic, B., Bhatia, N., Obando-Ceron, J., Otieno, O., Farestam, F., Rabbani, M., Baliah, S., Sanjeev, S., Shtanchaev, A., Fatima, M., Nguyen, T., Kareem, A., Aremu, T., Xavier, N., Bhatkal, A., Toyin, H., Chadha, A., Cholakkal, H., Anwer, R. M., Felsberg, M., Laaksonen, J., Solorio, T., Choudhury, M., Laptev, I., Shah, M., Khan, S. & Khan, F., 2025, arXiv, 26 s.

    Publikation: Working paper/PreprintPreprint

    Åben adgang

Citationsformater