Projekter pr. år
Personlig profil
Forskningsprofil
I am a Postdoctoral Researcher in Natural Language Processing (NLP) with a focus on NLP Applications, with a special focus on NLP for Education. I am advised by Prof. Johannes Bjerva and Prof. Euan Lindsay.
I hold a PhD in NLP from the IT University of Copenhagen (ITU), where I was advised by Prof. Barbara Plank and A/P. Rob van der Goot. I was part of NLPnorth at ITU and MaiNLP at the Ludwig Maximilian University of Munich (LMU). I worked on Computational Job Market Analysis (or NLP for HR), where we investigated how to extract information (e.g., skills) from job ads data and match these to existing resources (e.g., taxonomies).
I am interested in:
- NLP x Education (Postdoc): Can we improve students’ learning by giving them automatic feedback from NLP tools (e.g., language models)? How can we do this over time?
- NLP x HR (PhD): How can we extract relevant skills from job ads and in what way can we match them with existing taxonomies to assist job centers matching candidates to jobs better?
- Resource Creation: My general interests are mostly on resource creation; such as developing annotation guidelines for data annotation, (multilingual) datasets creation in both general and specific domains, and language model training on small and large scale.
Ekspertise relateret til FN’s Verdensmål
I 2015 blev FN's medlemslande enige om 17 Verdensmål til at bekæmpe fattigdom, beskytte planeten og sikre velstand for alle. Denne persons arbejde bidrager til følgende verdensmål:
Emneord
- Datalogi
Fingerprint
- 1 Lignende profiler
Samarbejde i de sidste fem år
Projekter
- 1 Igangværende
-
Digital Twins for Abundant Feedback: Novel Feedback Paradigms via Explainable Multilingual Natural Language Processing
Bjerva, J. (PI (principal investigator)), Lindsay, E. (PI (principal investigator)) & Zhang, M. (Projektdeltager)
01/01/2024 → 31/12/2025
Projekter: Projekt › Forskning
-
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Vayani, A., Dissanayake, D., Watawana, H., Ahsan, N., Sasikumar, N., Thawakar, O., Ademtew, H. B., Hmaiti, Y., Kumar, A., Kuckreja, K., Maslych, M., Ghallabi, W. A., Qin, C., Shaker, A. M., Zhang, M., Ihsani, M. K., Esplana, A., Gokani, M., Mirkin, S. & Singh, H. & 47 flere, , 2025, arXiv, 26 s.Publikation: Working paper/Preprint › Preprint
Åben adgang -
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Vayani, A., Dissanayake, D., Watawana, H., Ahsan, N., Sasikumar, N., Thawakar, O., Ademtew, H. B., Hmaiti, Y., Kumar, A., Kuckreja, K., Maslych, M., Ghallabi, W. A., Qin, C., Shaker, A. M., Zhang, M., Ihsani, M. K., Esplana, A., Gokani, M., Mirkin, S. & Singh, H. & 47 flere, , 10 jun. 2025, 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (Institute of Electrical and Electronics Engineers), s. 19565-19575 11 s. 11094031. (I E E E Conference on Computer Vision and Pattern Recognition. Proceedings).Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review
1 Citationer (Scopus)157 Downloads (Pure) -
Cross-Lingual Sentence-Level Skill Identification in English and Danish Job Advertisements
Musazade, N., Zhang, M. & Mezei, J., 3 aug. 2025, (Accepteret/In press) International Conference on Natural Language and Speech Processing 2025. Odense, Denmark: Association for Computational LinguisticsPublikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review
Åben adgangFil35 Downloads (Pure) -
DaKultur: Evaluating the Cultural Awareness of Language Models for Danish with Native Speakers
Müller-Eberstein, M., Zhang, M., Bassignana, E., Brunsgaard Trolle, P. & van der Goot, R., 29 apr. 2025, 3rd Workshop on Cross-Cultural Considerations in NLP (C3NLP). Association for Computational LinguisticsPublikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review
15 Downloads (Pure) -
HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings
Aavang, R. T., Rizzi, G., Bøggild, R., Iolov, A., Zhang, M. & Bjerva, J., feb. 2025, (Afsendt).Publikation: Working paper/Preprint › Preprint
Forskningsdatasæt
-
HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings
Jensen, R. T. A. (Ophavsperson), Rizzi, G. (Ophavsperson), Bøggild, R. (Ophavsperson), Iolov, A. (Ophavsperson), Zhang, M. (Ophavsperson) & Bjerva, J. (Vejleder), Hugging Face, 21 feb. 2025
DOI: 10.57967/hf/4619
Datasæt
-
Retrieval Augmented Skill Extraction
Zhang, M. (Foredragsholder)
28 apr. 2025Aktivitet: Foredrag og mundtlige bidrag › Foredrag og præsentationer i privat eller offentlig virksomhed
Fil -
SnakModel: Lessons Learned from Training an Open Danish Large Language Model
Zhang, M. (Foredragsholder)
19 dec. 2024Aktivitet: Foredrag og mundtlige bidrag › Foredrag og præsentationer i privat eller offentlig virksomhed
-
Evaluating Large Language Models: A Cultural Perspective
Zhang, M. (Foredragsholder)
12 dec. 2024Aktivitet: Foredrag og mundtlige bidrag › Foredrag og præsentationer i privat eller offentlig virksomhed
-
Evaluating Large Language Models: A Cultural Perspective
Zhang, M. (Foredragsholder)
3 dec. 2024Aktivitet: Foredrag og mundtlige bidrag › Foredrag og præsentationer i privat eller offentlig virksomhed
-
Pre-training Large Language Models
Zhang, M. (Foredragsholder)
20 sep. 2024Aktivitet: Foredrag og mundtlige bidrag › Gæsteforelæsning
Presse/medier
-
Forsker: AI-modeller skal fodres med dansk kultur
21/10/2025 → 24/10/2025
4 elementer af Mediedækning
Presse/medie