Projects per year
Personal profile
Research profile
I am a Postdoctoral Researcher in Natural Language Processing (NLP) with a focus on NLP Applications, with a special focus on NLP for Education. I am advised by Prof. Johannes Bjerva and Prof. Euan Lindsay.
I hold a PhD in NLP from the IT University of Copenhagen (ITU), where I was advised by Prof. Barbara Plank and A/P. Rob van der Goot. I was part of NLPnorth at ITU and MaiNLP at the Ludwig Maximilian University of Munich (LMU). I worked on Computational Job Market Analysis (or NLP for HR), where we investigated how to extract information (e.g., skills) from job ads data and match these to existing resources (e.g., taxonomies).
I am interested in:
- NLP x Education (Postdoc): Can we improve students’ learning by giving them automatic feedback from NLP tools (e.g., language models)? How can we do this over time?
- NLP x HR (PhD): How can we extract relevant skills from job ads and in what way can we match them with existing taxonomies to assist job centers matching candidates to jobs better?
- Resource Creation: My general interests are mostly on resource creation; such as developing annotation guidelines for data annotation, (multilingual) datasets creation in both general and specific domains, and language model training on small and large scale.
Expertise related to UN Sustainable Development Goals
In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This person’s work contributes towards the following SDG(s):
Keywords
- Computer Science
- Natural Language Processing
- NLP
- Large Language Models
- Education
- Human Resources
- Artificial Intelligence
Fingerprint
- 1 Similar Profiles
Collaborations from the last five years
Projects
- 1 Active
-
Digital Twins for Abundant Feedback: Novel Feedback Paradigms via Explainable Multilingual Natural Language Processing
Bjerva, J. (PI), Lindsay, E. (PI) & Zhang, M. (Project Participant)
01/01/2024 → 31/12/2025
Project: Research
-
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Vayani, A., Dissanayake, D., Watawana, H., Ahsan, N., Sasikumar, N., Thawakar, O., Ademtew, H. B., Hmaiti, Y., Kumar, A., Kuckreja, K., Maslych, M., Ghallabi, W. A., Qin, C., Shaker, A. M., Zhang, M., Ihsani, M. K., Esplana, A., Gokani, M., Mirkin, S. & Singh, H. & 47 others, , 10 Jun 2025, The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025. IEEE (Institute of Electrical and Electronics Engineers)Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review
File81 Downloads (Pure) -
DaKultur: Evaluating the Cultural Awareness of Language Models for Danish with Native Speakers
Müller-Eberstein, M., Zhang, M., Bassignana, E., Brunsgaard Trolle, P. & van der Goot, R., 4 May 2025, (Accepted/In press) 3rd Workshop on Cross-Cultural Considerations in NLP (C3NLP). Association for Computational Linguistics, ACL AnthologyResearch output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review
Open AccessFile5 Downloads (Pure) -
HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings
Aavang, R. T., Rizzi, G., Bøggild, R., Iolov, A., Zhang, M. & Bjerva, J., Feb 2025, (Submitted).Research output: Working paper/Preprint › Preprint
-
How Do Hackathons Foster Creativity? Towards AI Collaborative Evaluation of Creativity at Scale
Falk, J., Chen, Y., Rafner, J., Zhang, M., Bjerva, J. & Nolte, A., Apr 2025, CHI '25: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery (ACM), 34 p.Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review
Open Access -
Humanity's Last Exam
Center for AI Safety, 24 Jan 2025, (In preparation).Research output: Working paper/Preprint › Preprint
File19 Downloads (Pure)
Datasets
-
HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings
Jensen, R. T. A. (Creator), Rizzi, G. (Creator), Bøggild, R. (Creator), Iolov, A. (Creator), Zhang, M. (Creator) & Bjerva, J. (Supervisor), Hugging Face, 21 Feb 2025
DOI: 10.57967/hf/4619
Dataset
-
SnakModel: Lessons Learned from Training an Open Danish Large Language Model
Zhang, M. (Lecturer)
19 Dec 2024Activity: Talks and presentations › Talks and presentations in private or public companies
-
Evaluating Large Language Models: A Cultural Perspective
Zhang, M. (Lecturer)
12 Dec 2024Activity: Talks and presentations › Talks and presentations in private or public companies
-
Evaluating Large Language Models: A Cultural Perspective
Zhang, M. (Lecturer)
3 Dec 2024Activity: Talks and presentations › Talks and presentations in private or public companies
-
Pre-training Large Language Models
Zhang, M. (Lecturer)
20 Sept 2024Activity: Talks and presentations › Guest lecturers