TY - UNPB
T1 - Scaling Course Evaluations with Large Language Models: Semester-level Digestible Student Feedback for Program Leaders
AU - Zhang, Mike
AU - Lindsay, Euan
AU - Quitzau, Maj-Britt
AU - Bjerva, Johannes
PY - 2025/4/1
Y1 - 2025/4/1
N2 - End of semester student evaluations represent the primary feedback mechanism for academics' teaching practices. However, at the department or semester level, the sheer volume of feedback renders traditional analysis methods impractical. This paper addresses a gap in previous work where only course-level synthesis is explored using open-source generative AI for creating factual, actionable, and appropriate summaries of student feedback across an entire department. Instead, our study analyses 28 semester-level evaluation reports with student comments—with approximately 25,000 words and 170,000 characters—spanning the department, the model produces insights on several levels, namely degree-level, semester-level, year-level, and department-level. Through structured prompting, we developed a methodology that meets the specific needs of study board chairs who previously faced high workload from manually reviewing evaluations twice yearly. Our prompts allow the model to systematically checks for predetermined themes, while also identifying emergent patterns across courses. This approach enables targeted professional development initiatives at the departmental scale. Our contribution demonstrates that generative AI can effectively synthesize student feedback at a large organizational level, providing a cost-effective mechanism to support educational development and quality improvement across an entire academic unit.
AB - End of semester student evaluations represent the primary feedback mechanism for academics' teaching practices. However, at the department or semester level, the sheer volume of feedback renders traditional analysis methods impractical. This paper addresses a gap in previous work where only course-level synthesis is explored using open-source generative AI for creating factual, actionable, and appropriate summaries of student feedback across an entire department. Instead, our study analyses 28 semester-level evaluation reports with student comments—with approximately 25,000 words and 170,000 characters—spanning the department, the model produces insights on several levels, namely degree-level, semester-level, year-level, and department-level. Through structured prompting, we developed a methodology that meets the specific needs of study board chairs who previously faced high workload from manually reviewing evaluations twice yearly. Our prompts allow the model to systematically checks for predetermined themes, while also identifying emergent patterns across courses. This approach enables targeted professional development initiatives at the departmental scale. Our contribution demonstrates that generative AI can effectively synthesize student feedback at a large organizational level, providing a cost-effective mechanism to support educational development and quality improvement across an entire academic unit.
M3 - Preprint
BT - Scaling Course Evaluations with Large Language Models: Semester-level Digestible Student Feedback for Program Leaders
ER -