Machine Learning with Limited Data (PhD Course)

Activity: Attending an eventOrganisation or participation in workshops, courses, seminars, exhibitions or similar

Description

When using and developing Machine Learning methods, one is frequently faced with scenarios in which the training data available is highly limited. For instance, in Natural Language Processing (NLP), it is common that there might be rich amounts of data for certain languages (e.g., English, Spanish, Chinese) but much less for others (e.g., Danish, Faroese, Haitian Creole). Importantly, however, when faced with limited data there is often rich domain knowledge available. For instance, when classifying medical images, there is an abundance of knowledge from the medical field that can be used, and when dealing with natural languages, the field of linguistics has an abundance of insights that can be used. Whereas many methods for transfer learning simply ignore such domain knowledge, this course highlights the advantages of using it, and the necessity of interdisciplinary collaboration, rather than simply addressing data points without the appropriate domain context.
This course aims to: (i) provide an overview of methods for dealing with limited data in machine learning settings; (ii) provide concrete interactive code examples for participants to develop and potentially use for their own research; and (iii) highlight the importance of interdisciplinary collaboration and discussion with domain experts.

ECTS: 3

Lecturers:
Johannes Bjerva (AAU)
Jonas Pfeiffer (Google Zurich)
Benjamin Roth (University of Vienna)
Jonas Lotz (University of Copenhagen)
Jiaao Chen (Stanford University)
As everyone speaks a language, the examples used in the course will focus on NLP, and participants will be challenged to use both their own intuitions of language to develop and analyze ML/NLP systems.
Period26 Sept 202329 Sept 2023
Event typeCourse
LocationCopenhagen, DenmarkShow on map
Degree of RecognitionInternational

Keywords

  • Natural Language Processing
  • Language Models
  • Machine Learning