Abstract
A study of temporal aspects of authorship attribution - a task which aims to distinguish automatically between texts written by different authors by measuring textual features. This task is important in a number of areas, including plagiarism detection in secondary education, which we study in this work. As the academic abilities of students evolve during their studies, so does their writing style. These changes in writing style form a type of temporal context, which we study for the authorship attribution process by focussing on the students’ more recent writing samples. Experiments with real world data from Danish secondary school students show 84% prediction accuracy when using all available material and 71.9% prediction accuracy when using only the five most recent writing samples from each student.
Original language | English |
---|---|
Title of host publication | Multidisciplinary Information Retrieval : 7th Information Retrieval Facility Conference, IRFC 2014, Copenhagen, Denmark, November 10-12, 2014, Proceedings |
Editors | David Lamas, Paul Buitelaar |
Number of pages | 19 |
Publisher | Springer |
Publication date | 1 Jan 2014 |
Pages | 22-40 |
ISBN (Print) | 978-3-319-12978-5 |
ISBN (Electronic) | 978-3-319-12979-2 |
DOIs | |
Publication status | Published - 1 Jan 2014 |
Event | The 3rd Open Interdisciplinary MUMIA Conference and 7th Information Retrieval Facility Conference - Aalborg University Copenhagen, Copenhagen, Denmark Duration: 11 Nov 2014 → 12 Nov 2014 Conference number: 7 |
Conference
Conference | The 3rd Open Interdisciplinary MUMIA Conference and 7th Information Retrieval Facility Conference |
---|---|
Number | 7 |
Location | Aalborg University Copenhagen |
Country/Territory | Denmark |
City | Copenhagen |
Period | 11/11/2014 → 12/11/2014 |
Series | Lecture Notes in Computer Science |
---|---|
Volume | 8849 |
ISSN | 0302-9743 |
Keywords
- Authorship attribution
- Automatic classification
- Secondary education