Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground

Christian Domin; Henning Pohl; Markus Krause

doi:10.1145/2851581.2892512

Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground

Christian Domin, Henning Pohl, Markus Krause

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

15 Citations (Scopus)

Abstract

Plagiarism in online learning environments has a detrimental effect on the trust of online courses and their viability. Automatic plagiarism detection systems do exist yet the specific situation in online courses restricts their use. To allow for easy automated grading, online assignments usually are less open and instead require students to fill in small gaps. Therefore solutions tend to be very similar, yet are then not necessarily plagiarized. In this paper we propose a new approach to detect code re-use that increases the prediction accuracy by dynamically removing parts in assignments which are part of almost every assignment—the so called common ground. Our approach shows significantly better F-measure and Cohen's Kappa results than other state of the art algorithms such as Moss or JPlag. The proposed method is also language agnostic to the point that training and test data sets can be taken from different programming languages.

Original language	Undefined/Unknown
Title of host publication	Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems - CHI EA '16
Number of pages	7
Publication date	2016
Pages	1173-1179
DOIs	https://doi.org/10.1145/2851581.2892512
Publication status	Published - 2016
Externally published	Yes
Event	2016 CHI Conference on Human Factors in Computing Systems - San Jose, United States Duration: 7 May 2016 → 12 May 2016 http://chi2016.acm.org/wp/

Conference

Conference	2016 CHI Conference on Human Factors in Computing Systems
Country/Territory	United States
City	San Jose
Period	07/05/2016 → 12/05/2016
Internet address	http://chi2016.acm.org/wp/

Access to Document

10.1145/2851581.2892512

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{bce0b85678e7493ead6725805ed3c722,

title = "Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground",

abstract = "Plagiarism in online learning environments has a detrimental effect on the trust of online courses and their viability. Automatic plagiarism detection systems do exist yet the specific situation in online courses restricts their use. To allow for easy automated grading, online assignments usually are less open and instead require students to fill in small gaps. Therefore solutions tend to be very similar, yet are then not necessarily plagiarized. In this paper we propose a new approach to detect code re-use that increases the prediction accuracy by dynamically removing parts in assignments which are part of almost every assignment—the so called common ground. Our approach shows significantly better F-measure and Cohen's Kappa results than other state of the art algorithms such as Moss or JPlag. The proposed method is also language agnostic to the point that training and test data sets can be taken from different programming languages.",

keywords = "plagiarism, computer science education, massive open online course",

author = "Christian Domin and Henning Pohl and Markus Krause",

year = "2016",

doi = "10.1145/2851581.2892512",

language = "Udefineret/Ukendt",

pages = "1173--1179",

booktitle = "Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems - CHI EA '16",

note = "2016 CHI Conference on Human Factors in Computing Systems ; Conference date: 07-05-2016 Through 12-05-2016",

url = "http://chi2016.acm.org/wp/",

}

Domin, C, Pohl, H & Krause, M 2016, Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground. in Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems - CHI EA '16. pp. 1173-1179, 2016 CHI Conference on Human Factors in Computing Systems, San Jose, United States, 07/05/2016. https://doi.org/10.1145/2851581.2892512

Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground. / Domin, Christian; Pohl, Henning; Krause, Markus.
Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems - CHI EA '16. 2016. p. 1173-1179.

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

TY - GEN

T1 - Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground

AU - Domin, Christian

AU - Pohl, Henning

AU - Krause, Markus

PY - 2016

Y1 - 2016

N2 - Plagiarism in online learning environments has a detrimental effect on the trust of online courses and their viability. Automatic plagiarism detection systems do exist yet the specific situation in online courses restricts their use. To allow for easy automated grading, online assignments usually are less open and instead require students to fill in small gaps. Therefore solutions tend to be very similar, yet are then not necessarily plagiarized. In this paper we propose a new approach to detect code re-use that increases the prediction accuracy by dynamically removing parts in assignments which are part of almost every assignment—the so called common ground. Our approach shows significantly better F-measure and Cohen's Kappa results than other state of the art algorithms such as Moss or JPlag. The proposed method is also language agnostic to the point that training and test data sets can be taken from different programming languages.

AB - Plagiarism in online learning environments has a detrimental effect on the trust of online courses and their viability. Automatic plagiarism detection systems do exist yet the specific situation in online courses restricts their use. To allow for easy automated grading, online assignments usually are less open and instead require students to fill in small gaps. Therefore solutions tend to be very similar, yet are then not necessarily plagiarized. In this paper we propose a new approach to detect code re-use that increases the prediction accuracy by dynamically removing parts in assignments which are part of almost every assignment—the so called common ground. Our approach shows significantly better F-measure and Cohen's Kappa results than other state of the art algorithms such as Moss or JPlag. The proposed method is also language agnostic to the point that training and test data sets can be taken from different programming languages.

KW - plagiarism

KW - computer science education

KW - massive open online course

U2 - 10.1145/2851581.2892512

DO - 10.1145/2851581.2892512

M3 - Konferenceartikel i proceeding

SP - 1173

EP - 1179

BT - Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems - CHI EA '16

T2 - 2016 CHI Conference on Human Factors in Computing Systems

Y2 - 7 May 2016 through 12 May 2016

ER -