Abstract
Plagiarism is a complex problem and considered one of the biggest in publishing of scientific, engineering and other types of documents. Plagiarism has also increased with the widespread use of the Internet as large amount of digital data is available. Plagiarism is not just direct copy but also paraphrasing, rewording, adapting parts, missing references or wrong citations. This makes the problem more difficult to handle adequately. Plagiarism detection techniques are applied by making a distinction between natural and programming languages. Our proposed detection process is based on natural language by comparing documents. A similarity score is determined for each pair of documents which match significantly. We have implemented SCAM (Standard Copy Analysis Mechanism) which is a relative measure to detect overlap by making comparison on a set of words that are common between test document and registered document. Our plagiarism detection system, like many Information Retrieval systems, is evaluated with metrics of precision and recall.
Original language | English |
---|---|
Title of host publication | Proceedings of the International MultiConference on Engineers and Computer Scientists 2011 |
Number of pages | 6 |
Volume | Volume I |
Place of Publication | Hong Kong |
Publisher | Newswood Limited, International Association of Engineers, IAENG |
Publication date | 2011 |
Pages | 272-277 |
ISBN (Print) | 978-988-18210-3-4 |
Publication status | Published - 2011 |
Event | International MultiConference on Engineers and Computer Scientists 2011 - Hong Kong, China Duration: 16 Mar 2011 → 18 Mar 2011 |
Conference
Conference | International MultiConference on Engineers and Computer Scientists 2011 |
---|---|
Country/Territory | China |
City | Hong Kong |
Period | 16/03/2011 → 18/03/2011 |
Keywords
- Plagiarism
- SCAM
- WordNet
- Apache Lucene