External plagiarism detection systems compare suspicious texts against a reference collection to identify the original one(s). The suspicious text may not contain a verbatim copy of the reference collection since plagiarists often try to disguise their behaviour by altering the text. For large reference collections, such as those accessible via the internet, it is not practical to compare the suspicious text with every document in the reference collection. Consequently many approaches to plagiarism detection begin by identifying a set of candidate documents from the reference collection. We report an IR-based approach to the candidate document selection problem that uses query expansion to identify candidates which have been altered. The reported system outperforms a previously reported approach and is also robust to changes in the reference collection text. © 2012 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Nawab, R. M. A., Stevenson, M., & Clough, P. (2012). Retrieving candidate plagiarised documents using query expansion. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7224 LNCS, pp. 207–218). https://doi.org/10.1007/978-3-642-28997-2_18
Mendeley helps you to discover research relevant for your work.