The utility of information extraction in the classification of books

3Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We describe work on automatically assigning classification labels to books using the Library of Congress Classification scheme. This task is non-trivial due to the volume and variety of books that exist. We explore the utility of Information Extraction (IE) techniques within this text categorisation (TC) task, automatically extracting structured information from the full text of books. Experimental evaluation of performance involves a corpus of books from Project Gutenberg. Results indicate that a classifier which combines methods and tools from IE and TC significantly improves over a state-of-the-art text classifier, achieving a classification performance of Fβ=1 = 0.8099. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Betts, T., Milosavljevic, M., & Oberlander, J. (2007). The utility of information extraction in the classification of books. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4425 LNCS, pp. 295–306). Springer Verlag. https://doi.org/10.1007/978-3-540-71496-5_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free