Automatic hidden web database classification

3Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper, a method for automatic classification of Hidden-Web databases is addressed. In our approach, the classification tree for Hidden Web databases is constructed by tailoring the well accepted classification tree of DMOZ Directory. Then the feature for each class is extracted from randomly selected Web documents in the corresponding category. For each Web database, query terms are selected from the class features based on their weights. A hidden-web database is then probed by analyzing the results of the class-specific query. To raise the performance further, we also use Web pages which have links pointing to the hidden-web database (HW-DB) as another important source to represent the database. We combine link-based evaluation and query-based probing as our final classification solution. The experiment shows that the combined method can produce much better performance for classification of hidden Web Databases. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Gong, Z., Zhang, J., & Liu, Q. (2007). Automatic hidden web database classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4702 LNAI, pp. 454–461). Springer Verlag. https://doi.org/10.1007/978-3-540-74976-9_46

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free