Automatic hidden web database classification

Zhiguo Gong; Jingbai Zhang; Qian Liu

Conference ProceedingsOPEN ACCESS

Automatic hidden web database classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4702 LNAI 454-461

DOI: 10.1007/978-3-540-74976-9_46

3Citations

6Readers

Abstract

In this paper, a method for automatic classification of Hidden-Web databases is addressed. In our approach, the classification tree for Hidden Web databases is constructed by tailoring the well accepted classification tree of DMOZ Directory. Then the feature for each class is extracted from randomly selected Web documents in the corresponding category. For each Web database, query terms are selected from the class features based on their weights. A hidden-web database is then probed by analyzing the results of the class-specific query. To raise the performance further, we also use Web pages which have links pointing to the hidden-web database (HW-DB) as another important source to represent the database. We combine link-based evaluation and query-based probing as our final classification solution. The experiment shows that the combined method can produce much better performance for classification of hidden Web Databases. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Gong, Z., Zhang, J., & Liu, Q. (2007). Automatic hidden web database classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4702 LNAI, pp. 454–461). Springer Verlag. https://doi.org/10.1007/978-3-540-74976-9_46

Automatic hidden web database classification

Abstract

Cite

Register to see more suggestions