Please login to be able to save your searches and receive alerts for new content matching your search criteria.
As the Web grows, more and more data have become “hidden” or “deepen”. Web crawlers should not only have the ability of fetching the Publicly Indexable Web (PIW), but also be able to plunge into the Hidden Web to search for more useful information. In this paper, an approach for exploring and identifying the domain-specific search interfaces by using SVM classification scheme is presented. The method of integrating domain-specific search interface (DSI) into the topical crawling system is also introduced. This work is based on our observation of the conciseness and representative characteristics of DSI. Though intuitive and apparent this approach is, it seems more preferable for identifying DSIs. The experimental results show that such a feasible and practical way can achieve good performance.