Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Achieving the effectiveness in relation to the relevance of query result is the most crucial part of XML keyword search. Developing an XML Keyword search approach which addresses the user search intention, keyword ambiguity problems and query/search result grading (ranking) problem is still challenging. In this paper, we propose a novel approach called XDMA for keyword search in XML databases that builds two indices to resolve these problems. Then, a keyword search technique based on two-level matching between two indices is presented. Further, by utilizing the logarithmic and probability functions, a terminology that defines the Mutual Score to find the desired T-typed node is put forward. We also introduce the similarity measure to retrieve the exact data through the selected T-typed node. In addition, grading for the query results having comparable relevance scores is employed. Finally, we demonstrate the effectiveness of our proposed approach, XDMA with a comprehensive experimental evaluation using the datasets of DBLP, WSU and eBay.
In this paper, we propose a method to construct an appropriate data granularity of XML elements for keyword search in an XML document. There must be a hierarchical architecture among these keywords. The proposed method can construct several XML element segments from DTD to find some corresponding XML element segments in an XML document. Each identified segment of XML element presents a hierarchical architecture of instance data including the user specified keywords. Users can browse data at an appropriate data granularity. Finally, an example is illustrated to show how the proposed method works.
XML is emerging as a de facto standard for information exchange over the Web, while businesses and enterprises generate and exchange large amounts of XML data daily. One of the major challenges is how to query this data efficiently. Queries typically can be represented as twig patterns. Some researchers have developed algorithms that reduce the intermediate results that are generated during query processing, while others have introduced labeling schemes that encode the position of elements, enabling queries to be answered by accessing the labels without traversing the original XML documents. In this paper we outline optimizations that are based on semantics of the data being queried, and introduce efficient algorithms for content and keyword searches in XML databases. If the semantics are known we can further optimize the query processing, but if the semantics are unknown we revert to the traditional query processing approaches.
Existing RDF keyword search studies focus on constructing smallest trees or subgraphs which contain all query keywords, but neglect the semantic association between RDF data. Thus, this paper proposes the keyword parallel search over RDF data based on semantic association (KPSRSA)) algorithm which utilizes a score function to measure semantic association by combining OWL ontology and the probability model. It uses a distributed database Hbase as a storage medium and Mapreduce to perform parallel query, which queries sub-clusters with semantic association in Map phase and constructs a series of associated clusters as query results in Reduce phase. The experimental results demonstrate that the KPSRSA algorithm improves the precision and relevance of search results and keywords. In addition, distributed storage and parallel computing inquiry has improved scalability.