Keyword Parallel Search over RDF Data Based on Semantic Association
Existing RDF keyword search studies focus on constructing smallest trees or subgraphs which contain all query keywords, but neglect the semantic association between RDF data. Thus, this paper proposes the keyword parallel search over RDF data based on semantic association (KPSRSA)) algorithm which utilizes a score function to measure semantic association by combining OWL ontology and the probability model. It uses a distributed database Hbase as a storage medium and Mapreduce to perform parallel query, which queries sub-clusters with semantic association in Map phase and constructs a series of associated clusters as query results in Reduce phase. The experimental results demonstrate that the KPSRSA algorithm improves the precision and relevance of search results and keywords. In addition, distributed storage and parallel computing inquiry has improved scalability.