Please login to be able to save your searches and receive alerts for new content matching your search criteria.
In this paper, we describe a query approximation system which uses the Multi-Layered Database (MLDB), a collection of summarized relational data generated using domain-based concept hierarchies. The system generates approximate answers to queries to handle environmental constraints and access control levels, thus preserving the privacy and security of data. Using concept hierarchy (CH), we generalize attributes to transform base relations to different layers of summarized relations corresponding to access control levels. The summary databases thus formed are the compression of the tuples in the main database using the CH constructed using the domain set. The query is rewritten by traversing the MLDB layers according to the user's access control level. We present summarization methods, query rewriting algorithms, implementation and experimental results of the system. In addition, we analyze some of the known inferences in Multi Level Secure (MLS) databases and then proceed to explore their effects on an approximate query processor that uses the MLDB model. The common relationships among inferential queries are found by analyzing them, and are used in possible solutions to detect and prevent inference problems. These patches are added to the query processor in MLDB to form a system that provides approximate results by preserving privacy and at the same time block the possible inferences. We have observed that these extra patches introduce only very small overheads in the MLDB generation and query processing.
Clustering of data in a large dimension space is of great interest in many data mining applications. In this paper, we propose a method for clustering of web usage data in a high-dimensional space based on a concept hierarchy model. In this method, the relationship present in the web usage data are mapped into a fuzzy proximity relation of user transactions. We also described an approach to present the preference set of URLs to a new user transaction based on the match score with the clusters. The study demonstrates that our approach is general and effective for mining the web data for web personalization.
When attempting knowledge discovery on spatial data, certain additional constraints on and relationships among the data must be considered. These include spatially or locationally explicit attributes, as well as more implicit topological relationships. Given such additional constraints, many generalized data mining techniques and algorithms may be specially tailored for mining in spatial data. This chapter introduces several adapted techniques and algorithms that may be applied in a spatial data mining task.