Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Since the emergence in the popularity of XML for data representation and exchange over the Web, the distribution of XML documents has rapidly increased. It has become a challenge for researchers to turn these documents into a more useful information utility. In this paper, we introduce a novel clustering algorithm PCXSS that keeps the heterogeneous XML documents into various groups according to their similar structural and semantic representations. We develop a global criterion function CPSim that progressively measures the similarity between a XML document and existing clusters, ignoring the need to compute the similarity between two individual documents. The experimental analysis shows the method to be fast and accurate.
The paper describes a technique called ISE for image segmentation using entropy. The relation between the entropy of an image domain and the entropy of its subdomains is explored as a uniformity predicate. Such entropy is obtained from the analysis of the image histogram associating a Gaussian distribution to the maximum frequency of gray levels.
In order to implement the model, we have introduced a well-known technique of Problem Solving. In our model, the most important roles are played by the Evaluation Function (EF) and the Control Strategy. The EF is related to the ratio between the entropy of one region or zone of the picture and the entropy of the entire picture, while the Control Strategy determines the optimal path in the search tree (quadtree) so that the nodes in the optimal path have minimal entropy.
The paper shows some comparisons between ISE and classical edge detection techniques.