Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Computing approximate patterns in strings or sequences has important applications in DNA sequence analysis, data compression, musical text analysis, and so on. In this paper, we introduce approximate k-covers and study them under various commonly used distance measures. We propose the following problem: "Given a string x of length n, a set U of m strings of length k, and a distance measure, compute the minimum number t such that U is a set of approximate k-covers for x with distance t". To solve this problem, we present three algorithms with time complexity O(km(n - k)), O(mn2) and O(mn2) under Hamming, Levenshtein and edit distance, respectively. A World Wide Web server interface has been established at for automated use of the programs.
Feature saliency estimation and feature selection are important tasks in machine learning applications. Filters, such as distance measures are commonly used as an efficient means of estimating the saliency of individual features. However, feature rankings derived from different distance measures are frequently inconsistent. This can present reliability issues when the rankings are used for feature selection. Two novel consensus approaches to creating a more robust ranking are presented in this paper. Our experimental results show that the consensus approaches can improve reliability over a range of feature parameterizations and various seabed texture classification tasks in sidescan sonar mosaic imagery.
Simple approaches to texture discrimination based on histogram analysis are useful in real-time applications but often yield inadequate results. On the other hand, methods based on higher-order statistics (e.g., co-occurrence matrices) provide a more complete statistical characterisation but are extremely time-consuming. In this paper, methods based on first order statistical analysis are reviewed and the significance of the relevant representative features analyzed. Then, rank functions are considered and appropriate distance functions are introduced that prove to have substantial advantages over classical histogram-based approaches.
The concept of moving average is studied. We analyze several extensions by using generalized aggregation operators, obtaining the generalized moving average. The main advantage is that it provides a general framework that includes a wide range of specific cases including the geometric and the quadratic moving average. This analysis is extended by using the generalized ordered weighted averaging (GOWA) and the induced GOWA (IGOWA) operator. Thus, we get the generalized ordered weighted moving average (GOWMA) and the induced GOWMA (IGOWMA) operator. Some of their main properties are studied. We further extend this approach by using distance measures suggesting the concept of distance moving average and generalized distance moving average. We also consider the case with the OWA and the IOWA operator, obtaining the generalized ordered weighted moving averaging distance (GOWMAD) and the induced GOWMAD (IGOWMAD) operator. The paper ends with an application in multi-period decision making.
Hesitant fuzzy linguistic term set (HFLTS) is a set with ordered consecutive linguistic terms, and is very useful in addressing the situations where people are hesitant in providing their linguistic assessments. Wang [H. Wang, Extended hesitant fuzzy linguistic term sets and their aggregation in group decision making, International Journal of Computational Intelligence Systems8(1) (2015) 14–33.] removed the consecutive condition to introduce the notion of extended HFLTS (EHFLTS). The generalized form has wider applications in linguistic group decision-making. By introducing distance measures for EHFLTSs, in this paper we develop a novel multi-criteria group decision making model to deal with hesitant fuzzy linguistic information. The model collects group linguistic information by using EHFLTSs and avoids the possible loss of information. Moreover, it can assess the importance weights of criteria according to their subjective and objective information and rank alternatives based on the rationale of TOPSIS. In order to illustrate the applicability of the proposed algorithm, two examples are given and comparisons are made with the other existing methods.
Distance-based algorithms are nonparametric methods that can be used for classification. These algorithms classify objects by the dissimilarity between them as measured by distance functions. Several candidate distance functions are reviewed in this chapter along with two particular classification algorithms. Some of the current applications related to distance-based algorithms are also addressed.
The properties of dissimilarity measure between intuitionistic fuzzy points are discussed in this paper. A new axiom definition of distance measures between intuitionistic fuzzy sets is proposed and a general formula for measure distance is presented based on dissimilarity measure between intuitionistic fuzzy points. Finally, the similarity measures generated by the two new distance measures are analyzed by some patterns.
The method of k-nearest neighbors (k-NN) is used for estimation of conditional expectation (regression) of an output Y given the value of an input vector x: Such a regression problem arises, for example, in insurance where the pure premium for a new client (policy) x is to be found as conditional mean of the loss. In accordance with supervised learning set-up, a training set is assumed. We apply the k-NN method to a real data set by proposing solutions for feature weighting, distance weighting, and the choice of k. All the optimization procedures are based on cross-validation techniques. Comparisons with other methods of estimation of the regression function like regression trees and generalized linear models (quasi-Poisson regression) are drawn, demonstrating high competitiveness of the k-NN method.