In view of the inaccurate results and poor real-time performance of college archives business data push, the college archives business data push system based on B/S structure is designed using ASP/NET, ADO.NET, and other technologies. The system consists of database management module, management maintenance module and query module. The database management module uses the archive business data classification method optimized based on the initial cluster center to classify and manage large-scale archive data. The management and maintenance module completes the encryption and decryption management of archive business data through key management technology and key verification. After the user enters a valid key to log in to the system, the query module uses the association rule-based file business data push method to extract the file data in the strong weighted association rule conditions from the database, feed it back to the front browser interface, and complete the file business data push. After testing, the average accuracy rate of this system for multiple file data push is greater than 0.95, and the maximum time consumption for file business data push after use is reduced by 748ms.
This paper presents a weighted support vector machine (WSVM) to improve the outlier sensitivity problem of standard support vector machine (SVM) for two-class data classification. The basic idea is to assign different weights to different data points such that the WSVM training algorithm learns the decision surface according to the relative importance of data points in the training data set. The weights used in WSVM are generated by a robust fuzzy clustering algorithm, kernel-based possibilistic c-means (KPCM) algorithm, whose partition generates relative high values for important data points but low values for outliers. Experimental results indicate that the proposed method reduces the effect of outliers and yields higher classification rate than standard SVM does when outliers exist in the training data set.
An important work of data analysis is to identify correlation structures and classify the data in unlabeled high-dimensional data, which usually requires iterative experiments on clustering parameters, attribute weights and instances. For a large dataset, the number of clusters may be huge, and it is a great challenge to explore in this huge space. People usually have a more comprehensive understanding of some data. For example, they think that data A is better than data B, but they do not know which attributes are important. Therefore, a powerful interactive analysis tool can help people greatly improve the effectiveness of exploratory clustering analysis. This paper provides a visual analysis method for sorting and classifying multivariate data. It can determine the weight of each attribute through user’s interaction, thus, generating sorting, and then complete classification according to sorting results. Through visual display, users can understand the characteristics of data as well as category characteristics intuitively and quickly, and it helps users improve sorting and classification results.
One of the emerging technologies, seeking significant attention in the research area is cloud computing. However, privacy is the major concern in the cloud, as it is essential to manage the confidentiality in the data shared. In the first work, the privacy preservation model was developed by newly designed Kronecker product based Bat algorithm. Here, the previous work is extended by developing the classification algorithm for classifying the privacy preserved database. Initially, the Kronecker product based Bat algorithm finds the privacy preserved database from the original medical data. Then, the ontology based features are extracted from the privacy preserved database and given to the data classifier. Here, a classifier, named Whale based Sine Cosine Algorithm with Support Vector Neural Network (WSCA-SVNN), is newly developed for the data classification. The proposed WSCA algorithm helps in optimally choosing the weights for SVNN classifier, and finally, the WSCA-SVNN classifier classifies the medical data. The simulation of the proposed privacy preserved data classification network is done by utilizing the heart disease database. The analysis shows that the proposed WSCA-SVNN classifier scheme achieved an accuracy value of 90.29% during medical data classification.
Most multi-label learning (MLL) techniques perform classification by analyzing only the physical features of the data, which means they are unable to consider high-level features, such as structural and topological ones. Consequently, they have trouble to detect the semantic meaning of the data (e.g., formation pattern). To handle this problem, a high-level framework has been recently proposed to the MLL task, in which the high-level features are extracted using the analysis of complex network measures. In this paper, we extend that work by evaluating different combinations of four complex networks measures, namely clustering coefficient, assortativity, average degree and average path length. Experiments conducted over seven real-world data sets showed that the low-level techniques often can have their predictive performance improved after being combined with high-level ones, and also demonstrated that there is no a unique measure that provides the best results, i.e., different problems may ask for different network properties in order to have their high-level patterns efficiently detected.
The fifth-generation (5G) technology is anticipated to permit connectivity to billions of devices, called the Internet of Things (IoT). The primary benefit of 5G is that it has maximum bandwidth and can drastically expand service beyond cell phones to standard internet service for conventionally fixed connectivity to homes, offices, factories, etc. But IoT devices will unavoidably be the primary target of diverse kinds of cyberattacks, notably distributed denial of service (DDoS) attacks. Since the conventional DDoS mitigation techniques are ineffective for 5G networks, machine learning (ML) approaches find helpful to accomplish better security. With this motivation, this study resolves the network security issues posed by network devices in the 5G networks and mitigates the harmful effects of DDoS attacks. This paper presents a new pigeon-inspired optimization-based feature selection with optimal functional link neural network (FLNN), PIOFS-OFLNN model for mitigating DDoS attacks in the 5G environment. The proposed PIOFS-OFLNN model aims to detect DDoS attacks with the inclusion of feature selection and classification processes. The proposed PIOFS-OFLNN model incorporates different techniques such as pre-processing, feature selection, classification, and parameter tuning. In addition, the PIOFS algorithm is employed to choose an optimal subset of features from the pre-processed data. Besides, the OFLNN based classification model is applied to determine DDoS attacks where the Rat Swarm Optimizer (RSO) parameter tuning takes place to adjust the parameters involved in the FLNN model optimally. FLNN is a low computational interconnectivity higher cognitive neural network. There are still no hidden layers. FLNN’s input vector is operationally enlarged to produce non-linear remedies. More details can be accessed application of Nature-Inspired Method to Odia Written by hand Number system Recognition. To validate the improved DDoS detection performance of the proposed model, a benchmark dataset is used.
Point-of-Sale (POS) data analysis is usually used to explore sales performance in business commence. This manuscript aims to combine unsupervised clustering and supervised classification methods in an integrated data analysis framework to analyze the real-world POS data. Clustering method, which is performed on sales dataset, is used to cluster the stores into several groups. The clustering results, data labels, are then combined with other information in store features dataset as the inputs of the classification model which classifies the clustering labels by using store features dataset. Non-dominated sorting generic algorithm-II (NSGA-II) is applied in the framework to employ the multi-objective of clustering and classification. The experimental case study shows clustering results can reveal the hidden structure of sales performance of retail stores while classification can reveal the major factors that effect to the sales performance under different group of retail stores. The correlations between sales clusters and the store information can be obtained sequentially under a series of data analysis with the proposed framework.
Combinatorial metaheuristic optimization algorithms have newly become a remarkable domain for handling real-world and engineering design optimization problems. In this paper, the Whale Optimization Algorithm (WOA) and the Woodpecker Mating Algorithm (WMA) are combined as HWMWOA. WOA is an effective algorithm with the advantage of global searching ability, where the control parameters are very less. But WOA is more probable to get trapped in the local optimum points and miss diversity of population, therefore suffering from premature convergence. The fundamental goal of the HWMWOA algorithm is to overcome the drawbacks of WOA. This betterment includes three basic mechanisms. First, a modified position update equation of WMA by efficient exploration ability is embedded into HWMWOA. Second, a new self-regulation Cauchy mutation operator is allocated to the proposed hybrid method. Finally, an arithmetic spiral movement with a novel search guide pattern is used in the suggested HWMWOA algorithm. The efficiency of the suggested algorithm is appraised over 48 test functions, and the optimal outcomes are compared with 15 most popular and newest metaheuristic optimization algorithms. Moreover, the HWMWOA algorithm is applied for simultaneously optimizing the parameters of SVM (Support Vector Machine) and feature weighting to handle the data classification problem on several real-world datasets from the UCI database. The outcomes prove the superiority of the suggested hybrid algorithm compared to both WOA and WMA. In addition, the results represent that the HWMWOA algorithm outperforms other efficient techniques impressively.
The goal of this work is to define a notion of a “quantum neural network” to classify data, which exploits the low-energy spectrum of a local Hamiltonian. As a concrete application, we build a binary classifier, train it on some actual data and then test its performance on a simple classification task. More specifically, we use Microsoft’s quantum simulator, LIQUi|⟩, to construct local Hamiltonians that can encode trained classifier functions in their ground space, and which can be probed by measuring the overlap with test states corresponding to the data to be classified. To obtain such a classifier Hamiltonian, we further propose a training scheme based on quantum annealing which is completely closed-off to the environment and which does not depend on external measurements until the very end, avoiding unnecessary decoherence during the annealing procedure. For a network of size n, the trained network can be stored as a list of O(n) coupling strengths. We address the question of which interactions are most suitable for a given classification task, and develop a qubit-saving optimization for the training procedure on a simulated annealing device. Furthermore, a small neural network to classify colors into red versus blue is trained and tested, and benchmarked against the annealing parameters.
In human–robot interaction developments, detection, tracking and identification of moving objects (DATMO) constitute an important problem. More specifically, in mobile robots this problem becomes harder and more computationally expensive as the environments become dynamic and more densely populated. The problem can be divided into a number of sub-problems, which include the compensation of the robot's motion, measurement clustering, feature extraction, data association, targets' trajectory estimation and finally, target classification. Here, a mobile robot uses 2D laser range data to identify and track moving targets. A Joint Probabilistic Data Association with Interacting Multiple Model (JPDA-IMM) tracking algorithm associates the available laser data to track and provide an estimated state vector of targets' position and velocity. Potential moving objects are initially learned in a supervised manner and later on are autonomously classified in real-time using a trained Fuzzy ART neural network classifier. The recognized targets are fed back to the tracker to further improve the track initiation process. The resulting technique introduces a computationally efficient approach to already existing target-tracking and identification research, which is especially suited for real time application scenarios.
The data classification task is one of the main tasks within the knowledge discovering from databases field. Its goal is to allow the correct classification of new objects (records from a database), unknown to the classifier, based upon the extraction of knowledge from objects whose classes are known a priori. The known data can be used to generate a classification model, or simply to infer the class of new objects from those whose classes are known. This paper presents a proposal for a classification algorithm, called Constructive Particle Swarm Classifier (cPSClass), which uses mechanisms from the Particles Swarm Clustering algorithm and Artificial Immune Systems to determine dynamically the number of prototypes from a database and use them to predict the correct class to which a new input object should belong. For performance evaluation the cPSClass was applied to several datasets from the literature and its performance was compared with that of its predecessor version, the nonconstructive Particle Swarm Classifier, and also to some classic algorithms from the literature.
In this paper, a Sugeno type fuzzy system based on the fuzzy clustering has been developed for a variety of datasets. The number of rules for each dataset is based on the optimum number of clusters in that dataset. Rule sets provide the knowledge base for the classification of data. Each rule set is fine-tuned using the GWO with the intention to improve the classification. The approach is compared with the work of previous researchers on similar data sets using a variety of techniques, including nature-inspired algorithms such as genetic algorithms and Swarm based algorithms. Statistical Analysis of the performance of GWO shows that it is better than five other algorithms 95% of the time.
Data sets with imbalanced class sizes, where one class size is much smaller than that of others, occur exceedingly often in many applications, including those with biological foundations, such as disease diagnosis and drug discovery. Therefore, it is extremely important to be able to identify data elements of classes of various sizes, as a failure to do so can result in heavy costs. Nonetheless, many data classification procedures do not perform well on imbalanced data sets as they often fail to detect elements belonging to underrepresented classes. In this work, we propose the BTDT-MBO algorithm, incorporating Merriman–Bence–Osher (MBO) approaches and a bidirectional transformer, as well as distance correlation and decision threshold adjustments, for data classification tasks on highly imbalanced molecular data sets, where the sizes of the classes vary greatly. The proposed technique not only integrates adjustments in the classification threshold for the MBO algorithm in order to help deal with the class imbalance, but also uses a bidirectional transformer procedure based on an attention mechanism for self-supervised learning. In addition, the model implements distance correlation as a weight function for the similarity graph-based framework on which the adjusted MBO algorithm operates. The proposed method is validated using six molecular data sets and compared to other related techniques. The computational experiments show that the proposed technique is superior to competing approaches even in the case of a high class imbalance ratio.
By simulating humanlike stylistic classification behaviors, a novel design methodology called S2CM for stylistic data classification is developed in this study. The core of S2CM is to build a social network consisting of subnetworks corresponding to each data class in the training dataset, and then compute both the influence of each node and the authority of each subnetwork such that style information existing in the training dataset can be well expressed according to the philosophy of social networks. With the built social network, the prediction of S2CM for an unseen sample can be cheaply implemented. Experimental results on artificial and benchmarking datasets show that S2CM outperforms the comparison methods on stylistic data.
In classification and decision making, combining classifiers is a common approach, forming what is known as a classifier ensemble. The idea behind this approach is to, through diversity, improve classification accuracy. In these systems, perhaps the most important part is combining the different outputs presented by each classifier. However, most approaches found in literature use simple methods like majority voting or weighted means as the combination method. In this paper, we will present new approaches to combine the outputs of classifiers in a classifier ensemble: Fuzzy Majority Voting and Fuzzy Plurality voting, which are fuzzy approaches to the classical majority and plurality voting. Results obtained show that both are promising methods to be used in these systems.
In this paper we use pseudo gradient search to solve classification problems. In most classifiers, the goal is to reduce the misclassified rate that is discrete. Since pseudo gradient search is a local search, to use it for classification problem, objective function has to be real valued. A penalty technique is used for this purpose.
This paper presents a multi-class data classification approach based on hyper-boxes using a mixed integer linear programming (MILP) model. Comparing with other discriminant classifiers, hyper-boxes are adopted to capture the disjoint regions and define the boundaries of each class so as to minimise the total misclassified samples. Non-overlapping constraints are specified to avoid overlapping of boxes that belong to different classes. In order to improve the training and testing accuracy, an iterative solution approach is presented to assign multi-boxes to single class. Finally, the applicability of the proposed approach is demonstrated through two illustrative examples from machine learning databases. According to the computational results, our approach is competitive in terms of prediction accuracy when comparing with various standard classifiers.
We propose an information organizer for effective clustering and similarity-based retrieval of text and video data. Instead of giving keywords or authoring them, we use a vector space model and DCT image coding in order to extract characteristics of data. Data are clustered by Kohonen's self-organizing map, and the result is visualized in a 3D form. By this, similarity-based retrieval is achieved. We implemented a prototype system and report experimental results. We consider that our system effectively promotes reuse of distributed text and image data assets.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.