Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Epilepsy is a common neurological disorder that is characterized by the recurrence of seizures. Electroencephalogram (EEG) signals are widely used to diagnose seizures. Because of the non-linear and dynamic nature of the EEG signals, it is difficult to effectively decipher the subtle changes in these signals by visual inspection and by using linear techniques. Therefore, non-linear methods are being researched to analyze the EEG signals. In this work, we use the recorded EEG signals in Recurrence Plots (RP), and extract Recurrence Quantification Analysis (RQA) parameters from the RP in order to classify the EEG signals into normal, ictal, and interictal classes. Recurrence Plot (RP) is a graph that shows all the times at which a state of the dynamical system recurs. Studies have reported significantly different RQA parameters for the three classes. However, more studies are needed to develop classifiers that use these promising features and present good classification accuracy in differentiating the three types of EEG segments. Therefore, in this work, we have used ten RQA parameters to quantify the important features in the EEG signals.These features were fed to seven different classifiers: Support vector machine (SVM), Gaussian Mixture Model (GMM), Fuzzy Sugeno Classifier, K-Nearest Neighbor (KNN), Naive Bayes Classifier (NBC), Decision Tree (DT), and Radial Basis Probabilistic Neural Network (RBPNN). Our results show that the SVM classifier was able to identify the EEG class with an average efficiency of 95.6%, sensitivity and specificity of 98.9% and 97.8%, respectively.
The goal of designing an ensemble of simple classifiers is to improve the accuracy of a recognition system. However, the performance of ensemble methods is problem-dependent and the classifier learning algorithm has an important influence on ensemble performance. In particular, base classifiers that are too complex may result in overfitting. In this paper, the performance of Bagging, Boosting and Error-Correcting Output Code (ECOC) is compared for five decision tree pruning methods. A description is given for each of the pruning methods and the ensemble techniques. AdaBoost.OC which is a combination of Boosting and ECOC is compared with the pseudo-loss based version of Boosting, AdaBoost.M2 and the influence of pruning on the performance of the ensembles is studied. Motivated by the result that both pruned and unpruned ensembles made by AdaBoost.OC give similar accuracy, pruned ensembles are compared with ensembles of Decision Stumps. This leads to the hypothesis that ensembles of simple classifiers may give better performance for some problems. Using the application of face recognition, it is shown that an AdaBoost.OC ensemble of Decision Stumps outperforms an ensemble of pruned C4.5 trees for face identification, but is inferior for face verification. The implication is that in some real-world tasks to achieve best accuracy of an ensemble, it may be necessary to select base classifier complexity.
From the last decade, Sentiment Analysis of languages such as English and Chinese are particularly the focus of attention but resource poor languages such as Urdu are mostly ignored by the research community, which is focused in this research. After acquiring data from various blogs of about 14 different genres, the data is being annotated with the help of human annotators. Three well-known classifiers, that is, Support Vector Machine, Decision tree and k-Nearest Neighbor (k-NN) are tested, their outputs are compared and their results are ultimately improved in several iterations after taking a number of steps that include stop words removal, feature extraction, identification and extraction of important features. extraction. Initially, the performance of the classifiers is not satisfactory as the accuracy achieved by all the three is below 50%. Ensemble of classifiers is also tried but the results are not fruitful (in terms of high accuracy). The results are analyzed carefully and improvements are made including feature extraction that raised the performance of these classifiers to a satisfactory level. It is further concluded that k-NN is performing better than Support Vector Machine and Decision tree in terms of accuracy, precision, recall and f-measure.
A practically deployable gesture recognition system is developed using a robust hand detection method implemented using a motion-based image segmentation process and a two-level bare hand classification model, which is integrated with a gesture classification system of 58 gestures using new robust features. Since detection of bare hand is affected by nonideal conditions, multiple color-texture features are analyzed in this study. In the second stage of the system, 18 new ASCII characters are introduced and analyzed along with the existing 40 characters (alphabets, numbers, and arithmetic operators). New 15 dimensional features are introduced along with the existing features to enhance the classification accuracy of the gestures. Significance of features statistically tested using one-way analysis of variance (ANOVA), Kruskal–Wallis and Friedman test, which are sequentially ranked and evaluated using incremental feature selection (IFS) method. Performance of the proposed hand detection system is observed to be 12.5% higher than the existing hand detection system under clean conditions, while 46.4% higher under the nonideal conditions. Performance of 58 gestures classification model has improved by 12.08% (Naïve Bayes), 8.86% (ELM), 10.83% (SVM), 8.02% (kNN), and 6.61% (ANN) after using the new features. Majority voting-based classifier fusion method further improves the performance of the gesture recognition system by 3.88%, which is validated by Turkey’s HSD test.
The advancement of healthcare technology is impossible without machine learning (ML). There have been numerous advances in ML to analyze, predict, and diagnose medical data. Integrating a centralized scheme and therapy for classifying and diagnosing illnesses and disorders is a major obstacle in modern healthcare. To standardize all medical data into a single repository, researchers have proposed using ML using the centralized artificial neural network model (ML-CANNM). Random tree, support vector machine, and gradient booster are just a few proposed ML classifiers. Artificial neural networks (ANNs) have been trained using a variety of medical datasets to predict and analyze outcomes. ML-CANNM collects patient data from various studies and uses ML and ANNs to determine the results. Three layers make up an ANN. ML is used to classify the given patients’ data in the input layer. In the hidden layer, classification data are compared to a training dataset. The output layer’s job is to identify, classify, and diagnose diseases. As a result, disease diagnosis and detection are integrated into a single healthcare database. The proposed framework has proven that ML-CANNM works with more accuracy and lesser execution time. Thus, the numerical outcome suggested ML-CANNM increased accuracy ratio of 99.2% and a prediction ratio of 97.5%. The findings further show that the execution time is enhanced by less than 2h, decision table using ML and results in an efficiency ratio of 97.5%.
Piecewise-linear mathematical structures form a convenient and important framework for implementing trainable and adaptive pattern classifiers. Neural networks and genetic algorithms offer additional approaches with important benefits for the design of such classifiers. In this paper we show how neural modeling and genetic selection can be applied to piecewise-linear structures to optimize both the topology and the parameter values of the network forming the classifier. Such a classifier will tend to have a low error rate and high robustness. We describe applications of these techniques to an adaptive detector of abnormal tissue in mammograms and a detector of straight lines and edges in noisy aerial images.
Infertility is becoming a public health issue in almost all countries. Assisted Reproductive Technology (ART) is considered as a method of last resort for treating infertility. The treatment of ART is highly expensive and painful, and also the probability of success is low since the success is affected by a large number of variables. Researchers are now trying to identify patterns comprising significant variables, their impact on success, and the interdependence of different variables to enumerate the status of the patient and to support the doctors and biologists to prescribe treatment to improve the probability of success of ART. Machine learning technique is a tool that is used by various researchers in the field of ART to identify the interlink between the variables. The objective of this review paper is to find the appliance of machine learning techniques in ART and to find further enrichment needed for future research. From the literature, it is found that some research works were done using machine learning techniques to predict ART outcome. On analyzing the reviews qualitatively and quantitatively, it is understood that various classifiers are used for ART outcome prediction but they are trained using limited amount of static data collected from fertility centers. The exact prediction of ART outcome may be improved by training the classifier with large amount of dynamic data. But building such a classifier is difficult by the already existing techniques. This may be made possible by introducing Big Data Analytics in ART.
Osteoporosis is a disease of bones that leads to an increased risk of fracture and it is characterized by low bone mineral density and micro-architectural deterioration of bone tissue. In this article, the dataset consists of 3426 subjects (1083 pathological and 2343 healthy cases) whose diagnosis was based on laboratory and osteal bone densitometry examination. In all cases, four diagnostic factors for osteoporosis risk prediction, namely age, sex, height and weight were stored for later evaluation with the selected classifiers. In order to categorize subjects into two classes (osteoporosis, nonosteoporosis), twenty machine learning techniques were assessed, based on their popularity and frequency in biomedical engineering problems. All classifiers have been evaluated using the wellknown 10-fold cross validation method and the results are reported analytically. In addition, a feature selection method identified that with the use of only two diagnostic factors (age and weight), similar performance could be achieved. The scope of the proposed exhaustive methodology is to assist therapists in osteoporosis prediction, avoiding unnecessary further testing with bone densitometry.
One of the most sought-after research areas in object detection is pedestrian detection owing to its applications especially in automated surveillance and robotics. Traditional methods use hand-crafted features to characterize pedestrians. In this work, we have pro-posed a new hand-crafted feature extraction method that concatenates shape, color and texture features; which is then classified by using Support Vector Machine (SVM). As in recent years, deep learning models such as Convolutional Neural Networks (CNNs) have become an eminent state of the art in detection challenges, which unlike the manually designed feature extraction mechanism, results in more accuracy. Therefore, we have also proposed a CNN network, a modification of the pre-trained ResNet-18 named as Multi-layer Feature Fused-ResNet (MF2-ResNet). We have used the proposed modification for (1) feature extraction; which is then classified by using Support Vector Machine (SVM); (2) End-to-End feature extraction and classification by the CNN network and (3) MF2-ResNet based Faster-RCNN to include region proposals in the detection pipeline. To evaluate the proposed method, it is compared with existing pre-trained CNNs. The MF2-ResNet based Faster R-CNN is compared with state-of-the-art detection methods. Three benchmark pedestrian datasets are used in this work: INRIA, NICTA and Daimler.
Epilepsy is a common neurological disorder characterized by recurrence seizures. Alcoholism causes organic changes in the brain, resulting in seizure attacks similar to epileptic fits. Hence, it is challenging to differentiate the cause of fits as epileptic or alcoholism, which is important for deciding on the treatment in the neurology ward. The focus of this paper is to automatically differentiate epileptic, normal, and alcoholic electroencephalogram (EEG) signals. As the EEG signals are non-linear and dynamic in nature, it is difficult to tell the subtle changes in these signals with the help of linear techniques or by the naked eye. Therefore, to analyze the normal (control), epileptic, and alcoholic EEG signals, two non-linear methods, such as recurrence plots (RPs) and then recurrence quantification analysis (RQA) are adopted. Approximately 10 RQA parameters have been used to classify the EEG signals into three distinct classes, i.e., normal, epileptic, and alcoholic. Six classifiers, such as support vector machine (SVM), radial basis probabilistic neural network (RBPNN), decision tree (DT), Gaussian mixture model (GMM), k-nearest neighbor (kNN), and fuzzy Sugeno classifiers have been developed to accomplish this task. Results show that the GMM classifier outperformed the other classifiers with a classification sensitivity of 99.6%, specificity of 98.3%, and accuracy of 98.6%.
Owing to the large number of professional glossaries and unknown patent classification, analysts usually fail to collect and analyze patents efficiently. One solution to this problem is to conduct patent analysis using a patent classification system. However, in a corpus such as cloud patents, many keywords are common among different classes, making it difficult to classify the unknown class documents using the machine learning techniques proposed by previous studies. To remedy this problem, this study aims to establish an efficient classification system with a special focus on features extraction and application of extension theory. We first propose a compound method to determine the features, and then, we propose an extension-based classification method to develop an efficient patent classification system. Using cloud computing patents as the database, the experimental results show that our proposed scheme can outperform the classification quality of the traditional classifiers.
The aim of this paper is to develop a model to classify the stance expressed in social media texts. More specifically, the work presented focuses on tweets. In stance detection (SD) tasks, the objective is to identify the stance of a person towards a target of interest. In this paper, a model for SD is established and its variations are evaluated using different classifiers. The single models differ based on the pre-processing and the combination of features. To reduce the dimensionality of the feature space, analysis of variance (ANOVA) test is used. Then, two classifiers are employed as base learners including Random Forests (RF) and Support Vector Machines (SVM). Experimental analyses are conducted on SemEval dataset that is used as a benchmark for SD. Finally, the base learners that resulted from different design alternatives, are combined into three ensemble models. Experimental results show the significance of the used features and the effectiveness of a manually built dictionary that is used in the pre-processing stage. Moreover, the proposed ensembles outperform the state-of-the-art models in the overall test score, which suggests that ensemble learning is the best tool for effective SD in tweets.
Associative classifiers are new classification approach that use association rules for classification. An important advantage of these classification systems is that, using association rule mining (ARM) they are able to examine several features at a time. Many applications can benefit from good classification model. Associative classifiers are especially fit to applications where the model may assist the domain experts in their decisions. Medical diagnosis is a domain where the maximum accuracy of the model is desired. In this paper, we propose a framework weighted associative classifier (WAC) that assigns different weights to different attributes according to their predicting capability. We are using maximum likelihood estimation (MLE) theory to calculate weight of each attribute using training data. We also show how existing Apriori algorithm can be modified in weighted environment to infer association rule from medical dataset having numeric valued attributes as the conventional ARM usually deals with the transaction database with categorical values. Experiments have been performed on benchmark data set to evaluate the performance of WAC in terms of accuracy, number of rules generating and impact of minimum support threshold on WAC outcomes. The result reveals that WAC is a promising alternative in medical prediction and certainly deserves further attention.
Early detection of breast abnormalities remains the primary prevention against breast cancer despite the advances in breast cancer diagnosis and treatment. Presence of mass in breast tissues is highly indicative of breast cancer. The research work presented in this paper investigates the significance of different types of features using proposed neural network based classification technique to classify mass type of breast abnormalities in digital mammograms into malignant and benign. 14 gray level based features, four BI-RADS features, patient age feature and subtlety value feature have been explored using the proposed research methodology to attain maximum classification on test dataset. The proposed research technique attained a 91% testing classification rate with a 100% training classification rate on digital mammograms taken from the DDSM benchmark database.
This paper presents a novel technique for the classification of suspicious areas in digital mammograms. The proposed technique is based on clustering of input data into numerous clusters and amalgamating them with a Support Vector Machine (SVM) classifier. The technique is called multi-cluster support vector machine (MCSVM) and is designed to provide a fast converging technique with good generalization abilities leading to an improved classification as a benign or malignant class. The proposed MCSVM technique has been evaluated on data from the Digital Database of Screening Mammography (DDSM) benchmark database. The experimental results showed that the proposed MCSVM classifier achieves better results than standard SVM. A paired t-test and Anova analysis showed that the results are statistically significant.
Nutrition diagnosis plays a key role in the crop’s growth, which has mainly been carried out in the field by agricultural workers. Currently, automatic nutrition recognition technologies have been widely used in this field. A procedure is proposed in this paper to diagnose nitrogen nutrition non-destructively for rapeseed qualitatively based on the multifractal theory. Twelve texture parameters are given by the method of multifractal detrended fluctuation (MF-DFA), which contains six generalized Hurst exponents and six relative multifractal parameters that are used as features of the rapeseed leaf images for identifying the two nitrogen levels, namely, the N-mezzo and the N-wane. For the base leaves, central leaves and top leaves of the rapeseed plant and the three-section mixed samples, three parameters combinations are selected to conduct the work. Five classifiers of Fisher’s linear discriminant algorithm (LDA), extreme learning machine (ELM), support vector machine and kernel method (SVMKM), random decision forests (RF) and K-nearest neighbor algorithm (KNN) are employed to calculate the diagnosis accuracy. An interesting finding is that the best diagnose accuracy is from the base leaves of the rapeseed plant. It is explained that the base leaf is the most sensitive to the nitrogen deficiency. The diagnose effect by the base leaves samples is outshining the existing result significantly for the same leaves samples. For the mixed samples, the averaged discriminant accuracy reaches 97.12% and 97.56% by SVMKM and RF methods with the 10-fold cross-validation respectively. The resulting high accuracy on N-levels identification shows the feasibility and efficiency of our method.
Support Vector Machines (SVMs) methods have become a popular tool for predictive data mining problems and novelty detection. They show good generalization performance on many real-life datasets and they are motivated theoretically through convex programming formulations. There are relatively few free parameters to adjust using cross validation and the architecture of the SVM learning machine does not need to be found by experimentation as in the case of Artificial Neural Networks (ANNs). We discuss the fundamentals of SVMs with emphasis to multiclass classification problems and applications in science, business and engineering.
Piecewise-linear mathematical structures form a convenient and important framework for implementing trainable and adaptive pattern classifiers. Neural networks and genetic algorithms offer additional approaches with important benefits for the design of such classifiers. In this paper we show how neural modeling and genetic selection can be applied to piecewise-linear structures to optimize both the topology and the parameter values of the network forming the classifier. Such a classifier will tend to have a low error rate and high robustness. We describe applications of these techniques to an adaptive detector of abnormal tissue in mammograms and a detector of straight lines and edges in noisy aerial images.
We propose a method for fuzzy rule generation directly from numerical data for designing classifiers. First a fuzzy partition is imposed on the domain of each feature, which results in a set of fuzzy values for each feature. Then a descriptor-pattern table is constructed using the training data and the fuzzy feature values. Rules are now discovered from the descriptor-pattern table. The rule generation process finds the distinct descriptors to discover simple rules and if required generates further rules using conjunction of common descriptors or conjunction of common descriptors and negation of distinct descriptors. A rule minimization process is then initiated to retain a small set of rules adequate to learn the training data. We suggest three possible schemes for generation of the initial fuzzy partitioning of the feature space and a genetic algorithm based tuning scheme is used to refine the rule base. Finally, the proposed scheme is tested on some real data. Unlike, most of the classifiers, the proposed method can detect ambiguous data and declare them to be unclassified - this is a distinct advantage.
This work investigated the ability of hyperspectral imagery for identifying Agathosma (A) Betulina and Agathosma (A) Crenulata plants. The plants have been used as traditional medicines to heal diseases such as urinary tract infections, stomach complaints, for washing and cleaning wounds, kidney diseases, and symptomatic relief of rheumatism. The species are normally identified on the basis of their shapes. A. Betulina has round-leaves while A. Crenulata has oval-leaves. The recognition based on morphology is no longer adequate because of extensive cultivation. New hybrids of the leaves now exist which are not easily separable. The study proposed implementation of Local Polynomial Approximation (LPA) algorithm and Principal Component Analysis (PCA) for the data processing. The data generated from the two methods are subjected to classification procedures for the plants identification. Various classifiers were used for the data separations. The results obtained reveal that most of the classifiers performed better on LPA processed data as compared to PCA.