In order to solve the high uncertainty and dynamics of coal transportation costs and reduce the subjectivity of artificial prediction, a method of coal transportation cost prediction under mixed uncertainty based on a graph attention network (GAT) is proposed. The analytic hierarchy process (AHP) is used to build the factor model of coal transportation cost prediction, calculate the weight coefficient of each factor, and clarify the relative importance of each factor in the prediction process. Principal component analysis (PCA) is used to reduce the dimensions of various factors affecting the cost of coal transportation, eliminate the noise and redundant information in the data, and retain the main information characteristics. The processed data set is input into the STDGAT model that combines the GAT and the long-term and short-term memory neural network (LSTM). Through GAT, the spatial correlation characteristics in the coal transportation demand are extracted and combined with LSTM to capture the dynamic characteristics in the time dimension, and the time-space joint prediction of the coal transportation cost is realized. The experimental results show that the prediction results are highly close to the actual cost, and the error remains at a low level no matter in what time period, short distance, medium distance or long-distance transportation, and different transportation modes. By adjusting the number of attention heads, it was found that the prediction performance of the model was the best when the number of attention heads was 6.
In the past decades, there is a wide increase in the number of people affected by diabetes, a chronic illness. Early prediction of diabetes is still a challenging problem as it requires clear and sound datasets for a precise prediction. In this era of ubiquitous information technology, big data helps to collect a large amount of information regarding healthcare systems. Due to explosion in the generation of digital data, selecting appropriate data for analysis still remains a complex task. Moreover, missing values and insignificantly labeled data restrict the prediction accuracy. In this context, with the aim of improving the quality of the dataset, missing values are effectively handled by three major phases such as (1) pre-processing, (2) feature extraction, and (3) classification. Pre-processing involves outlier rejection and filling missing values. Feature extraction is done by a principal component analysis (PCA) and finally, the precise prediction of diabetes is accomplished by implementing an effective distance adaptive-KNN (DA-KNN) classifier. The experiments were conducted using Pima Indian Diabetes (PID) dataset and the performance of the proposed model was compared with the state-of-the-art models. The analysis after implementation shows that the proposed model outperforms the conventional models such as NB, SVM, KNN, and RF in terms of accuracy and ROC.
We present a noise robust PCA algorithm which is an extension of the Oja subspace algorithm and allows tuning the noise sensitivity. We derive a loss function which is minimized by this algorithm and interpret it in a noisy PCA setting. Results on the local stability analysis of this algorithm are given and it is shown that the locally stable equilibria are those which minimize the loss function.
The development of efficient stroke-detection methods is of significant importance in today's society due to the effects and impact of stroke on health and economy worldwide. This study focuses on Human Activity Recognition (HAR), which is a key component in developing an early stroke-diagnosis tool. An overview of the proposed global approach able to discriminate normal resting from stroke-related paralysis is detailed. The main contributions include an extension of the Genetic Fuzzy Finite State Machine (GFFSM) method and a new hybrid feature selection (FS) algorithm involving Principal Component Analysis (PCA) and a voting scheme putting the cross-validation results together. Experimental results show that the proposed approach is a well-performing HAR tool that can be successfully embedded in devices.
The commercial quality of Japanese Angelica radices — Angelica acutiloba Kitagawa (Yamato-toki) and A. acutiloba Kitagawa var. sugiyama Hikino (Hokkai-toki) — used in Kampo traditional herbal medicines, was studied by use of omics technologies. Complementary and alternative medical providers have observed in their clinical experience that differences in radix commercial quality reflect the differences in pharmacological responses; however, there has been little scientific examination of this phenomenon. The approach of omics, including metabolomics, transcriptomics, genomics, and informatics revealed a distinction between the radix-quality grades based on their metabolites, gene expression in human subjects, and plant genome sequences. Systems biology, constructing a network of omics data used to analyze this complex system, is expected to be a powerful tool for enhancing the study of radix quality and furthering a comprehensive understanding of all medicinal plants.
This paper presents an integrated approach for assessment of export performance based on principal component analysis (PCA) and Numerical Taxonomy (NT). The integrated assessment is influenced by shaping factors such as export value, production value, export growth, R&D expenditure and value added. Iranian chemical industries are selected as a case study according to the format of International Standard for Industrial Classification (ISIC) for a five-year period. The modeling approach of this paper could be used for analyzing other sectors and countries. This study shows how total export efficiency is obtained through the proposed approach whereas previous studies consider conventional productivity approach by a single indicator.
Different eigenspace-based approaches have been proposed for the recognition of faces. They differ mostly in the kind of projection method being used and in the similarity matching criterion employed. The aim of this paper is to present a comparative study between some of these different approaches. This study considers theoretical aspects as well as experiments performed using a face database with a few number of classes (Yale) and also with a large number of classes (FERET).
Computer vision systems for monitoring people and collecting valuable demographic information in a social environment is an important research problem. It is expected that such a system will play an increasingly important role in enhancing user's experience and can significantly improve the intelligibility of a human computer interaction (HCI) system. For example, a robust gender classification system can provide a basis for passive surveillance and access to a smart building using demographic information or can provide valuable consumer statistics in a public place. The option of an audio cue in addition to the visual cue promises a robust solution with high accuracy and ease-of-use in human computer interaction systems.
This paper investigates gender classification using Support Vector Machines (SVMs). The visual (thumbnail frontal face) and the audio (features from speech data) cues were considered for designing the classifier. Three different representations of the data, namely, raw data, principle component analysis (PCA) and non-negative matrix factorization (NMF) were used for the experimentation with visual signal. For speech, mel-cepstral coefficient and pitch were used for the experimentation. It was found that the best overall classification rates obtained using the SVM for the visual and speech data were 95.31% and 100%, respectively, on data set collected in laboratory environment. The performance of the SVM was compared with two simple classifiers namely, the nearest prototype neighbor and the k-nearest neighbor on all feature sets. It was found that the SVM outperformed the other two classifiers on all datasets. To further understand the robustness issues, the proposed approach has been applied on a large balanced (roughly equal distribution of gender, ethnicity and age group) data-base consisting of 8000 faces collected in real world environment. While, the results are very promising it indicates more to be done to make a statistically meaningful conclusion.
A two-stage face recognition method is presented in this paper. In the first stage, the set of candidate patterns is narrowed down with the global similarity being taken into account. In the second stage, synergetic approach is employed to perform further recognition. Face image is segmented into meaningful regions, each of which is represented as a prototype vector. The similarity between a given region of the test pattern and a stored sample is obtained as the order parameter which serves as an element of the order vector. Finally, a modified definition of the potential function is given, and the dynamic model of recognition is derived from it. The effectiveness of the proposed method is experimentally confirmed.
Automatic face recognition is becoming increasingly important due to the security applications derived from it. Although the facial recognition problem has focused on 2D images, recently, due to the proliferation of 3D scanning hardware, 3D face recognition has become a feasible application. This 3D approach does not need any color information. In this way, it has the following main advantages in comparison to more traditional 2D approaches: (1) being robust under lighting variations and (2) providing more relevant information. In this paper we present a new 3D facial model based on the curvature properties of the surface. Our system is able to detect the subset of the characteristics of the face with higher discrimination power from a large set. The robustness of the model is tested by comparing recognition rates using both controlled and noncontrolled environments regarding facial expressions and facial rotations. The difference between the recognition rates of the two environments of only 5% proves that the model has a high degree of robustness against pose and facial expressions. We consider that this robustness is enough to implement facial recognition applications, which can achieve up to 91% correct recognition rate. A publish 3D face database containing face rotations and expressions has been created to achieve the recognition experiments.
Facial expression recognition is one of the most challenging research areas in the image recognition field and has been actively studied since the 70's. For instance, smile recognition has been studied due to the fact that it is considered an important facial expression in human communication, it is therefore likely useful for human–machine interaction. Moreover, if a smile can be detected and also its intensity estimated, it will raise the possibility of new applications in the future. We are talking about quantifying the emotion at low computation cost and high accuracy. For this aim, we have used a new support vector machine (SVM)-based approach that integrates a weighted combination of local binary patterns (LBPs)-and principal component analysis (PCA)-based approaches. Furthermore, we construct this smile detector considering the evolution of the emotion along its natural life cycle. As a consequence, we achieved both low computation cost and high performance with video sequences.
To improve the accuracy, reduce the time consumption and obtain the number of faults, a fault detection method based on AP (affinity propagation) clustering and PCA (principal component analysis) was proposed. Firstly, discontinuous points in seismic horizons were searched out by the connected component labeling method. Secondly, the AP clustering algorithm was used to cluster the discontinuous points and the points of the same cluster were used to determine a fault, meanwhile, the faults existing in a seismic section were quantified. Finally, the PCA was adopted to calculate the principal direction of the discontinuous points contained in the same cluster. As a result, the corresponding cluster center and the principal direction determined a straight line, and the part that intercepted by the clustered edge was the fault we wanted. In the proposed method, the time consumption of correlation calculation of the traditional method was reduced; the computing work was simplified and the number of the faults in the seismic section was obtained. To confirm the feasibility and advancement of the proposed method, comparative experiments were done on the seismic model data and the real seismic section. The results show that the accuracy of the proposed method was better and the time cost was greatly reduced.
In this paper, we propose a prior fusion and feature transformation-based principal component analysis (PCA) method for saliency detection. It relies on the inner statistics of the patches in the image for identifying unique patterns, and all the processes are done only once. First, three low-level priors are incorporated and act as guidance cues in the model; second, to ensure the validity of PCA distinctness model, a linear transform for the feature space is designed and needs to be trained; furthermore, an extended optimization framework is utilized to generate a smoothed saliency map based on the consistency of the adjacent patches. We compare three versions of our model with seven previous methods and test them on several benchmark datasets. Different kinds of strategies are adopted to evaluate the performance and the results demonstrate that our model achieves the state-of-the-art performance.
A new method about the multi-fault condition monitoring of slurry pump based on principal component analysis (PCA) and sequential probability ratio test (SPRT) is proposed. The method identifies the condition of the slurry pump by analyzing the vibration signal. The experimental model is established using the normal impeller and the faulty impellers where the collected vibration signals were preprocessed using wavelet packet transform (WPT). The characteristic parameters of the vibration signals are extracted by time domain signal analysis and the dimension of data was reduced by PCA. The principal components with the largest contribution rate are chosen as the inputted signal to SPRT to assess the proposed algorithm. The new methodology is reasonable and practical for the multi-fault diagnosis of slurry pump.
In this paper, a software toolchain is presented for the fully automatic alignment of a 3D human face model. Beginning from a point cloud of a human head (previously segmented from its background), pose normalization is obtained using an innovative and purely geometrical approach. In order to solve the six degrees of freedom raised by this problem, we first exploit the human face's natural mirror symmetry; secondly, we analyze the frontal profile shape; and finally, we align the model's bounding box according to the position of the tip of the nose. The whole procedure is considered as a two-fold, multivariable optimization problem which is addressed by the use of multi-level, genetic algorithms and a greedy search stage, with the latter being compared against standard PCA. Experiments were conducted utilizing a GavabDB database and took into account proper preprocessing stages for noise filtering and head model reconstruction. Outcome results reveal strong validity in this approach, however, at the price of high computational complexity.
The performance of landing gear retraction mechanism in aircraft directly affects its safe operation. Therefore, it is important to analyze and evaluate its comprehensive performance during the design process. Multiple single kinematic and dynamic performance indexes of landing gear retraction mechanism could be solved by CAD/CAE software. The weighting factors of every single performance index are used to distinguish the different effects of comprehensive evaluation, and also achieved by the expert investigation method. Combining the a priori information of the mechanism, the comprehensive performance of landing gear retraction mechanism could be analyzed by Relative Principal Component Analysis (RPCA) method, and the scale of the landing gear retraction mechanism with the best comprehensive performance could be effectively selected. Further, RPCA could also provide a scientific reference basis for the optimization design of landing gear retraction mechanism.
In the generation and analysis of Big Data following the development of various information devices, the old data processing and management techniques reveal their hardware and software limitations. Their hardware limitations can be overcome by the CPU and GPU advancements, but their software limitations depend on the advancement of hardware. This study thus sets out to address the increasing analysis costs of dense Big Data from a software perspective instead of depending on hardware. An altered K-means algorithm was proposed with ideal points to address the analysis costs issue of dense Big Data. The proposed algorithm would find an optimal cluster by applying Principal Component Analysis (PCA) in the multi-dimensional structure of dense Big Data and categorize data with the predicted ideal points as the central points of initial clusters. Its clustering validity index and F-measure results were compared with those of existing algorithms to check its excellence, and it had similar results to them. It was also compared and assessed with some data classification techniques investigated in previous studies and we found that it made a performance improvement of about 3–6% in the analysis costs.
Principal Component Analysis (PCA) is a classical dimensionality reduction technique that computes a low rank representation of the data. Recent studies have shown how to compute this low rank representation from most of the data, excluding a small amount of outlier data. We show how to convert this problem into graph search, and describe an algorithm that solves this problem optimally by applying a variant of the A* algorithm to search for the outliers. The results obtained by our algorithm are optimal in terms of accuracy, and are shown to be more accurate than results obtained by the current state-of-the- art algorithms which are shown not to be optimal. This comes at the cost of running time, which is typically slower than the current state of the art. We also describe a related variant of the A* algorithm that runs much faster than the optimal variant and produces a solution that is guaranteed to be near the optimal. This variant is shown experimentally to be more accurate than the current state-of-the-art and has a comparable running time.
In this paper, we examine the properties of the Jones polynomial using dimensionality reduction learning techniques combined with ideas from topological data analysis. Our data set consists of more than 10 million knots up to 17 crossings and two other special families up to 2001 crossings. We introduce and describe a method for using filtrations to analyze infinite data sets where representative sampling is impossible or impractical, an essential requirement for working with knots and the data from knot invariants. In particular, this method provides a new approach for analyzing knot invariants using Principal Component Analysis. Using this approach on the Jones polynomial data, we find that it can be viewed as an approximately three-dimensional subspace, that this description is surprisingly stable with respect to the filtration by the crossing number, and that the results suggest further structures to be examined and understood.
Currently, the entire world is fighting against the Corona Virus (COVID-19). As of now, more than thirty lacs of people all over the world were died due to the COVID-19 till April 2021. A recent study conducted by China suggests that Chest CT and X-ray images can be used as a preliminary test for COVID detection. This paper propose a transfer learning-based mathematical COVID detection model, which integrates a pre-trained model with the Random Forest Tree (RFT) classifier. As the available COVID dataset is noisy and imbalanced so Principal Component Analysis (PCA) and Generative Adversarial Networks (GANs) is used to extract most prominent features and balance the dataset respectively. The Bayesian Cross-Entropy Loss function is used to penalize the false detection differently according to the class sensitivity (i.e., COVID patient should not be classified as Normal or Pneumonia class). Due to the small dataset, a pre-trained model like VGGNet-19, ResNet50 and Inception_ResNet_V2 were chosen to extract features and then trained them over the RFT for the classification task. The experiment results showed that ResNet50 gives the maximum accuracy of 99.51%, 98.21%, and 97.2% for training, validation, and testing phases, respectively, and none of the COVID Chest X-ray images were classified as Normal or Pneumonia classes.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.