Processing math: 100%
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  Bestsellers

Bestsellers

Handbook of Machine Learning
Handbook of Machine Learning

Volume 1: Foundation of Artificial Intelligence
by Tshilidzi Marwala
Handbook on Computational Intelligence
Handbook on Computational Intelligence

In 2 Volumes
edited by Plamen Parvanov Angelov

 

  • articleNo Access

    A Data-Driven Model to Construct the Influential Factors of Online Product Satisfaction

    Online shopping is becoming more prevalent, with consumers turning to e-commerce platforms to search for information about the goods and services they need. Users will usually check other consumer reviews on the platform as a reference while shopping. Online retailers can collect and analyze these online reviews to monitor consumer opinions about product quality, logistics services, packaging and other attributes to provide an accurate basis for product improvement and service optimization. This paper applies the Latent Dirichlet Allocation (LDA) algorithm to extract the critical factors that affect consumer satisfaction. More than 30,000 reviews of seven kinds of 3C (computer, communication, and consumer electronic) product categories obtained by crawler technology are analyzed. Then, the DEMATEL-ANP (DANP) method is applied to the extracted framework to build a cause-and-effect diagram of 3C product satisfaction model. The innovative LDA-DANP hybrid model clarifies the causal influence of the evaluation dimensions for 3C products sold online. The results show that brand value is the most important dimension affecting consumer online product satisfaction. Appearance design, logistics awareness service and product performance also have a positive influence on perceived service and brand value. Finally, some management implications and practical suggestions are proposed.

  • articleNo Access

    RECURSIVE DISCRIMINANT REGRESSION ANALYSIS TO FIND HOMOGENEOUS GROUPS

    The main motivation of this paper is to propose a method to extract the output structure and find the input data manifold that best represents that output structure in a multivariate regression problem. A graph similarity viewpoint is used to develop an algorithm based on LDA, and to find out different output models which are learned as an input subspace. The main novelty of the algorithm is related with finding different structured groups and apply different models to fit better those structures. Finally, the proposed method is applied to a real remote sensing retrieval problem where we want to recover the physical parameters from a spectrum of energy.

  • articleNo Access

    Identifying Suitable Brain Regions and Trial Size Segmentation for Positive/Negative Emotion Recognition

    The development of suitable EEG-based emotion recognition systems has become a main target in the last decades for Brain Computer Interface applications (BCI). However, there are scarce algorithms and procedures for real-time classification of emotions. The present study aims to investigate the feasibility of real-time emotion recognition implementation by the selection of parameters such as the appropriate time window segmentation and target bandwidths and cortical regions. We recorded the EEG-neural activity of 24 participants while they were looking and listening to an audiovisual database composed of positive and negative emotional video clips. We tested 12 different temporal window sizes, 6 ranges of frequency bands and 60 electrodes located along the entire scalp. Our results showed a correct classification of 86.96% for positive stimuli. The correct classification for negative stimuli was a little bit less (80.88%). The best time window size, from the tested 1s to 12s segments, was 12s. Although more studies are still needed, these preliminary results provide a reliable way to develop accurate EEG-based emotion classification.

  • articleNo Access

    A WORD POSITION-RELATED LDA MODEL

    LDA (Latent Dirichlet Allocation) proposed by Blei is a generative probabilistic model of a corpus, where documents are represented as random mixtures over latent topics, and each topic is characterized by a distribution over words, but not the attributes of word positions of every document in the corpus. In this paper, a Word Position-Related LDA Model is proposed taking into account the attributes of word positions of every document in the corpus, where each word is characterized by a distribution over word positions. At the same time, the precision of the topic-word's interpretability is improved by integrating the distribution of the word-position and the appropriate word degree, taking into account the different word degree in the different word positions. Finally, a new method, a size-aware word intrusion method is proposed to improve the ability of the topic-word's interpretability. Experimental results on the NIPS corpus show that the Word Position-Related LDA Model can improve the precision of the topic-word's interpretability. And the average improvement of the precision in the topic-word's interpretability is about 9.67%. Also, the size-aware word intrusion method can interpret the topic-word's semantic information more comprehensively and more effectively through comparing the different experimental data.

  • articleNo Access

    Multi-Channel Mapping Image Segmentation Method Based on LDA

    In order to improve the segmentation accuracy of plant lesion images, multi-channels segmentation algorithm of plant disease image was proposed based on linear discriminant analysis (LDA) method’s mapping and K-means’ clustering. Firstly, six color channels from RGB model and HSV model were obtained, and six channels of all pixels were laid out to six columns. Then one of these channels was regarded as label and the others were regarded as sample features. These data were grouped for linear discrimination analysis, and the mapping values of the other five channels were applied to the eigen vector space according to the first three big eigen values. Secondly, the mapping value was used as the input data for K-means and the points with minimum and maximum pixel values were used as the initial cluster center, which overcame the randomness for selecting the initial cluster center in K-means. And the segmented pixels were changed into background and foreground, so that the proposed segmentation method became the clustering of two classes for background and foreground. Finally, the experimental result showed that the segmentation effect of the proposed LDA mapping-based method is better than those of K-means, ExR and CIVE methods.

  • articleNo Access

    A Hybrid Fuzzy System via Topic Model for Recommending Highlight Topics of CQA in Developer Communities

    Question-answering (QA) websites supply a quickly growing source of useful information in numerous areas. These platforms present novel opportunities for online users to supply solutions, they also pose numerous challenges with the ever-growing size of the QA community. QA sites supply platforms for users to cooperate in the form of asking questions or giving answers. Stack Overflow is a massive source of information for both industry and academic practitioners, and its analysis can supply useful insights. Topic modeling of Stack Overflow is very beneficial for pattern discovery and behavior analysis in programming knowledge. In this paper, we propose a framework based on the Latent Dirichlet Allocation (LDA) algorithm and fuzzy rules for question topic mining and recommending highlight latent topics in a community question-answering (CQA) forum of developer community. We consider a real dataset and use 170,091 programmer questions in the R language forum from the Stack Overflow website. Our result shows that LDA topic models via novel fuzzy rules can play an effective role for extracting meaningful concepts and semantic mining in question-answering forums in developer communities.

  • articleNo Access

    Mining Hidden Interests from Twitter Based on Word Similarity and Social Relationship for OLAP

    Online Analytical Processing, or OLAP, is an approach to answering multidimensional analytical (MDA) queries in an interactive way. However, the traditional OLAP approaches can only deal with structured data, but not unstructured textual data like tweets. To address this problem, we propose a Latent Dirichlet Allocation (LDA)-based model, called Multilayered Semantic LDA (MS-LDA), which detects the hidden layered interests from Twitter data based on LDA. The layered dimension of interests can be further used to apply OLAP techniques to Twitter data. Furthermore, MS-LDA employs the semantic similarity among words of tweets based on word2vec, and also the social relationship among twitters, to improve its effectiveness. The extensive experiments demonstrate that MS-LDA can effectively extract the dimension hierarchy of tweeters' interests for OLAP.

  • articleNo Access

    Research Notes: User Influence in Microblog Based on Interest Graph

    Microblog is currently the largest social networking platform in China. In recent years, as a social media, the influence of microblog continues to expand. The users who have large influence play a guiding role in the spread of microblog, and even guide the trends of public opinion. Therefore, we propose an influence analysis method to find microblog users who are with great influence, which is of great significance for the research and mining of microblog. User influence analysis in microblog has great difficulties due to the limited amount of microblog information, quick updates and nonstandard microblog language. First, we use the label propagation algorithm combined with LDA algorithm to divide users by the user interest graph, according to the social relationship of microblog users and the content they generate. Then, depending on different interest areas, an improved PageRank algorithm based on user interaction behavior is proposed to calculate the user’s influence. Experiments on the real datasets show that the proposed method outperforms the traditional algorithms.

  • articleNo Access

    Features-Level Fusion of Reflectance and Illumination Images in Finger-Knuckle-Print Identification System

    In Finger-Knuckle-Print (FKP) recognition, feature extraction plays a very important role in the overall system performance. This paper merges two types of the histograms of oriented gradients (HOG)-based features extracted from reflectance and illumination images for FKP-based identification. The Adaptive Single Scale Retinex (ASSR) algorithm has been used to extract the illumination and the reflectance images from each FKP image. Serial feature fusion is used to form a large feature vector for each user, and extract the distinctive features in the higher-dimension vector space. Finally, the cosine similarity distance measure is used for classification. The Hong Kong Polytechnic University (PolyU) FKP database is used during all of the tests. Experimental results show that our proposed system achieves better results than other state-of-the-art system.

  • articleNo Access

    Intelligent Analysis and Positioning of Political Public Opinion in Universities

    With the rapid development of Internet technology, the network has become an indispensable way of life for undergraduates. The correct guidance of public opinion has also become an important thing in the ideological work of universities. Undergraduates are in an important period of formation and development of thoughts that they are easily to be incited by cyber-rumors. Therefore, it is particularly important to obtain the data of political public opinion in universities and position the hot topics for early detection of political public opinion tendency, which can also avoid the outbreak of major security incidents. With such consideration, this paper obtains multi-source political public opinion data from BBS, Tieba and Weibo of SUN YAT-SEN UNIVERSITY (SYSU) through crawler. We study a text feature extraction method based on Word2Vec & LDA (Latent Dirichlet Allocation), which improves the high-dimensional sparsity in traditional Vector Space Model (VSM) text representation. Meanwhile, based on the classical Single-pass clustering algorithm, this paper studies the Single-pass & HAC clustering algorithm. In addition, a measurement method of hot topic is defined to calculate the heat value of political public opinion. Dictionary and rule based method is used to improve the accuracy of sentiment tendency analysis. The experimental results demonstrate that the effect of topic detection and positioning based on LDA & Word2Vec and Single-pass & HAC algorithm is better than other methods.

  • articleNo Access

    1D BAR CODE READING ON CAMERA PHONES

    The availability of camera phones provides people with a mobile platform for decoding bar codes, whereas conventional scanners lack mobility. However, using a normal camera phone in such applications is challenging due to the out-of-focus problem. In this paper, we present the research effort on the bar code reading algorithms using a VGA camera phone, NOKIA 7650. EAN-13, a widely used 1D bar code standard, is taken as an example to show the efficiency of the method. A wavelet-based bar code region location and knowledge-based bar code segmentation scheme is applied to extract bar code characters from poor-quality images. All the segmented bar code characters are input to the recognition engine, and based on the recognition distance, the bar code character string with the smallest total distance is output as the final recognition result of the bar code. In order to train an efficient recognition engine, the modified Generalized Learning Vector Quantization (GLVQ) method is designed for optimizing a feature extraction matrix and the class reference vectors. 19 584 samples segmented from more than 1000 bar code images captured by NOKIA 7650 are involved in the training process. Testing on 292 bar code images taken by the same phone, the correct recognition rate of the entire bar code set reaches 85.62%. We are confident that auto focus or macro modes on camera phones will bring the presented method into real world mobile use.

  • articleNo Access

    FUNGIFORM PAPILLAE HYPERPLASIA (FPH) IDENTIFICATION BY TONGUE TEXTURE ANALYSIS

    Computerized tongue diagnosis can make use of a number of pathological features of the tongue. To date, there have been few computerized applications that focus on the very commonly used and distinctive diagnostic and textural features of the tongue, Fungiform Papillae Hyperplasia (FPH). In this paper, we propose a computer-aided system for identifying the presence or absence of FPH. We first define and partition a region of interest (ROI) for texture acquisition. After preprocessing for detection and removal of reflective points, a set of 2D Gabor filter banks is used to extract and represent textural features. Then, we apply the Linear Discriminant Analysis (LDA) to identify the data sets from the tongue image database. The experimental results reasonably demonstrate the effectiveness of the method described in this paper.

  • articleNo Access

    A PIECEWISE-DEFINED SEVERITY DISTRIBUTION-BASED LOSS DISTRIBUTION APPROACH TO ESTIMATE OPERATIONAL RISK: EVIDENCE FROM CHINESE NATIONAL COMMERCIAL BANKS

    Following the Basel II Accord, with the increased focus on operational risk as an aspect distinct from credit and market risk, quantification of operational risk has been a major challenge for banks. This paper analyzes implications of the advanced measurement approach to estimate the operational risk. When modeling the severity of losses in a realistic manner, our preliminary tests indicate that classic distributions are unable to fit the entire range of operational risk data samples (collected from public information sources) well. Then, we propose a piecewise-defined severity distribution (PSD) that combines a parameter form for ordinary losses and a generalized Pareto distribution (GPD) for large losses, and estimate operational risk by the loss distribution approach (LDA) with Monte Carlo simulation. We compare the operational risk measured with piecewise-defined severity distribution based LDA (PSD-LDA) with those obtained from the basic indicator approach (BIA), and the ratios of operational risk regulatory capital of some major international banks with those of Chinese commercial banks. The empirical results reveal the rationality and promise of application of the PSD-LDA for Chinese national commercial banks.

  • articleNo Access

    An Entity Extraction and Categorization Technique on Twitter Streams

    As social media platforms have gained huge momentum in recent years, the amount of information generated from the social media sites is growing exponentially and gives the information retrieval systems a great challenge to extract the potential named entities. Researchers have utilized the semantic annotation mechanism to retrieve the entities from the unstructured documents, but the mechanism returns with too many ambiguous entities. In this work, the DBpedia knowledge base is adopted for entity extraction and categorization. To achieve the entity extraction task precisely, a two-step process is proposed: (a) train the unstructured datasets with Word2Vec and classify the entities into their respective categories. (b) crawl the web pages, forums, and other web sources to identifying the entities that are not present in the DBpedia. The evaluation shows the results with more precision and promising F1 score.

  • articleNo Access

    MONOTONOUS TASKS AND ALCOHOL CONSUMPTION EFFECTS ON THE BRAIN BY EEG ANALYSIS USING NEURAL NETWORKS

    An analysis of the Electroencephalogram (EEG) signals while performing a monotonous task and drinking alcohol using principal component analysis (PCA), linear discriminant analysis (LDA) for feature extraction and Neural Networks (NNs) for classification is proposed. The EEG is captured while performing a monotonous task that can adversely affect the brain and possibly cause stress. Moreover, we investigate the effects of alcohol on the brain by capturing the data continuously after consumption of equal amounts of alcohol. We hope that our work will shed more light on the relationship between such actions and EEG, and investigate if there is any relation between the tasks and mental stress. EEG signals offers a rare look at brain activity, while, monotonous activities are well known to cause irritation which may contribute to mental stress. We apply PCA and LDA to characterize the change in each component, extract it and discriminate using a NN. After experiments, it was found that PCA and LDA are effective analysis methods in EEG signal analysis.

  • articleOpen Access

    Pre-Training Clustering Models to Summarize Vietnamese Texts

    Our investigation aims at pre-training clustering models to summarize Vietnamese texts. For this purpose, we create a large-scale dataset by collecting Vietnamese articles from newspaper websites and extracting the plain text to build the dataset, including 1,101,101 documents. We propose a new single-document extractive text summarization model based on clustering models. Our proposal clusters the documents with the hard clustering k-means algorithm and the soft clustering LDA (Latent Dirichlet Allocation) algorithm. Then, based on the pre-training clustering models, a summary model is used to select the salient sentence in the input text to construct the summary. The empirical results showed that our summary model achieved 51.22% ROUGE-1, 17.62% ROUGE-2 and 29.16% ROUGE-L on the testing set. Besides the traditional word representation such as BoW (Bag-of-Words), we also use the word meaning-based tools like FastText and BERT (Bidirectional Encoder Representations from Transformers) in our model. The additional benefit of our proposed extractive summary model is that the output summary is a long-text, readable document. Furthermore, the model’s architecture is straightforward, easy to understand and runs on cost-efficient resources like arm CPU and GPU too.

  • chapterNo Access

    Automatic analysis of microblogging data to aid in emergency management

    Microblogging platforms like Twitter, in the recent years, have become one of the important sources of information for a wide spectrum of users. As a result, these platforms have become great resources to provide support for emergency management. During any crisis, it is necessary to sieve through a huge amount of social media texts within a short span of time to extract meaningful information from them. Extraction of emergency-specific information, such as topic keywords or landmarks or geo-locations of sites, from these texts plays a significant role in building an application for emergency management. This paper thus highlights different aspects of automatic analysis of tweets to help in developing such an application. Hence, it focuses on: (1) identification of crisis-related tweets using machine learning, (2) exploration of topic model implementations and looking at its effectiveness on short messages (as short as 140 characters); and performing an exploratory data analysis on short texts related to crises collected from Twitter, and looking at different visualizations to understand the commonality and differences between topics and different crisis-related data, and (3) providing a proof of concept for identifying and retrieving different geo-locations from tweets and extracting the GPS coordinates from this data to approximately plot them in a map.

  • chapterNo Access

    Research on Mining Negative Online Reviews on E-commerce Platforms Based on Social Network Analysis and LDA Model

    Negative online reviews have become essential decision-making information for businesses. By conducting text mining on negative online reviews of e-commerce platforms to accurately identify problems in online platform transactions, using social network analysis to clarify the correlation between critical factors in negative reviews, and applying the LDA topic model to mine eight significant themes of negative reviews, namely platform rider disputes, education refund difficulties, difficulty in canceling or changing reservations, damage or loss of goods, taxi disputes, payment harassment complaints, platform member disputes, and slow response on customer service. This chapter is of great significance for improving the quality of products and services, enhancing customer satisfaction, and effectively regulating e-commerce platforms by the government.

  • chapterNo Access

    INTERACTIVE CLASSIFICATION ORIENTED SUPERRESOLUTION OF MULTISPECTRAL IMAGES

    Classification techniques are routinely utilized on satellite images. Pansharpening techniques can be used to provide super resolved multispectral images that can improve the performance of classification methods. So far, these pansharpening methods have been explored only as a preprocessing step. In this work we address the problem of adaptively modifying the pansharpening method in order to improve the precision and recall figures of merit of the classification of a given class without significantly deteriorating the performance of the classifier over the other classes. The validity of the proposed technique is demonstrated using a real Quickbird image.

  • chapterNo Access

    A Method of Modelling Software Evolution Confirmation Based on LDA

    This paper research a method that can confirm the software evolution based on Latent Dirichlet Allocation (LDA). LDA is a method that can analyze the interdependency among words, topics and documents, and the interdependency can be expressed as probability. In this paper, adoption of LDA to modeling software evolution, take the package in source code as a document, regard names of function (method), variable names and comments as words, and figure out the probability between the three. Take results compare with update reports, can confirm the software of new version consistent with update reports.