Loading [MathJax]/jax/output/CommonHTML/jax.js
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  Bestsellers

  • articleNo Access

    Research on Intelligent Mining Algorithm of English Translation Text Big Data Based on Deep Learning

    This paper explores the translation of English texts by a deep learning algorithm, which aims to intelligently extract relevant records from a large corpus of English-translated texts, especially its shortcomings in processing difficult texts and dealing with long-term dependencies. This research integrates the modern long short-term memory (LSTM) network framework, with the aim of improving the accuracy and processing efficiency of the translation. Using a large number of experimental facts, these LSTM models are comprehensively learned and rigorously optimized. The traditional approach is to adapt the style to the language-specific properties of the translated text, and then use a bunch of evaluation metrics to evaluate the overall performance of the model across a large number of text types. The results of this study show that the method significantly speeds up the extraction and processing of information while maintaining the integrity at the translation level. In addition, the study reveals deep learning models’ ability to recognize complex contextual nuances and subtle formulations of language expression. The results of this research are multifaceted, bringing great improvements to device translation structures and facilitating the development of state-of-the-art text content analysis tools. These advances apply primarily to sectors of considerable statistical volume and complexity, including the criminal, scientific, and technological fields. Therefore, this study lays a foundation for future exploration of improving device understanding of language and automation of translation methods in various professional fields.

  • articleOpen Access

    Research on Automatic Identification of Power Big Data Anomalies Based on Improved Decision Tree

    With the advancement of smart grid technology, the issue of power system network security has become increasingly critical. To fully utilize the power grid’s vast data resources and enhance the efficiency of anomaly detection, this paper proposes an improved decision tree (DT)-based automatic identification approach for anomalies in electric power big data. The method employs six-dimensional features extracted from the dimensions of volatility, trend, and variability to characterize the time series of power data. These features are integrated into a hybrid DT-SVM-LSTM framework, combining the strengths of DTs, support vector machines, and long short-term memory networks. Experimental results demonstrate that the proposed method achieves an accuracy of 96.8%, a precision of 95.3%, a recall of 94.8%, and an F1-score of 95.0%, outperforming several state-of-the-art methods cited in the literature. Moreover, the approach exhibits strong robustness to noise, maintaining high detection accuracy even under low signal-to-noise ratio conditions. These findings highlight the effectiveness of the method in efficiently detecting anomalies and addressing noise interference.

  • articleNo Access

    Coal Transportation Cost Prediction under Mixed Uncertainty Based on Graph Attention Networks

    In order to solve the high uncertainty and dynamics of coal transportation costs and reduce the subjectivity of artificial prediction, a method of coal transportation cost prediction under mixed uncertainty based on a graph attention network (GAT) is proposed. The analytic hierarchy process (AHP) is used to build the factor model of coal transportation cost prediction, calculate the weight coefficient of each factor, and clarify the relative importance of each factor in the prediction process. Principal component analysis (PCA) is used to reduce the dimensions of various factors affecting the cost of coal transportation, eliminate the noise and redundant information in the data, and retain the main information characteristics. The processed data set is input into the STDGAT model that combines the GAT and the long-term and short-term memory neural network (LSTM). Through GAT, the spatial correlation characteristics in the coal transportation demand are extracted and combined with LSTM to capture the dynamic characteristics in the time dimension, and the time-space joint prediction of the coal transportation cost is realized. The experimental results show that the prediction results are highly close to the actual cost, and the error remains at a low level no matter in what time period, short distance, medium distance or long-distance transportation, and different transportation modes. By adjusting the number of attention heads, it was found that the prediction performance of the model was the best when the number of attention heads was 6.

  • articleNo Access

    A hybrid model using 1D-CNN with Bi-LSTM, GRU, and various ML regressors for forecasting the conception of electrical energy

    To solve power consumption challenges by using the power of Artificial Intelligence (AI) techniques, this research presents an innovative hybrid time series forecasting approach. The suggested model combines GRU-BiLSTM with several regressors and is benchmarked against three other models to guarantee optimum reliability. It uses a specialized dataset from the Ministry of Electricity in Baghdad, Iraq. For every model architecture, three optimizers are tested: Adam, RMSprop and Nadam. Performance assessments show that the hybrid model is highly reliable, offering a practical option for model-based sequence applications that need fast computation and comprehensive context knowledge. Notably, the Adam optimizer works better than the others by promoting faster convergence and obstructing the establishment of local minima. Adam modifies the learning rate according to estimates of each parameter’s first and second moments of the gradients separately. Furthermore, because of its tolerance for outliers and emphasis on fitting within a certain margin, the SVR regressor performs better than stepwise and polynomial regressors, obtaining a lower MSE of 0.008481 using the Adam optimizer. The SVR’s regularization also reduces overfitting, especially when paired with Adam’s flexible learning rates. The research concludes that the properties of the targeted dataset, processing demands and job complexity should all be considered when selecting a model and optimizer.

  • articleNo Access

    Manifold Matrices-based Attention Mechanisms on 3D Skeletons for Human Action Recognition

    Currently, one of the most well-liked study fields in computer-vision is skeleton-based human action recognition. As the Lie group in Riemannian manifolds is able to precisely describe 3D geometric relationships among rigid bodies, it is widely used in skeleton-based action recognition approaches to construct action feature descriptors. Regrettably, the majority of these approaches overlook crucial body parts and skeletons in favor of focusing solely on spatio-temporal descriptors of an action as a whole. A manifold-based rigid body motion attention mechanism is proposed to assign varying degrees of importance to the relative geometries of different limb motions, and then a skeleton-based spatial attention module is constructed on the basis of limb motions for the more efficient extraction of spatial features from skeletons. Furthermore, a Lie group-based temporal attention mechanism is proposed on the basis of the first two phases to choose significant skeleton frames in an effort to raise the level of action recognition accuracy even higher. Results from experiments on four of the most influential action datasets demonstrate that the proposed approach performs better in terms of action recognition accuracy than many cutting-edge approaches based on skeletons.

  • articleNo Access

    OPTIMUM LEARNING MODEL FOR TEMPERATURE PROFILE PREDICTION IN ADDITIVE MANUFACTURING PROCESS

    In recent years, several industries have made extensive use of additive manufacturing (AM) as it creates complicated parts by layer-by-layer deposition and offers high customization and rapid production. Nonetheless, hardware failure or thermal stress during the AM process can result in defects. In addition, the thermal history of the AM process is typically simulated using finite element research, but they are expensive and time-consuming. In this research, an essential element of a methodological approach for developing real-time control systems based on data-driven models is designed and developed. Finite element techniques are used to generate the database and resolve time-dependent heat equations The proposed approach makes use of the adaptive falcsreech buzpullet search algorithm (FBSO-LSTM) model, which forecasts the temperatures of subsequent voxels utilizing inputs like laser characteristics, to address problems with extremely unpredictable solutions as well as the temperatures of preceding voxels. The performance with the lowest error values is achieved by the adaptive FBSO-LSTM model utilizing the GAMMA database with values of 6.00, 107.51, 5.24, and 0.26 for MAE, MAPE, MSE and RMSE, respectively. Similarly, the FEM database with values of 6.00, 49.51, 5.26, and 0.27 for MAE, MAPE, MSE and RMSE, respectively.

  • articleNo Access

    Swin-Caption: Swin Transformer-Based Image Captioning with Feature Enhancement and Multi-Stage Fusion

    The objective of image captioning is to empower computers to generate human-like sentences autonomously, describing a provided image. To tackle the challenges of insufficient accuracy in image feature extraction and underutilization of visual information, we present a Swin Transformer-based model for image captioning with feature enhancement and multi-stage fusion (Swin-Caption). Initially, the Swin Transformer is employed in the capacity of an encoder for extracting images, while feature enhancement is adopted to gather additional image feature information. Subsequently, a multi-stage image and semantic fusion module is constructed to utilize the semantic information from past time steps. Lastly, a two-layer LSTM is utilized to decode semantic and image data, generating captions. The proposed model outperforms the baseline model in experimental tests and instance analysis on the public datasets Flickr8K, Flickr30K, and MS-COCO.

  • articleNo Access

    Distributed Unmanned Vehicle Platoon Control: A Combining Intention Recognition Method

    Unmanned Systems18 Mar 2025

    This paper aims to improve the safety of road traffic and design a vehicle platoon controller based on the vehicle lane-changing intention recognition model. Through deep learning technology, the lane-changing intention of surrounding vehicles on the road is detected in real-time. First, we used the Generative Adversarial Networks (GAN) network to increase the number of lane-changing data in the original data, and train a deep neural network model based on the Long Short-Term Memory (LSTM) network, with the model accuracy reaching 93.54%. Based on the intent recognition model, the leading vehicle adjusts the longitudinal distance from the surrounding vehicles according to the probability of the detected intent to change lanes, to ensure the safety of the platoon during the lane-changing process of the surrounding vehicles. Finally, the related simulation verification is carried out by Matlab/Simulink joint Prescan, and the experimental results show that the proposed method recognizes a difference of 0.4 s between the time of generating lane change intention and the actual time.

  • articleNo Access

    A New Delay Connection for Long Short-Term Memory Networks

    Connections play a crucial role in neural network (NN) learning because they determine how information flows in NNs. Suitable connection mechanisms may extensively enlarge the learning capability and reduce the negative effect of gradient problems. In this paper, a new delay connection is proposed for Long Short-Term Memory (LSTM) unit to develop a more sophisticated recurrent unit, called Delay Connected LSTM (DCLSTM). The proposed delay connection brings two main merits to DCLSTM with introducing no extra parameters. First, it allows the output of the DCLSTM unit to maintain LSTM, which is absent in the LSTM unit. Second, the proposed delay connection helps to bridge the error signals to previous time steps and allows it to be back-propagated across several layers without vanishing too quickly. To evaluate the performance of the proposed delay connections, the DCLSTM model with and without peephole connections was compared with four state-of-the-art recurrent model on two sequence classification tasks. DCLSTM model outperformed the other models with higher accuracy and F1_score. Furthermore, the networks with multiple stacked DCLSTM layers and the standard LSTM layer were evaluated on Penn Treebank (PTB) language modeling. The DCLSTM model achieved lower perplexity (PPL)/bit-per-character (BPC) than the standard LSTM model. The experiments demonstrate that the learning of the DCLSTM models is more stable and efficient.

  • articleNo Access

    A Modified Long Short-Term Memory Cell

    Machine Learning (ML), among other things, facilitates Text Classification, the task of assigning classes to textual items. Classification performance in ML has been significantly improved due to recent developments, including the rise of Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM), Gated Recurrent Units (GRUs), and Transformer Models. Internal memory states with dynamic temporal behavior can be found in these kinds of cells. This temporal behavior in the LSTM cell is stored in two different states: “Current” and “Hidden”. In this work, we define a modification layer within the LSTM cell which allows us to perform additional state adjustments for either state, or even simultaneously alter both. We perform 17 state alterations. Out of these 17 single-state alteration experiments, 12 involve the Current state whereas five involve the Hidden one. These alterations are evaluated using seven datasets related to sentiment analysis, document classification, hate speech detection, and human-to-robot interaction. Our results showed that the highest performing alteration for Current and Hidden state can achieve an average F1 improvement of 0.5% and 0.3%, respectively. We also compare our modified cell performance to two Transformer models, where our modified LSTM cell is outperformed in classification metrics in 4/6 datasets, but improves upon the simple Transformer model and clearly has a better cost efficiency than both Transformer models.

  • articleNo Access

    Evolving a Pipeline Approach for Abstract Meaning Representation Parsing Towards Dynamic Neural Networks

    Abstract Meaning Representation parsing aims to represent a sentence as a structured, Directed, Acyclic Graph (DAG), in an attempt to extract meaning from text. This paper extends an existing 2-stage pipeline AMR parser with state-of-the-art techniques in dependency parsing. First, Pointer-Generator Networks are used for out-of-vocabulary words in the concept identification stage, with an improved initialization via the use of word-and character-level embeddings. Second, the performance of the Relation Identification module is improved by jointly training the Heads Selection and the Arcs Labeling components. Last, we underline the difficulty of end-to-end training with recurrent modules in a static deep neural network construction approach and explore a dynamic construction implementation, which continuously adapts the computation graph, thus potentially enabling end-to-end training in the proposed pipeline solution.

  • articleOpen Access

    Spatio-Temporal Image-Based Encoded Atlases for EEG Emotion Recognition

    Emotion recognition plays an essential role in human–human interaction since it is a key to understanding the emotional states and reactions of human beings when they are subject to events and engagements in everyday life. Moving towards human–computer interaction, the study of emotions becomes fundamental because it is at the basis of the design of advanced systems to support a broad spectrum of application areas, including forensic, rehabilitative, educational, and many others. An effective method for discriminating emotions is based on ElectroEncephaloGraphy (EEG) data analysis, which is used as input for classification systems. Collecting brain signals on several channels and for a wide range of emotions produces cumbersome datasets that are hard to manage, transmit, and use in varied applications. In this context, the paper introduces the Empátheia system, which explores a different EEG representation by encoding EEG signals into images prior to their classification. In particular, the proposed system extracts spatio-temporal image encodings, or atlases, from EEG data through the Processing and transfeR of Interaction States and Mappings through Image-based eNcoding (PRISMIN) framework, thus obtaining a compact representation of the input signals. The atlases are then classified through the Empátheia architecture, which comprises branches based on convolutional, recurrent, and transformer models designed and tuned to capture the spatial and temporal aspects of emotions. Extensive experiments were conducted on the Shanghai Jiao Tong University (SJTU) Emotion EEG Dataset (SEED) public dataset, where the proposed system significantly reduced its size while retaining high performance. The results obtained highlight the effectiveness of the proposed approach and suggest new avenues for data representation in emotion recognition from EEG signals.

  • articleNo Access

    The Impact of Corporate Executives’ Behavior on Company Performance: An Analysis Based on Voice Emotion Classification System and Deep Learning

    Using deep learning methods, this study provides insights into the significant impact of corporate executive behavior on firm performance, particularly through the lens of vocal emotions. Considering that emotions play a crucial role in leadership effectiveness as well as corporate success, this paper employs a Long Short-Term Memory (LSTM) network to meticulously categorize the emotions in executive speeches into positive, neutral, and negative categories. The initial stage requires rigorous pre-processing of the speech signal, including collection, denoising and feature extraction using Mel Frequency Cepstrum Coefficients (MFCC). Subsequently, LSTM models are trained on these preprocessed data for sentiment classification. This study further innovates by combining sentiment analysis with Key Performance Indicators (KPIs) to scrutinize the correlation between executives’ emotional expressions and company performance. Through statistical analysis and machine learning techniques, we assess the significance of this correlation and present evidence that highlights the predictive power of executives’ emotional expressions on firm performance metrics. Our findings not only contribute to an understanding of the nuanced ways in which leadership behavior impacts firm performance, but also open avenues for enhancing executive training and performance assessment methods. This paper demonstrates the classification accuracy of our model and its effectiveness in correlating executive emotions with firm performance, providing valuable insights into the interplay between leadership emotional intelligence and firm success.

  • articleNo Access

    Long short-term memory neural network-based multi-level model for smart irrigation

    Rice is a staple food crop around the world, and its demand is likely to rise significantly with growth in population. Increasing rice productivity and production largely depends on the availability of irrigation water. Thus, the efficient application of irrigation water such that the crop doesn’t experience moisture stress is of utmost importance. In the present study, a long short-term memory (LSTM)-based neural network with logistic regression has been used to predict the daily irrigation schedule of drip-irrigated rice. The correlation threshold of 0.75 was used for the selection of features, which helped in limiting the number of input parameters. Also, a dataset based on the recommendation of a domain expert, and another used by the tool Agricultural Production Systems Simulator (APSIM) was used for comparison. Field data comprising of weather station data and past irrigation schedules has been used to train the model. Grid search algorithm has been used to optimize the hyperparameters of the model. Nested cross-validation has been used for validating the results. The results show that the correlation-based selected dataset is as effective as the domain expert-recommended dataset in predicting the water requirement using LSTM as the base model. The models were evaluated on different parameters and a multi-criteria decision evaluation (Technique for Order of Preference by Similarity to Ideal Solution [TOPSIS]) was used to find the best performing.

  • articleNo Access

    LSTM Deep Neural Networks Postfiltering for Enhancing Synthetic Voices

    Recent developments in speech synthesis have produced systems capable of producing speech which closely resembles natural speech, and researchers now strive to create models that more accurately mimic human voices. One such development is the incorporation of multiple linguistic styles in various languages and accents. Speech synthesis based on Hidden Markov Models (HMM) is of great interest to researchers, due to its ability to produce sophisticated features with a small footprint. Despite some progress, its quality has not yet reached the level of the current predominant unit-selection approaches, which select and concatenate recordings of real speech, and work has been conducted to try to improve HMM-based systems. In this paper, we present an application of long short-term memory (LSTM) deep neural networks as a postfiltering step in HMM-based speech synthesis. Our motivation stems from a similar desire to obtain characteristics which are closer to those of natural speech. The paper analyzes four types of postfilters obtained using five voices, which range from a single postfilter to enhance all the parameters, to a multi-stream proposal which separately enhances groups of parameters. The different proposals are evaluated using three objective measures and are statistically compared to determine any significance between them. The results described in the paper indicate that HMM-based voices can be enhanced using this approach, specially for the multi-stream postfilters on the considered objective measures.

  • articleNo Access

    Falling Detection Research Based on Elderly Behavior Infrared Video Image Contours Ellipse Fitting

    Throughout the world, the proportion of the elders in the total population is increasing dramatically, and home-based care has become the most important form of old-age care. Falling is the most common cause of accidents among the elders at home that poses a huge threat to their health and lives. In order to protect the privacy of the elders an accidental falling detection algorithm for the elders in the home has been proposed in this paper. First, contour-based infrared motion video images are used instead of high-definition cameras to collect the elderly behaviors to protect their privacy. Second, ellipse fitting is performed on the infrared video images of the five behaviors including standing, sitting, squatting, bending and falling. The five geometric characteristic variables of the contour-fitting ellipses including the number of ellipses, centroid positions, ellipsoidal areas, horizontal inclinations and long-short axis ratios of the images, have been extracted. Next, an LSTM model is established using the above variables as inputs for feature extraction and classification. Finally, infrared video images of different types of active behaviors of the elders aged from 50 to 70 years have been selected as IFD database for classification detection. Sixty percent of the IFD images are used as training datasets, and 40% of the IFD images are used as test datasets, and compared with the classification detection of URFD datasets which contains optical RGB HD video images of the different behaviors. The experimental results show the effectiveness of the algorithm proposed in this paper which combines the contour ellipse fitting of the infrared video images and the LSTM feature extraction. The average correct classification rate of the normal and falling down behaviors of the elders is above 95%, which is comparable to the optical RGB datasets. The precision of behavior recognition can effectively protect the privacy of the elders, and provide protection for the accidental falling detection of the elders living alone.

  • articleFree Access

    An Algorithm for Network Security Situation Assessment Based on Deep Learning

    Aiming at the problems that the existing assessment methods are difficult to solve, such as the low efficiency and uncertainty of network security situation assessment in complex network environment, by constructing the characteristic elements of network security big data, a typical model based on deep learning, long short-term memory (LSTM), is established to assess the network security situation in time series. The hidden relationship and change trend of network security situation are automatically mined and analyzed through the deep learning algorithm of big data, which greatly improves the prediction accuracy of security situation. Experimental analysis shows that this method has a better assessment effect on network threats, has higher learning efficiency than the traditional network situation assessment methods, and has strong representation ability in the face of network threats. It can more accurately and effectively assess the changing trend of big data security situation in the future.

  • articleNo Access

    Multi-Network-Based Ensemble Deep Learning Model to Forecast Ross River Virus Outbreak in Australia

    Ross River virus (RRV) disease is one of the most epidemiological mosquito-borne diseases in Australia. Its major consequences on public health require building a precise and accurate model for predicting any forthcoming outbreaks. Several models have been developed by machine learning (ML) researchers, and many studies have been published as a result. Later, deep learning models have been introduced and shown tremendous success in forecasting, mainly the long short-term memory (LSTM), which performs significantly better than the traditional machine learning approaches. There are four common problems that previously developed models need to solve. They are exploding gradient, vanishing gradient, uncertainty and parameter bias. LSTM has already solved the first two problems, i.e. exploding and vanishing gradient problems, and the remaining two are overcome by n-LSTM. However, developing a prediction model for the RRV disease is a challenging task because it presents a wide range of symptoms, and there needs to be more accurate information available on the disease. To address these challenges, we propose a data-driven ensemble deep learning model using multi-networks of LSTM neural network for RRV disease forecasting in Australia. Data is collected between 1993 and 2020 from the Health Department of the Government of Australia. Data from 1993 to 2016 is taken to train the model, while the data of 2016–2020 is used as a test dataset. Previous research has demonstrated the efficacy of both ARIMA and exponential smoothing techniques in the field of time-series forecasting. As a result, our study sought to evaluate the performance of our proposed model in comparison to these established parametric methods, including ARIMA and ARMA, as well as the more recent deep learning approaches such as encoder–decoder and attention mechanism models. The results show that n-LSTM achieves higher accuracy and has a less mean-square error. We have also discussed the comparison of the models in detail. Such forecasting gives an insight into being well prepared and handling the situation of the outbreak.

  • articleFree Access

    Medical Image Segmentation Using Grey Wolf-Based U-Net with Bi-Directional Convolutional LSTM

    In recent years, deep learning-based networks have been able to achieve state-of-the-art performance in medical image segmentation. U-Net, one of the currently available networks, has proven to be effective when applied to the segmentation of medical images. A Convolutional Neural Network’s (CNN) performance is heavily dependent on the network’s architecture and associated parameters. There are many layers and parameters that need to be set up in order to manually create a CNN, making it a complex procedure. Designing a network is made more difficult by using a variety of connections to increase the network’s complexity. Evolutionary computation can be used to set the parameters of CNN and/or organize the CNN layers as an optimization strategy. This paper proposes an automatic evolutionary method for detecting an optimal network topology and its parameters for the segmentation of clinical image using Grey Wolf Optimization algorithm. Also, Bi-Directional LSTM integrated in the skip connection to extract dense feature characteristics of image by combining feature maps extracted from encoded and previous decoded path in nonlinear way (MIS-GW-U-Net-BiDCLSTM) is proposed. The experimental results demonstrate that the proposed method attains 98.49% accuracy with minimal parameters, which is much better than that of the other methods.

  • articleNo Access

    Aerial Gaze Target Recognition Based on Head and Eye Movements

    Aerial gaze target recognition is an important step in aerial eye control interaction. In order to achieve accurate aerial gaze target recognition, the VT4LM recognition algorithm was constructed in this paper. By this algorithm, the facial images containing the eyes of the flight operators were first inputted into Vision Transformer (ViT) to extract the local features, and the head posture of the flight operators was inputted into four LSTMs to extract the global features. Then, the local and global features of the flight operators were inputted into three fully connected layers, two dropout layers and one softmax classifier. Finally, the recognition result of aerial gaze target was obtained. In this paper the effectiveness of the VT4LM recognition algorithm was verified through the identification of four aerial line-of-sight gaze targets named Head-up display, Accelerator push rod, Control lever and Rudder by four flight operators during simulated flight. The experimental results showed that the accuracy of the VT4LM algorithm for aerial gaze target recognition reached 89.29% and the Cross Entropy Loss was 1.45. Compared to the other three recognition methods, the VT4LM algorithm had the highest recognition accuracy and minimum loss. When using the VT4LM algorithm to detect four simulated flight operators staring at four aerial gaze targets, the recognition accuracy was all higher than 85.00%. It could be seen that the VT4LM algorithm had a good performance in aerial gaze target recognition.