Search name | Searched On | Run search |
---|---|---|
Keyword: Convolutional Neural Network (253) | 26 Mar 2025 | Run |
You do not have any saved searches
The characteristics of hyperspectral remote sensing images such as inconspicuous feature representativeness, single feature level, and complex information content, can lead to unstable classification results. We propose a lightweight dense network model that injects channel attention in the form of dense connections between network layers (DSE-DN) for the classification of hyperspectral images. In the DSE-DN network, principal component analysis (PCA) is applied to reduce redundancy in the hyperspectral images. Subsequently, a densely connected network is constructed, incorporating channel attention mechanisms through dense connections to enhance the analysis of spectral image features. Finally, the processed hyperspectral images are classified using a fully interconnected layer. We assess two classical hyperspectral datasets and construct 2DCNN, 3DCNN, ResNet, and the network that injects channel attention layer by layer to compare with DSE-DN. The experimental results indicate the utility of the DSE-DN network in hyperspectral image classification and its superiority over other networks.
The artistry of literary aesthetics carries the core of the artist’s thought expression and is an important connotation to attract visitors. With the pursuit of personalized and diversified artistry by visitors, the traditional way of presenting artwork can no longer meet the needs of visitors. Therefore, this paper integrates virtual reality technology and embedded AI on-chip system to construct the artistic embodiment model of literary aesthetics, and realizes the purpose of enhancing visitors’ sensory feelings through virtual reality technology. At the same time, the embedded AI on-chip system also analyzes visitors’ preferences and behavioral data to provide artists with richer artwork expression elements. The experimental results show that the model system in this paper can optimize the artwork under the condition of maintaining the original characteristics of the artwork, and strengthen the artistic performance ability of the artwork, especially in terms of artwork innovation, emotional expression of newness and diversity of performance, which has been significantly improved. In addition, the model system in this paper can better integrate virtual reality technology and embedded AI system-on-a-chip data, shorten the system’s processing of artwork interaction data and response time, improve the interaction recognition rate, and enhance the completeness and timeliness of artistic expression.
User’s basic attributes, behavior characteristics, value attributes, social attributes, interest attributes, psychological attributes, and other factors will lead to poor user experience, information overload, interference, and other negative effects. In order to develop more accurate marketing strategies, optimize user experience, and improve the conversion rate and user satisfaction of e-commerce platforms, an accurate construction method of e-commerce user profile based on artificial intelligence algorithm and big data analysis is proposed. Based on big data analysis technology, the basic attributes, behavior characteristics, value attributes, social attributes, interest attributes, and psychological attributes of e-commerce users are collected and integrated from multiple dimensions. The improved sequential pattern mining algorithm (PBWL) is applied to mine the frequent sequential pattern in the e-commerce user behavior, and to reveal the user’s behavior habit. The comprehensive attribute representation of e-commerce users is obtained by combining the LINE network model and the convolutional neural network. The firefly K-means clustering algorithm is used to cluster the e-commerce users, group the users based on the similarity of user attribute information, create different types of user clusters, and achieve the accurate construction of an e-commerce user profile. The experimental results show that this method can build an accurate e-commerce user profile and provide strong support for personalized recommendation and precision marketing of e-commerce platforms. This method can dig deeply into the behavior habits of e-commerce users and accurately reflect their interest preferences and consumption characteristics. This method can quickly and stably cluster e-commerce users, and the clustering effect of user profiles is optimal. This method can also divide the data into meaningful groups according to the user’s consumption behavior, and reveal the characteristics and values of different groups.
In the process of predicting the optimization quality of teaching mode, a single convolutional neural network method is affected by multiple sources of data such as students’ behavioral data and teachers’ evaluations, which is prone to causing modal collapse and affecting the prediction quality. For this reason, we propose a method for predicting the optimization quality of history course teaching mode based on the fusion analysis of multi-class data. After collecting the data of optimizing the quality of teaching mode of historical courses, noise reduction is carried out by the noise reduction self-coder network on the data collection, and the combination of historical data and current data is realized by the dynamic slicing method of multi-feature matrix, after fusing the multi-category data, the data are inputted into the convolutional neural network, and an optimized multi-objective genetic algorithm optimizes the convolutional neural network, stabilizes the convolutional neural network model, and avoids the modal collapse caused by multi-source attributes and categories of the data. The optimized convolutional neural network is used to extract the deep features of the optimized quality of history course teaching mode, and the obtained features are input into the multiple linear regression model as independent variables to obtain the predicted scores of the optimized quality of history course teaching mode. Through experimental verification, the method can realize in-depth cleaning and denoising of the data of each feature of history curriculum teaching mode, and there is a large correlation between the features related to the optimization quality of the five history curriculum teaching modes collected by the method and the final score of the optimization quality of the history curriculum teaching modes, the method predicts that the optimization quality of the history curriculum teaching modes scores and the scores of the expert evaluation are basically consistent.
Each local feature in the appearance image of cigarette packs is a key element to reflect the corresponding brand information. If only a single convolutional neural network is used, the context information of sequence data may be lost, resulting in an insufficient grasp of the overall information. In order to realize the deep-level feature extraction of the appearance of cigarette packs and realize the appearance detection of cigarette packs with higher accuracy and speed, a new method for the appearance detection of cigarette packs was proposed by combining the convolutional neural network and the cyclic neural network methods in the deep learning algorithm. The image acquisition card is used to collect the appearance images of cigarette packs, and contrast enhancement and rotation correction are performed on the collected images to effectively improve their quality and provide a good guarantee for subsequent feature extraction and detection. The preprocessed cigarette package image is input into the convolutional neural network to realize the deep-level feature extraction of the cigarette package appearance image. The cigarette package appearance features’ output from the convolutional neural network is input into the short-term and long-term memory unit, as well as the gate-controlled cyclic unit, of the corresponding recurrent neural network, in order to process time sequence information based on the efficient extraction of details from the image, retain the sequence and context information of the input data, and ultimately achieve accurate detection of the cigarette package appearance. Through experimental analysis, this method can effectively identify the appearance defects of cigarette packs, mark them immediately, and present them in an intuitive way, so that staff can quickly locate the problem and take corresponding measures. The method can detect appearance defects larger than 1.59mm×1.59mm with high accuracy. For various appearance defects, the detection rate can be guaranteed to be over 98%, providing strong support for quality control and product upgrading in the tobacco industry.
Insect recognition plays a crucial role in agricultural production and maintaining ecological balance. However, the vast variety and differing forms of insects make traditional image recognition methods, which rely on manual identification, low in accuracy and efficiency. This study aims to explore the application of deep learning technology in insect image recognition by proposing a deep learning-based method to enhance recognition accuracy and efficiency through automatic extraction of image features using a deep learning model. We utilized a combination of web-collected insect images and existing datasets to provide comprehensive training data, and developed an advanced deep learning model named Advanced Overlap Patch Relevance Extraction and Classification (AOPREC). The model’s design includes overlapping patch operations to minimize redundancy in background information, enhancing the model’s ability to manage image details and local features. Additionally, we replaced the multilayer perceptron (MLP) in the vision transformer (ViT) with a more efficient classifier that combines convolutional neural networks (CNN) and long short-term memory (LSTM) structures, significantly improving classification performance. Furthermore, Gaussian error linear unit (GELU) was employed as the activation function to optimize training efficiency and generalization, ensuring consistent performance across various images. Experimental results demonstrate that the AOPREC model excels in insect image classification and object detection tasks, significantly improving recognition accuracy.
Optical Character Recognition (OCR) is widely used to digitize printed documents, extract information from forms, automate data entry, and enable text recognition in applications ranging from license plate recognition to handwritten document conversion. Machine learning and deep learning models have recently improved OCR performance, however hyperparameter tuning remains an issue. To solve this, this paper proposes an efficient method for recognizing Brahmi script characters that combines a Convolutional Neural Network (CNN) with a random forest classifier. First, a CNN-based autoencoder extracts features from Brahmi script images, which are then input into the random forest classification model. The hyperparameters of both models are optimized using a genetic algorithm (GA). Extensive experimental results reveal that the proposed approach achieves a significantly better accuracy of 97%, over competitive models.
In the process of personalized learning resource recommendation, recommendation systems usually combine text data related to the resources themselves with text data related to learners. They analyze learners’ learning needs, interests, and preferences through algorithms. Then they select learning resources that meet learners’ needs from the learning resource library for recommendation. In order to achieve accurate and effective recognition of text emotions in personalized learning resource recommendations, a text emotion recognition method based on deep transfer learning is proposed. Based on the control value theory and the emotional attribute induction method, we will construct a recommended text emotional attribute index system which includes the text emotional attributes type level. For example, we collect multiple text data containing all emotional attributes. Then, we reconstruct the text data set through data cleaning, text analysis, and stop-word removal operations. Furthermore, we extract deep text features based on convolutional neural networks (CNNs). Finally, we integrate deep transfer learning methods to achieve sentiment classification and recognition of recommended text. The experimental results show that the recognition rates of positive and negative emotions in the source target domain text obtained by the design method are 93.5% and 98.2%, respectively; 98.9% and 96.2%, respectively. The mean square error of obtaining emotion recognition results is less than 0.1. This indicates that the knowledge learned from the source data in the design method can be well applied to the target data of personalized learning resource recommendation text. Therefore, it can effectively improve the generalization ability of low-resource datasets. Moreover, it can make reasonable emotional judgments on personalized learning resource recommendation text.
In order to support precise expression and efficient creation in the animation production process, this paper proposes a feature extraction method for animation script creation by introducing deep learning algorithms from artificial intelligence. First, the basic elements of animation script creation, including screen content, shot motion and time length, were analyzed. Subsequently, in the TF-IDF algorithm, the importance of keywords in the script is quantified by calculating word frequency and inverse text frequency. In the image block sparse representation method, the sparsity degree is used to represent the number of blocks and the target state is described by extracting image features. Finally, using convolutional neural network methods, feature extraction of segmented scripts for animation script creation is achieved through steps such as constructing two-dimensional matrices, performing convolution operations, segment pooling and feature extraction. The experimental results show that the method proposed in this paper has excellent accuracy in extracting features from shot scripts in animation script creation. It can support precise expression and efficient creation in the animation production process, improve the accuracy of feature extraction and provide strong support for the visualization of animation scripts and the design of shot language.
Malicious firmware upgrading represents a critical security vulnerability in Internet of Things (IoT) devices. This study introduces HyCNNAt, a novel hybrid deep learning network for IoT malware detection that synergistically combines Convolutional Neural Networks (CNNs) with transformer attention mechanisms. HyCNNAt’s architecture vertically and horizontally stacks convolution and attention layers, enhancing the network’s generalization capabilities, capacity, and overall effectiveness. We evaluated HyCNNAt using a publicly available IoT firmware dataset, where it demonstrated superior performance with the highest accuracy (97.11%±1.02%), F1-score (99.992%±0.004%), and recall (97.48%±2.6556%), highlighting its robust classification capabilities, although its precision (91.27%±45.08%) exhibited variability compared to state-of-the-art models such as CoAtNet, MobileViT, MobileNet, and MobileNet variants using transfer learning. These results underscore HyCNNAt’s potential as a robust solution for addressing the pressing challenge of IoT malware detection.
Accurately and rapidly detecting traffic object has been attracted intensive attention due to its potential applications in the fields of autonomous driving, traffic flow monitoring, augmented reality (AR) and so on. However, there are many difficulties in the process of traffic object detection indeed, such as occlusion and aggregation between objects, insufficient feature extraction of objects, in particular the presence of a large number of small objects, which bring great challenges to these traffic objects detection. In this paper, an improved traffic object detection model based on You-Only-Look-Once version 5 small (YOLOv5s) is proposed to address the issues. By utilizing spatial pyramids to extract multi-scale spatial features and applying Squeeze-and-Excitation (SE) channel attention to capture more global and local semantic features, especially by designing a sub-network in the neck to fuse high-resolution information in shallow layers with more accurately semantic information in deep layers, the detection sensitivity of object features is enhanced. More importantly, by explanting decoupled-head into the network, outstanding performance of the model with high detection accuracy and rapid detection speed is realized. The experimental results on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) and Laboratory for Intelligent and Safe Automobiles (LISA) traffic signs datasets both show that the modified model significantly improves the detection accuracy. Meanwhile, the high real-time performance is still maintained. Undoubtedly, the modified model proposed in this paper can effectively address many difficulties in traffic object detection under various complex scenes, which would be greatly helpful for its potential applications in the future.
Fractal dimension (FD), as an important parameter for the surface morphology of mechanical machining, can be used to analyze the friction characteristics of contacting surfaces. The accurate and rapid measurement of the FD is of significant importance. To this end, this paper presents a novel approach based on a convolutional neural network (3D-CNN) for the recognition of three-dimensional FDs on machined surfaces. We construct a dataset of anisotropic rough surfaces with different FDs using the Weierstrass–Mandelbrot (WM) fractal function and use a one-factor experimental design to analyze the influence of network parameters (network depth, filter size, filter quantity) on the accuracy of FD identification, finding the optimal combination of neural network parameters. By comparing our 3D-CNN method with three other methods (differential box-counting (DBC), triangular prism surface area (TPSA), and fractal Brownian motion (FBM)), we validated the effectiveness of our proposed method. The experimental results show that the average absolute percentage error of FD calculated by the 3D-CNN method can be controlled within 2%, and the method exhibits small errors throughout the full dynamic range of FDs. We also apply the proposed method to calculate the FD of the vertical milling surface and compare the results with the TPSA and FBM methods. The results show that the FD obtained by our method is close to the other two methods, and can be used to calculate the FD of three-dimensional surface profiles, thereby providing a new approach for the dynamic modeling and parameter identification of contact surfaces.
When the quantity of parameters or values in systems being tested is substantial, there is a significant escalation in the number of combinatorial test cases. The execution of all test cases will require a substantial allocation of time and resources. Prioritization technology enables early detection of system faults and enhances test efficiency. Most existing prioritization methods rely on historical empirical data, which can be challenging to obtain in many cases. In the meantime, random prioritization often leads to lower fault detection rates. This paper presents a prioritization method for combinatorial test cases based on a convolutional neural network (CNN) model. The weights of suspected fault-inducing interactions are initially extracted through a convolution operation in the subset of upfront test cases. Second, the features related to fault-inducing interactions are derived using wide multi-layer kernels convolutional neural network (WCNN). Third, the deep WCNN-SVM model undergoes training and makes predictions on the entire set of test cases. The predicted results are then combined with weights to prioritize combinatorial test cases. Test cases of equal priority are adjusted based on distance entropy. Application experiments on UAV demonstrate that the proposed method effectively enhances both fault detection speed and fault detection rate.
In the field of structural health monitoring, vibration-based damage identification remains a formidable challenge. Key to this challenge is the establishment of a reliable association between observed vibration characteristics and the actual state of structural damage (e.g. stiffness reduction). This association not only accurately indicates the presence of damage, but also the location and severity of the damage. To solve this complex pattern identification problem, a large number of approaches, including deep learning, have emerged in recent years. In this paper, we propose a new structural damage identification method that utilizes the vibration information of the structure and a convolutional neural network based on Alex NET improvement. The method consists of calculating the acceleration response power spectral density of damaged and undamaged structures under impact loading separately, and then making a difference between the two power spectral data, and subsequently introducing these power spectral difference data into the convolutional neural network for training. The use of power spectral density analysis as a preprocessing step converts the time-domain signals into frequency-domain signals, and this conversion allows the convolutional neural network to capture and learn from the specific frequency characteristics of the data, thus facilitating the learning process of the neural network model. In this paper, the effectiveness of the method is critically evaluated through numerical simulation and experimental validation, and 3% and 5% noise are added to the numerical study to test the robustness of the method. During the convolution neural network training process, the optimal training mean squared error (MSE) is 5×10−6 in the case of no noise addition; the optimal training MSE is 1.3×10−5 in the case of noise addition. Both the results of simulations and experiments confirm the high accuracy and good robustness of the method in localizing structural damage.
Structural damage detection is crucial for ensuring the safety of civil building structures in operational environments. Recently, deep learning-based methods have gained increasing attention from engineers and researchers. The performance of conventional deep learning methods for structural damage detection relies on a large number of labeled training datasets. However, it is difficult or/and impossible to obtain sufficient datasets to cover various damage scenarios for in-service structures. A little research has been conducted to identify both the damage severity and location with limited labeled measurement data. A novel transfer learning-based method for structural damage identification with limited measurements has been proposed utilizing frequency response functions (FRFs) as the input. The real structure is regarded as the target domain and its numerical model is as the source domain. The samples for various damage scenarios are generated using the numerical model, and a designed deep convolutional neural network (CNN) is pre-trained. The knowledge of the pre-trained network is transferred to identify the damage location and severity of the real structure using limited measurement data. Numerical and experimental studies have been conducted on a three-story building structure to verify the performance of the proposed method. To understand transfer learning and model interpretability, the t-SNE feature visualization is adopted to show the feature distribution changes during transfer learning. Numerical and experimental results show that the proposed approach outperforms conventional CNN models, and it is effective and accurate in identifying structural damage location and severity in real structures with limited measurement data.
With the rapid pace of economic and social development, the complexity and diversity of building structures continue to evolve. Over their operational lifespan, structures are subjected to various forms of degradation, including material aging and environmental erosion, which can significantly diminish their durability. Consequently, the development of robust structural health monitoring systems becomes imperative. These systems not only track the evolving performance trends of structures but also predict potential failures, extending their operational lifespan and ensuring the safety of occupants and assets. This study addresses the challenges inherent in extracting detailed structural feature information and enhancing the accuracy of predictive models. Focused on the cantilever beam test model as a research framework, the research explores an innovative approach to structural state prediction. It integrates Variational Mode Decomposition (VMD) and Convolutional Neural Network (CNN) methodologies to effectively analyze and forecast structural conditions. The study highlights a significant improvement in prediction accuracy and overall model performance by comparing CNN’s predictions using original structural signals with those processed through VMD. The results show that when the number of modes K=5, compared with the original signal (when K=0), the growth rate of the R-value at each measuring point can reach a maximum of 288.13%, and the average growth rate is 101.07%. This indicates that the VMD-CNN can significantly improve the prediction performance of the model.
As an important form of expression in modern civilization art, printmaking has a rich variety of types and a prominent sense of artistic hierarchy. Therefore, printmaking is highly favored around the world due to its unique artistic characteristics. Classifying print types through image feature elements will improve people’s understanding of print creation. Convolutional neural networks (CNNs) have good application effects in the field of image classification, so CNN is used for printmaking analysis. Considering that the classification effect of the traditional convolutional neural image classification model is easily affected by the activation function, the T-ReLU activation function is introduced. By utilizing adjustable parameters to enhance the soft saturation characteristics of the model and avoid gradient vanishing, a T-ReLU convolutional model is constructed. A better convolutional image classification model is proposed based on the T-ReLU convolutional model, taking into account the issue of subpar multi-level feature fusion in deep convolutional image classification models. Utilize normalization to analyze visual input, an eleven-layer convolutional network with residual units in the convolutional layer, and cascading thinking to fuse convolutional network defects. The performance test results showed that in the data test of different styles of artificial prints, the GT-ReLU model can obtain the best image classification accuracy, and the image classification accuracy rate is 0.978. The GT-ReLU model maintains a classification accuracy above 94.4% in the multi-dataset test classification performance test, which is higher than that of other image classification models. For the use of visual processing technology in the field of classifying prints, the research content provides good reference value.
Insect and rodents constantly cause trouble to the farmers leading to different kinds of diseases in the crop. Controlling as well as crop maintenance becomes a highly essential task for the farmers to ensure the health of the crop. However, they cause various social as well as environmental issues. Excessive pesticide usage may affect the contamination of soil and water, and also, it becomes highly toxic to plants. Hence, bugs and insects become more cautious against plants along with constant exposure, which pushes the farmer to utilize heavy pesticides. However, genetic seed manipulation is mainly used to provide high robustness against pest attacks, and they are highly expensive for practical execution. Implementation of the Internet-of-Things (IoT) in the agricultural domain has brought an enhanced improvement in on-field pest management. Several pest detections, as well as classification models, have been implemented in prior works, and they are based on effective techniques. The main purpose of this survey paper is to provide a literature review of IoT-aided pest detection and classification using different images. The datasets used in different pest detection and classification, the simulated platforms, and performance measures are analyzed. Further, the recent trends of machine learning and deep learning methods in this field are reviewed and categorized. Thus, the given survey impacts the economy for analyzing pest detection in the early stage, which provides better crop production, and also maximizes the protection of crops. Moreover, it helps to minimize human errors, and also it provides the best efforts to increase the automated monitoring system for large fields.
Due to its significant applications in security, the iris recognition process has been considered as the most active research area over the last few decades. In general, the iris recognition framework has been crucially utilized for various security applications because it includes a set of features as well as does not alter its character according to the time. In recent times, emerging deep learning techniques have attained huge success, particularly in the field of the iris recognition framework model. Moreover, in considering the field of iris recognition, there is no possibility for the remarkable capability of the deep learning model as well as to attain superior performance. To handle the issues in the conventional model of iris recognition, a novel heuristic-aided deep learning framework has been implemented for recognizing the iris system. Initially, the required source iris images are gathered from the data sources. It is then followed by the pre-processing stage, where the pre-processed image is obtained. Consequently, the image segmentation process is carried out by Adaptive Deeplabv3+layers, in which the parameters are optimized using the Modified Weighted Flow Direction Algorithm (MWFDA). Finally, the iris recognition is accomplished by hybrid Hybridization of Multiscale Dilated-Assisted Learning (MDAL) that will be composed of both a Convolutional Neural Network (CNN) and a Residual Network (ResNet). To achieve optimal recognition results, the parameters in CNN and ResNet are tuned optimally by using MWFDA. The experimental results are estimated with the help of distinct measures. Contrary to conventional methods, the empirical results prove that the recommended model achieves the desired value to enhance the recognition performance.
Sports technology and 3D motion recognition require models to be able to accurately identify athletes’ movements, which is crucial for training analysis, game strategy development and refereeing assistance decisions. To maintain a high recognition rate under different competition scenes, different athlete styles and different environmental conditions, and to ensure the practicality and reliability of the model, two independent 3D convolutional neural networks are applied to construct the action recognition model of Two-stream 3D Residual Networks. According to the temporal-spatial characteristics of human movements in video, the model introduces attention mechanism and combines time dimension to build a Two-stream 3D Residual Networks action recognition model integrating time-channel attention. The average accuracy of Top-1 and Top-5 action recognition models of Two-stream 3D Residual Networks integrated with pre-activation structure is 68.97% and 91.68%. The residual block of the pre-activated structure can enhance the model’s effectiveness. The average precision of Top-1 and Top-5 of action recognition model of two-stream 3D spatio-temporal residual network integrating time-channel attention is 85.73% and 92.05%, which has higher accuracy. The action recognition model of the two-stream 3D spatio-temporal residual network, which integrates time-channel attention, is accurate and achieves good recognition results with volleyball action recognition in real scenes.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.