Data security and privacy have become areas of concern as image recognition and computer vision technologies have advanced. In Internet of Things (IoT) systems for remote sensing, conventional encryption methods are computationally challenging and unsuitable, and multimedia data require strong mechanisms. This study proposes a novel Authentication Vision Optimizer (ACVO)-driven module incorporated with advanced cryptographic approaches and computer vision to generate a privacy-preserving image identification system for e-archives in IoT environments. The framework consists of four modules, like encryption module, secure sharing, authentication, and recognition optimization, utilizing high-resolution satellite images captured via electronic remote sensing gathered from the Kaggle. The encryption module employs the Advanced Encryption Standard (AES) algorithm, while the secure sharing module utilizes Visual Cryptography (VC) for human-readable reconstruction. The authentication module uses the Trusted Execution Environments (TEE) approach to ensure data authenticity. The image recognition optimization module utilizes transfer learning to fine-tune pre-trained Efficient Golden Jackal Tuned Deep Convolute Neuronet (EGJ-DCNN) on small-scale encrypted datasets. The study achieved high recognition accuracy on encrypted satellite images utilizing transfer learning optimization, reducing computational time with AES and visual cryptography encryption. Data integrity is maintained optimally in blind watermarking verification, and privacy protection represents high resistance to unofficial access.
As society advances, computer vision will play an increasingly crucial role in digital and intelligent transformations. Known as deep learning models, Convolutional Neural Networks (CNNs) have emerged as a key component of computer vision due to their superior performance in automatically detecting image features, handling high-dimensional data and performing large-scale classification tasks. This paper examines the development of CNNs, leveraging the strengths of current mainstream image recognition methods, and proposes a Self-Distillation and Attention-based Convolutional Neural Network (SDACNN) model to further enhance CNN accuracy. Experimental results demonstrate that the proposed model effectively accomplishes image recognition tasks.
In contrast to the previous artificial neural networks (ANNs), spiking neural networks (SNNs) work based on temporal coding approaches. In the proposed SNN, the number of neurons, neuron models, encoding method, and learning algorithm design are described in a correct and pellucid fashion. It is also discussed that optimizing the SNN parameters based on physiology, and maximizing the information they pass leads to a more robust network. In this paper, inspired by the “center-surround” structure of the receptive fields in the retina, and the amount of overlap that they have, a robust SNN is implemented. It is based on the Integrate-and-Fire (IF) neuron model and uses the time-to-first-spike coding to train the network by a newly proposed method. The Iris and MNIST datasets were employed to evaluate the performance of the proposed network whose accuracy, with 60 input neurons, was 96.33% on the Iris dataset. The network was trained in only 45 iterations indicating its reasonable convergence rate. For the MNIST dataset, when the gray level of each pixel was considered as input to the network, 600 input neurons were required, and the accuracy of the network was 90.5%. Next, 14 structural features were used as input. Therefore, the number of input neurons decreased to 210, and accuracy increased up to 95%, meaning that an SNN with fewer input neurons and good skill was implemented. Also, the ABIDE1 dataset is applied to the proposed SNN. Of the 184 data, 79 are used for healthy people and 105 for people with autism. One of the characteristics that can differentiate between these two classes is the entropy of the existing data. Therefore, Shannon entropy is used for feature extraction. Applying these values to the proposed SNN, an accuracy of 84.42% was achieved by only 120 iterations, which is a good result compared to the recent results.
Deep Feedforward Neural Networks (FNNs) with skip connections have revolutionized various image recognition tasks. In this paper, we propose a novel architecture called bidirectional FNN (BiFNN), which utilizes skip connections to aggregate features between its forward and backward paths. The BiFNN accepts any FNN as a plugin that can incorporate any general FNN model into its forward path, introducing only a few additional parameters in the cross-path connections. The backward path is implemented as a nonparameter layer, utilizing a discretized form of the neural memory Ordinary Differential Equation (nmODE), which is named ϵϵ-net. We provide a proof of convergence for the ϵϵ-net and evaluate its initial value problem. Our proposed architecture is evaluated on diverse image recognition datasets, including Fashion-MNIST, SVHN, CIFAR-10, CIFAR-100, and Tiny-ImageNet. The results demonstrate that BiFNNs offer significant improvements compared to embedded models such as ConvMixer, ResNet, ResNeXt, and Vision Transformer. Furthermore, BiFNNs can be fine-tuned to achieve comparable performance with embedded models on Tiny-ImageNet and ImageNet-1K datasets by loading the same pretrained parameters.
Here, we review the classical Hamming associative memory and we discuss a cellular implementation. We show how the patterns to be stored can be superimposed or enfolded onto a single memory element with exponential storage capacity and how these memory elements can be organized in a cellular network architecture suitable for pattern association.
The application of machine vision to industrial robots is a hot topic in robot research nowadays. A welding robot with machine vision had been developed, which is convenient and flexible to reach the welding point with six degrees-of-freedom (DOF) manipulator, while the singularity of its movement trail is prevented, and the stability of the mechanism had been fully guaranteed. As the precise industry camera can capture the optical feature of the workpiece to reflect in the camera’s CCD lens, the workpiece is identified and located through a visual pattern recognition algorithm based on gray scale processing, on the gradient direction of edge pixel or on geometric element so that high-speed visual acquisition, image preprocessing, feature extraction and recognition, target location are integrated and hardware processing power is improved. Another task is to plan control strategy of control system, and the upper computer software is programmed in order that multi-axis motion trajectory is optimized and servo control is accomplished. Finally, prototype was developed and validation experiments show that the welding robot has high stability, high efficiency, high precision, even if welding joints are random and workpiece contour is irregular.
Fire is one of the most common serious disasters in human society. It is a kind of burning phenomenon that is out of control in time and space. When a fire occurs, how to detect the fire quickly and remove it in the budding state has become the key content of fire control work. Outdoor fire is very common in our daily life, and once it occurs without effective and timely control, it will cause huge losses. Therefore, it is particularly important to study an intelligent alarm system for outdoor fire. Generally, fire detection technology can be divided into sensor fire detection technology and image fire detection technology. Sensor fire detection technology is low cost and easy to design, but its application field is limited. Under the interference of many factors outside, misjudgement and missed judgement will occur. Image fire detection technology can achieve certain detection function through manual design of features and classifiers, but there are still defects in the application in the actual diversified environment. With the development of neural network technology in recent years, it has made great breakthroughs in the field of image recognition. Its judgment type is obtained through a large number of data training algorithms. Because of its automatic feature extraction and classification characteristics, it can effectively adapt to the external environment. Therefore, this paper proposes an end-to-end two-stream neural network model to detect fires, uses fire video on the network to train the algorithm, and then uses the fire database to test. Compared with the existing fire detection algorithms, it is found that the proposed method has good practicability and versatility, and provides a good reference for the development of fire detection technology.
Deep learning refers to Convolutional Neural Network (CNN). CNN is used for image recognition for this study. The dataset is named Fruits-360 and it is obtained from the Kaggle dataset. Seventy percent of the pictures are selected as training data and the rest of the images are used for testing. In this study, an image size is 100×100×3100×100×3. Training is realized using Stochastic Gradient Descent with Momentum (sgdm), Adaptive Moment Estimation (adam) and Root Mean Square Propogation (rmsprop) techniques. The threshold value is determined as 98% for the training. When the accuracy reaches more than 98%, training is stopped. Calculation of the final validation accuracy is done using trained network. In this study, more than 98% of the predicted labels match the true labels of the validation set. Accuracies are calculated using test data for sgdm, adam and rmsprop techniques. The results are 98.08%, 98.85%, 98.88%, respectively. It is clear that fruits are recognized with good accuracy.
With the development of digital image processing technology, the application scope of image recognition is more and more wide, involving all aspects of life. In particular, the rapid development of urbanization and the popularization and application of automobiles in recent years have led to a sharp increase in traffic problems in various countries, resulting in intelligent transportation technology based on image processing optimization control becoming an important research field of intelligent systems. Aiming at the application demand analysis of intelligent transportation system, this paper designs a set of high-definition bayonet systems for intelligent transportation. It combines data mining technology and distributed parallel Hadoop technology to design the architecture and analysis of intelligent traffic operation state data analysis. The mining algorithm suitable for the system proves the feasibility of the intelligent traffic operation state data analysis system with the actual traffic big data experiment, and aims to provide decision-making opinions for the traffic state. Using the deployed Hadoop server cluster and the AdaBoost algorithm of the improved MapReduce programming model, the example runs large traffic data, performs traffic analysis and speed–overspeed analysis, and extracts information conducive to traffic control. It proves the feasibility and effectiveness of using Hadoop platform to mine massive traffic information.
In the image recognition field, there are several techniques that allow identifying patterns in digital images, correlation being one of them. In a correlation, you have to obtain an output plane that is as clean as possible. To measure the sharpness of the correlation peak and the cleanliness of the output plane, a performance metric called Peak to Correlation Energy (PCE) is used.
In this paper, the fractional correlation is applied to recognize real phytoplankton images. This fractional correlation guarantees a higher PCE compared to the conventional correlation. The results of PCE are two-orders of magnitude higher than those obtained with the conventional correlation and manage to identify 91.23% of the images, while the conventional correlation only manages to identify 87.42% of them.
This methodology was tested using images in salt and pepper or Gaussian noise, and the fractional correlation output plane always is cleaner and generates a better-defined correlation peak when compared with the classical correlation.
For the logistics sorting warehouse without much light is complex, and the difference between express packaging is not obvious, a fast recognition method of sorting images based on deep learning and dual tree complex wavelet transform was studied. Sorting images are not very clear due to factors such as the enclosed environment and the weak lighting conditions of the warehouse. First, the dual tree complex wavelet transform is used to preprocess the sorting image for noise reduction and other image preprocessing. Second, a convolutional neural network (CNN) was designed. On the basis of Alexnet neural network, parameters of convolutional layer, ReLU layer and pooling layer of CNN are redefined to accelerate the learning speed of neural network. Lastly, according to the new image classification task, the last three layers of the neural network, the full connection layer, the softmax layer and the classification output layer are defined to adapt to the new image recognition. The proposed fast sorting image recognition technology based on deep learning has higher training speed and recognition accuracy in the face of more complicated sorting image recognition, which can meet the experimental requirements. Rapid identification of sorting images is of great significance to improve the efficiency of logistics in unmanned warehouses.
Real-time and accurate measurement of coal quantity is the key to energy-saving and speed regulation of belt conveyor. The electronic belt scale and the nuclear scale are the commonly used methods for detecting coal quantity. However, the electronic belt scale uses contact measurement with low measurement accuracy and a large error range. Although nuclear detection methods have high accuracy, they have huge potential safety hazards due to radiation. Due to the above reasons, this paper presents a method of coal quantity detection and classification based on machine vision and deep learning. This method uses an industrial camera to collect the dynamic coal quantity images of the conveyor belt irradiated by the laser transmitter. After preprocessing, skeleton extraction, laser line thinning, disconnection connection, image fusion, and filling, the collected images are processed to obtain coal flow cross-sectional images. According to the cross-sectional area and the belt speed of the belt conveyor, the coal volume per unit time is obtained, and the dynamic coal quantity detection is realized. On this basis, in order to realize the dynamic classification of coal quantity, the coal flow cross-section images corresponding to different coal quantities are divided into coal type images to establish the coal quantity data set. Then, a Dense-VGG network for dynamic coal classification is established by the VGG16 network. After the network training is completed, the dynamic classification performance of the method is verified through the experimental platform. The experimental results show that the classification accuracy reaches 94.34%, and the processing time of a single frame image is 0.270s.
Deep learning algorithms have shown superior performance than traditional algorithms when dealing with computationally intensive tasks in many fields. The algorithm model based on deep learning has good performance and can improve the recognition accuracy in relevant applications in the field of computer vision. TensorFlow is a flexible opensource machine learning platform proposed by Google, which can run on a variety of platforms, such as CPU, GPU, and mobile devices. TensorFlow platform can also support current popular deep learning models. In this paper, an image recognition toolkit based on TensorFlow is designed and developed to simplify the development process of more and more image recognition applications. The toolkit uses convolutional neural networks to build a training model, which consists of two convolutional layers: one batch normalization layer before each convolutional layer, and the other pooling layer after each convolutional layer. The last two layers of the model use the full connection layer to output recognition results. Batch gradient descent algorithm is adopted in the optimization algorithm, and it integrates the advantages of both the gradient descent algorithm and the stochastic gradient descent algorithm, which greatly reduces the number of convergence iterations and has little influence on the convergence effect. The total training parameters of the toolkit model reach 1.7 million. In order to prevent overfitting problems, the dropout layer before each full connection layer is added and the threshold of 0.5 is set in the design. The convolution neural network model is trained and tested by the MNIST set on TensorFlow. The experimental result shows that the toolkit achieves the recognition accuracy of 99% on the MNIST test set. The development of the toolkit provides powerful technical support for the development of various image recognition applications, reduces its difficulty, and improves the efficiency of resource utilization.
Aggregate depth residual network (ResNeXt) can not only improve the accuracy without increasing the parameter complexity, but also reduce the number of super parameters. It is one of the popular convolutional neural network models for image recognition. Maize diseases have a great impact on maize yield, quality and farmers’ income. Rapid and effective identification of the severity of maize diseases plays an important role in accurate control and accurate drug use. The general ResNeXt model has large spots in extracting image features, but the spots of corn diseases are small and the extracted features are not obvious, which affects the recognition accuracy. Therefore, an improved ResNeXt model is proposed to recognize the occurrence degree of corn diseases. First, the original data of maize disease degree are classified according to national standards. Second, the original data are extended through data enhancement. Third, the original ResNeXt101 model is improved. The first layer convolution kernel is changed to three 3 * 3 convolution kernels, and the cardinality is adjusted to 64. Finally, the improved model is verified. The recognition accuracy of corn disease degree is 89.667%, which is 0.98% higher than the original model. Through testing on 276 actually collected corn disease images, the recognition accuracy is 90.22%. Therefore, this method is feasible for the diagnosis of corn disease degree and can provide an important basis for accurate prevention and control.
This paper presents a technique for image recognition, reconstruction, and processing using a novel massively parallel system. This device is a physical implementation of a Boltzmann machine type of neural network based on the use of magnetic thin films and opto-magnetic control. Images or patterns in the form of pixel arrays are imposed on the magnetic film using a laser in an external magnetic field. These images are learned and can be recalled later when a similar image is presented. A stored image is recallable even when a partial, noisy, or corrupted version of that image is imposed on the film. The system can also be used for feature detection and image compression. The operation and construction of the physical system is described, together with a discussion of the physical basis for its operation.
The authors have developed Monte Carlo style computer simulations of the system for a variety of platforms, including serial workstations and hypercube configured parallel systems. They describe here some of the factors involved in computer simulations of the system, which can be fast and relatively simple in implementation. Simulation results are presented and, in particular, the behavior of the model under simulated annealing in the light of statistical physics is discussed. The simulation itself can be used as a neural network model capable of the functions ascribed to the physical device.
We make a comparision of classification ability between BPN (BackPropagation Neural Network) and k-NN (k-Nearest Neighbor) classification methods. Voice data and patellar subluxation images are used. The result was that the average recognition rate of BPN was 9.2 percent higher than that of the k-NN classification method. Although k-NN classification is simple in theory, classification time was fairly long. Therefore, it seems that real time recognition is difficult. On the other hand, the BPN method has a long learning time but a very short recognition time. Especially if the number of dimensions of the samples is large, it can be said that BPN is better than k-NN in classification ability.
This paper presents a methodology for integrating connectionist and symbolic approaches to 2D image recognition. The proposed integration paradigm exploits the synergy of the two approaches for both the training and the recognition phases of an image recognition system. In the training phase, a symbolic module provides an approximate solution to a given image-recognition problem in terms of symbolic models. Such models are hierarchically organized into different abstraction levels, and include contextual descriptions. After mapping such models into a complex neural architecture, a neural training process is carried out to optimize the solution of the recognition problem. The so-obtained neural networks are used during the recognition phase for pattern classification. In this phase, the role of symbolic modules consists of managing complex aspects of information processing: abstraction levels, contextual information, and global recognition hypotheses. A hybrid system implementing the proposed integration paradigm is presented, and its advantages over single approaches are assessed. Results on Magnetic Resonance image recognition are reported, and comparisons with some well-known classifiers are made.
A new algebraic feature extraction method for image recognition is presented. The optimal transform of image matrices is proposed to extract the features from images. The Frobenius norm of matrices is first introduced as a measure of the distance between two matrices. Based on this, the within-class and between-class distances of image samples are defined. The ratio of the between-class and within-class distances of the transformed image sample set is taken as the criterion function J(T). The optimal transform matrix T is calculated by maximizing J(T) under some constraints. Experiments have been conducted to recognize both human face and handwritten character images. These results indicate that the algebraic features extracted by the present method possess a very strong discriminant power. An important conclusion about the present method is that the traditional linear discriminant method can be considered as a special case of the present feature method when image samples have only one column of vectors.
Existing tobacco curing process assumes a uniform distribution of temperature and humidity in a barn without considering surface, texture, and biochemical properties of leaves, leading to low quality or even inferior end products. This paper proposes a novel curing process by combining image recognition and data analysis techniques that aims to intelligently improve curing quality of tobacco leaves. Specifically, an image recognition technique is first proposed to classify tobacco leaves and determine their placement in a curing barn. Then, data analysis of the biochemical spectrum of the tobacco leaves are conducted to correlate the temperature and humidity with biochemical data features. Extensive experimental results show that proposed curing process achieves 98.68% accuracy in image recognition for tobacco position control and provides an accurate mapping between tobacco state and biochemical spectrum signals.
In the operation of railway vehicles, the quality of bogies directly affects the operation quality and driving safety. Wheel set is one of the most important components in bogie, so the maintenance of wheel set is very important. For a long time, the detection of train wheel sets in China is still in the stage of manual measurement with backward technology and low efficiency. A new automatic detection method of wheel flange tread based on fuzzy neural network image processing technology is proposed in this paper. This method can accurately detect the defects of wheel flange tread. It collects the original image of the tested wheel set through the digital camera, inputs it into the computer, through certain calculation and processing, and compares it with the model established based on fuzzy neural network, so as to detect the defects of wheel flange and tread. First, the research status of wheel tread defect detection is summarized. Second, the basic principles of digital image technology are studied, the image processing models are confirmed, and the image processing method based on fuzzy neural network is established. Finally, eight wheel set treads are selected to carry out defect detection, and the analysis results show that the proposed method can obtain the better inspection precision.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.