Please login to be able to save your searches and receive alerts for new content matching your search criteria.
The monitoring and maintenance of grid equipment have become increasingly crucial due to the continual progress in smart grid technology. Efficient identification technology for grid equipment is crucial for enabling equipment status monitoring and fault diagnosis, directly influencing the operational stability of the grid concerning precision and timely functionality. Nevertheless, the reliance of current image recognition methods on intricate models and extensive computational resources poses implementation challenges in resource-limited field environments, thereby restricting their use in operations such as drone-based power line inspections. In response to this obstacle, the paper introduces a streamlined identification approach for grid equipment through model compression. This method aims to uphold recognition precision while minimizing the computational workload and storage demands of the model, making it well-suited for integration into drone-based power line inspections. Introducing a target recognition network, this method integrates tailored multi-scale information for grid equipment and embeds an attention mechanism within the network to enhance the model’s capacity for identifying crucial features. Expanding on this approach, model compression techniques are utilized to condense the trained model. This process maintains accuracy by removing redundant weights, thereby shrinking the model’s size and computational complexity, ultimately achieving a lightweight network.
We propose a general framework for broadcasting in ad hoc networks through self-pruning. The approach is based on selecting a small subset of hosts (also called nodes) to form a forward node set on carry out a broadcast process. Each node, upon receiving a broadcast packet, determines whether to forward the packet or not based on two neighborhood coverage conditions proposed in this paper. These coverage conditions depend on neighbor connectivity and history of visited nodes, and in general, resort to global network information. Using local information such as k-hop neighborhood information, the forward node set is selected through a distributed and local pruning process. The forward node set can be constructed and maintained through either a proactive process (i.e., "up-to-date") or a reactive process (i.e., "on-the-fly"). Several existing broadcast algorithms can be viewed as special cases of the coverage conditions with k-hop neighborhood information. Simulation results show that new algorithms, which are more efficient than existing ones, can be derived from the coverage conditions, and self-pruning based on 2- or 3-hop neighborhood information is relatively more cost-effective.
The global extended Kalman filtering (EKF) algorithm for recurrent neural networks (RNNs) is plagued by the drawback of high computational cost and storage requirement. In this paper, we present a local EKF training-pruning approach that can solve this problem. In particular, the by-products, obtained along with the local EKF training, can be utilized to measure the importance of the network weights. Comparing with the original global approach, the proposed local approach results in much lower computational cost and storage requirement. Hence, it is more practical in solving real world problems. Simulation showed that our approach is an effective joint-training-pruning method for RNNs under online operation.
This paper describes a new method for pruning artificial neural networks, using a measure of the neural complexity of the neural network. This measure is used to determine the connections that should be pruned. The measure computes the information-theoretic complexity of a neural network, which is similar to, yet different from previous research on pruning. The method proposed here shows how overly large and complex networks can be reduced in size, whilst retaining learnt behaviour and fitness. The technique proposed here helps to discover a network topology that matches the complexity of the problem it is meant to solve. This novel pruning technique is tested in a robot control domain, simulating a racecar. It is shown, that the proposed pruning method is a significant improvement over the most commonly used pruning method Magnitude Based Pruning. Furthermore, some of the pruned networks prove to be faster learners than the benchmark network that they originate from. This means that this pruning method can also help to unleash hidden potential in a network, because the learning time decreases substantially for a pruned a network, due to the reduction of dimensionality of the network.
While feedforward neural networks have been widely accepted as effective tools for solving classification problems, the issue of finding the best network architecture remains unresolved, particularly so in real-world problem settings. We address this issue in the context of credit card screening, where it is important to not only find a neural network with good predictive performance but also one that facilitates a clear explanation of how it produces its predictions. We show that minimal neural networks with as few as one hidden unit provide good predictive accuracy, while having the added advantage of making it easier to generate concise and comprehensible classification rules for the user. To further reduce model size, a novel approach is suggested in which network connections from the input units to this hidden unit are removed by a very straightaway pruning procedure. In terms of predictive accuracy, both the minimized neural networks and the rule sets generated from them are shown to compare favorably with other neural network based classifiers. The rules generated from the minimized neural networks are concise and thus easier to validate in a real-life setting.
This paper presents a pruning method for artificial neural networks (ANNs) based on the 'Lempel-Ziv complexity' (LZC) measure. We call this method the 'silent pruning algorithm' (SPA). The term 'silent' is used in the sense that SPA prunes ANNs without causing much disturbance during the network training. SPA prunes hidden units during the training process according to their ranks computed from LZC. LZC extracts the number of unique patterns in a time sequence obtained from the output of a hidden unit and a smaller value of LZC indicates higher redundancy of a hidden unit. SPA has a great resemblance to biological brains since it encourages higher complexity during the training process. SPA is similar to, yet different from, existing pruning algorithms. The algorithm has been tested on a number of challenging benchmark problems in machine learning, including cancer, diabetes, heart, card, iris, glass, thyroid, and hepatitis problems. We compared SPA with other pruning algorithms and we found that SPA is better than the 'random deletion algorithm' (RDA) which prunes hidden units randomly. Our experimental results show that SPA can simplify ANNs with good generalization ability.
Deep Convolutional Neural Networks (CNNs) show remarkable performance in many areas. However, most of the applications require huge computational costs and massive memory, which are hard to obtain in devices with a relatively weak performance like embedded devices. To reduce the computational cost, and meantime, to preserve the performance of the trained deep CNN, we propose a new filter pruning method using an additional dataset derived by downsampling the original dataset. Our method takes advantage of the fact that information in high-resolution images is lost in the downsampling process. Each trained convolutional filter reacts differently to this information loss. Based on this, the importance of the filter is evaluated by comparing the gradient obtained from two different resolution images. We validate the superiority of our filter evaluation method using a VGG-16 model trained on CIFAR-10 and CUB-200-2011 datasets. The pruned network with our method shows an average of 2.66% higher accuracy in the latter dataset, compared to existing pruning methods when about 75% of the parameters are removed.
Learning when limited to modification of some parameters has a limited scope; capability to modify the system structure is also needed to get a wider range of the learnable. In the case of artificial neural networks, learning by iterative adjustment of synaptic weights can only succeed if the network designer predefines an appropriate network structure, i.e. the number of hidden layers, units, and the size and shape of their receptive and projective fields. This paper advocates the view that the network structure should not, as is usually done, be determined by trial-and-error but should be computed by the learning algorithm. Incremental learning algorithms can modify the network structure by addition and/or removal of units and/or links. A survey of current connectionist literature is given on this line of thought. “Grow and Learn” (GAL) is a new algorithm that learns an association at one shot due to its being incremental and using a local representation. During the so-called “sleep” phase, units that were previously stored but which are no longer necessary due to recent modifications are removed to minimize network complexity. The incrementally constructed network can later be finetuned off-line to improve performance. Another method proposed that greatly increases recognition accuracy is to train a number of networks and vote over their responses. The algorithm and its variants were tested on recognition of handwritten numerals and seem promising especially in terms of learning speed. This makes the algorithm attractive for on-line learning tasks, e.g. in robotics. The biological plausibility of incremental learning is also discussed briefly.
Having more hidden units than necessary can produce a neural network that has a poor generalization. This paper proposes a new algorithm for pruning unnecessary hidden units away from the single-hidden layer feedforward neural networks, resulting in a Spartan network. Our approach is simple and easy to implement, yet produces a very good result. The idea is to train the network until it begins to lose its generalization. Then the algorithm measures the sensitivity and automatically prunes away the most irrelevant unit. We define this sensitivity as the absolute difference between the desirable output and the output of the pruned network. Unlike other pruning methods, our algorithm is distinct in calculating the sensitivity from the validation set, instead of the training set, without increasing the asymptotic time complexity of the back-propagation algorithm. In addition, for a classification problem, we raise a point that the sensitivities of some well-known pruning algorithms may still underestimate the irrelevance of hidden unit even though the validation set is used in measuring the sensitivity. We resolve this problem by considering the number of misclassified patterns as the main concern. The Spartan simplicity algorithm is applied to three artificial and seven standard benchmarks. In most problems, the algorithm can produce a compact-sized network with high generalization ability in comparison with other pruning algorithms.
Meta-learning has been widely used in medical image analysis. However, it requires a large amount of storage space and computing resources to train and use neural networks, especially model-agnostic meta-learning (MAML) models, making networks difficult to deploy on embedded systems and low-power devices for smart healthcare. Aiming at this problem, we explore to compress a MAML model with pruning methods for disease diagnosis. First, for each task, we find unimportant and redundant connections in MAML for its classification, respectively. Next, we find common unimportant connections for most tasks with intersections. Finally, we prune the common unimportant connections of the initial network. We conduct some experiments to assess the proposed model by comparison with MAML on Omniglot dataset and MiniImagenet dataset. The results show that our method reduces 40% parameters of the raw models, without incurring accuracy loss, demonstrating the potential of the proposed method for disease diagnosis.
Deep neural network has made surprising achievements in natural language processing, image pattern classification recognition, and other domains in the last few years. It is still tough to apply to hardware-constrained or mobile equipment because of the huge number of parameters, high storage as well as computing costs. In this paper, a new sparse iteration neural network architecture is proposed. First, the pruning method is used to compress the model size and make the network sparse. Then the architecture is iterated on the sparse network model, and the network performance is improved without adding additional parameters. Finally, the hybrid deep learning model was carried out on CV tasks and NLP tasks on ANN, CNN, and Transformer. Compared with the sparse network architecture, we finally found that the accuracy of the MINST, CIFAR10, PASCAL VOC 2012, and SQuAD datasets is improved by 0.47%, 0.64%, 3.75%, and 15.06%, respectively.
Testing activities for software product lines should be different from that of single software systems, due to significant differences between software product line engineering and single software system development. The cost of testing in software product line is generally higher compared with single software systems; therefore, there should exist a certain balance between cost, quality of final products, and the time of performing testing activities. As decreasing testing cost is an important challenge in software product line integration testing, the contribution of this paper is in introducing a method for early integration testing in software product lines based on feature model (FM) by prioritizing test cases in order to decrease integration testing costs in SPLs. In this method, we focus on reusing domain engineering artifacts and prioritized selection and execution of integration test cases. It also uses separation of concerns and pruning techniques on FMs to help prioritize the test cases. The method shows to be promising when applied to some case studies in the sense that it decreases the costs of performing integration test by about 82% and also detects about 44% of integration faults in domain engineering.
Error based pruning can be used to prune a decision tree and it does not require the use of validation data. It is implemented in the widely used C4.5 decision tree software. It uses a parameter, the certainty factor, that affects the size of the pruned tree. Several researchers have compared error based pruning with other approaches, and have shown results that suggest that error based pruning results in larger trees that give no increase in accuracy. They further suggest that as more data is added to the training set, the tree size after applying error based pruning continues to grow even though there is no increase in accuracy. It appears that these results were obtained with the default certainty factor value. Here, we show that varying the certainty factor allows significantly smaller trees to be obtained with minimal or no accuracy loss. Also, the growth of tree size with added data can be halted with an appropriate choice of certainty factor. Methods of determining the certainty factor are discussed for both small and large data sets. Experimental results support the conclusion that error based pruning can be used to produce appropriately sized trees with good accuracy when compared with reduced error pruning.
Extreme learning machine (ELM) is an efficient training algorithm for single-hidden layer feed-forward neural networks (SLFNs). Two pruned-ELM named P-ELM1 and P-ELM2 are proposed by Rong et al. P-ELM1 and P-ELM2 employ χ2 and information gain to measure the association between the class labels and individual hidden node respectively. But for the continuous value data sets, it is inevitable for P-ELM1 and P-ELM2 to evaluate the probability distributions of the data sets with discretization methods for calculating χ2 and information gain, while the discretization will lead to information loss. Furthermore, the discretization will result in high computational complexity. In order to deal with the problems, based on tolerance rough sets, this paper proposed an improved pruned-ELM algorithm, which can overcome the drawbacks mentioned above. Experimental results along with statistical analysis on 8 UCI data sets show that the improved algorithm outperforms the pruned-ELM in computational complexity and testing accuracy.
The amount of voltage fault data collection is limited to signal acquisition instruments and simulation software. Generative adversarial networks (GAN) have been successfully applied to the data generation tasks. However, there is no theoretical basis for the selection of the network structure and parameters of generators and discriminators in these GANs. It is difficult to achieve the optimal selection basically by experience or repeated attempts, resulting in high cost and time-consuming deployment of GAN computing in practical applications. The existing methods of neural network optimization are mainly used to compress and accelerate the deep neural network in classification tasks. Due to different goals and training processes, they cannot be directly applied to the data generation task of GAN. In the three-generation scenario, the hidden layer filter nodes of the initial GAN generator and discriminator are growing firstly, then the GAN parameters after the structure adjustment are optimized by particle swarm optimization (PSO), and then the node sensitivity is analyzed. The nodes with small contribution to the output are pruned, and then the GAN parameters after the structure adjustment are optimized using PSO algorithm to obtain the GAN with optimal structure and parameters (GP-PSO-GAN). The results show that GP-PSO-GAN has good performance. For example, the simulation results of generating unidirectional fault data show that the generated error of GP-PSO-GAN is reduced by 70.4% and 15.2% compared with parameters optimization only based on PSO (PSO-GAN) and pruning- PSO-GAN (P-PSO-GAN), respectively. The convergence curve shows that GP-PSO-GAN has good convergence.
The art of mimicking a human’s responses and behavior in a programming machine is called Artificial intelligence (AI). AI has been incorporated in games in such a way to make them interesting, especially in chess games. This paper proposes a hybrid optimization tuned neural network (NN) to establish a winning strategy in the chess game by generating the possible next moves in the game. Initially, the images from Portable Game Notation (PGN) file are used to train the NN classifier. The proposed Locust Mayfly algorithm is utilized to optimally tune the weights of the NN classifier. The proposed Locust Mayfly algorithm inherits the characteristic features of hybrid survival and social interacting search agents. The NN classifier involves in finding all the possible moves in the board, among which the best move is obtained using the mini-max algorithm. At last, the performance of the proposed Locust mayfly-based NN method is evaluated with help of the performance metrics, such as specificity, accuracy, and sensitivity. The proposed Locust mayfly-based NN method attained a specificity of 98%, accuracy of 98%, and a sensitivity of 98%, which demonstrates the productiveness of the proposed mayfly-based NN method in pruning.
A recently developed quantitative model of cortical activity is used that permits data comparison with experiment using a quantitative and standardized means. The model incorporates properties of neurophysiology including axonal transmission delays, synapto-dendritic rates, range-dependent connectivities, excitatory and inhibitory neural populations, and intrathalamic, intracortical, corticocortical and corticothalamic pathways. This study tests the ability of the model to determine unique physiological properties in a number of different data sets varying in mean age and pathology. The model is used to fit individual electroencephalographic (EEG) spectra from post-traumatic stress disorder (PTSD), schizophrenia, first episode schizophrenia (FESz), attention deficit hyperactivity disorder (ADHD), and their age/sex matched controls. The results demonstrate that the model is able to distinguish each group in terms of a unique cluster of abnormal parameter deviations. The abnormal physiology inferred from these parameters is also consistent with known theoretical and experimental findings from each disorder. The model is also found to be sensitive to the effects of medication in the schizophrenia and FESz group, further supporting the validity of the model.
Many traditional pruning methods assume that all the datasets are equally probable and equally important, so they apply equal pruning to all the datasets. However, in real-world classification problems, all the datasets are not equal and considering equal pruning rate during pruning tends to generate a decision tree with a large size and high misclassification rate.
In this paper, we present a practical algorithm to deal with the data specific classification problem when there are datasets with different properties. Another key motivation of the data specific pruning in the paper is "trading accuracy and size". A new algorithm called Expert Knowledge Based Pruning (EKBP) is proposed to solve this dilemma. We proposed to integrate error rate, missing values and expert judgment as factors for determining data specific pruning for each dataset. We show by analysis and experiments that using this pruning, we can scale both accuracy and generalisation for the tree that is generated. Moreover, the method can be very effective for high dimensional datasets. We conduct an extensive experimental study on openly available 40 real world datasets from UCI repository. In all these experiments, the proposed approach shows considerably reduction of tree size having equal or better accuracy compared to several benchmark decision tree methods that are proposed in literature.
Associative classification (AC) is a promising data mining approach that integrates classification and association rule discovery to build classification models (classifiers). In the last decade, several AC algorithms have been proposed such as Classification based Association (CBA), Classification based on Predicted Association Rule (CPAR), Multi-class Classification using Association Rule (MCAR), Live and Let Live (L3) and others. These algorithms use different procedures for rule learning, rule sorting, rule pruning, classifier building and class allocation for test cases. This paper sheds the light and critically compares common AC algorithms with reference to the abovementioned procedures. Moreover, data representation formats in AC mining are discussed along with potential new research directions.
Diversity is a key component for building a successful ensemble classifier. One approach to diversifying the base classifiers in an ensemble classifier is to diversify the data they are trained on. While sampling approaches such as bagging have been used for this task in the past, we argue that since they maintain the global distribution, they do not create diversity. Instead, we make a principled argument for the use of k-means clustering to create diversity. Expanding on previous work, we observe that when creating multiple clusterings with multiple k values, there is a risk of different clusterings discovering the same clusters, which would in turn train the same base classifiers. This would bias the ensemble voting process. We propose a new approach that uses the Jaccard Index to detect and remove similar clusters before training the base classifiers, not only saving computation time, but also reducing classification error by removing repeated votes. We empirically demonstrate the effectiveness of the proposed approach compared to the state of the art on 19 UCI benchmark datasets.