The liver is one of the vital organs of the human body, and its location detection is of great significance for computer-aided diagnosis. There are two problems in applying the existing algorithms based on convolution neural network directly to liver detection. One is that pooling operation in the convolutional layer, characteristic of the existing algorithms, will result in local information loss, and the other is that direct calculation of area-based pre-defined anchor boxes will cause incomplete alignment of the generated anchor boxes with overall data distribution. As a solution, this paper suggests a liver detection algorithm based on local information fusion. First, area calculations are complemented with the target aspect ratio as a constraint term to generate a predefined anchor box more in line with actual data distribution. Second, the local feature fusion (LFF) structure is proposed to bridge local information loss caused by pooling operation. As the final step, LFF is used to optimize the neural network analyzed in YOLOv3 for liver detection. The experimental results show that the optimized algorithm achieves an average intersection over union (IoU) in liver detection three percentage points higher than the YOLOv3 algorithm. The optimized algorithm proves more accurate in portraying local details. In the object detection of the public data set, Average Precision for medium objects (APm) and Average Precision for large objects (APl) are 2.8% and 1.7% higher than their counterparts derived from the YOLOv3 algorithm, respectively.
Region-of-interest cueing by hyperspectral imaging systems for tactical reconnaissance has emphasized wide area coverage, low false alarm rates, and the search for manmade objects. Because they often appear embedded in complex environments and can exhibit large intrinsic spectral variability, these targets usually cannot be characterized by consistent signatures that might facilitate the detection process. Template matching techniques that focus on distinctive and persistent absorption features, such as those characterizing gases or liquids, prove ineffectual for most hard-body targets. High-performance autonomous detection requires instead the integration of limited and uncertain signature knowledge with a statistical approach. Effective techniques devised in this way using Gaussian models have transitioned to fielded systems. These first-generation algorithms are described here, along with heuristic modifications that have proven beneficial. Higher-performance Gaussian-based algorithms are also described, but sensitivity to parameter selection can prove problematical. Finally, a next-generation parameter-free non-Gaussian method is outlined whose performance compares favorably with the best Gaussian methods.
This paper discusses the research in small target detection in infrared images with heavy clutter background. For most infrared images, ship objects are rather dim in the relative dark sea surface background. The existence of scan line disturbance and noise also increases the difficulty in proper detection. Dim objects must be distinguished from a dark background. On the other hand, the small targets must also be distinguished from clutters. Through analysis of the targets and background, we build characteristic models of small ship objects, noise and sea backgrounds respectively, and indicate their differences in spatial and frequency domains among them. Based on the principles of signal processing, pattern recognition and artificial intelligence, we propose a combined algorithm for detecting sea surface small targets. In this algorithm, components of background and noise are first suppressed by a multilevel filter designed accordingly, meanwhile enhancing the target ones of interest. The pixels of the candidate targets are then discriminated by minimum risk Bayes test. Finally, according to a priori knowledge about the targets such as the ranges of their sizes, the targets of interest can be detected. In particular, the related probability distributions used by statistic decision are obtained by offline learning of typical training samples. Experiments show that the algorithm is excellent for such kinds of target detection and is robust to noise.
Target detection is an important research topic in hyperspectral image processing, and has been the focus of research of many scholars. Compared with many target detection methods, the subspace target detection method is superior. This paper proposes a hyperspectral subspace target detection method LBSE-AMUSE based on the Local Background Subspace Estimation (LBSE) method and the AMUSE algorithm (Algorithm for Multiple Signals Extraction). HyMap airborne hyperspectral remote sensing data is used as the data source, and the performance between the LBSE and LBSE-AMUSE methods is compared by using quantitative indicators such as false alarm rate, detection rate and ROC curve. The experimental results show that the LBSE-AMUSE method has a simple structure, short running time and low false alarm rate compared with the LBSE method, which can suppress the background highlighting target to some extent.
With the popularization of video detection and recognition systems and the advancement of video image processing technology, the application research of intelligent transportation systems based on computer vision technology has received more and more attention. It comprehensively utilizes image processing, pattern recognition, artificial intelligence and other technologies. It also involves processing and analyzing the video image sequence collected by the detection system, intelligently understanding the video content and making processing, and dealing with various problems such as accident information judgment, pedestrian and vehicle classification, traffic flow parameter detection, and moving target tracking. It promotes intelligent transportation systems to be more intelligent and practical, and provides comprehensive, real-time traffic status information for traffic management and control. Therefore, the research on the method of traffic information detection based on computer vision has important theoretical and practical significance. The detection and recognition of video targets is an important research direction in the field of intelligent transportation and computer vision. However, due to the background complexity, illumination changes, target occlusion and other factors in the detection and recognition environment, the application still faces many difficulties, and the robustness and accuracy of detection and recognition need to be further improved. In this paper, several key problems in video object detection and recognition are studied, including accurate segmentation of target and background, shadow in complex scenes; accurate classification of extracted foreground targets; and target recognition in complex background. In response to these problems, this paper proposes a corresponding solution.
Watermelon is a crop susceptible to diseases. Rapid and effective detection of watermelon diseases is of great significance to ensure the yield of watermelon. Aiming at the interference of the environment and obstacles in the natural environment, resulting in low target detection accuracy and poor robustness, this paper takes watermelon leaves as the research object, considering anthracnose, leaf blight, leaf spot and normal leaves as examples. A disease recognition method based on deep learning is proposed. This paper has improved the pre-selected box setting formula of the SSD model and tested it in multiple SSD models. Experiments show that the average accuracy of the final SSD768 model is 92.4%, and the average accuracy of the IOU is 88.9%. It shows that this method can be used to detect watermelon diseases in natural environment.
As technological advancements progress and energy conservation and emission reduction policies gain traction, an increasing amount of clean energy is being integrated into the power grid system. This influx of new energy imposes stringent demands on the transmission lines within the power grid system. In recent years, the State Grid has implemented a plethora of intelligent transmission line inspection strategies, with the intelligent inspection of Unmanned Aerial Vehicle (UAV) transmission lines receiving significant promotion and widespread application. However, practical application has revealed that the prevalent transmission line detection algorithms yield a substantial quantity of false detections, particularly in the detection of nut defects in small-sized metallic fittings, voltage balancing ring defects, and defects in uninsulated conductors. To address these issues, this paper employs deep learning algorithms for target detection, critical point detection, and instance segmentation, focusing on aspects such as algorithmic logic, algorithmic models, and data processing. The aim is to enhance the precision of these three types of defect detection, diminish the rate of false detections, and augment the practicality of intelligent grid inspection.
Social distance monitoring is of great significance for public health in the era of COVID-19 pandemic. However, existing monitoring methods cannot effectively detect social distance in terms of efficiency, accuracy, and robustness. In this paper, we proposed a social distance monitoring method based on an improved YOLOv4 algorithm. Specifically, our method constructs and pre-processes a dataset. Afterwards, our method screens the valid samples and improves the K-means clustering algorithm based on the IoU distance. Then, our method detects the target pedestrians using a trained improved YOLOv4 algorithm and gets the pedestrian target detection frame location information. Finally, our method defines the observation depth parameters, generates the 3D feature space, and clusters the offending aggregation groups based on the L2 parametric distance to finally realize the pedestrian social distance monitoring of 2D video. Experiments show that the proposed social distance monitoring method based on improved YOLOv4 can accurately detect pedestrian target locations in video images, where the pre-processing operation and improved K-means algorithm can improve the pedestrian target detection accuracy. Our method can cluster the offending groups without going through calibration mapping transformation to realize the pedestrian social distance monitoring of 2D videos.
In this research, a 3D visual recognition system has been developed based on Wei deep learning algorithm using GPU. The proposed system consisted of a GPU with depth image function library, which performed image data acquisition, depth information operation, coordinate conversion, image contour search, convolutional class neural network model training, etc., and achieved object pinning by TCP/IP communication with motion control system. The obtained experimental results revealed that the recognition rate of the developed algorithm for target objects at different positions was as high as 92%. Experimental target recognition rates for different angles were relatively low, but reached 87%, and experimental accuracy rates of different luminance values also reached 89%. The errors of robot hand clamping targets also fell within 1–4mm, which were higher than experimental expectation.
To address the problem of missed diagnosis in rib fracture detection from CT scans, this study introduces an enhanced model, called Faster-RCNN-SE-FA, which is built upon the traditional Faster-RCNN architecture. The proposed model integrates a novel filter anchor method and thoroughly considers the specific imaging characteristics of ribs in CT images. The preprocessing of the image is followed by applying the Squeeze-and-Excitation (SE) module, which enhances the discrimination of features in the channel dimension while preserving the location Sensitivity (Sen) important for target detection tasks. Consequently, this modification leads to a significant improvement in model performance. Empirical experiments, conducted on CT sequences of 130 rib feature cases provided by the First Affiliated Hospital of Ningbo University, demonstrate that the Faster-RCNN-SE-FA model achieves better Sen and accuracy compared to traditional methods, including the baseline Faster-RCNN.
In this paper, a new method is presented for joint estimation of delay and Doppler in bistatic radars by using model-based signal processing framework. In the proposed method, the time history of each cell in ambiguity function is modeled as a stochastic process consisting of a stationary component caused by clutter and noise as well as a possible transient component which is caused by a target. Then some cells are indicated as candidates for being targeted based on estimating the higher order statistics of the above stochastic processes. Finally, the spatial processing scheme is performed to prune false candidates. To evaluate the proposed method, its performance is simulated under two different scenarios (i.e., slow and fast moving targets) which have been inspirited from real conditions. Comparison of the results obtained from the proposed method with its alternatives, shows its better performance in such a way that it detects fast targets at least 8% better than the best among all examined methods. Furthermore, the proposed method detects slow targets at least 4.3% better than the best among all examined methods. Both of the above superiorities are obtained in those conditions while the false alarm of the proposed method does not show the meaningful difference against other examined algorithms.
Intelligent agriculture has become the development trend of agriculture in the future, and it has a wide range of research and application scenarios. Using machine learning to complete basic tasks for people has become a reality, and this ability is also used in machine vision. In order to save the time in the fruit picking process and reduce the cost of labor, the robot is used to achieve the automatic picking in the orchard environment. Cherry target detection algorithms based on deep learning are proposed to identify and pick cherries. However, most of the existing methods are aimed at relatively sparse fruits and cannot solve the detection problem of small and dense fruits. In this paper, we propose a cherry detection model based on YOLOv5s. First, the shallow feature information is enhanced by convolving the feature maps sampled by two times down in BackBone layer of the original network model to the input end of the second and third CSP modules. In addition, the depth of CSP module is adjusted and RFB module is added in feature extraction stage to enhance feature extraction capability. Finally, Soft- Non-Maximum Suppression (Soft-NMS) is used to minimize the target loss caused by occlusion. We test the performance of the model, and the results show that the improved YOLOv5s-cherry model has the best detection performance for small and dense cherry detection, which is conducive to intelligent picking.
Automatic fault diagnosis for power system equipment has always been an essential concern in this industry. Conventionally, such works are conducted by manual patrol inspection, which consumes much human labor and expert knowledge. Fortunately, infrared images can present diagnosis areas inside the equipment via the thermal sensing function. In such context, this work utilizes deep neural network to construct a specific infrared image processing framework that can realize automatic fault diagnosis. Thus in this paper, a deep learning-based fault diagnosis approach for power system equipment via infrared image sensing is proposed. First, a pulse-coupled neural network structure is employed to enhance feature representation for infrared images of the equipment. Next, a fuzzy C-means (FCM)-based segmentation method is developed to filter diagnosis areas from the infrared images. Finally, a convolution operation-based fault diagnosis operator is adopted to identify the diagnosis types. After that, some simulation experiments are conducted on real-world infrared images on the power system equipment, in order to make the performance evaluation of the proposed approach. The proposal realizes the end-to-end process of feature extraction and fault detection and identification, and avoids the problems of single feature. It is due to manual extraction of fault features, and the inability to detect and identify faults effectively in specific situations and scenarios.
It is of great need to accurately locate vessels and oil platforms because huge numbers of these marine man-made targets can cause oil spills or illegal invasion problems. Many different types of marine targets in the complex marine environment make it very difficult to deal with the target detection and classification process accurately. This paper proposes a robust Laplacian of Gaussian operator connected domain controller squeeze excitation residual network (LCSE-ResNet) for marine man-made target classification based on optical remote sensing imagery. Vessel and oil platform candidate regions in remote sensing images are extracted by the Laplacian of Gaussian (LoG) operator and the connected domain controller, which are input into the classification neural networks in the following. It is a whole End-to-End structure from original remote sensing images to the precise target information. That clouds, ripples, and the other man-made target disturbances can be excluded effectively means the better robust character in the actual application. Some shape and size features of vessels and oil platforms are considered in the LCSE-ResNet structure, which improves the explanation of the final results. Experimental results demonstrate that the proposed LCSE-ResNet can effectively detect marine man-made targets and effectively distinguish vessels and oil platforms.
The purpose of the study is to obtain more powerful data in the process of urban planning, and more excellent building recognition algorithms for the high-point monitoring images in the city.26 A new algorithm based on Faster R-CNN (Faster Regions-Convolutional Neural Networks) is proposed, and the optimization methods involved are systematically described, and then a simple description of the Faster R-CNN algorithm is made for optimization. Subsequently, experiments are carried out on the MS COCO (Microsoft Common Objects in Context) datasets, and a series of relevant experimental demonstrations are conducted on the optimization scheme. Different features are used to extract networks; some practical optimization methods are added and the training method is modified; the speed and accuracy are paid attention to, and the expected goal of target detection is achieved. Based on this high-point monitoring image, the speed and accuracy of building recognition are greatly optimized. Urban managers will have more reliable information in urban planning, which has a positive impact on urban development.
Sea clutter refers to the radar returns from a patch of ocean surface. Accurate modeling of sea clutter and robust detection of low observable targets within sea clutter are important problems in remote sensing and radar signal processing applications. Due to lack of fundamental understanding of the nature of sea clutter, however, no simple and effective methods for detecting targets within sea clutter have been proposed. To help solve this important problem, we apply three types of fractal scaling analyses, fluctuation analysis (FA), detrended fluctuation analysis (DFA), and the wavelet-based fractal scaling analysis to study sea clutter. Our analyses show that sea clutter data exhibit fractal behaviors in the time scale range of about 0.01 seconds to a few seconds. The physical significance of these time scales is discussed. We emphasize that time scales characterizing fractal scaling break are among the most important features for detecting patterns using fractal theory. By systematically studying 392 sea clutter time series measured under various sea and weather conditions, we find very effective methods for detecting targets within sea clutter. Based on the data available to us, the accuracy of these methods is close to 100%.
Fast and robust vehicle recognition from remote sensing images (RSIs) has excellent economic analysis, emergency management, and traffic surveillance applications. Additionally, vehicle density and location data are vital for intelligent transportation systems. However, correct and robust vehicle recognition in RSIs has been complex. Conventional vehicle recognition approaches depend on handcrafted extracted features in sliding windows with distinct scales. Recently, the convolutional neural network can be executed for aerial image object recognition, and it has accomplished promising outcomes. This research projects an automatic vehicle detection and classification utilizing an imperialist competitive algorithm with a deep convolutional neural network (VDC-ICADCNN) technique. The primary purpose of the VDC-ICADCNN technique is to develop the RSI and apply deep learning (DL) models for the recognition and identification of vehicles. Three main procedures were involved in the presented VDC-ICADCNN technique. At the primary stage, the VDC-ICADCNN technique employs the EfficientNetB7 approach for the feature extractor. Then, the hyperparameter fine-tuning of the EfficientNet approach takes place utilizing ICA, which aids in attaining improved performance in the classification process. The VDC-ICADCNN technique utilizes a variational autoencoder (VAE) model for the vehicle recognition method. Extensive experiments can be implemented to establish the superior solution of the VDC-ICADCNN technique. The obtained outcomes of the VDC-ICADCNN technique highlighted a superior accuracy value of 96.77% and 98.59% with other DL approaches under Vehicle Detection in Aerial Imagery (VEDAI) and Potsdam datasets.
This research proposed a methodology for specifying the location of an object with image processing. The objectives of this methodology are to capture the target area, and specify the location of the object by using image. In order to locate the dropping object on the image plane efficiently, consecutive images are analyzed and a threshold operation is proposed. Because the accuracy of the dropping objects location on the difference of consecutive images image plane is usually influenced by noise. Moreover, transformation unit is adopted to map the XY coordinate on image plane into the world coordinate for an accuracy of the dropping objects position. After we get the actual XY coordinate of the dropping object, we can find the distance from the target point (center) and clock direction of the dropping object related to the center also. In addition, by using one digital video camera set on the tower and pan to capture the image on the target area to detect the dropping object from the air to the ground. It made the proposed methodology provide easier portability to detect the dropping object in any area.
In recent studies, YOLOv3, a deep learning-based target detection algorithm, becomes extensively used in object recognition, especially guiding the visually disabled. Current YOLOv3-based assistive technology for the disabled person can now achieve high-precision, real-time object recognition. Even though this algorithm has several flaws, including the failure to estimate distances and the difficulty of accurately recognizing points in fog or haze, it can perform well in waste management. Therefore, this study proposes an Intelligent Garbage Monitoring Scheme based on an improved YOLOv3 Target Detection Algorithm (IGMS-iYTDA) to classify the IoT’sgarbages (IoT) enabled trash can. The performance of the proposed scheme has been evaluated and illustrated for various classification evaluation metrics. The evaluation results show the highest classification accuracy of 99.9% compared to existing models for the proposed scheme.
In this paper, we consider the problem of detecting the echo from a point target in the presence of reverberation. For its strong correlation to the transmitted signal and non-Rayleigh envelope arising from sparse distribution of scatterers, reverberation decreases the performance of the conventional matched filter (MF) receiver. We propose a nonlinear receiver using the bistable system with a discrete-time autoregressive model of order 1 [AR(1)]. The signals are pre-processed by bistable systems before the MF receiver. Numerical simulations are carried out to verify the performance of our proposed nonlinear receiver versus the conventional MF receiver. The reverberation is generated using the point-scatterer simulation method, and the results show that in reverberation-limited environment, our proposed nonlinear receiver may perform better and much more robust to the scatterer number than the conventional MF receiver.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.