In the evaluation of cervical spine disorders, precise positioning of anatomo-physiological hallmarks is fundamental for calculating diverse measurement metrics. Despite the fact that deep learning has achieved impressive results in the field of keypoint localization, there are still many limitations when facing medical image. First, these methods often encounter limitations when faced with the inherent variability in cervical spine datasets, arising from imaging factors. Second, predicting keypoints for only 4% of the entire X-ray image surface area poses a significant challenge. To tackle these issues, we propose a deep neural network architecture, NF-DEKR, specifically tailored for predicting keypoints in cervical spine physiological anatomy. Leveraging neural memory ordinary differential equation with its distinctive memory learning separation and convergence to a singular global attractor characteristic, our design effectively mitigates inherent data variability. Simultaneously, we introduce a Multi-Resolution Focus module to preprocess feature maps before entering the disentangled regression branch and the heatmap branch. Employing a differentiated strategy for feature maps of varying scales, this approach yields more accurate predictions of densely localized keypoints. We construct a medical dataset, SCUSpineXray, comprising X-ray images annotated by orthopedic specialists and conduct similar experiments on the publicly available UWSpineCT dataset. Experimental results demonstrate that compared to the baseline DEKR network, our proposed method enhances average precision by 2% to 3%, accompanied by a marginal increase in model parameters and the floating-point operations (FLOPs). The code (https://github.com/Zhxyi/NF-DEKR) is available.
A new multi-resolution scheme based on interpolating scaling function (ISF) on adaptive gridding (AG) shows promise in the first principle calculation. It is simpler than the wavelet scheme and fully implements the fast wavelet transformation. Although the scheme is similar to the AG scheme on real space, ISF can represent fields more effectively and needs less grids than the scheme of real space does. This simple and effective method provides an alternative to both real space and wavelet methods in the first principle calculation.
Green plant species identification plays an important role in so many aspects, such as ecological environment protection, Chinese medicine preparation, agricultural and horticultural application, etc. A method on green plants recognition based on wavelet transform and variable local edge patterns (VLEP) is proposed in this paper. Firstly, the original image is decomposed by wavelet transformation. Then texture features are extracted using VLEPs. At the same time, block-based and multi-resolution ideas are considered together to extract features after images are transformed by wavelet. Finally, the fused texture features are classified by the nearest neighbor method. The experimental results show that the proposed method is a promising method for recognizing the common green plants with the natural and complex background compared with the other state-of-the-art methods, and combination of block-based and multi-resolution ideas can further improve the accuracy rate effectively.
In this paper, we propose an end-to-end multi-resolution three-dimensional (3D) capsule network for detecting actions of multiple actors in a video scene. Unlike previous capsule, network-based action recognition does not specifically concern with the individual action of multiple actors in a single scene, our 3D capsule network takes advantage of multi-resolution technique to detect different actions of multiple actors that have different sizes, scales, and aspect ratios. Our 3D capsule network is built on top of 3D convolutional neural network (3DCNN) that extracts spatio-temporal features from video frames inside regions of interest generated by Faster RCNN object detection. We first apply our method to the problem of detecting illegal cheating activities in a classroom examination scene with multiple subjects involved. Second, we test our system on the publicly available and extensively studied UCF-101 dataset. We compare our method with several state-of-the-art 3DCNN-based methods, first the multi-resolution 3DCNN, the single-resolution 3D capsule network, and a combination of both these models. We show that models containing 3D capsule networks have a slight advantage over the conventional 3DCNN and multi-resolution 3DCNN. Our 3D capsule networks not only perform a classification of said actions but also generate videos of single actions. Our experimental results show that the use of multi-resolution pathways in the 3D capsule networks make the result even better. Such findings also hold even when we use pre-trained C3D (convolutional 3D) features to train these networks. We believe that the multiple resolutions capture lower-level features at different scales. At the same time, the 3D capsule layers combine these features in more complex ways than conventional convolutional models.
Face recognition is widely used and is one of the most challenging tasks in computer vision. In recent years, many face recognition methods based on dictionary learning have been proposed. However, most methods only focus on the resolution of the original image, and the change of resolution may affect the recognition results when dealing with practical problems. Aiming at the above problems, a method of multi-resolution dictionary learning combined with sample reverse representation is proposed and applied to face recognition. First, the dictionaries associated with multiple resolution images are learnt to obtain the first representation error. Then different auxiliary samples are generated for each test sample, and a dictionary consisted of test sample, auxiliary samples, and other classes of training samples is established to sequentially represent all training samples at this resolution, and to obtain the second representation error. Finally, a weighted fusion scheme is used to obtain the ultimate classification result. Experimental results on four widely used face datasets show that the proposed method achieves better performance and is effective for resolution change.
Vocal fold, a significant body structure, is accountable for phonation, which regulates air motion within and out of the lungs. The disorders in the vocal fold influence the quality of life. Thus, diagnosis of vocal fold disorders has a significant need, and CT of the neck is employed for an effective imaging scheme. Accordingly, this paper proposes an advanced multi-resolution algorithm (MRA) that optimally identifies and classifies pathologies. The vocal regions are acquired using the genetic k-means algorithm. The pathology features are generated using the local directional pattern (LDP) fed to pathology classification using moth search-rider optimization-based deep convolutional neural networks (MRA-based DCNN). The hybrid optimization (MRA), integrates the standard rider optimization algorithm (ROA) and moth search algorithm (MS) that trains deep learning classifier (DCNN). The analysis using the real databases regarding the performance metrics divulge that the proposed pathology detection module obtained the accuracy, specificity, and sensitivity of 97.020%, 91.698%, and 96.624%.
The real-time rendering of high-quality, non-uniform scenes based on viewpoint has always been one of the most difficult problems in the CG area. In this paper, we propose one efficient algorithm to solve this problem with the help of merging texture synthesis and discrete wavelet transform (DWT) techniques. Using a single normal-sized image input, we can efficiently obtain texture sizes with different resolutions and update these in real-time rendering with the help of DWT. The results of our experiments prove that our algorithm can smoothly and efficiently render the non-uniform scenes based on viewpoint.
Industrial fume emissions are a major contributor to global warming, and accurate monitoring is necessary. However, current segmentation techniques for video monitoring of industrial fumes are limited by ineffective edge segmentation, lack of consideration of dynamic characteristics, and poor segmentation accuracy. To address these issues, a deep learning-based semantic video segmentation network is proposed in this paper. The network combines fume deformation information, employs LR-ASPP design for real-time performance, and spatio-temporal consistency to enhance semantic information in the dynamic region. A residual hybrid attention network is constructed for the motion region to minimize loss of motion information. The proposed network demonstrates strong anti-interference capability in complex environments, achieving over 10% improvement in IoU and F-score under a high-speed segmentation of 53 FPS. This technology can be integrated with automated systems to enable timely responses to hazardous situations, minimizing risks to workers and nearby communities. In summary, the proposed deep learning-based semantic video segmentation network has significant implications for improving environmental monitoring and management practices in industrial settings.
Multi-resolution representation has wide application demands in the field of Geographic Information System (GIS). The existing fractal interpolation methods can implement the reconstruction of natural linear features from low resolution to high resolution, but they are unable to quantify the relationship between the number of fractal iterations and the resolution, and the fractal interpolation results cannot maintain the geographical characteristics of natural linear features. This study presents a multi-resolution reconstruction method for natural linear features in GIS based on the restricted fractal interpolation. First, the geographical characteristics of natural linear features are extracted to restrict the parameters of fractal interpolation function and control the process of fractal interpolation. Second, a mathematical function is deduced for identifying the relationship between the number of fractal iterations and the resolution. The experiment results demonstrate that this study can reconstruct various preset high-resolution linear features from the low-resolution linear data while the geographical characteristics and random fractal characteristics of natural linear features are kept well.
Owing to its good approximation characteristics of trigonometric functions and the multi-resolution local characteristics of wavelet, the trigonometric Hermite wavelet function is used as the element interpolation function. The corresponding trigonometric wavelet beam element is formulated based on the principle of minimum potential energy. As the order of wavelet can be enhanced easily and the multi-resolution can be achieved by the multi-scale of wavelet, the hierarchical and multi-resolution trigonometric wavelet beam element methods are proposed for the adaptive analysis. Numerical examples have demonstrated that the aforementioned two methods are effective in improving the computational accuracy. The trigonometric wavelet finite element method (WFEM) proposed herein provides an alternative approach for improving the computational accuracy, which can be tailored for the problem considered.
The probability density evolution method (PDEM) provides a feasible approach for the dynamic response analysis of nonlinear stochastic structures. The key step in this regard is to solve a generalized density evolution equation (GDEE) in order to establish the probability density function (PDF). Previously, a finite difference method (FDM) has often been resorted to solve the GDEE. However, one may encounter the problem of mesh sensitivity in the application of FDM to the PDEM. To this end, a novel difference-wavelet method that can improve the finite difference result by means of a nonlinear wavelet density estimation method is proposed in the present paper. By exploiting the multi-resolution property of wavelet functions and by choosing the optimal scale at each instant, it is expected that the bothering mesh sensitivity issue in finite difference method can be overcome to some extent and a better probability density result can be obtained. In order to verify the proposed method, a single-degree-of-freedom (SDOF) oscillator and an 8-story frame structure are investigated in detail. The results show the notable superiority of the proposed method to finite difference method.
Reconciling scene realism with interactivity has emerged as one of the most important areas in making virtual reality feasible for large-scale CAD data sets consisting of several millions of primitives. Level of detail (LoD) and multi-resolution modeling techniques in virtual reality can be used to speed up the process of virtual design and virtual prototyping. In this paper, we present an automatic LoD generation and rendering algorithm, which is suitable for CAD models and propose a new multi-resolution representation scheme called MRM (multi-resolution model), which can support efficient extraction of fixed resolution and variable resolution for multiple objects in the same scene. MRM scheme supports unified selective simplifications and selective refinements over the mesh. Furthermore, LoD and multi-resolution models may be used to support real-time geometric transmission in collaborative virtual design and prototyping.
Liver diseases are a common medical problem, especially amongst the population of developing countries. Magnetic Resonance Cholangio Pancreatography (MRCP) has become the popular non-invasive, non-ionizing examination for analysis of the hepatobiliary structure in the liver. Unfortunately, conventional 2D MRCP images can be difficult to analyze for biliary tree anomalies, especially with volume effect, artefacts and noise present in these images, whilst good 3D MRI systems are costly for less affluent nations. This paper proposes a scale-space multi-resolution approach to a segment-based implementation of the popular region growing algorithm, to identify the hierarchical structure of the biliary tree in conventional 2D MRCP images. Results obtained are promising in aiding automatic processing of these images to assist medical practitioners in analyzing the biliary tract more efficiently. Application of the algorithm may be extended for telemedicine.
This paper, proposes a novel approach for feature extraction based on the segmentation and morphological alteration of handwritten multi-lingual characters. We explored multi-resolution and multi-directional transforms such as wavelet, curvelet and ridgelet transform to extract classifying features of handwritten multi-lingual images. Evaluating the pros and cons of each multi-resolution algorithm has been discussed and resolved that Curvelet-based features extraction is most promising for multi-lingual character recognition. We have also applied some morphological operation such as thinning and thickening then feature level fusion is performed in order to create robust feature vector for classification. The classification is performed with K-nearest neighbor (K-NN) and support vector machine (SVM) classifier with their relative performance. We experiment with our in-house dataset, compiled in our lab by more than 50 personnel.
In this paper, an efficient approximate sparse representation (SR) algorithm with multi-selection strategy is used to solve the image fusion problem. We have shown that the approximate SR is effective for image fusion even if the sparse coefficients are not the sparsest ones possible. A multi-selection strategy is used to accelerate the process of generating the approximate sparse coefficients which are used to guide the fusion of image patches. The relative parameters are also investigated experimentally to further reduce the computational time. The proposed method is compared with some state-of-the-art image fusion approaches on several pairs of multi-source images. The experimental results exhibit that the proposed method is able to yield superior fusion results with less consumption time.
Built-up area detection is very important for applications such as urban planning, urban growth detection and land use monitoring. In this paper, we address the problem of built-up area detection from the perspective of visual saliency computation. Generally, areas containing buildings attract more attentions than forests, lands and other backgrounds. This paper explores a Bayesian saliency model to automatically detect urban areas. Firstly, prior probability is computed by using fast multi-scale edge distribution. Then the likelihood is obtained by modeling the distributions of color and orientation. Built-up areas are further detected by segmenting the final saliency map using Graph Cut algorithm. Experimental results demonstrate that the proposed method can extract built-up area efficiently and accurately.
In order to effectively improve the pathological diagnosis capability and feature resolution of 3D human brain CT images, a threshold segmentation method of multi-resolution 3D human brain CT image based on edge pixel grayscale feature decomposition is proposed in this paper. In this method, first, original 3D human brain image information is collected, and CT image filtering is performed to the collected information through the gradient value decomposition method, and edge contour features of the 3D human brain CT image are extracted. Then, the threshold segmentation method is adopted to segment the regional pixel feature block of the 3D human brain CT image to segment the image into block vectors with high-resolution feature points, and the 3D human brain CT image is reconstructed with the salient feature point as center. Simulation results show that the method proposed in this paper can provide accuracy up to 100% when the signal-to-noise ratio is 0, and with the increase of signal-to-noise ratio, the accuracy provided by this method is stable at 100%. Comparison results show that the threshold segmentation method of multi-resolution 3D human brain CT image based on edge pixel grayscale feature decomposition is significantly better than traditional methods in pathological feature estimation accuracy, and it effectively improves the rapid pathological diagnosis and positioning recognition abilities to CT images.
Modeling of 3D highway integrated with terrain model was firstly discussed in this paper, and a new method of judging the maximum angle was used to eliminate the distorted triangles on the boundary of model. It is a fundamental issue for visualization and analysis on 3D GIS at a rapid speed. To strip-distributed terrain data, this paper proposed a new dynamic multi-resolution model. Different from other models/algorithms, this algorithm need not pre-partition the model regularly and can achieve multi-resolution representation in one tile. In addition, such a condition was considered in the approach that only resolution level of parts of triangles will change in adjacent frames when roaming. That is, the resolution of parts of triangles which locate in the joints of adjacent resolution levels will change. Therefore, the speed of algorithm can be improved. Finally, a software prototype “VRHighWay”, developed by VC++ 6.0 and OpenGL, was introduced. The experimental results demonstrate that proposed method acquires better performance in terms of accuracy for the multi-resolution representation of terrains with roads embedded.
Drawing on the rich theory of wavelets, a new method for vehicle license plate image denoising with edge preservation and enhancement is proposed in this paper, based on image multi-resolution decomposition by a redundant wavelet transform. As the experimental results show, a noticeable enhancement of number plate image quality(less noise without loss of important details) can be obtained by using this method. Furthermore, a good base for future segmentation or recognition is built.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.