The computational model on which the algorithms are developed is the array with reconfigurable optical buses (AROB). It integrates the advantages of both optical transmission and electronic computation. The main contributions of this paper are in designing several optimal and/or optimal speed-up template matching algorithms with varying degrees of parallelism on the AROB model. For an N × N digitized image and an M × M template, when the domains of the image and the template are O(log N)-bit integers, we first design several basic operations for window broadcasting and rotation. Then based on these basic operations, three efficient and scalable algorithms for template matching are derived using various numbers of processors on a two-dimensional (2-D) or 3-D AROB. For 1 ≤ r ≤ N, 1 ≤ p ≤ M ≤ q ≤ N, one runs in time using r × r processors, another runs in
, (resp.
) time using pN × pN/log M (resp. pN × pN × log N) processors, and the other runs in
(resp.
) time using pq × pq/log M (or pq × pqN × log N) processors, respectively. The latter two algorithms can be tuned to run in O(1) time on a 2-D AROB. To the best of our knowledge, there are no algorithms which can reach this time complexity for this problem on a 2-D array architecture.
We consider fundamental data manipulation operations such as broadcasting, prefix sum, data sum, data shift, data accumulation, consecutive sum, adjacent sum, sorting, and random access reads and writes, and show how these may be performed on the distributed memory bus computer (DMBC). In addition, we study two image processing applications: shrinking and expanding, and template matching. The DMBC algorithms are generally simpler than corresponding algorithms of the same time complexity developed for other reconfigurable bus computers.
A mesh-connected computer enhanced by a reconfigurable bus system is referred to as a reconfigurable mesh (RM). Given an n×n grey-scale image and m×m template, in this paper, an O(log m) time parallel algorithm for template matching is presented on a RM with O(m2n2) processors. Suppose the image and template are binary, an O(1) time algorithm is presented on a RM with O(m3n2) processors. Both algorithms are superior to the best known algorithms on RMs.
Segmentation of handwritten text into lines, words and characters is one of the important steps in the handwritten text recognition process. In this paper, we propose a float fill algorithm for segmentation of unconstrained Devanagari text into words. Here, a text image is directly segmented into individual words. Rectangular boundaries are drawn around the words and horizontal lines are detected with template matching. A mask is designed for detecting the horizontal line and is applied to each word from left to right and top to bottom of the document. Header lines are removed for character separation. A new segment code features are extracted for each character. In this paper, we present the results of multiple classifier combination for offline handwritten Devanagari characters. The use of regular expressions in handwritten characters is a novel concept and they are defined in a manner so that they can become more robust to noise.
We have achieved an accuracy of 94% for word level segmentation, 95% for coarse classification and 85% for fine classification of character recognition. On experimentation with a dataset of 5000 samples of characters, the overall recognition rate observed is 95% as we considered top five choice results. The proposed combined classifier can be applied to handwritten character recognition of any other language like English, Chinese, Arabic, etc. and can recognize the characters with same accuracy.18 For printed characters we have achieved accuracy of 100%, only by applying the regular expression classifier.17
Using image processing technology to extract important information, such as isoline and weather system of the meteorological facsimile chart, is conducive to integration with other information, and has important practical value in navigation operations, marine weather forecasting, target recognition, and image retrieval. In meteorological facsimile charts, there are many types of medium-value lines, dense lines in some areas, superimposition and presence of multiple information, such as isolines and isoline characters, intersection of specific weather system symbols, etc. For different types of contours, numeric characters, weather system symbols and other object characteristics, the corresponding object extraction and recognition methods are proposed: Remove the latitude and longitude lines and coastline in the meteorological facsimile map by basemap matching; According to the position and shape features of the figure box, extract the meteorological fax figure box, separate and remove the different character tagging information; On the basis of identifying triangles and semicircles in weather symbols of the frontal system, the frontal symbols are extracted based on the circumscribed triangles and template matching. First the contour character on the fax image is expanded into a block connected region. Determine the position of the character information by judging the number of pixels in the connected region, and then use rotation and template matching to identify the numeric character. Using the meteorological facsimile maps of the US Meteorological Center and the Japan Meteorological Center for the main information extraction, experiments show that the method of this paper has a good effect on the complete and accurate symbol extraction of frontal weather systems, and reduces the computational complexity of contour detection, isoline extraction and numerical recognition. The methods can detect some information from weather charts properly and the error rate is very low.
A vision-based method for detecting the cracks in the concrete sleepers of the railway tracks will be introduced in this paper. The method is able to detect and partially classify the cracks of the concrete sleepers in two successive steps based on the image processing and pattern recognition techniques. The method has been implemented on the acquired image data frames followed by the analysis, experimental, comparison results and evaluation. The presented results are reasonable which indicates the goodness of the introduced method. The preliminary results of this work have been presented in [A. Delforouzi, A. H. Tabatabaei, M. H. Khan and M. Grzegorzek, A vision-based method for automatic crack detection in railway sleepers, in Kurzynski, M., Wozniak, M., Burduk, R. (eds.), Proceedings of the 10th International Conference on Computer Recognition Systems CORES 2017, Polanica Zdroj, Poland. CORES 2017. Advances in Intelligent Systems and Computing, Vol. 578 (Springer, Cham, 2018), pp. 130–139, doi: 10.1007/978-3-319-59162-9_14].
There is an increasing demand for biometric security systems in several fields. This study presents a highly accurate facial recognition method that uses high-speed transformation and facial morphing using region-limited log-polar transformation based on a center point calculated from the coordinates of both eyes and corners of the mouth. Log-polar transformation is limited to the region, so that the region including the feature can be suppressed to the minimum range, thereby facilitating high-speed transformation. Additionally, after facial morphing, the shapes of the eyes and mouth are altered based on the outline of the face, enabling high-precision facial recognition. The efficacy of the proposed method is verified experimentally. Therefore, we can confirm that 91.72% of images using the color FERET database and 96% of images using the FEI face database can be recognized using our method.
Traditional template matching-based motion estimation is a popular but time-consuming method for vibration vision measurement. In this study, the particle swarm optimization (PSO) algorithm is improved to solve this time-consumption problem. The convergence speed of the algorithm is increased using the adjacent frames search method in the particle swarm initialization process. A flag array is created to avoid repeated calculation in the termination strategy. The subpixel positioning accuracy is ensured by applying the surface fitting method. The robustness of the algorithm is ensured by applying the zero-mean normalized cross correlation. Simulation results demonstrate that the average extraction error of the improved PSO algorithm is less than 1%. Compared with the commonly used three-step search algorithm, diamond search algorithm, and local search algorithm, the improved PSO algorithm consumes the least number of search points. Moreover, tests on real-world image sequences show good estimation accuracy at very low computational cost. The improved PSO algorithm proposed in this study is fast, accurate, and robust, and is suitable for plane motion estimation in vision measurement.
The recognition of patterns is an important task in robot and computer vision. The patterns themselves could be one- or two-dimensional, depending upon the application. Pattern matching is a computationally intensive and time consuming operation. The design of special purpose hardware could speed up the matching task considerably, making real-time responses possible. Advances in parallel processing and VLSI technologies have made it possible to implement inexpensive, efficient and very fast custom designs. Many approaches and solutions have been proposed in the literature for hardware implementations of pattern matching techniques. In this paper, we present a detailed overview of some of the important contributions in the area of hardware algorithms and architectures for pattern matching.
The recognition of polygons in 3-D space is an important task in robot vision. Advances in VLSI technology have now made it possible to implement inexpensive, efficient and very fast custom designs. The authors have earlier proposed a class of VLSI architectures for this computationally intensive task, which makes use of a set of local shape descriptors for polygons which are invariant under affine transformations, i.e. translation, scaling, rotation and orthographic projection from 3-D to any 2-D plane. This paper discusses the design and implementation of PMAC, a prototype for polygon matching, as a custom CMOS VLSI chip. The recognition procedure is based on the matching of edge-length ratios using a simplified version of the dynamic programming procedure commonly employed for string matching. The matching procedure also copes with partial occlusions of polygons. The implemented architecture is systolic and fully utilizes the principles of pipelining and parallelism in order to obtain high speed and throughput.
We present four algorithms to perform template matching of an N×N image with an M×M template. The first algorithm is sequential, and the others are based on a SIMD mesh connected computer with N×N processors. In the first three algorithms, both the image and the template are represented by quadtrees, whereas in the last one, the template is represented by a quadtree, and the image is represented by a matrix. The time complexities of the four algorithms are respectively upper-bounded by α1N2M2, α2N+β2M2, α3N+β3M2, and β4M2, where α1, α2, β2, α3, β3, and β4 are constants.
This paper describes some issues in building a 3-D human face modeling system which mainly consists of three parts:
• Modeling human faces;
• Analyzing facial motions;
• Synthesizing facial expressions.
A variety of techniques developed for this system are described in detail in this paper. Some preliminary results of applying this system to computer animation, video sequence compression and human face recognition are also shown.
The objective of this study is to analyze and compare three different recognition approaches to machine printed Arabic characters. The first approach is a template matching and a correlation technique where an input character is compared to a standard set of stored prototype images. The second and the third approaches are based on feature analysis and matching. The features in the second approach are extracted from the horizontal and vertical projections of the images of characters. The third approach is a structural approach where the features are extracted from the geometry of the segments that make a character. In all approaches, the same neural network structure, feedforward with the back propagation learning algorithm, is used for classification. A centering and scaling normalization preprocessing stage precedes the feature extraction process and is used to achieve a size and a position invariant recognition system. The study focuses on the 28 basic Arabic characters of the Cairo font. The performance of the recognition algorithms in each approach is evaluated and the results are compared.
The emCGA is a new extension of the compact genetic algorithm (CGA) that includes elitism and a mutation operator. These improvements do not increase significantly the computational cost or the memory consumption and, on the other hand, increase the overall performance in comparison with other similar works. The emCGA is applied to the problem of object recognition in digital images. The objective is to find a reference image (template) in a landscape image, subject to distortions and degradation in quality. Two models for dealing with the images are proposed, both based on the intensity of light. Several experiments were done with reference and landscape images, under different situations. The emCGA was compared with an exhaustive search algorithm and another CGA proposed in the literature. The emCGA was found to be more efficient for this problem, when compared with the other algorithms. We also compared the two proposed models for the object. One of them is more suitable for images with rich details, and the other for images with low illumination level. Both models seem to perform equally in the presence of distortions. Overall, results suggested the efficiency of emCGA for template matching in images and encourages future developments.
Research reported in this article is motivated, in part, by current U.S. military programs aimed at the development of efficient data integration and sensor management methods capable of handling large sensor suites and achieving robust target recognition performance in real time scenarios. Modern sensor systems have shown good recognition abilities against a few isolated targets. However, these capabilities decline steeply when multiple sensors are acting against large target groups under realistic conditions requiring dynamic allocation of the sensor resources and efficient on-line integration and disambiguation of multiple sensor outputs. Neural networks and other sensor integration technologies have been inspired by cognitive models attributing human perceptual integration to parallel processing and convergence of simultaneous data streams. This article explores a different model emphasizing serial processing and association of consecutive memory traces in the Long Term Memory (LTM) into a globally connected memory structure called a Virtual Associative Network (VAN). Information integration in VAN is called blending. Target representation is constructed dynamically from the segments of virtual net matched serially against the input segments in the Short Term Memory (STM). This article will elaborate the concept of blending, reference its biological foundations, explain the difference between information blending and conventional sensor fusion techniques, and demonstrate blending applications in a large scale sensor management task.
The nondestructive determination of the maturity of durian is an important process for quality control. Since durian has specific properties, such as the thickness of peel, non-uniformity of shape, roughness of prickle skin, largeness of size etc., the nondestructive determination of the maturity is hard to perform. This paper proposes two approaches for determining maturity of durian by using nondestructive vibration and ultrasonic. The appropriate vibration and ultrasonic are directly transferred through durian in the region between prickles located at the middle of durian. The measurement of frequency response from vibration of durian is done by using laser doppler. The signal measured by laser doppler is processed to extract high frequency part by wavelet transform and convert into spectrum form. The template matching is performed by correlation between the high frequency spectrum of signal and templates in order to determine the maturity of durain. To evaluate the proposed method, two prototypes of system using force vibration and ultrasonic are constructed. The experiments are performed for comparing with the results of dry-weight percentage method that is considered to be absolutely right destructive way. And the results of template matching by correlation as proposed signal processing are also compared with conventional signal processing such as vibration velocity way and elastic constants way. The results reveal that the methods using force vibration and ultrasonic with template matching are more accurate by 95% approximately.
In Khmer printed characters, same character has various shapes according to the fonts and some characters are very similar in shape. In this paper we try to solve these problems, and propose a method of Khmer printed character recognition by using Wavelet Descriptors. In the recognition, firstly the Khmer printed character images are converted to skeleton forms, then skeletons of Khmer character are converted to temporal domain. The templates are obtained by wavelet coefficients from the character training set. To match the input characters with templates, the character recognition method using deformable wavelet descriptor is adapted by using fixed template and Euclidean distance classifier for matching. The smallest distance is the recognition result of the proposed method. As a result, the deformation can be skipped because it might get low recognition rate of similar characters. The experiment consists of two parts. The first part is to evaluate the overall recognition rate of input characters with three different sizes (22-point, 18-point and 12-point) from 10 different fonts of Khmer printed character. Twenty styles of characters are used as the training set. The results show 92.85, 91.66, and 89.27 percent for 22-point, 18-point, and 12-point respectively. The second part is to specifically evaluate the system, testing with one document that has 21 pages of Khmer printed character with different resolutions from a scanner and facsimile (fax). The document is initially printed with 300 dpi (dots per inch), then scanned with three different resolutions, 600 dpi, 300 dpi and 150 dpi. The document that received from fax machine is scanned by 300 dpi. The results show 92.99, 88.61, and 80.05 percent recognition rate for 300, 150 dpi resolutions, and input from fax respectively.
This study aimed to measure dynamic responses of structural systems using smartphone videos and vision-based sensing processes. Two algorithms, based on template matching and feature extraction, are employed for this purpose. They were verified on the shake table experiments with single and multi-degree freedom steel specimens where their videos were captured by an ordinary smartphone during excitation. Dynamic responses such as strains at discrete sections as well as displacements, velocities, and accelerations at the floor levels of the specimens were obtained by following the signs of physical or virtual markers during the video recording. Through comprehensive experiments and one available full-scale multi-story shake table experiment in the literature, the developed vision-based algorithms were validated to be used with smartphone videos. It was also shown that regardless of the quality of the video record, substantial characteristics of a specimen or a structure could be determined reasonably by smartphone videos since the absolute mean relative differences varied between 10% to 20%.
In this paper, we describe a recognition method of lung nodule shadows in X-ray CT images using 3-dimensional nodule and blood vessel models. From these 3D object models, artificial CT images are generated as templates. The templates are then applied to input images which comprise of suspicious shadows. If any parameters of the suspicious shadow matches a nodule template rather than any blood vessel template, then it is determined to be abnormal. Otherwise, it is determined to be normal. By applying our new method to the actual lung CT images of 38 patients, the false positive ratio is reduced to 4.31 [shadow/patient] with the sensitivity exceeding 95%.
We study the problem of thresholding the residual of template matching as a preprocess for selecting the correct matches between feature points in two images. In order to determine the threshold dynamically, we introduce a statistical model of the residual and compute an optimal threshold according to that model. The model parameters are estimated from the histogram of the residuals of candidate matches. Using real images, we show that our method can substantially upgrade the quality of the initial matches by simply adjusting the threshold.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.