![]() |
Advances in Handwriting Recognition contains selected key papers from the 6th International Workshop on Frontiers in Handwriting Recognition (IWFHR '98), held in Taejon, Korea from 12 to 14, August 1998. Most of the papers have been expanded or extensively revised to include helpful discussions, suggestions or comments made during the workshop.
Sample Chapter(s)
Handwriting Recognition or Reading? Situation at the Dawn of the 3rd Millennium (633 KB)
https://doi.org/10.1142/9789812797650_fmatter
The following sections are included:
https://doi.org/10.1142/9789812797650_0001
During the last forty years, Human Handwriting Processing (HHP) has most often been investigated under the frameworks of Character (OCR) and Pattern Recognition. An evolution recently occurred and to date, HHP can be much more viewed as an automatic Handwriting Reading (HR) task for the machine. At the dawn of the 3rd millennium, we guess that HHP will be more considered as a perceptual and interpretation task closely connected with research into Human Language. This paper gives some guidelines and examples to design systems able to perceive and to interpret, i.e. to read the handwriting automatically.
https://doi.org/10.1142/9789812797650_0002
A fast HMM algorithm is proposed for on-line hand written character recognition. After preprocessing input stroke are discretized so that a discrete HMM is used. This particular discretization naturally leads to a simple procedure for assigning initial state and state transition probabilities. In the training phase, complete marginelization with respect to state is not performed. A criterion based on normalized log likelihood ratio is given for deciding when to create a new model for the same character in the learning phase, in order to cope with stroke order variations and large shape variations. Experiments are done on the Kuchibue data base from Tokyo University of Agriculture and Technology. The algorithm appears to be very robust against stroke number variations and have reasonable robustness against stroke order variations and large shape variations. The results seem encouraging.
https://doi.org/10.1142/9789812797650_0003
We address the problem of determining the best training text for large-vocabulary, writer-dependent, unconstrained English handwriting recognition. Our goal is to achieve maximum recognition accuracy, while minimizing the duration and tedium of the user's task of writing training text. We explore recognition accuracy as a function of three dimensions of training text: length, choice of character-coverage criterion, and relative priority of keeping the text interesting vs. optimizing to the chosen character-coverage criterion. Our results show various advantages to using coverage criteria based on (1) balancing occurrences of character unigrams and (2) incorporating most-common bigrams. We also find that preserving a theme in the training text causes relatively little harm to coverage or recognition accuracy.
https://doi.org/10.1142/9789812797650_0004
In this paper, we will propose a simple yet robust structural approach for recognizing on-line handwriting. Our approach is designed to achieve reasonable speed, fairly high accuracy and sufficient tolerance to variations. At the same time, it maintains a high degree of reusability and hence facilitates extensibility. Experimental results show that the recognition rates are 98.60% for digits, 98.49% for uppercase letters, 97.44% for lowercase letters, and 97.40% for the combined set. When the rejected cases are excluded from the calculation, the rates can be increased to 99.93%, 99.53%, 98.55% and 98.07%, respectively. On the average, the recognition speed is about 7.5 characters per second running in Prolog on a Sun SPARC 10 Unix workstation and the memory requirement is reasonably low.
https://doi.org/10.1142/9789812797650_0005
Out-of-order diacriticals introduce significant complexity to the design of an on-line handwriting recognizer, because they require some reordering of the time domain information. It is common in cursive writing to write the body of a ‘t’, ‘j’ or ‘i’ (and sometimes an ‘x’ or ‘f’) during the writing of the word, and then to return and dot or cross the letter once the word is complete. The difficulty arises because we have to look ahead, when scoring one of these letters, to find the mark occurring later in the writing stream that completes the letter. We should also remember that we have used this mark, so that we don't use it again for a different letter, and we should also penalize a word if there are some marks that look like diacriticals that are not used. This paper describes an efficient method that provides a natural mechanism for considering alternative treatments of potential diacriticals, to see whether it is better to treat a given mark as a diacritical or not and directly compare the two outcomes by score.
https://doi.org/10.1142/9789812797650_0006
In this paper, we introduce a framework for handwriting recognition, based on perceptual cycles. This framework permits us to integrate coherently differents sources of information such as segmentation, letter spatial coherency and lexical. We then present a recognition system we built upon this framework.
https://doi.org/10.1142/9789812797650_0007
This paper presents a comparison of several advanced modeling techniques for HMM-based handwriting recognition. The performance of these methods is tested on an extremely challenging handwriting recognition task, namely the improvement of a baseline very large vocabulary HMM-based handwriting recognition system with a vocabulary size of 200 000 German words. The use of sophisticated HMM-technology allows the construction of such a baseline system. It is however an extremely difficult task to further improve such a system, because the very large vocabulary size leads to implementation problems related to memory and search space limitations. The paper investigates some advanced techniques for further improvement of such systems and demonstrates how several special effects of these techniques can be exploited in order to increase the recognition performance without enlarging the requirements for memory allocation or search space. This is mainly achieved by introducing a novel hybrid Connectionist/HMM-approach to handwriting recognition, based on the use of so-called Maximum Mutual Information Neural Networks (MMINN) which serve as neural vector quantizer for the HMM-based handwriting recognition system. The result of this investigation is one of the first available 200k writer-dependent on-line handwriting recognition systems, which can be demonstrated on a PC and has achieved an average recognition rate of more than 90% obtained from evaluating several test writers.
https://doi.org/10.1142/9789812797650_0008
Within the proposed approach, handwritten words are represented as left-to-right sequences of graphemes and can be described by ergodic HMMs. In contrast to conventional HMMs, a feedforward multilayer neural network provides the grapheme observation probabilities, while the transition probabilities are used as in conventional HMMs. Training of the hybrid proceeds by an EM type algorithm, where the HMMs provide the targets for the discriminant training of the neural network. Our system is designed for small vocabulary problems. We report results obtained on a large data base of words showing recognition rates close to 93% for the 28 word vocabulary relevant for French legal amounts.
https://doi.org/10.1142/9789812797650_0009
This paper is concerned with an approach to the automated design of filters for background removal of bankchecks. The approach is derived from a simplified version of the morphological subtraction and its interpretation, using the 2D- Histogram. From an empty bankcheck, a mask is computed, which can be added to every filled-in bankcheck of the same kind, filtering the user entered information parts within the bankcheck image. This approach works well for many types of bankchecks, but it fails for EC bankchecks. By considering the 2D- Histogram of empty and filled-in EC bankcheck images, a means is given for improving the result for EC checks, using the so-called 2D-Lookup algorithm. Moreover, this algorithm allows for the use of the filled-in image only, if there is a way to specify two filter operations, which process the filled-in image in order to get a separable 2D-Histogram. An evolutionary algorithm, the genetic programming (GP), is used for performing this adaptation task. The evolutionary generated filters shows very good performance for EC bankcheck background removal. A detailed study gives the importance of a family of image processing operations, the OWA operations, for this removal.
https://doi.org/10.1142/9789812797650_0010
In this paper we overview an architecture which allows form readers to be built using a toolkit of software components. The application of this architecture is demonstrated by the rapid instantiation of two quite different prototype form reader systems. It is suggested that future generations of our architecture will aid the construction of readers which show a marked improvement in accuracy and flexibility compared with current form readers.
https://doi.org/10.1142/9789812797650_0011
In the automated reading of bank checks by an OCR program, a check is usually rejected if the recognition results for the courtesy and the legal amounts do not agree. This paper presents a new method for the localization, analysis and correction of recognition errors that cause the disagreement. The correction is based on a second, more sophisticated classification of detected potential errors in the legal amount. On a database with real Swiss postal checks, the acceptance rate can be improved from 58% to 65% without increasing the error rate. Since only local parts and not the whole amount are re-estimated, the method is very efficient.
https://doi.org/10.1142/9789812797650_0012
Handwriting recognition is a rather easy task for humans, but still remains extremely difficult using classical algorithmic methods. Most of the systems that have already been developped does not take in account the great variability of handwriting. Thus, their processing cost is independant of the difficulty of the task and corresponds to the worst case. We suggest an approach which allows to bound the treatments, adapting them to the difficulty of the problem. For that purpose, we use global word recognition and adaptive classifiers.
https://doi.org/10.1142/9789812797650_0013
Scanning a filled form results in a single image which contains both the form's preprinted background design and its handwritten foreground. Form dropout is concerned with taking this image and splitting it to produce one image containing only the form's foreground components, and another containing the form's background. This process is important since many stages in a form reader require that foreground and background images are available separately for analysis. In this paper we present a colour classification algorithm which can be used for fast and accurate dropout of colour forms. After describing the algorithm we illustrate its performance over a diverse testset of 40 form designs.
https://doi.org/10.1142/9789812797650_0014
A text lines extraction algorithm based on Hough Transform for unconstrained handwritten documents is proposed. A natural learning method which is similar to the learning procedure of human being is used in this algorithm. First some general informations have been obtained by applying Hough Transform. Then natural learning is used during the clustering. The algorithm gets more information step by step by natural learning until the final segmentation result is reached.
https://doi.org/10.1142/9789812797650_0015
This paper is devoted to the description of the handwritten word recognizer. It works with a whole word or a phrase written in one string. The recognizer is based on the holistic approach that allows us to avoid the segmentation stage and corresponding segmentation mistakes. Recognition involves feature extraction and feature ordering in a linear sequence and matching the input word feature representation to all lexicon entry representations. The set of features reflects the importance of vertical extremums, which are the most stable elements of writing. We introduce a measure of similarity between different types of features. It gives us high flexibility in the matching process necessary for the recognition of sloppy handwriting. The performance measured on word images cut from envelopes is 86.4% for lexicon size 1000. The recognizer is used as the core of the check reader licensed by NCR for the industrial check processing system.
https://doi.org/10.1142/9789812797650_0016
This paper describes techniques for handwritten text recognition that will enable recognition of ordinary handwritten text in documents. Image analysis concerns problems such as text line detection, text line extraction, noise removal, underline removal, recognition of digits, alphabets, and special symbols. The second stage, word segmentation and recognition concern isolation of words from a line of text image, and handwritten word recognition algorithms with and without a dictionary. Postprocessing concerns using linguistic constraints to intelligently parse and recognize text. Key ideas employed in each functional module are described in the paper. The system is written in C and has about 30,000 lines of code, and takes less than one minute on a SUN Sparc 20 to process an 8.5 in × 11 in page scanned at 300 dpi binary.
https://doi.org/10.1142/9789812797650_0017
The paper addresses the problem of recognizing cursive phrases when word segmentation is very difficult or truly impossible, as in the case of literal amounts on Italian cheques, where words are written connected together. Since the general approach adopted for recognizing handwriting is that of generating a graph of segmentation hypotheses, the technical problem becomes that of searching for the path in the segmentation graph that matches optimally all the possible sequences of letters, represented also by paths in a graph that describes the phrase grammar.
The paper describes first the basic algorithm, discussing in particular the important issue of the form of the cost function in order not to bias recognition towards shorter phrases. Then we discuss a hierarchical decomposition of the search algorithm that fits with the usual hierarchical forms of linguistic knowledge representation, i.e. a grammar level and a lexical level. Accordingly, we describe a Hierarchical Dynamic Programming (HDP) algorithm, composed of a Sentence Level and a Word Level D.P. With a proper design, the HDP algorithm comes out very simple and WLDP results in a simple extension of the common search algorithms to recognize single words.
https://doi.org/10.1142/9789812797650_0018
An approach to handwritten word recognition is described which attempts to combine the properties of hidden Markov modeling with those of segmentation-by-recognition. The approach is based on a heuristic segmentation of word images which is designed to find all characters frontiers and which tends to propose also superfluous segmentation cuts. Bitmaps delimited by neighboring cut hypotheses are input to a character recognition module whose outputs serve as the observations of the hidden Markov model. Through the use of a character recognizer applied to recombined bitmaps, the method is allowed to recover the identity of pseudo-characters using contextual information. By considering the recognizer outputs as HMM observations instead of directly using them in a scoring function, the approach inherits the well-founded and efficient estimation algorithms of hidden Markov models.
https://doi.org/10.1142/9789812797650_0019
This paper first summarizes a number of findings in human reading of handwriting. A method is proposed to uncover more detailed information about geometrical features which human readers use in the reading of Western script. The results of an earlier experiment on the use of ascender/descender features were used for a second experiment aimed at more detailed features within words. A convenient experimental setup was developed, based on image enhancement by local mouse-clicks under time pressure. The readers had to develop a cost-effective strategy to identify the letters in the word. Results revealed a left-to-right strategy in time, however, with extra attention to the initial, leftmost parts and the final rightmost parts of words in a range of word lengths. The results confirm high hit rates on ascenders, descenders, crossings and points of high curvature in the handwriting pattern.
https://doi.org/10.1142/9789812797650_0020
In this paper we present a system for the off-line recognition of hand-written sentences. The system takes scanned images of handwritten text as input. An input image is first binarized. Next, the lines are extracted. Then feature extraction is performed by a neural network. The extracted features are input to a hidden Markov model, where word recognition takes place. After word recognition contextual postprocessing is performed by means of a language model. Using a simple statistical bigram model, we could improve the recognition rate of the system from 74% to 85% on the word level.
https://doi.org/10.1142/9789812797650_0021
This paper introduces a new handwriting recognition system that is currently under development. Our application is the reading of German handwritten addresses for automatic mail sorting. The quality of the handwritten words is often bad in this application, because writers are not very cooperative. Therefore we have developed some suitable and efficient preprocessing operations to clean the image and normalize the writing.
Because the words are often difficult to segment into letters, we have chosen a segmentation-free approach for recognition with semi-continuous Hidden Markov Models. We are applying the technique of context modelling in a model hierarchy in order to train more specific letter models.
For training and evaluation, we have used a large sample of 15000 handwritten city and street names. A number of experiments have been performed to evaluate strategies for feature space reduction (Karhunen-Loeve transform, linear discriminant analysis). On a 100 word lexicon, we achieve recognition rates of up to 90% on large independent test sets.
https://doi.org/10.1142/9789812797650_0022
Two methods for stroke segmentation from global point of view are presented comparatively. One is based on thinning method. The other is based on contour curve fitting. For both cases an input image is binarized. For the former, Hilditch's method is used. Then crossing points are sought, around which some domain is constructed. Outside the domain, a set of line segments are identified. These lines are connected and approximated by cubic B-spline curves. Smoothly connected lines are selected as segmented curves. This method works well for some limited class of crossing lines, which are experimentary shown. The other is that a contour line is approximated by cubic B-spline curve, along which curvature is measured. According to the extreme points of the curvature graph, the contour line is segmented, based on which line segment is done. Experimental results are shown for some difficult cases.
https://doi.org/10.1142/9789812797650_0023
Character segmentation for handwritten cursive scripts is known to be a difficult problem, due to both the high variability of handwriting style and the connectivity of adjacent characters. Recently, deformable models have been studied extensively for the extraction of non-rigid patterns. The success of using deformable models to recognize manually segmented, isolated characters has been demonstrated in our earlier work. However, the approach works poorly in the presence of gross errors which are commonly found in handwritten cursive scripts in the form of stroke anomalies and closely cluttered characters. The problem persists even with good model initialization. In this paper, robust statistical techniques (M-estimation) are integrated with deformable pattern matching for the extraction of individual characters from handwritten cursive scripts that exhibit a large degree of gross errors, or outliers. The effectiveness of our algorithm is demonstrated by applying it to character segmentation using data in the CEDAR database.
https://doi.org/10.1142/9789812797650_0024
This paper describes a new algorithm for segmenting handwritten text lines into characters, used in a developed ICR system. The segmentation process consists of three major components: the preliminary character separation, the grouping of fragmented characters, the splitting of touching characters. The preliminary character separation extracts portions of the text line that contain groups of pixels clearly separated from the other pixels in the row. The grouping of fragmented characters identifies components of the image that can be merged together to obtain a single character. The splitting of touching characters finds the cuts along which the components of more than one character can be separated in order to isolate every single character. Experiments on more than two hundreds numeric and alphabetic handwritten strings provided by a line finder process have proven that the proposed approach to character segmentation is robust enough to cope with highly fragmented characters and touching characters.
https://doi.org/10.1142/9789812797650_0025
This paper presents a new technique for cursive word segmentation and describes its application to a system for cursive amount recognition of Italian bank checks. The technique is based on a hypothesis-then-verification strategy. Hypothesis generation is obtained by a segmentation technique simulating a “drop-falling” process. Hypothesis verification is accomplished by evaluating the recognition confidence of the segmented elements and its consistency with the constraints of the contextual knowledge of the specific domain of application.
https://doi.org/10.1142/9789812797650_0026
In this paper, we propose a character segmentation method which obtains shape feature vectors from segmentation candidates, evaluates the candidates by linearly transforming of their feature vectors, and searches for the segmentation path which has the best sum of evaluated values. The advantage of the proposed method is that the parameters for linear transformation can be optimized by a steepest gradient method to obtain the best segmentation rate for training samples. Since optimization can be carried out in terms of on-line learning, the segmentation results can be gradually adapted to the writer. Experiments prove the efficiency of the proposed method.
https://doi.org/10.1142/9789812797650_0027
In this paper, a probabilistic model is proposed for precise candidate selection in recognition of large character set. This model explores the redundancy of minimum distance classification and selects a reduced set of candidate classes. Firstly, the output class probabilities are evaluated from rank ordered distances by minimizing the square error of probability ratio. Then the classifier behavior knowledge is incorporated by diagnostic inference to deduce the input class probabilities. Based on the inferred input probabilities, the candidate set is determined by thresholding of probability ratio. The efficiency of this method was demonstrated in coarse recognition of ETL8B2 and ETL9B databases. Compared to minimum distance classification with fixed number of candidates, the proposed method selects less than half of candidates with precision preserved.
https://doi.org/10.1142/9789812797650_0028
This paper presents a study on the off-line handwritten Hangul (Korean) character recognition based on modular neural network employing partial connections between the hidden nodes and one global and ten local receptive fields. A modular architecture called modular partial connected neural network (MPCNN) has been proposed here. This MPCNN combines three partially connected neural networks (PCNNs) trained on three different feature sets into a hierarchically organized MLP with their truncated subnetwork as basic building blocks. This subnetwork combination has been achieved via internal hidden layer coupling rather than conventional output combination that attracts considerable attention recently. The performance of the proposed classifier has been verified on the recognition of 18 off-line handwritten Hangul characters widely used in business cards in Korea.
https://doi.org/10.1142/9789812797650_0029
In this paper, a new method for extracting document structure is presented. This method is especially suitable for extracting character lines from an unformatted document image. Because character line extraction has a great influence on all sub-sequent processes, extraction of correct lines is one of the most important issues in document image understanding. Maximum a posteriori probability estimation is used to solve this problem by modeling the knowledge of target documents and extracting the most suitable document structures for input images. With some assumption, the model can be expressed very simply. Moreover the model is automatically obtained by learning from sample target images. This function enables the proposed system to perform detailed handling of diverse styled documents without parametric tuning by a human operator.
https://doi.org/10.1142/9789812797650_0030
In this paper we propose a method of stroke extraction from handwritten Kanji characters directly without thinning. The method extracts strokes by decomposing a character into region segments, classifying them into simple regions and complex regions, extending simple regions in their directions, and finally merging them into one stroke if certain conditions are satisfied. It extracts first vertical strokes and horizontal strokes independently by 2-way raster scannings, and combines them in an early stage by referring to the mutual scanning results and the original pattern. Experiments using 200 handwritten Kanji characters of ‘ki’ and ‘kuchi’ from the ETL9B database show more than 94% of successful stroke extraction.
https://doi.org/10.1142/9789812797650_0031
Currently the state of the art in character recognition has advanced from the use of primitive schemes for the recognition of printed numerals to the application of sophisticated techniques for the recognition of handprinted characters and symbols. However the problem of handwritten Hangul character recognition is a big challenge to researchers because of the wide varieties of writing styles. Also, the shape of handwritten Hangul characters can vary in such propositions that the recognition rate greatly depends on character qualities. Therefore, developing an automatic method for evaluating the qualities of handwritten Hangul characters is very effective for improving the performance of handwritten Hangul recognition algorithm. In this paper, an automatic method for evaluating qualities of the handwritten Hangul database KU-1 is proposed. In order to verify the performance of the proposed method, an evaluation with the KU-1 database has been performed. The experimental results reveal that the proposed method is a valuable tool for evaluating the qualities of handwritten Hangul characters objectively and automatically in accordance with the human criteria.
https://doi.org/10.1142/9789812797650_0032
Since a Korean character consists of nat-jas(graphemes), it seems natural to recognize its constituent nat-jas in order to recognize a handwritten Korean character. This paper proposes a new method which recognizes all its constituent nat-jas of a given character image. In this approach, we first decompose the image into small pieces, called segments, then reconstruct the nat-ja images by combining these segments and recognize them. In order to reduce the number of combinations of the segments to be examined, and effectively perform the recognition process, we build a constraint satisfying graph and use an efficient traversing method of the graph by making use of the human knowledge of Korean characters. An initial experimental result is given to show that the proposed approach has high potential to attack the problem of the handwritten Korean character recognition.
https://doi.org/10.1142/9789812797650_0033
In order to apply an already-developed high-performance on-line handwritten Hangul recognizer to off-line handwritten Hangul recognition, this paper presents a method to reconstruct the temporal information(stroke sequence) from an off-line handwritten Hangul character. In this paper, we divide the problems which make temporal information recovery difficult into four; 1) side effects due to thinning, 2) touching between graphemes in a Hangul character, 3) stroke separation in a grapheme and 4) stroke sequence recovery. Then, we propose methods to solve the four problems by 1) minimization of side effects based on smoothing, 2) grapheme segmentation based on Hangul structural information, 3) stroke composition based on the overlapping of connected component and vowel structural information, and 4) stroke sequence recovery based on heuristic rules, respectively.
With PE92 Hangul database, we perform off-line handwritten recognition experiments using an on-line handwritten recognizer, Bongnet. In the experiment, a stroke sequence is extracted from an off-line handwritten character by the proposed methods, and then, presented as an input to the on-line recognizer. Experimental results show the effectiveness of the proposed methods.
https://doi.org/10.1142/9789812797650_0034
A method for the recognition of handwritten Korean characters, based on a hierarchical random graph representation, is proposed. The characteristic of a Korean character composed of two or more graphemes on 2D plane is naturally embedded in the proposed hierarchical graph. In the hierarchical graph, the bottom layer is modeling various strokes, while the next two upper layers represent spatial and structural relationships between strokes and between graphemes respectively. Model parameters of the hierarchical graph have been estimated automatically from the training data by EM algorithm7 and embedded training technique. The recognition experiments were conducted with unconstrained handwritten Hangul characters to show that the proposed method can absorb large shape variations with a small number of models.
https://doi.org/10.1142/9789812797650_0035
A holistic method for recognizing touching digits in numeric strings is presented. A method for weighting the contribution of features from the various zones of the touching digit pattern is described. The method supports our intutive knowledge that the central part of the pattern lacks information useful for classification. Recognition rates are in the mid nineties for classes with sufficient number of training samples. These rates are significantly higher than the accuracy obtained by segmentation based method on the same set of images.
https://doi.org/10.1142/9789812797650_0036
This paper adopts a new kind of neural network — Quantum Neural Network (QNN) to recognize handwritten numerals. QNN combines the advantages of neural modelling and fuzzy theoretic principles. Novel experiments have been designed for the in-depth studies of applying the QNN to both synthesized confusing images and real data. Tests on synthesized data examine QNN's fuzzy feature space with an intention to illustrate its mechanism and characteristics, while studies on real data prove its great potential as a handwritten numeral classifier and the special role it plays in the multi-expert systems. Detailed comparisons and analyses of experimental results are given. An effective decision-fusion system is proposed and high reliability of 99.10% has been accomplished.
https://doi.org/10.1142/9789812797650_0037
In this paper it is shown, how an existing polynomial classifier could be improved by iterative (reinforced) learning. In some experiments the effects of this algorithm are evaluated. Additionally it is shown how the learning factor is related to the length of the polynomial to gain a good convergence to reduce the error rate. Furthermore at the first time a complete quadratic polynomial classifier in 256 features resulting in a polynomial length of 33152 could be trained and evaluated with this algorithm.
https://doi.org/10.1142/9789812797650_0038
Since ten digits have very important applications in our daily lives, for counting numbers, financial accounting, banking, trading and business transactions, etc, their analysis and recognition have been a very active topic of research in character recognition. In this paper, we apply new techniques of crucial combinations, to exploit distinctiveness and similarities of handwritten numerals in four- and six- partitions. The confusion combinations are also discovered to enhance pattern analysis. In-depth analyses and comparisons have been made using the most common models of numeric handprints. The results of this investigation create a new direction in finding and using the distinctive parts of characters for the recognition of handwritten numerals.
https://doi.org/10.1142/9789812797650_0039
This paper mainly proposes adquate feature candidates and multiple networks fusion strategy for the recognition of totally unconstrained handwritten numerals. Feature extraction methods consist of zoning feature, crossing point feature and directional feature. After we extract these feature sets, we use fuzzy integral to overcome convergence time, complexity of computation and accuracy when we used single neural network achitecture with high input dimension. Using Concordia database, we obtained 97.85% recognition rate without rejection. The experimental results show that proposed feature sets and classification methods have a good capability for practical application.
https://doi.org/10.1142/9789812797650_0040
The major concern of this paper is to compare performance of several statistical and neural network classifiers in both theoretical and practical aspects. Discussion in the latter is based on the results of experiments run with the handwritten digit images from the NIST Special Database 3. The statistical classifiers discussed in the paper can be divided into two types: parametric classifier and non-parametric classifier. The former includes an LDF, a QDF and a RDF, and the latter includes a k-NN. We also adopt an MLP, one of neural network approaches, to compare with the statistical classifiers.
https://doi.org/10.1142/9789812797650_0041
This paper describes a new numeral string recognition method. We classified numeral character touchings into six types. These touching types are easily detected by comparing the length of a vertical black pixel run with that of the horizontally adjacent one. And, they are obtained when touching characters are segmented into character candidate patterns. We propose to use the touching types to verify isolated character recognition. A verification unit consists of a pair of character codes and their touching type. The verification units which are impossible in actual character patterns are used to reject isolated character recognition results. We constructed a postal code recognition system, which demonstrated the performance of the proposed recognition method.
https://doi.org/10.1142/9789812797650_0042
It is very important for a classifier to be able to distinguish legible digits from anti-digit patterns in order to deal with ambiguous character segmentation in connected handwritten digit recognition. We present in this paper a new template representation and extraction method to tackle the problem. Instead of using the shape of a digit, a template is represented by its distance distribution map that is approximated by a rational B-spline surface with a set of control knots. A neural network approach is then applied to extract templates with a transfer function that takes into account of both the amplitude and gradient direction of each point in the distribution map. With 1,000 templates extracted from 10,426 training samples in NIST Special Database 3, our approach can successful reject 87.5% of anti-digit patterns while achieves a 95.8% correct classification rate without rejection on an independent test set of isolated digits. Our approach compared favorably with two other techniques tested in terms of the rejection rate on anti-digit patterns.
https://doi.org/10.1142/9789812797650_0043
Novel pattern recognition techniques using multiple agents for the recognition of handwritten text are proposed in this paper. The concept of intelligent agents and innovative multi-agent architectures for pattern recognition tasks is introduced for combining and elaborating the classification hypotheses of several classifiers. The architecture of a distributed digit-recognition system dispatching recognition tasks to a set of recognizers and combining their results is presented. This concept is being developed in the MAPR project, where intelligent agent architectures are built for pattern recognition tasks.
https://doi.org/10.1142/9789812797650_0044
A new approach for modeling the trajectory of the writing tool in the context of off-line character recognition is presented in this paper. Pieces of writing are first divided into segments which represent regular strokes interconnected together, at each interconnection a decision has to be taken in order to reconstruct the temporal sequence of segments. But in fact, the problem is solved as a global search in a graph which models the entire piece of writing (a character or a word). Three different graph models are presented, the more accurate allows to take into account pen-down, pen-up and retracing strokes. These models are valued by measures of different kinds extracted on the strokes, such as geometric and photometric information since the strokes are extracted as a set of gray-level cross-sections. Results are evaluated on a database of isolated cursive handwritten characters, nearly 90% of characters are correctly reconstructed.
https://doi.org/10.1142/9789812797650_0045
Contour Detection of Handwriting belongs to the step edge. A special class of wavelets, modular-angle-separated wavelets, is applied to detect step edges. The characteristics of wavelet transform with respect to the function of the step edge are studied in this paper. A significant property is proved that the wavelet transform of the step edge is a non-zero constant which is independent on both the gradient direction and the scale of the wavelet transform. Based on this property, an algorithm with the ability of noise-a multi-edges image effectively.
https://doi.org/10.1142/9789812797650_0046
Computation of the k nearest neighbors of an unknown pattern generally requires a large number of expensive distance computations. The method nicknamed PURD presented in this paper is proposed to speed-up this task. It is based on a study of Portegys and uses the concept of relative distance between two patterns, PURD consists in a reorganization of a database as a tree structure, followed by a tree search based on rules which allows portions of the tree to be cut-off. The price to pay is an increase of the amount of memory needed to store preprocessed data. This technique has been used in a handwriting recognition multi-agent system to classify digit samples derived from NIST data. Experimental results demonstrate the efficiency of the proposed algorithm. A speed-up of more than an order of magnitude has been obtained against classical kNN for the same rate of recognition. In comparison to other computational time optimization methods, PURD performances are quite interesting.
https://doi.org/10.1142/9789812797650_0047
This paper presents a novel data structure for representing large lexicons, that allows fast searches. It is based on the concept of directly addressing a table (Existence Table) in which there is a slot of 1 bit for each word on the lexicon. To obtain a small table, succesive reductions of the number of bits used to represent each word are done by using look-up tables (Translation Tables). The data structure is very flexible, and can be used not only for English lexicons, but also for those with large data sets like Japanese or Chinese.
https://doi.org/10.1142/9789812797650_0048
Results on a comparison of adaptive recognition techniques for on-line recognition of handwritten Latin alphabets are presented. The emphasis is on five adaptive classification strategies described in this paper. The strategies are based on first generating a user-independent set of prototype characters and then modifying this set in order to adapt it to each user's personal writing style. The initial set is formed by a simple clustering algorithm. The modification of the prototype set is performed using three modes of operation: 1) new prototypes are added, 2) existing prototypes are reshaped to better match the input, and 3) prototypes which produce false classifications are removed. The classification decision uses the k- Nearest Neighbor (k-NN) rule for the distances between the unknown character and the stored prototypes. The distances are calculated by using template matching with Dynamic Time Warping (DTW). The reshaping of the existing prototypes is performed by utilizing a modified version of the Learning Vector Quantization (LVQ) algorithm. The presented experiments show that the recognition system is able to adapt well to the user's writing style with only a few – say one hundred – handwritten characters.
https://doi.org/10.1142/9789812797650_0049
In the character recognition, the multi dictionary for one category is effective, but it has the problems of size and process time. We proposed the method of integrating the plural dictionaries to one dictionary. This recognition system is using the relaxation matching, and the features are extracted by polygonal approximation, but this is unstable by the deformation of the handprinted character. To solve this problem, integrated dictionary is important. This system was tested on common database ETL9 with the recognition result of 96.07% for the unknown data.
https://doi.org/10.1142/9789812797650_0050
This paper presents a 2-D Overlap-Save Method in the contour extraction of handwriting by applying spline wavelet transform. To speed up the computation, the character is equivalently divided into many small separated sections in overlap-save way. A dyadic wavelet is first exerted on the contour extraction of each section. Fast Number Theoretic Transform (FNTT) has then utilized as an effectively mathematical tool. A computational example is presented as well as a practiced experiment of extracting the contours of a Chinese handwriting. The positive results show the effectiveness of the 2-D Overlap-Save Method in contour extraction.
https://doi.org/10.1142/9789812797650_0051
In this paper, we present a novel combination of shape matrices and Hidden Markov Models (HMM) for the rotation, translation and scale invariant recognition of hand-drawn pictograms. The feature extraction is based on a polar subsampling technique, which takes into consideration not only a shape's outer geometry, but its inner geometry as well. This shape description technique is also known as a shape matrix. Within the HMM-framework the features are used to classify the pictogram and to estimate the rotation angle of the pattern using the combined segmentation and classification abilities of the Markov models. Three variations of the classifier design are presented, giving the option to choose between recognition with preferred rotation angles and fully rotation invariant recognition. The proposed techniques show high recognition accuracies up to 99.5% on two large pictogram databases consisting of 20 classes, where significant shape variations occur within each class due to differences in how each element is drawn. In order to obtain a detailed evaluation of our methods, experimental results for conventional approaches utilizing moments and neural networks are given in comparison. The techniques can be easily adapted to handle gray scale or color images and we demonstrate this by showing some results of our experimental image retrieval by user sketch system which serves also as an example for future applications.
https://doi.org/10.1142/9789812797650_0052
In this paper, a new approach for multiple classifier integration based on relevant classifiers is introduced. The design of the approach takes into account two facts. One is that there are intra-disparities (peculiar features) within different pattern modes of one class, which is the main reason for classification errors. The other is that there are always intra-similarities (universal features) within patterns of the same class, which is the underlying basis on which different classes can be distinguished. These two facts lead to the interrelation of classifiers based on different feature sets in a multiple classifier integration system. In our approach, the interrelations of classifiers are discovered by the integrating unit through learning from samples and grouping features appropriately, so that contradictory decisions can be analysed and classifiers invoked to apply features dynamically in order to reach a consistent decision. This approach provides a new flexible approach to the classification and verification of distorted, multi-mode complex patterns, such as handwritten signatures.
https://doi.org/10.1142/9789812797650_0053
This paper presents an approach for the recognition of online handwritten mathematical expressions. Using the simultaneous segmentation and recognition capabilities of Hidden Markov Models (HMMs), it is possible to avoid the complex and crucial handwriting segmentation during pre-processing. The segmentation and recognition result is used in a further step for the interpretation of the symbols and their spatial relationships. Ambiguities can be resolved under consideration of some constraints to the handwriting production process. For visualization purposes, the results are transformed into TEX-syntax. Some recognition results, which were obtained with a writer-dependent system are presented in Sect. 5.
https://doi.org/10.1142/9789812797650_0054
This paper describes a fully operational on-line signature verification system. From a hardware point of view, its heart is the SMARTpen™, a special input-device allowing the recording of force and angle signals. The most important software aspect that we focus on here is the exploitation of the Baum-Welch procedure in the feature extraction process. This algorithm provides a mathematical basis for classifying a signature taking into account the relative importance of both the different signals under observation, and the distinct phenomena that are present in these. The usefulness of the approach is illustrated by presenting the results of a full-scale field test.
https://doi.org/10.1142/9789812797650_0055
This paper reports a function-based on-line signature recognition system. The system is translation, rotation and size invariance. The system does not require any special writing pen and the signature is obtained through a normal writing pad. The orientation and scale of the input signature is first estimated, and then the signature is normalized. To extract features of a signature, a number of timing functions have been evaluated and four of them are selected for signature representation. Regression analysis is adopted to perform distance measurement. The output of the system is a score ranging from zero to one indicating the similarity between the unknown signature and a signature in the reference database. A database with eight hundreds signatures is used to evaluate the proposed system. The database consists of signatures obtained from one hundred persons, with each person gives five genuine signatures and three forgeries. The accuracy of the proposed system is 98% while both type I and type II error are less than 2%. The system is developed on a Pentium II 300 MHz personal computer. The recognition time is less than 1 second while the time for verifying one signature is less than 0.5 second.
https://doi.org/10.1142/9789812797650_0056
This paper discusses signature verification using distribution of angular direction of pen-point movement in signing process. First, the locus of pen-point movement is approximated by a finite number of line segments with a constant length. It is considered here the angular direction sequence of the line segments as time series. Then the distribution of the angular direction sequence in the segmented time interval is computed. With the computed distribution of the angular direction sequence, signature verification experiments on a data base of four hundred KANJI signatures were attempted. In our experiments, type I (false rejection) error rates were 2.5 – 7.5% while type II (false acceptance) error rates were 0 – 15%.
https://doi.org/10.1142/9789812797650_0057
This paper describes our research to enhance handwriting-based user interfaces. It consists of building infrastructures, advancing handwriting recognition technology, studying human interface and developing applications. Our goal is to provide consistent and creative human interfaces among a wide range of pen input devices such as PDA, desktop tablet and electronic whiteboard. The on-line handwritten character recognition is the key technology. Our method has marked 90 to 95% correct recognition rates without learning to a large database of on-line handwritten Japanese text. Its recognition speed is about 0.02 sec/character on a Pentium 200 Mhz processor and roughly 0.2 sec/char, on a small PDA machine. The method is not only robust to stroke connections and pattern distortions but also highly customizable for personal use. Upon the request of learning an input pattern, it identifies a deformed subpattern (radical), registers the (sub)pattern and extends the effect to all the character categories whose shapes include it. This paper also describes educational applications which benefit from pen interfaces and the handwriting recognition engine.