Document AnalysisNo Access

MULTIMODAL COMPUTER-ASSISTED TRANSCRIPTION OF TEXT IMAGES AT CHARACTER-LEVEL INTERACTION

DANIEL MARTÍN-ALBO

Instituto Tecnológico de Informática, Universitat Politècnica de València, Camino de Vera s/n, 46071 Valencia, Spain

Search for more papers by this author

VERÓNICA ROMERO

Instituto Tecnológico de Informática, Universitat Politècnica de València, Camino de Vera s/n, 46071 Valencia, Spain

Search for more papers by this author

ALEJANDRO H. TOSELLI

Instituto Tecnológico de Informática, Universitat Politècnica de València, Camino de Vera s/n, 46071 Valencia, Spain

Search for more papers by this author

, and

ENRIQUE VIDAL

Instituto Tecnológico de Informática, Universitat Politècnica de València, Camino de Vera s/n, 46071 Valencia, Spain

Search for more papers by this author

https://doi.org/10.1142/S0218001412630037Cited by:6 (Source: Crossref)

Abstract

Currently, automatic handwriting recognition systems are ineffectual in unconstrained handwriting documents. Therefore, to obtain perfect transcriptions, heavy human intervention is required to validate and correct the results of such systems. Given that this post-editing process is inefficient and uncomfortable, a multimodal interactive approach has been proposed in previous works, which aims at obtaining correct transcriptions with the minimum human effort. In this approach, the user interacts with the system by means of an e-pen and/or more traditional methods such as keyboard or mouse. This user's feedback allows to improve system accuracy and multimodality increases system ergonomics and user acceptability. Until now, multimodal interaction has been considered only at whole-word level. In this work, multimodal interaction at character-level is studied, that may lead to more effective interactivity, since it is faster and easier to write only one character rather than a whole word. Here we study this kind of fine-grained multimodal interaction and present developments that allow taking advantage of interaction-derived context to significantly improve feedback decoding accuracy. Empirical tests on three cursive handwritten tasks suggest that, despite losing the deterministic accuracy of traditional peripherals, this approach can save significant amounts of user effort with respect to fully manual transcription as well as to noninteractive post-editing correction.

Keywords:

References

S. Barrachinaet al., Comput. Ling. 3 (2009), DOI: 10.1162/coli.2008.07-055-R2-06-29. Google Scholar
J. Civeraet al., Advances in Statistical, Structural and Syntactical Pattern Recognition, Lecture Notes in Computer Science, eds. A. Fredet al. (Springer-Verlag, 2004) pp. 207–215. Crossref, Google Scholar
G. Dimauroet al., A new database for research on bank-check processing, 8th Int. Workshop on Frontiers in Handwriting Recognition (2002) pp. 524–528. Google Scholar
I. Guyonet al., UNIPEN project of on-line data exchange and recognizer benchmarks, Proc. of the 14th Int. Conf. Pattern Recognition (1994) pp. 29–33. Google Scholar
B. Q. Huang, Y. B. Zhang and M. T. Kechadi, Preprocessing techniques for online handwriting recognition, ISDA '07: Proc. Seventh Int. Conf. Intelligent Systems Design and Applications (IEEE Computer Society, 2007) pp. 793–800. Google Scholar
F. Jelinek , Statistical Methods for Speech Recognition ( MIT Press , 1998 ) . Google Scholar
S. Johansson, E. Atwell, R. Garside and G. Leech, The tagged lob corpus, User's manual, Norwegian Computing Center for the Humanities (1996) . Google Scholar
L. A. Leivaet al., Evaluating an interactive-predictive paradigm on handwriting transcription: A case study and lessons learned, Proc. 35th Annual IEEE Computer Software and Applications Conf. (COMPSAC) (2011) pp. 610–617. Google Scholar
U.-V. Marti and H. Bunke, A full English sentence database for off-line handwriting recognition, Proc. ICDAR'99 (1999) pp. 705–708. Google Scholar
U.-V. Marti and H. Bunke, Int. J. Pattern Recogn. Artif. Intell. 15(1), 65 (2001), DOI: 10.1142/S0218001401000848. Link, Web of Science, Google Scholar
U. Marti and H. Bunke, Int. J. Pattern Recogn. Artif. Intell. 5(1), 39 (2002). Google Scholar
L. Rabiner, Proc. IEEE 77, 257 (1989), DOI: 10.1109/5.18626. Crossref, Web of Science, Google Scholar
V. Romeroet al., Computer assisted transcription for ancient text images, ICIAR 20074633 (Springer-Verlag, Montreal, Canada, 2007) pp. 1182–1193. Google Scholar
V. Romero, A. H. Toselli and E. Vidal, Using mouse feedback in computer assisted transcription of handwritten text images, Int. Conf. Document Analysis and Recognition, 10th ICADR (2009) pp. 96–100. Google Scholar
V. Romero, A. H. Toselli and E. Vidal, Character-level interaction in computer-assisted transcription of text images, Proc. ICFHR 2010 (2010) pp. 539–544. Google Scholar
V. Romero , A. H. Toselli and E. Vidal , Multimodal Interactive Handwritten Text Transcription , Series in Machine Perception and Artificial Intelligence (MPAI) ( World Scientific Publishing , 2011 ) . Google Scholar
S. N. Srihari and E. J. Keubert, Integration of handwritten address interpretation technology into the United States postal service remote computer reader system, Fourth Int. Conf. Document Analysis and Recognition2 (1997) pp. 892–896. Google Scholar
A. H. Toselliet al., Int. J. Pattern Recogn. Artif. Intell. 18(4), 519 (2004), DOI: 10.1142/S0218001404003344. Link, Web of Science, Google Scholar
A. H. Toselli, A. Juan and E. Vidal, Spontaneous handwriting recognition and classification, Proc. 17th Int. Conf. Pattern Recognition1 (2004) pp. 433–436. Google Scholar
A. H. Toselli, M. Pastor and E. Vidal, On-line handwriting recognition system for Tamil handwritten characters, Proc. IbPRIA'074477 (Springer-Verlag, 2007) pp. 370–377. Google Scholar
A. H. Toselliet al., Pattern Recogn. 43(5), 1814 (2010), DOI: 10.1016/j.patcog.2009.11.019. Crossref, Web of Science, Google Scholar
A. H. Toselliet al., Computer assisted transcription of handwritten text, Proc. 9th ICDAR 2007 (2007) pp. 944–948. Google Scholar
E. Vidalet al., Interactive pattern recognition, Proc. 4th Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms4892, LNCS (2007) pp. 60–71. Google Scholar
M. Zimmermann, J.-C. Chappelier and H. Bunke, IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 818 (2006), DOI: 10.1109/TPAMI.2006.103. Crossref, Web of Science, Google Scholar