Research interest in Chinese character recognition in Taiwan in recent years has been intense, due in part to cultural considerations, and in part to advances in computer hardware development. This chapter addresses coarse character classification, candidate selection, statistical character recognition, recognition based on structural character primitives such as line segments, strokes and radicals, as well as postprocessing and model development.
Coarse character classification and candidate selection are used to reduce matching complexity; statistical methods of character recognition are shown to be effective feature-matching which shows good performance is reported; and, structural-based methods able to distinguish between similar characters are investigated thoroughly. Since no temporal information is available for off-line recognition systems, the character test base is still limited. Methods used to extract structural primitives are also investigated.
Language models based on syntactical or semantic considerations are used to select the most probable characters from sets of candidates, and are applied in postprocessing in input sentence images. These models generally employ the dynamic programming methods. To increase identification capacity, various ways of grouping Chinese words into a reasonable number of classes are also proposed.