A NEURAL NETWORK-BASED COLOR DOCUMENT SEGMENTATION APPROACH
Document Segmentation is a necessary pre-processing stage for most document process systems. In the color document, text, picture and graph appear with millions of different colors. This paper proposes a new color document segmentation approach based on neural network and wavelet. The approach provides both lower classification error rate and better visual result when we segment a complex document with color background and embedded objects. Firstly an adaptive color reduction technique is used to obtain the optimal number of colors, and to convert the document into these principal colors, Secondly a modified context-based multiscale segmentation algorithm is designed to segment documents at each new binary document based on the classification features extracted from wavelet-transformed images. And then a document page is segmented into three classes: text, graph and picture regions by a merging procedure. Finally the paper gives the experimental results and conclusion.