World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Spring Sale: Get 35% off with a min. purchase of 2 titles. Use code SPRING35. Valid till 31st Mar 2025.

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

A NEW KERNEL-BASED FORMALIZATION OF MINIMUM ERROR PATTERN RECOGNITION

    https://doi.org/10.1142/9789812775320_0003Cited by:0 (Source: Crossref)
    Abstract:

    Previous expositions of the Minimum Classification Error framework for discriminative training of pattern recognition systems describe the use of a smoothed version of the error count as the criterion function for classifier design, but do not specify the origin and nature of the smoothing used. In this chapter we show that the same optimization criterion can be derived from the classic Parzen window approach to smoothing in the context of non-parametric density estimation. The density estimated is not that of the category pattern distributions - as performed in conventional non-discriminative methods such as maximum likelihood estimation - but rather that of a transformational variable comparing correct and best incorrect categories. The density estimate can easily be integrated over the domain corresponding to classification mistakes, yielding a cost function that is closely related to the original MCE cost function. The risk estimate formulation presented here provides a new link, Parzen estimation, between the empirical cost function calculated on a finite training set and the true theoretical classification risk. The classic Parzen concepts of linking the kernel width to the amount of training data can be used in the context of discriminative training to express the intuitive concept of using a smaller margin as the amount of training data increases.