BIAS-VARIANCE CONTROL VIA HARD POINTS SHAVING
Abstract
In this paper, we propose a regularization technique for AdaBoost. The method implements a bias-variance control strategy in order to avoid overfitting in classification tasks on noisy data. The method is based on a notion of easy and hard training patterns as emerging from analysis of the dynamical evolutions of AdaBoost weights. The procedure consists in sorting the training data points by a hardness measure, and in progressively eliminating the hardest, stopping at an automatically selected threshold. Effectiveness of the method is tested and discussed on synthetic as well as real data.