Automatic Video Event Detection for Imbalance Data Using Enhanced Ensemble Deep Learning
Abstract
With the explosion of multimedia data, semantic event detection from videos has become a demanding and challenging topic. In addition, when the data has a skewed data distribution, interesting event detection also needs to address the data imbalance problem. The recent proliferation of deep learning has made it an essential part of many Artificial Intelligence (AI) systems. Till now, various deep learning architectures have been proposed for numerous applications such as Natural Language Processing (NLP) and image processing. Nonetheless, it is still impracticable for a single model to work well for different applications. Hence, in this paper, a new ensemble deep learning framework is proposed which can be utilized in various scenarios and datasets. The proposed framework is able to handle the over-fitting issue as well as the information losses caused by single models. Moreover, it alleviates the imbalanced data problem in real-world multimedia data. The whole framework includes a suite of deep learning feature extractors integrated with an enhanced ensemble algorithm based on the performance metrics for the imbalanced data. The Support Vector Machine (SVM) classifier is utilized as the last layer of each deep learning component and also as the weak learners in the ensemble module. The framework is evaluated on two large-scale and imbalanced video datasets (namely, disaster and TRECVID). The extensive experimental results illustrate the advantage and effectiveness of the proposed framework. It also demonstrates that the proposed framework outperforms several well-known deep learning methods, as well as the conventional features integrated with different classifiers.