World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Self-Labeling Learning Ensemble via Deep Recurrent Neural Network and Self-Representation for Speech Emotion Recognition

    https://doi.org/10.1142/S0218001424520177Cited by:0 (Source: Crossref)

    Speech emotion recognition (SER) methods rely on frames to analyze the speech data. However, the existing methods typically divide a speech sample into smaller speech frames and label them with a single emotional tag, which fails to consider the possibility of multiple emotion tags coexisting within a speech sample. To deal with this limitation, we present a novel approach called self-labeling learning ensemble via DRNN and self-representation (En-DRNN-SR) for SER. This method automatically segments speech sample into speech frames, and then the deep recurrent neural network (DRNN) is applied to learn the deep features, and next the self-representation is built to get a relational degree matrix, finally the speech frames is divided into three parts using a relational degree matrix: the key emotional frames, the compatible emotional frames and the noise frames. The emotion tags of the compatible emotional frames are adaptive cyclic learned based on the key emotion frames vias the relational degree matrix, while also checking the emotion tags associated with the key compatible frames. Additionally, we introduce a new self-labeling criterion based on fuzzy membership degree for SER. To evaluate the feasibility and effectiveness of the proposed En-DRNN-SR, we conducted extensive experiments on IEMOCAP, EMODB, and SAVEE database, the proposed En-DRNN-SR obtains 69.13%, 82.83%, and 52.31% results on IEMOCAP, EMODB, and SAVEE database, which outperformed all competing algorithms. The experimental results clearly demonstrate that the proposed approach outperforms state-of-the-art SER methods, achieving superior performance on feature learning and classification.