World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Wavelet-Based Weighted Low-Rank Sparse Decomposition Model for Speech Enhancement Using Gammatone Filter Bank Under Low SNR Conditions

    https://doi.org/10.1142/S0219477523500207Cited by:0 (Source: Crossref)

    Estimating noise-related parameters in unsupervised speech enhancement (SE) techniques is challenging in low SNR and non-stationary noise environments. In the recent SE approaches, the best results are achieved by partitioning noisy speech spectrograms into low-rank noise and sparse speech parts. However, a few limitations reduce the performance of these SE methods due to the use of overlap and add in STFT process, noisy phase, due to inaccurate estimation of low rank in nuclear norm minimization and Euclidian distance measure in the cost function. These aspects can cause a loss of information in the reconstructed signal when compared to clean speech. To solve this, we propose a novel wavelet-based weighted low-rank sparse decomposition model for enhancing speech by incorporating a gammatone filter bank and Kullback–Leibler divergence. The proposed framework differs from other strategies in which the SE is carried entirely in time domain without the need for noise estimation. Further, to reduce the word error rate, these algorithms were trained and tested on a typical automatic speech recognition module. The experimental findings indicate that the proposed cascaded model has shown significant improvement under low SNR conditions over individual and traditional methods with regard to SDR, PESQ, STOI, SIG, BAK and OVL.

    Communicated by Andras Der