Skip to main content Accessibility help
×
×
Home

Optimized wavelet-domain filtering under noisy and reverberant conditions

  • Randy Gomez (a1), Tatsuya Kawahara (a2) and Kazuhrio Nakadai (a1)

Abstract

The paper addresses a robust wavelet-based speech enhancement for automatic speech recognition in reverberant and noisy conditions. We propose a novel scheme in improving the speech, late reflection, and noise power estimates from the observed contaminated signal. The improved estimates are used to calculate the Wiener gain in filtering the late reflections and additive noise. In the proposed scheme, optimization of the wavelet family and its parameters is conducted using an acoustic model (AM). In the offline mode, the optimal wavelet family is selected separately for the speech, late reflections, and background noise based on the AM likelihood. Then, the parameters of the selected wavelet family are optimized specifically for each signal subspace. As a result we can use a wavelet sensitive to the speech, late reflection, and the additive noise, which can independently and accurately estimate these signals directly from an observed contaminated signal. For speech recognition, the most suitable wavelet is identified from the pre-stored wavelets, and wavelet-domain filtering is conducted to the noisy and reverberant speech signal. Experimental evaluations using real reverberant data demonstrate the effectiveness and robustness of the proposed method.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Optimized wavelet-domain filtering under noisy and reverberant conditions
      Available formats
      ×

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Optimized wavelet-domain filtering under noisy and reverberant conditions
      Available formats
      ×

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Optimized wavelet-domain filtering under noisy and reverberant conditions
      Available formats
      ×

Copyright

This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

Corresponding author

Corresponding author: R. Gomez Email: r.gomez@jp.honda-ri.com

References

Hide All
[1]Habets, E.: Single and multi-microphone speech dereverberation using spectral enhancement. PhD Thesis, June 2007.
[2]Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. on ASSP, 27 (2), 1979, 113120.
[3]Kim, W.; Kang, S.; Ko, H.: Spectral subtraction based on phonetic dependency and masking effects. Proc. IEEE Vis. Image Signal Process., 147, 2000, 423427.
[4]Lockwood, P.; Boudy, J.: Experiments with non-linear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars. Speech Commun., 11 (2–3) (1992), 215228
[5]Soon, I.; Koh, S.; Yeo, C.: Selective magnitude subtraction for speech enhancement, in Proc. The Fourth Int. Conf./Exhibition on High Performance Computing in The Asia Pacific Region, 2000, vol. 2, 692695.
[6]Gomez, R.; Kawahara, T.: Robust speech recognition based on dereverberation parameter optimization using Acoustic model likelihood. IEEE Trans. Audio, Speech and Lang. Proc., 18, 2010, 17081716.
[7]Ambikairajah, E.; Tattersall, G.; Davis, A.: Wavelet transform-based speech enhancement, in Proc. ICSLP, 1998.
[8]Cohen, I.: Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Trans. Speech Audio Process., 11, 2003, 466475.
[9]Ayat, S.; Manzuri-Shalmani, M.T.; Dianat, R.: An improved wavelet-based speech enhancement by using speech signal features. Comput. Electr. Eng., 32 (6), 2006, 411425.
[10]Ayat, S.; Manzuri, M.; Dianat, R.; Kabudian, J.: An improved spectral subtraction speech enhancement system by using an adaptive spectral estimator, in IEEE Canadian Conf. on Electrical and Computer Engineering, 2005.
[11]Loizou, P.: Speech enhancement: theory and practice. CRC Press, Boca Raton, FL, 2007.
[12]Gomez, R.; Kawahara, T.: Denoising using optimized wavelet filtering for automatic speech recognition, in Proc. of Interspeech, 2011.
[13]Gomez, R.; Kawahara, T.: An improved wavelet-based dereverberation for robust automatic speech recognition, in Proc. of Interspeech, 2010.
[14]Gomez, R.; Even, J.; Saruwatari, H.; Shikano, K.: Distant-talking Robust speech recognition using late reflection components of room impulse response, in ICASSP, 2008.
[15]Gomez, R.; Kawahara, T.: Optimization of Dereverberation parameters based on likelihood of speech recognizer. in Proc. of Interspeech, 2009.
[16]Zelniker, G.; Taylor, F.: Advanced digital signal processing. Marcel Dekker, Inc., New York, 1994.
[17]Sheikhzadeh, H.; Abutalebi, H.: An improved wavelet-based speech enhancement system, in Proc. of Eurospeech, 2001.
[18]Seltzer, M.: Speech-recognizer-based optimization for microphone array processing, in Proc. of EEE Signal Processing Letters, 2003.
[19]Kuttruff, H.: Room acoustics. Spon Press, London, 2000.
[20]Hirsch, H.-G.; Finster, H.: A new approach for the adaptation of HMMs to reverberation and background noise. Speech Commun., 50, 2008, 244263.
[21]Gomez, R.; Even, J.; Saruwatari, H.; Shikano, K.: Fast dereverberation for hands-free speech recognition, in Proc. of the Hands-free Speech Communication and Microphone Arrays (HSCMA), 2008.
[22]Donoho, D.L.: Denoising by soft thresholding. IEEE Trans. Inf. Theory, 41, 1995, 613617.
[23]Yamade, S.; Matsunami, K.; Baba, A.; Lee, A.; Saruwatari, H.; Shikano, K.: Spectral subtraction in noisy environments applied to speaker adaptation based on HMM sufficient statistics, in Proc. of ICSLP, 2000.
[24]Daubechies, I.: Ten lectures on wavelets. SIAM, Philadelphia, PA, 1992.
[25]Misit, M.; Misiti, Y.; Oppenheim, G.; Poggi, J.: Wavelet toolbox user guide. Mathworks, Natick, MA, 2014.
[26]Yegnanarayana, B.; Satyaranyarana, P.: Enhancement of reverberant speech using LP residual signals. Proc. of IEEE Trans. on Audio, Speech and Lang. Proc., 8 (3), 2000, 267281.
[27]Griebel, S.; Brandstein, M.: Wavelet transform extrema clustering for multi-channel speech dereverberation, in IEEE Workshop on Acoustic Echo and Noise Control, 1999.
[28]Gomez, R.; Kawahara, T.: Optimizing wavelet parameters for dereverberation in automatic speech recognition, in Proc. of APSIPA, 2010.
[29]Advanced Front-End Feature Extraction Algorithm, ETSI Standard Document ES 202 050, 2002.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

APSIPA Transactions on Signal and Information Processing
  • ISSN: 2048-7703
  • EISSN: 2048-7703
  • URL: /core/journals/apsipa-transactions-on-signal-and-information-processing
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Keywords

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed