Hostname: page-component-77f85d65b8-g98kq Total loading time: 0 Render date: 2026-03-30T09:23:55.605Z Has data issue: false hasContentIssue false

Blind bandwidth extension of audio signals based on non-linear prediction and hidden Markov model

Published online by Cambridge University Press:  30 July 2014

Xin Liu
Affiliation:
Speech and Audio Signal Processing Laboratory, School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100124, China. Phone: +86 10 6739 1635
Changchun Bao*
Affiliation:
Speech and Audio Signal Processing Laboratory, School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100124, China. Phone: +86 10 6739 1635
*
Corresponding author: C. Bao Email: baochch@bjut.edu.cn

Abstract

The bandwidth limitation of wideband (WB) audio systems degrades the subjective quality and naturalness of audio signals. In this paper, a new method for blind bandwidth extension of WB audio signals is proposed based on non-linear prediction and hidden Markov model (HMM). The high-frequency (HF) components in the band of 7–14 kHz are artificially restored only from the low-frequency information of the WB audio. State-space reconstruction is used to convert the fine spectrum of WB audio to a multi-dimensional space, and a non-linear prediction based on nearest-neighbor mapping is employed in the state space to restore the fine spectrum of the HF components. The spectral envelope of the resulting HF components is estimated based on an HMM according to the features extracted from the WB audio. In addition, the proposed method and the reference methods are applied to the ITU-T G.722.1 WB audio codec for comparison with the ITU-T G.722.1C super WB audio codec. Objective quality evaluation results indicate that the proposed method is preferred over the reference bandwidth extension methods. Subjective listening results show that the proposed method has a comparable audio quality with G.722.1C and improves the extension performance compared with the reference methods.

Information

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
The online version of this article is published within an Open Access environment subject to the conditions of the Creative Commons Attribution licence http://creativecommons.org/licenses/by/3.0/
Copyright
Copyright © The Authors, 2014
Figure 0

Fig. 1. Block diagram of the proposed BWE method.

Figure 1

Fig. 2. Fine spectrum for a frame of violin signals.

Figure 2

Fig. 3. Example of autocorrelation function for fine spectrum of violin signals.

Figure 3

Fig. 4. The relationship between ratio of FNN to all the state vectors and state-space dimension.

Figure 4

Fig. 5. Example of state trajectory for LF fine spectrum of audio signal.

Figure 5

Fig. 6. Example of state trajectory for HF fine spectrum of audio signal.

Figure 6

Fig. 7. Block diagram of non-linear prediction for fine spectrum using NNM.

Figure 7

Fig. 8. The comparison of fine spectrum for audio signals from violin. (a) Original spectrum; (b) truncated spectrum; (c) extended spectrum.

Figure 8

Fig. 9. εNMSE of proposed BWE method with different thresholds in autocorrelation method.

Figure 9

Fig. 10. εNMSE of proposed BWE method with different ratio of FNN.

Figure 10

Table 1. Comparison of normalized mean square error for four BWE methods.

Figure 11

Table 2. Time-frequency features for describing WB audio signals.

Figure 12

Fig. 11. Block diagram of a priori knowledge training.

Figure 13

Fig. 12. Block diagram of G.722.1 encoder.

Figure 14

Fig. 13. Block diagram of G.722.1 decoder with BWE.

Figure 15

Table 3. Algorithm of G.722.1 decoder with the BWE function.

Figure 16

Fig. 14. LSD for different BWE methods.

Figure 17

Fig. 15. SNRseg for different BWE methods.

Figure 18

Fig. 16. Mean subjective scores with 95% confidence intervals for the MUSHRA listening test.

Figure 19

Fig. 17. Distributions of listener rating in CCR tests. (a) Comparison between NNM and TDNP. (b) Comparison between TDNP and G.722.1C. (c) Comparison between G.722.1C and NNM.

Figure 20

Table 4. Algorithm complexity of proposed BWE method.