Hostname: page-component-6766d58669-mzsfj Total loading time: 0 Render date: 2026-05-14T13:43:52.143Z Has data issue: false hasContentIssue false

Development of speech technologies to support hearing through mobile terminal users

Published online by Cambridge University Press:  12 October 2015

Taro Togawa
Affiliation:
Media Processing Laboratories, Fujitsu Laboratories Ltd., Kawasaki, Japan
Takeshi Otani*
Affiliation:
Media Processing Laboratories, Fujitsu Laboratories Ltd., Kawasaki, Japan
Kaori Suzuki
Affiliation:
Advanced Technologies Division, Ubiquitous Business Strategy Unit, Fujitsu Limited, Kawasaki, Japan
Tomohiko Taniguchi
Affiliation:
Network Systems Laboratories, Fujitsu Laboratories Ltd., Kawasaki, Japan
*
Corresponding author: T. Otani Email: otani.takeshi@jp.fujitsu.com

Abstract

Mobile terminals have become the most familiar communication tool we use, and various types of people have come to use mobile terminals in various environments. Accordingly, situations in which we talk over the telephone in noisy environments or with someone who speaks fast have increased. However, it is sometimes difficult to hear a person's voice in these cases. To make the voice received through mobile terminals easy to hear, authors have developed two technologies. One is a voice enhancement technology that emphasizes a caller's voice according to the noise surrounding the recipient, and the other is a speech rate conversion technology that slows speech while maintaining voice quality. In this paper, we explain the trends and the features of these technologies and discuss ways to implement their algorithms on mobile terminals.

Information

Type
Industrial Technology Advances
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright
Copyright © The Authors, 2015
Figure 0

Fig. 1. Block diagram of spectral subtraction method.

Figure 1

Fig. 2. Block diagram of MMSE-STSA method.

Figure 2

Fig. 3. Block diagram of the reverberation suppression method.

Figure 3

Fig. 4. Block diagram of the post-filters used in ACELP.

Figure 4

Fig. 5. Outline of voice enhancement technology.

Figure 5

Fig. 6. Block diagram of developed voice enhancement technology.

Figure 6

Fig. 7. Principle of high-frequency gain control according to the type of noise.

Figure 7

Fig. 8. Evaluation test results (developed voice enhancement technology).

Figure 8

Fig. 9. Basic principles of speech rate conversion.

Figure 9

Fig. 10. Principles of OverLap-Add method.

Figure 10

Fig. 11. Block diagram of developed speech rate conversion technology.

Figure 11

Fig. 12. Example using developed speech rate conversion technology.

Figure 12

Fig. 13. Evaluation test results (developed speech rate conversion technology).

Figure 13

Fig. 14. Factors of received speech quality.

Figure 14

Fig. 15. Attaining “standard speech quality” with adaptive speech quality control. (a) Without adaptive speech control (upper). (b) With adaptive speech quality control (lower).

Figure 15

Fig. 16. Diagram of speech quality control.

Figure 16

Fig. 17. VAD mechanism.

Figure 17

Fig. 18. Hearing ability decreases with increasing age.

Figure 18

Fig. 19. Amplification amount control according to input volume. Sound pressure level (SPL).