To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Chapter 2 kicks off the presentation with an overview of the standard metrics and methodologies followed by a description of specialized tools employed for obtaining subjective voice-quality scores through genuine opinion surveys and via computer modeling emulating human perceptive evaluation of speech quality. It then relates voice-quality scores obtained via surveys or computer evaluations to the perception of worth. It elaborates on the relationships between opinion scores and the potential return on investment in voice-quality technology. It examines results of voice-quality studies with reference to the three popular GSM codecs – full rate (FR), enhanced full rate (EFR) and half rate (HR). The presentation includes a discussion of the effect of noise and transmission errors on the relative performance of these codecs.
Introduction
It is widely agreed that the most vital element affecting the performance of voice quality in wireless networks is the type of codec used in the communication session. It is also well known that spectrum is the most expensive (per channel) building block contributing to the viability of wireless infrastructure. Yet most European GSM wireless operators, throughout the nineties and the first half of the following decade, have opted to restrict their service offerings to full-rate (FR) or enhanced full-rate (EFR), rather than embracing the half-rate (HR) option, which could cut their spectrum consumption in half and save them millions of dollars in operating expenses.
Chapter 6 is dedicated to the subject of level-control optimization. The presentation is divided into three parts and an introduction. The chapter starts the ball rolling in the introduction by defining standard methodologies for measuring and quantifying signal levels. The first part deals with automatic level control (ALC), how it works, and its placement within the network. The second part describes the adaptive level control, a.k.a. noise compensation (NC), how it works under different codecs, and where it is placed in the network. The third part describes the high-level compensation procedure along the same outline.
Basic signal-level measurements and definitions
One of the most critical speech-quality attributes is the perceived-speech level. When it is too loud, it may either overload or hurt the eardrum. When it is too soft, the listener or even the codec may have difficulties picking up words. There are several different measures and metrics used to measure speech levels. The common thread linking them together to a standardized scale is the unit of measurement defined as decibel and abbreviated as dB.
The human auditory system has a dynamic range of 100,000,000,000,000 (1014) intensity units. This dynamic range is better represented by a logarithmic scale, as a ratio of two intensities, P1 and P2. The expression “log(P1/P2)” is labeled a bel. The resulting number is still too large so, instead, the measure employed is one tenth of a bel or a decibel (dB).
Chapter 3 provides an overview of echo in telecommunications networks, its root causes, and its parameters. It follows the presentation with the methods used for controlling electrical echo, including network loss, echo suppression, linear convolution, non-linear processing, and comfort noise injection. The chapter covers the application of echo cancelation in wireless communications. And, in view of the fact that today's wireless networks include long-distance circuit switched, VoIP, and VoATM infrastructures (specifically as part of third-generation architectures), the chapter covers echo cancelation in long-distance and voice-over-packet applications.
Electrical echo
Many people have never experienced echo on a telephone call. They either never made calls outside their vicinity, or never encountered a malfunctioning echo canceler on their long-distance or wireless calls. In reality, echo does exist in the network. It accompanies every call involving an analog PSTN phone, but in most cases it either gets canceled before reaching its listening ear or it arrives too quickly with little delay, so it can sneak in undetected.
To hear an echo, we must first generate a sound. Then the sound must travel over a substantial distance or a slow terrain (a.k.a. complex processing) to accumulate delay. Next, the sound must be reflected and then travel back towards us. Finally, when the reflected sound reaches our ear, it must be loud enough to be heard. In addition, the amount of delay that the original sound signal incurs influences our perception of “reverberated echo” (i.e., increasing delay produces a greater echo effect).
Most people view sleep as a biological necessity designed to rejuvenate and invigorate body and mind so that human beings can function effectively when they are awake. While awake we tend to eat, dispose of waste, spawn the new generation, and take care of business, so that when the day is over we can go back to sleep.
When asked: “What is life's purpose?” we immediately think of life as the time we spend on earth while being awake.
Now let's reverse the paradigm. Let's pretend for a brief moment that life's purpose is a good sleep, and that all of the energetic activities above are designed to sustain our ability to go back and “live” (while asleep) the next day (see Figure 16.1).
I know. This is a weird thought, but let's see if we can apply the concept to voice-quality engineering in wireless networks, so that it makes sense.
Voice-quality systems are designed to operate on voice and to disable themselves on detection of data transmission. In other words, when turning on the VQS, voice-enhancement applications are enabled automatically while the system goes on guard, watching for any data that might require a disabling action.
Now let's reverse the paradigm. Let's disable all or a particular VQ application upon system turn-on, and continue to monitor the channel until voice is detected. When this occurs, the system enables the VQ application, and then puts it back to sleep when the monitored signal no longer looks like voice.
In Chapters 2 and 9, I discussed studies that illustrate a considerable weakness of the GSM half-rate codec in the face of noise. I also brought to light study results showing that an effective NR algorithm may be able to remedy some of the HR weakness and lift the voice-quality performance back to par for signals with a relatively poor SNR (see Figure 9.13).
As new wireless codecs are introduced at the market place, many service providers wonder whether VQ systems are effective in enhancing quality, as higher compression ratios are used in an attempt to reduce spectrum requirements and augment air capacity.
In this chapter, I introduce a procedure for testing the hypothesis that a new codec with a higher compression ratio offers an inferior (or equivalent) voice-quality performance in comparison to an existing codec, specifically under noisy conditions.
If the hypothesis proves to be true, then there is a need to test the hypothesis that VQS can lift the performance level of the new codec to a close proximity (or beyond) of the existing codec (under noisy conditions and without VQS).
Procedure
The procedure outlined here involves two phases.
Phase 1 (see Figure 15.1)
(1) Send pre-recorded speech signals from point B without noise. Capture and record speech signals (pre-recorded) at point A.
(2) Send pre-recorded speech signals from point E with noise. Capture and record speech signals (pre-recorded) at point D.
(3) Send pre-recorded speech signals from point B without noise. Capture and record speech signals (pre-recorded) at point A while sending speech signals from point A (double talk).
In survey after survey potential and actual users of wireless communications indicated that voice quality topped their reasons for selecting a specific service provider. While providers have been well aware of this key component powering their offering, they have not always been certain as to the specific methodology, resolution elements, equipment type, architecture, trade-offs, and rate of return on their particular investment that elevate the perceived voice-quality performance in their network.
It is only natural that voice quality in wireless networks has become a key differentiator among the competing service vendors. Network operators, network infrastructure planners, sales representatives of equipment vendors, their technical and sales support staff, and students of telecommunications seek information and knowledge continually that may help them understand the components of high-fidelity communicated sound.
Throughout the 1990s applications involving voice-quality enhancements, and specifically echo cancelation, have induced fresh inventions, new technology, and startling innovations in the area of enhanced voice performance. The initial echo canceler (EC) product implementations existed for about a decade before a diverse array of voice-quality enhancement realizations emerged to meet the evolving needs of digital wireless communications applications.
Early EC implementations were limited to very long distance (e.g., international) circuit-switched voice and fax applications where echo was perceived (in voice conversations) due to delays associated with signal propagation. The EC application soon expanded beyond strictly very-long-distance applications as further signal processing and dynamic routing along the communications path added delay to end-to-end voice transport.
Part I reviews the major voice codecs, their history, and their relative perceived quality. Voice-coding architectures are the building blocks of transmitted voice. They are the core that shapes the characteristics and quality of transmitted speech. Nevertheless, they are treated in this book only as background to the main subject, which deals with impairments due to transmission architecture and environment, and their corresponding remedies that immunize and repair any potential or actual spoil. Since the effectiveness of the various remedies depends on that underlying coding, it is essential that these designs be understood so that remedies can be fine tuned and customized to suit the particular characteristics of the underlying voice architecture.
Chapter 5 is devoted to the subject of noise reduction. Noise reduction is the most complicated feature among the voice-quality-assurance class of applications. It also requires a higher-level understanding of mathematics. This discussion, however, substitutes numerous mathematical expressions for intuition, ordinary analogies, and logical reasoning, supplemented by graphical and audio illustrations.
The analysis gets underway with the definition of noise, a definition consistent with the principles and characterization employed by a typical noise-reduction algorithm. It then introduces and explains the mathematical concept of time and frequency domains and the transformation process between the two. Once the reader is armed with the understanding of time- and frequency-domain representations, the analysis proceeds to a discussion of the noise-estimation process. The presentation then moves ahead to examine the suppression algorithm, which employs the noise-estimation results in its frequency-band attenuation procedures. The next segment contains a presentation covering the final algorithmic steps, which involve scaling and inverse transformation from frequency to time domains.
The next section in Chapter 5 reflects on key potential side effects associated with noise-reduction algorithms including treatment of non-voice signals. It points to key trade-offs and adverse-feature interactions that may occur in various GSM and CDMA networks – a subject that is covered much more thoroughly in Part V – Managing the network. The final section offers an examination of the network topology and placement of the noise-reduction application within it.
Noise in wireless networks
Background acoustic noise is a major voice-quality irritant that is, unfortunately, abundant in wireless communications.
Chapter 7 reviews the optional placements of the VQS functions relative to the mobile-switching center and the base-station controller, since placement impacts voice performance, applications, deployment cost, and data-detection algorithms. The first section of this chapter covers wireless-network architectures that provide comprehensive signal processing coverage for mobile-to-mobile call applications. The topic of economics and architectural trade-off associated with voice-enhancement systems is also addressed. The second part of the chapter presents an analysis of the techniques employed by a voice-quality system when coping with data communications without interfering or blocking its error-free transmission. The analysis includes descriptions of data-detection algorithms based on bit-pattern recognitions. The scope encompasses circuit-switched and high-speed circuit-switched data (CSD and HSCSD respectively) services. Finally, the third section describes tandem-free operation (TFO), its potential impact on speech transmission and data communication, and potential features and architectures.
The chapter characterizes two major themes and their joint interplay: (1) mobile-to-mobile network architectures with voice-quality enhancements, and (2) mobile data communications. It elaborates on various applications, technical challenges, and potential solutions. It is also intended to impart a sharper awareness of where technology is heading, and what constitutes winning features in the race to provide products that deliver superior voice quality in the wireless-communications arena.
The surge in data communications has spilled its fervor into wireless applications. Demand for higher-speed data access and the remarkable growth of internet applications have fueled the growth of this industry.
This chapter portrays the 2G, and 3G network topologies, and their impact on VQA feasibility and architecture. It provides an evolutionary examination of the process leading from the 2G to 3G wireless architecture, and it presents a parallel progression of placement and applicability of the VQS that supports the evolving infrastructure.
3G promotions and promises
Anyone following the latest developments in wireless telecommunications has certainly noticed the hype concerning the 3G wireless era. “It sets in motion high-speed web-cruising via your cellphone,” the exuberant technologists exclaim. “It lets you watch a movie on your cellphone,” the ecstatic entrepreneurs predict, waiting for you to part your lips in awe. “It represents an amazing political transformation,” the sociological gurus gasp. “Imagine – a single standard, with the GSM and TDMA (IS-54) advocates surrendering to a competing religion – CDMA,” the 3G zealots chant in unison. Although these are exciting and enticing statements, they represent rumour and, at best, are actually only half-truths.
Considering the complexities, it may take quite some time beyond launching the early pieces of 3G networks for wireless-data rates to match existing wireline digital subscriber line (DSL) and cable-modem speeds. Presently, the 3G theoretical speed limits are severely constrained by implementation intricacies.
The 3G IMT-2000, UMTS, and cdma2000 capabilities are founded on the basis of foresight and planned service capabilities that require wide-band virtual channels, which enable full-motion video transmission and very high-speed data-transfer options.