A few weeks back our friends at the Xperia™ news blog posted a piece
about Xperia T being HD Voice certified (in fact, the coming Xperia™ V
is also HD Voice certified). Now, we will dive deeper into the HD Voice
technology. This is a technology that enables you to get superior voice
quality during calls, thanks to speech codec, acoustic design and signal
processing enhancements. Read more about how HD Voice works after the
jump!
The HD Voice technology is primarily based on 3GPP specifications for wideband speech, which can be applied to phones and networks that support this technology. Before using the GSMA HD Voice logotype for a phone, it must pass a substantial number of acoustic performance tests.
The term HD Voice actually has three different meanings:
Wideband speech on cellular, based on acoustic test cases defined by for example GCF (Global Certification Forum). This is supported by most Sony Xperia™ smartphones since a few years back.
Additional operator (“carrier”) -specific acoustic requirements. Many Sony Xperia™ smartphones are endorsed by major operators.
Additional acoustic test cases defined by GSMA. Compliance is necessary to use the GSMA HD Voice logotype. Xperia™ T and Xperia™ V are already certified in this way.
As you see, Sony has support across all three categories. The GSMA specification for HD voice is actually quite new and was developed with the aim of reducing the fragmentation so that a single definition would prevail on the market. In this article, we will focus on what is needed to use the GSMA HD voice logotype.
Now, to implement HD Voice during phone calls, three main technologies are used. HD Voice uses an improved (compared to “normal” voice quality) speech codec called Adaptive Multi-Rate – Wideband (AMR-WB). Together with improvements of the acoustic design and signal processing, HD Voice quality is reached. Each of these three improvements is described more in detail below.
Doubling the audio bandwidth with the AMR-WB codec
The heart of the technology behind HD Voice is “wideband speech”, which is enabled by the speech codec AMR-WB, which has been improved compared to normal voice calls. This codec doubles the audio bandwidth compared to traditional telephony (~7 kHz instead of ~3.5 kHz), and it adds the treble we need to avoid the muffled sound character we associate with old-school telephony.
This also makes it easier for us to understand speech. One classic example is the “s” and “f” sounds, which sound very much the same when doing normal voice calls. However, this is not the case with the HD voice technology, where you can hear a clear difference. The increased quality and fidelity of the voice also provide a new sense of intimacy, where you feel closer to the person you are talking to.
Improved acoustic design
In addition to the support of the AMR-WB codec, one important improvement in HD Voice compared to normal voice calls, is that the acoustic design is done in a different way to utilise the greater bandwidth that the codec offers. This means that components like microphones and loudspeakers have to be of the right type and quality, and they need to be integrated on the phone in an optimal way to make use of the greater bandwidth.
Redesigned signal processing algorithms
The audio processing algorithms, which are running behind the scenes to process the audio, are also redesigned to work at a higher sampling rate compared to normal voice calls (the sampling rate for HD Voice is at least 16 kHz, compared to the usual 8 kHz).
One such algorithm is the noise suppressor. A GSMA HD Voice compliant phone ensures that when you are talking on the phone, the person you are talking to should hear your voice naturally while the level of the background noise is reduced. In fact, the metrics used are predicting how a number of people would judge the sound for various environments, for example in a car, in traffic or at restaurant, both in terms of speech distortion and how intrusive the noise is.
While it may seem attractive to completely remove the background noise from the signal, it is actually not the goal. What we want to do is to reduce the noise to a comfortable level, but still maintain a natural feeling and convey an overall context to the conversation. To get this result, the Xperia™ T, and most other Sony Xperia™ smartphones, includes a dual microphone noise suppressor that successfully suppresses real-world noises. This works both for fairly static noise, as in for example a car, and for very dynamic noise environments, like a crowded restaurant.
What do you need to be able to try HD Voice?
To experience HD Voice during a phone call, your phone and the phone on the other end both need to support HD Voice. In addition, the operator needs to support it in the network.
Several operators have been rolling out HD Voice support in their networks for a few years now, and more and more networks are actually being enabled. For you and your friends to enjoy the HD Voice experience, please check what your operator supports or plans to support in the near future!
The HD Voice technology is primarily based on 3GPP specifications for wideband speech, which can be applied to phones and networks that support this technology. Before using the GSMA HD Voice logotype for a phone, it must pass a substantial number of acoustic performance tests.
The term HD Voice actually has three different meanings:
Wideband speech on cellular, based on acoustic test cases defined by for example GCF (Global Certification Forum). This is supported by most Sony Xperia™ smartphones since a few years back.
Additional operator (“carrier”) -specific acoustic requirements. Many Sony Xperia™ smartphones are endorsed by major operators.
Additional acoustic test cases defined by GSMA. Compliance is necessary to use the GSMA HD Voice logotype. Xperia™ T and Xperia™ V are already certified in this way.
As you see, Sony has support across all three categories. The GSMA specification for HD voice is actually quite new and was developed with the aim of reducing the fragmentation so that a single definition would prevail on the market. In this article, we will focus on what is needed to use the GSMA HD voice logotype.
Now, to implement HD Voice during phone calls, three main technologies are used. HD Voice uses an improved (compared to “normal” voice quality) speech codec called Adaptive Multi-Rate – Wideband (AMR-WB). Together with improvements of the acoustic design and signal processing, HD Voice quality is reached. Each of these three improvements is described more in detail below.
Doubling the audio bandwidth with the AMR-WB codec
The heart of the technology behind HD Voice is “wideband speech”, which is enabled by the speech codec AMR-WB, which has been improved compared to normal voice calls. This codec doubles the audio bandwidth compared to traditional telephony (~7 kHz instead of ~3.5 kHz), and it adds the treble we need to avoid the muffled sound character we associate with old-school telephony.
This also makes it easier for us to understand speech. One classic example is the “s” and “f” sounds, which sound very much the same when doing normal voice calls. However, this is not the case with the HD voice technology, where you can hear a clear difference. The increased quality and fidelity of the voice also provide a new sense of intimacy, where you feel closer to the person you are talking to.
Improved acoustic design
In addition to the support of the AMR-WB codec, one important improvement in HD Voice compared to normal voice calls, is that the acoustic design is done in a different way to utilise the greater bandwidth that the codec offers. This means that components like microphones and loudspeakers have to be of the right type and quality, and they need to be integrated on the phone in an optimal way to make use of the greater bandwidth.
Redesigned signal processing algorithms
The audio processing algorithms, which are running behind the scenes to process the audio, are also redesigned to work at a higher sampling rate compared to normal voice calls (the sampling rate for HD Voice is at least 16 kHz, compared to the usual 8 kHz).
One such algorithm is the noise suppressor. A GSMA HD Voice compliant phone ensures that when you are talking on the phone, the person you are talking to should hear your voice naturally while the level of the background noise is reduced. In fact, the metrics used are predicting how a number of people would judge the sound for various environments, for example in a car, in traffic or at restaurant, both in terms of speech distortion and how intrusive the noise is.
While it may seem attractive to completely remove the background noise from the signal, it is actually not the goal. What we want to do is to reduce the noise to a comfortable level, but still maintain a natural feeling and convey an overall context to the conversation. To get this result, the Xperia™ T, and most other Sony Xperia™ smartphones, includes a dual microphone noise suppressor that successfully suppresses real-world noises. This works both for fairly static noise, as in for example a car, and for very dynamic noise environments, like a crowded restaurant.
What do you need to be able to try HD Voice?
To experience HD Voice during a phone call, your phone and the phone on the other end both need to support HD Voice. In addition, the operator needs to support it in the network.
Several operators have been rolling out HD Voice support in their networks for a few years now, and more and more networks are actually being enabled. For you and your friends to enjoy the HD Voice experience, please check what your operator supports or plans to support in the near future!
No comments:
Post a Comment