Monday, September 12, 2011

How the Migration to IP Improves Voice Quality

Back to the early years of Voice over IP, the quality was not great in comparison to TDM systems. Since IP networks did not have enough bandwidth and quality of service, voice had to be compressed a lot to be sent over the IP network. TDM solutions by contrast did not compress voice and since there was a physical connection between the TDM system and the TDM phone, they did not need to do much packetization either. The result was better voice quality on TDM phones than on IP phones – up until the advance of fast IP LANs and wideband audio codecs in the 2000s entirely changed the balance.

Most VOIP phones shipping today have some sort of HD Voice support. Polycom has been shipping HD voice for 10 years, starting with 7 kHz voice, then moving to 14 kHz audio in 2003 and 20+ kHz audio in 2006. While the voice industry as a whole is only now moving to 7 kHz voice, Polycom has moved further beyond – to support 14 kHz and even 20+ kHz audio (with Siren 22 and G.719 codecs). It is not just about "voice" anymore but rather about "audio" - the technology has gone beyond speech/voice transmission and allows for high-quality music and mixed content.

HD Voice is not only about better quality codecs. The acoustics of the handset were improved while microphones and speakers have to be modified to capture/play higher quality voice and audio. Echo cancelation and other algorithms have to be adjusted to support the wider frequency band. The challenges around transmitting high-quality audio are described in a joint white paper of Polycom and the Manhattan School of Music. The paper highlights our focus on audio quality and demonstrates our capability to meet the requirements of the most demanding users: musicians. The technology developed for this high-end application trickles down to room and personal telepresence systems and telephones, effectively spreading across the entire Polycom portfolio.

So how does a communication system capture the value of high-quality audio now available in VOIP phones? The key is migration to a distributed architecture that routes voice streams without transcoding them back to the TDM format (G.711 codec). If audio is delivered without transcoding between two communication partners on the system, the quality remains the highest (assuming that both partners have high-quality VOIP phones). The matter gets a little more complicated with multipoint calls because most voice conferencing servers embedded in enterprise voice systems support only G.711. If a video conference server such as Polycom RMX is part of a Unified Communication solution, the unused video ports on this server can be configured to support audio (up to the highest audio quality of 20+ kHz). Audio requires far less performance than video; therefore, one video port becomes 40 audio ports, and that is enough scalability for an enterprise deployments. Long-term, however, wideband audio will be gradually supported on all conference servers in enterprise systems, starting with the 7kHz G.722 wideband codec which is widely supported in newer IP phones.

Once the multipoint problem is solved, the only one remaining is connectivity to other systems across service provider networks. Most voice systems today still use TDM connection (such as T1, PRI) to connect to service providers. This TDM connection takes the voice quality down to G.711 due to physical limitations. Newer systems however support the so-called SIP trunking standard (specification is managed by the SIP Forum) that allows connecting the enterprise voice system with a service provider using an IP connection and a virtual trunk with SIP signaling. This virtual trunk does not impose any physical limitations on the voice streams (it is just IP packets crossing the network); therefore, any voice quality can be supported - as long as both the enterprise and the SP systems can handle it. SIP trunks enable wideband voice to travel among enterprise communications systems around the world without any transcoding and quality loss.

Will wideband voice make its way to wireless handsets? The latest generation of wireless handsets – for example Polycom Spectralink 8400 - already support wideband voice (7kHz, G.722, G.722.1), and as long as the voice stream can reach the destination in its original form, the receiver enjoys the clarity and superior understanding of wideband voice communication. The challenges related to multipoint conferencing and SP trunking apply equally to wireless phones, as they are treated as any other phone in the IP communication system environment.

In conclusion, voice technology has made an amazing progress over the past decade. The work of researchers and engineers is now finally finding its way in enterprise communication solutions that provide better quality, reduce misunderstandings and fatigue, and in general, makes human interactions over distances more natural and effortless.