Wednesday, March 18, 2009

Business to Business Telepresence: The Compatibility Issue

John Bartlett wrote a short article about compatibility issues with B2B telepresence at, and I believe this topic deserves deeper analysis.

The article mentions Quality of Service as an issue for all types of telepresence systems that connect over the public Internet. I have some experience with the technology – I have been using Polycom HDX 4000 personal telepresence system in my home office for a while and can connect to the Polycom network via a video border proxy that resides in the De Militarized Zone (DMZ) of the closest Polycom office (San Jose, California). I can also place calls to other telepresence systems connected to the public Internet, e.g. a colleague of mine has one in Atlanta, Georgia.

The bad news is that the network performance is not predictable: sometimes I can connect at 1 megabit per second and enjoy high definition quality and sometimes I only get 512 kilobits per second which results in standard definition quality. On really bad days, I get mere 384 kilobits per second – luckily, the latest generation of video systems can deliver SD quality even over such low bandwidth. Internet distances are not like geographical distances, and I frequently experience high bandwidth (and high quality calls) to the other US coast (that would be the East Coast) or even internationally while getting lower bandwidth and quality on local calls. It all generally depends on the Internet usage during the call.

The good news is that my video system tolerates changes in the network conditions, and gracefully adjusts the quality (frame rate and resolution) to optimize the user experience. A video call that starts at 1 megabit per second (high definition) automatically becomes a standard definition call when the bandwidth falls to 768 kilobits per second, and goes back to HD when the available bandwidth increases.

There are ways to deliver QOS over the Internet by using two video border proxies back to back. I do not have one in my home office just because I like to have less boxes and cables and keep the place tidy but if you connect two offices, e.g. a main and a branch office of a company, you have enough networking closets to hide such equipment, and using VBP is recommended. The VBP assigns classes to the different type of traffic (voice, video, and data) and makes sure the real time traffic (voice and video) gets the appropriate priority through the Internet. I have some diagrams about that in the teleworking white paper, which is almost ready and will soon be published. Stay tuned!

John Bartlett also highlights the problem of signaling protocol incompatibilities and points out that Cisco and HP are destroying the balance created by standards (H.323, SIP) and interoperability in the video industry (Polycom, Tandberg, etc.). I think that HP Halo is less of a problem because it is just a single product/service that HP offers in the communication space. In my humble opinion, the desire to communicate with everybody else and the inherent disadvantages of using gateways will eventually drive HP to the standards camp. I am however concerned about Cisco because they have consistently tried to move customers away from standards and towards proprietary implementations, i.e. Call Manager. Starting with the Selsius Systems’ acquisition in 1998, they have been expanding their Call Manager ecosystem to include proprietary telephones and recently proprietary video endpoints that can only talk to Call Manager and to nothing else. I have already discussed the issue with proprietary implementation in my posting about collaboration tools for education: it creates an island in the communications market that cannot communicate with the rest. If you ask Cisco about standards compliance, they will tell you that they have gateways between Call Manager and H.323 or between Call Manager and SIP. They have been talking about their ‘interoperability’ (sometimes they use the word ‘compatibility’ but it does not really matter) since 1998. Nothing really came out of that. The gateways in question remain a demo for customers who need reassurance before investing in Cisco. Once customers install Call Manager and the deployment reaches critical mass, they have no other choice but to buy Cisco proprietary gear all the way. I have always wondered how companies with dual supplier purchasing policies or by the same token the federal government handle that. After 10+ years of Call Manager, does anyone out there still believe that Cisco is interested in interoperability with anyone but itself?

Anyway … John Bartlett then writes about the issues that only apply to multi-screen telepresence systems (mismatches of color temperature, audio, screen layout, image ratio, and eye contact lines). This issue is real and I have to say there is no good solution for it. Standardization bodies such as IETF and ITU-T are great at defining networking protocols but have no experience standardizing camera and microphone position, which is necessary to achieve true interoperability across multi-screen telepresence systems. There was a very heated discussion about that exact topic at the last IMTC meeting, and the issue is now getting visibility in the standards community.

I would be interested to get feedback from my fellow bloggers and followers on the issue of telepresence compatibility.

Monday, March 9, 2009

Summary of International SIP Conference, Paris, January 2009

I finally found time to post a summary of this year's (10th) International SIP Conference I presented at the event last year and this year, and can say that there was huge difference between the two events. The discussions last year were mostly around peer-to-peer architectures based on SIP. This year’s event was more of a reality check of what SIP accomplished and what promises remained unfulfilled. The SIP Conference reinforced my impression from other industry events in 2008 that standards in any communication field are under attack and that proprietary implementations are gaining momentum. This could be explained by the wave of additional applications that emerged in the last few years and that were not around when current standards were designed. And while it is possible to implement new applications using existing standards such as SIP, many developers and companies make the choice to develop proprietary technologies that better fit their particular application. Unfortunately, this approach while optimizing application performance leads to non-interoperable islands.

Back to the SIP Conference … The audience included representatives from the SIP research and development communities in Europe, North America, and Asia-Pacific. Two of the SIP creators - Henry Sinnreich and Henning Schulzrinne – presented and I also enjoyed the keynote by Venky Krishnaswamy from Avaya and the presentation from my friend Ingvar Aaberg from Paradial.

Venky kicked off the event with a keynote about SIP history and present status. SIP was meant to be the protocol of convergence but got wide adoption in Voice over IP while having mixed success in non-VoIP applications such as instant messaging and presence. Venky focused on the changes in the way people communicate today and on the many alternatives to voice communication: SMS, chat, social networking, blogs, etc. Most interest today is therefore in using SIP for advanced services, video and collaboration, and Web2.0 apps. The bottom line is that SIP is now one of many protocols and has to coexist with all other standard and proprietary protocols out there.

Henning focused on the need for better interoperability across SIP implementations. Today, interoperability is vendors’ responsibility but interoperability problems hurt the entire SIP community. SIPit interoperability test events are great for new SIP devices in development but do not scale to cover all SIP devices and their frequent software updates. Henning argued that an online SIP interoperability test tool is required to automate the test process. He suggested to start with testing simple functions such as registration call flow, codec negotiation, and measuring signaling delay, and then expand the test with more complex functions, e.g. security.

Henry works now for Adobe and is trying to persuade their application guys (Flash, AIR, and Connect) to use SIP for collaboration applications. Since Adobe limits software size to assure fast downloads, the SIP client must have small footprint and Henry’s message to the Conference was about the need for a simplified SIP specification. There are currently about 140 SIP RFCs (Request For Comment, or RFC, is how standards are called in IETF). While this complexity is business as usual for telecom vendors, it seems to be too much for software companies such as Adobe, Google and Yahoo. Henry suggested focusing on the endpoint / user agent functionality - since only endpoint knows what the user wants to do - and combining the ten most important RFCs into a base SIP specification: draft-sinnreich-sip-tools.txt at Henry is also very interested in cloud computing that allows reducing the number of servers.

Thinking back, the complexity discussion comes up every time a new set of companies enter the communications market. I lived through two waves of complexity discussions. The first one was when VOIP emerged and new VOIP companies criticized legacy PBX vendors for the complexity of their protocols and the hundreds of obscure features that they support in PBXs. Later, there was a complexity discussion – if not outright war – between companies pioneering SIP and companies using H.323. At the time SIP was just a couple of RFCs and H.323 was this big binder including several specifications and all sorts of annexes. So the SIP proponents called for simplicity, and argued that SIP has to replace H.323, and make everything simpler. Now that SIP has reached 140 RFCs the argument comes from the proprietary camp that SIP is too complex. I think it is important to put these things into perspective. Nevertheless, I really hope that Henry’s effort succeeds in IETF and I am looking forward to meeting him at the 74th IETF Meeting in San Francisco, March 22-27.

Ingvar talked about the tradeoff between SIP accessibility and security. The ICE firewall traversal standard is emerging but there are still interoperability issues. ICE does dynamic end-to-end probing and deploys STUN and TURN when necessary. What are the ICE alternatives? VPN creates private network, is complex to deploy, and since all traffic is relayed, has high bandwidth consumption. Session Border Controllers have connectivity and QOS issues. HTTP tunneling has serious QOS issues. So I guess there are no real alternatives to ICE, then.

My presentation ‘SIP as the Glue for Visual Communications in UC’ was about applications and key characteristics of visual communications and the need to integrate it with VoIP, IM and presence applications. I focused on the integrations of Polycom video systems in Microsoft, IBM, Alcatel and Nortel environments.

Thursday, March 5, 2009

Collaboration tools for education

Last week, a discussion about collaboration tools for education started in the Megaconference distribution list ( and I felt compelled to jump in. Here is a summary of the key points I made - this for the people who are not on the Megaconference distribution list.

Hundreds of collaboration tools are flooding the market, and usually offer some form of video support. Unfortunately, new market entrants do not seem to care about standards - they support neither H.323 nor the alternative SIP standard, both of which are very well established in the communication and collaboration world. Usually when new entrants introduce proprietary products, they justify it with 'immaturity of standards' and 'time to market', i.e. 'if I had waited for a standard to come out, I would have been late'. These excuses do not work in this particular space because both H.323 and SIP have been around for quite a while and give developers options to implement simpler or more complex collaboration scenarios. H.323 and SIP are also international standards - H.323 is defined by ITU, SIP by IETF - and are therefore supported around the world. The sole purpose of introducing proprietary products is therefore creating islands in the communication and collaboration world. This is a bad idea for any type of organization but is especially bad for education which thrives on exchanging ideas across countries and continents.

There are several options for collaboration software that is standard compliant. RadVision has a product called Scopia Desktop; the soft client is downloaded from RadVision’s Scopia conference server (also called video bridge or MCU) and all video and content sharing goes through the Scopia conference server. While the soft client is free, organizations have to buy more of the Scopia conference servers to support desktop video deployments. All calls, including point-to-point calls go through the server. Since point-to-point calls in the IP world usually go directly between client A and client B, to assure highest possible quality, I have been thinking quite a lot about RadVision’s approach, and the only explanation for it I can come up with is that they just do not know how to sell software and feel comfortable selling more hardware (Scopia servers) to make up for the free clients. Anyway, the business model is very strange, and I really do not want my point-to-point calls to go through any conference server, create traffic loops in the network, and decrease my video quality. (Later in 2009, RadVision released Scopia V7 which supports direct point-to-point calls between clients without going through the Scopia conference server.)

As already discussed in my previous blog posting, Vidyo is trying something new with their SVC technology (the SVC technology itself is not new; it has been around of many years but nobody implemented it). Although SVC is an annex to the H.323 standard now, this does not change the fact that SVC is not compatible with standard H.323 networks (neither is it compatible with SIP). Therefore, connecting Vidyo/SVC system to the H.323 or SIP network requires media gateways, and media gateways do two things: introduce delay and decrease video quality.

Polycom CMA Desktop and Tandberg Movi are standard compliant collaboration products which differ from RadVision because they allow endpoints to communicate directly, and not always call through a conference server. Both products do not require media gateways between desktop video and video conferencing (telepresence) rooms. Both vendors charge for soft client licenses.

In the technology though, Tandberg and Polycom went in different directions. Tandberg decided to base Movi on SIP and leave its video conferencing equipment on H.323. The result is signaling gateways (VCS), and the issue with any signaling gateway is scalability. The scalability impact is not as bad as with media gateways but supporting two protocols simultaneously still decreases scalability maybe by factor of 10, e.g. if you can scale to 50,000 users on a single protocol server, you will scale to 5,000 if the server supports two protocols and has to translate between them. I wrote about this issue on page 3 of my white paper 'Scalable Infrastructure for Distributed Video' (see link in the section ‘White papers and articles’ below).

The Polycom approach is to stick to the H.323 protocol, so that CMA Desktop clients can connect to H.323 endpoints (new and old), to H.323 conference servers (new and old), and support features like H.239 content sharing, continuous presence, etc. Due to its single protocol architecture, CMA can scale and provide the highest audio and video quality between video endpoints and CMA Desktop soft clients.

Finally, a common problem for education is that collaboration tools like the ones discussed above do not run on Mac OS and only support Windows OS. I participate in a lot of events in education and the number of Mac users seems to have exploded. However, Mac is still not getting much penetration in corporations and government (also big video users), and vendors have tough time gauging the importance of the requirement. Thanks to distribution list like Megaconference, we get feedback from education organizations and improve product capabilities.

Scalable video conference servers

A lot of the discussion in the video industry these days is around video conference servers (also known as bridges and MCUs). With the advances of video communication technology and the deployment of HD video the load on video conference servers is growing because they have to process more bits per second to support HD video calls. In addition, desktop video deployments rapidly increase the size of video networks.

Fundamentally, there are three ways to make video conference servers more scalable. The first one is to build a large server, use carrier-grade architecture and try to squeeze in as much computing power as possible into a large chassis. This is the approach Tandberg is taking with the MSE 8000. The benefit of such approach is that it easy to explain to resellers, integrators and customers: ‘you had a small server, now you are running out of resources, so get a big one’. The disadvantage is that the server becomes an extremely critical single point of failure; if the server is down, or its part of the IP network is down, the entire video service is impacted. There is also the cost aspect - buying such large server is a considerable chunk of money – but I am looking at it from a network design perspective and can only say that it is impossible to find an optimum location for such server in the network. Enterprise, government, education and health networks are all so distributed these days that placing the server in any one location leads to inefficient use of the network bandwidth and decreased quality for participants from other locations.

The second approach to scalability is to build a conference server sufficient for mid-sized video deployments and create a new architecture that allows you to combine many such conference servers into one pool of conferencing resources – to meet the needs of large organizations. You can increase this pool by adding conference servers and decrease it by removing them. The management server that manages all resources reroutes video calls to the most appropriate resource in the entire network. You can make the selection algorithm as sophisticated as you want, e.g. the algorithm may select the conference server that is closest to the majority of participants, or select the server that has the horsepower to support the quality that the participants require for that particular call. The benefit of this architecture is that you can spread the conference servers across your networks – thus avoiding bottlenecks and congestions - and still manage all servers as one giant virtual conference server. This is Polycom’s architecture: the conference server is RMX 2000; the resource management server is DMA 7000. Networking experts understand the high reliability and survivability of this approach – distributed computing and load balancing have been the preferred way to achieve scalability of applications for long time. The real challenge with this approach is to educate traditional video equipment resellers and integrators who look at the conference server as ‘the bridge’ (i.e. one box) and not as a service that can and must be distributed across the network to scale.

The third approach is to completely change the distribution of computing power between endpoints and conference servers, i.e. move more of the computing to the video endpoints (requires more powerful/expensive hardware for endpoints) and reduce the computing power in the conference server. This is the approach which the startup Vidyo is taking. Simplifying the conference server is a great idea but it remains to be seen if this benefit can outweigh the need for more performance in the endpoints. More importantly, this approach is incompatible with the installed base of video equipment and requires signaling and media gateways for interoperability. Gateways – and especially media gateways – introduce delays, decrease video and audio quality, and add substantial cost to the solution.

You can find more details about the scalability mechanisms discussed in this posting, as well as diagrams explaining the configurations, in the white paper ‘Scalable Infrastructure for Distributed Video’ (see link in the ‘White Papers and Articles’ section below on this blog). I will also address this subject in my presentation ‘Visual Communication – Believe the hype and prepare for the impact’ at InfoComm (see link in the ‘Speaking Engagements’ section below).