Chapter 8. Protocols for VoIP

The Internet is a telephone system that’s gotten uppity.

Clifford Stoll

The telecommunications industry spans over 100 years, and Asterisk integrates most—if not all—of the major technologies that it has made use of over the last century. To make the most out of Asterisk, you need not be a professional in all areas, but understanding the differences between the various codecs and protocols will give you a greater appreciation and understanding of the system as a whole.

This chapter explains Voice over IP and what makes VoIP networks different from the traditional circuit-switched voice networks that were the topic of the last chapter. We will explore the need for VoIP protocols, outlining the history and potential future of each. We’ll also look at security considerations and these protocols’ abilities to work within topologies such as Network Address Translation (NAT). The following VoIP protocols will be discussed (some more briefly than others):

  • IAX

  • SIP

  • H.323

  • MGCP

  • Skinny/SCCP

  • UNISTIM

Codecs are the means by which analog voice can be converted to a digital signal and carried across the Internet. Bandwidth at any location is finite, and the number of simultaneous conversations any particular connection can carry is directly related to the type of codec implemented. In this chapter, we’ll also explore the differences between the following codecs in regards to bandwidth requirements (compression level) and quality:

  • G.711

  • G.726

  • G.729A

  • GSM

  • iLBC

  • Speex

  • MP3

We will then conclude the chapter with a discussion of how voice traffic can be routed reliably, what causes echo and how to deal with it, and how Asterisk controls the authentication of inbound and outbound calls.

The Need for VoIP Protocols

The basic premise of VoIP is the packetization[101] of audio streams for transport over Internet Protocol-based networks. The challenges to accomplishing this relate to the manner in which humans communicate. Not only must the signal arrive in essentially the same form that it was transmitted in, but it needs to do so in less than 150 milliseconds. If packets are lost or delayed, there will be degradation to the quality of the communications experience, meaning that two people will have difficulty in carrying on a conversation.

The transport protocols that collectively are called “the Internet” were not originally designed with real-time streaming of media in mind. Endpoints were expected to resolve missing packets by waiting longer for them to arrive, requesting retransmission, or, in some cases, considering the information to be gone for good and simply carrying on without it. In a typical voice conversation, these mechanisms will not serve. Our conversations do not adapt well to the loss of letters or words, nor to any appreciable delay between transmittal and receipt.

The traditional PSTN was designed specifically for the purpose of voice transmission, and it is perfectly suited to the task from a technical standpoint. From a flexibility standpoint, however, its flaws are obvious to even people with a very limited understanding of the technology. VoIP holds the promise of incorporating voice communications into all of the other protocols we carry on our networks, but due to the special demands of a voice conversation, special skills are needed to design, build, and maintain these networks.

The problem with packet-based voice transmission stems from the fact that the way in which we speak is totally incompatible with the way in which IP transports data. Speaking and listening consist of the relaying of a stream of audio, whereas the Internet protocols are designed to chop everything up, encapsulate the bits of information into thousands of packages, and then deliver each package in whatever way possible to the far end. Clearly, some way of dealing with this is required.



[101] This word hasn’t quite made it into the dictionary, but it is a term that is becoming more and more common. It refers to the process of chopping a steady stream of information into discrete chunks (or packets), suitable for delivery independently of one another.

Get Asterisk: The Future of Telephony, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.