• Nenhum resultado encontrado

VoIP Next Gen VoIP Services and Applications Using SIP and Java

N/A
N/A
Protected

Academic year: 2019

Share "VoIP Next Gen VoIP Services and Applications Using SIP and Java"

Copied!
27
0
0

Texto

(1)

Next-Gen VoIP

Services and

Applications Using

SIP and Java

This Guide has been sponsored by

Guide Series

(2)

The internet-age Pingtel xpressa phone, and its virtually

limitless Java™repertoire of revenue-enhancing possibilities, such

as hosted IP voice services, is a very serious money maker indeed.

To learn about the opportunities the world’s most intelligent phone

can bring you, go to www.pingtel.com/mintmoney.

Or send an e-mail to us at hostedvoiceservices@pingtel.com

and we’ll get back to you

For Service Providers,

it’s a

mini branch

(3)

Table of Contents

Abstract 4

Introduction 4

Architecture Models 6

Technology Enablers for Next Generation Voice Services and Applications 16

Next Generation IP Voice Services

and Applications 29

Summary 33

Glossary 34

Appendix A: Session Initiation Protocol (SIP) Concepts and Operation 38

Editorial Writing Team

ATG’s Technology Guides and White Papers are produced according to a structured methodology and proven process. Our editorial writing team has years of experience in IT and communications technologies, and is highly conversant in today’s emerging technologies.

The Guide format and main text of this Guide are the property of The Applied Technologies Group, Inc. and is made available upon these terms and conditions. The Applied Technologies Group reserves all rights herein. Reproduction in whole or in part of the main text is only permitted with the written consent of The Applied Technologies Group. The main text shall be treated at all times as a proprietary document for internal use only. The main text may not be duplicated in any way, except in the form of brief excerpts or quotations for the purpose of review. In addition, the information contained herein may not be duplicated in other books, databases or any other medium. Making copies of this Guide, or any portion for any purpose other than your own, is a violation of United States Copyright Laws. The information contained in this Guide is believed to be reliable but cannot be guaranteed to be complete or correct. Any case studies or glossaries contained in this Guide or any Guide are excluded from this copyright.

Copyright © 2001 by The Applied Technologies Group, Inc. 209 West Central Street, Suite 301, Natick, MA 01760 Tel: (508) 651-1155, Fax: (508) 651-1171

E-mail: info@techguide.com, Web site: http://www.techguide.com

techguide.com

Visit our Web site

to read, download,

and print all the

Technology Guides

in this series.

Visit our Web site

to read, download,

and print all the

Technology Guides

in this series.

techguide.com

Software Applications Network Management Enterprise Solutions Network Technology

Telecommunications Convergence/CTI Internet

Security

Over 100 Technology Guides in the

Following Categories:

Over 100 Technology Guides in the

(4)

caller ID, etc.), cannot provide the types of features that are needed by a contemporary business in the age of e-commerce. The traditional business telephony solutions are complicated, for both the service administrators and the users. Because of the daunting complexity of PBX and CLASS/Centrex user-interfaces, users typically know and use only a fraction of the total feature set.

Now imagine telephony services in the context of the current business need. The users would still like to use a phone for making and receiving calls and playing voice-mail messages. However, they would also like to have the phone appliance integrated with a browser-based PC for managing phone books and seamlessly interfacing with other applications, such as customer relationship

management (CRM), sales force automation (SFA), supply chain management (SCM), time accounting, etc. In other words, perform tasks most suitable for the PC on the PC and those most suitable for the telephone using a phone appliance and have the two devices seamlessly integrated.

Today’s telephone just cannot deal with this new business imperative.

In contrast, the Internet and Web-based communications have revolutionized the business environment and user personal life-styles by their inexpensive, standards-based innovations. We already have data, multimedia, video, and music applications on the Internet. The Internet is already serving as the underpinning of critical business and IT solutions. Just in the last few years alone the Internet and the Web have generated more innovations than traditional telephony has produced in its entire history. The next frontier for the Web is to apply the same degree of innovation to telephony.

Most market surveys have verified that IP telephony is already supplementing traditional telephony and it is expected that the IP telephony architecture will ultimately replace the traditional telephony model.

Abstract

This Technology Guide explains the unique benefits of using the Web architectural model with SIP and Java as the enabling technologies for next generation IP voice services and applications.

Using the Web as a reference model for rapid innovation, the Guide contrasts the limitations of circuit-switched telephony and first generation VoIP architectures with the Web model. It summarizes limitations of centralized-processing models such as traditional telephony, MGCP, and Megaco as compared to peer-to-peer models such as SIP and H.323.

This Technology Guide explains in more detail the unique benefits of using SIP for call control and Java for making phones intelligent. SIP is compared with H.323 in terms of innovation, scalability, simplicity, ease of deployment, and standardization. The guide also includes an explanation of SIP concepts and operation. A description of Java features supporting new voice-services and applications is also included.

The Guide concludes with examples of new voice-services and applications made possible exclusively by SIP and Java.

Introduction

(5)

Both models have all of their intelligence in a centralized switch or server, which performs all of the telephony functions such as call setup, call forwarding, conference calling, etc. All requests, responses, and state changes must be processed by the central switch/server with the end-station being a dumb terminal.

The following are the salient characteristics of the traditional telephony environment:

Archaic, Host-to-Dumb Terminal Architecture:

Voice service architecture has not changed for generations. Today, PBX and Centrex services are delivered using switches that contain all application intelligence — just as mainframes and minicomputers did for IBM 3270 or VT100 terminals in old computer systems.

Dumb Terminal — The Telephone:Voice service delivery assumes a dumb terminal in telephony parlance — the telephone. The

end-Figure 1B: First-generation IP telephony architectures

"call manager"

IP Centrex

Softswitch "gatekeeper"

LAN PBX

This Technology Guide explains the architecture of the new IP telephony model using Session Initiation Protocol (SIP) and Java. The Guide also demonstrates the power of SIP and Java in terms of scalability, ease of use, and innovative services and applications.

Architecture Models

Circuit-Switched and First-Generation IP

Telephony Architectures

The traditional telephony architecture is based on a centralized processing model. First generation IP telephony architecture uses a Media Gateway Control Protocol (MGCP), Megaco, or vendor proprietary protocols such as Cisco’s Skinny Client Control Protocol (SCCP), which also are centralized architectures similar to the traditional telephony.

Figure 1A: Traditional circuit-switched telephony architectures

Centrex

CLASS 5 switch

(6)

Web Architecture

The Web represents the most successful application architecture in history. The Web features many intelligent servers located everywhere on the network and an intelligent, browser-based client device (a PC or a low cost Internet appliance). It is the client device, not the server, that both initiates and controls all

communications with the server. When a user simply clicks on an icon to access an application, the browser pulls content in the form of HTML and applications (Java, Java script, Flash, Active X, etc.) from the server and runs them on the PC.

There is a complete disaggregation of services in the Web model. Not only do the services come from different servers, they may be provided by different and multiple service providers. Some of the examples (shown in figure 2) include Yahoo for news; Amazon for shopping; MSN for instant messaging; ASP services (such as Corio) for customer relationship management (CRM), sales force automation (SFA), enterprise resource planning (ERP); and MP3.com for music. An enterprise can outsource as few or as many services as suits its business model.

Key characteristics of the Web architecture include:

• Intelligent end devices (clients)

• Distributed, intelligent servers (no central switch or server for services)

• An open architecture leading to innovation, rapid application development, and lower costs user interface for these services on the dumb

telephone requires non-intuitive flash

sequences and star codes. No options exist for making telephony features easier to use and increasing user productivity.

Hardware Specific Software:The voice features reside in software that is usually hardware-specific and/or proprietary. This environment requires highly-specialized software engineers that are expensive and hard to find. Even simple software modifications require the extensive regression testing of feature interaction.

• Limited Next-Generation Platforms: Next-generation voice service platforms still fall short of business needs. Most first-generation IP telephony systems, for both service providers and enterprises, do exploit IP for transport and some feature a Java or XML software

environment. However, this “open”

environment is not easily made extensible by anyone other than the vendor or possibly a service provider; certainly not the enterprise or an independent software vender with a great idea. These systems, consequently, still perpetuate the same 1960’s host-terminal architecture with a dumb telephone as the endpoint:

• The IP PBX is a host computer with all the smarts driving dumb IP phones.

(7)

voice-world are solely defined and developed by PBX and CLASS switch manufacturers, just as mainframe applications were defined by the vendors.

The PBX and CLASS switch vendors, their ideas, their bureaucratic practices, and their business motivations have held innovation in the voice-world hostage. Voice features reside in software on the switch that is hardware-specific and vendor-specific. It is a proprietary environment that is not openly extensible. Even modest new functions require the onerous regression testing of feature interaction.

The centralized, closed-software environment offers no way for enterprises to add their own innovations or enhancements to telephony features, let alone individual users or software developers with really good ideas. Some features are impossible to implement because of the dumb-telephone as the endpoint. Consequently,

innovation is and will remain dead, especially when compared to the revolutions on the Web.

Web

Innovation on the Web occurs at the edges of the network, where anyone — businesses and individuals can create Web sites that are

immediately open for other users to interact with. On the Web, in contrast to traditional telephony, a new page or “feature” can be created in a few minutes. More importantly, the Web page can be conceived, created, delivered and personalized by anyone — yahoo, e-bay, GE, a company, an individual, their kids or their grandparents. Several million Web sites are in existence today, up from a few thousand in 1993. These sites satisfy

everyone’s personal and business needs for news, buying, entertainment, chat, sports, sex, etc. regardless of gender, race, religion, ethnic background, industry and occupation.

Amazon.com would not have happened if the world needed to rely on the data communications

Comparing the Architectures

The Web has revolutionized the world of business. It has enabled a whole new business paradigm in the form of e-business, portals, e-tailers, and collaborative applications. The Web has enabled businesses to reach business partners and customers worldwide with a click of the mouse. Telephony services must change dramatically to become a functional member of this business revolution. However, given their limitations, it is virtually impossible for the current telephony architectures to satisfy emerging requirements.

Innovation

Traditional and First-Generation IP Telephony

The telephone was invented more than 125 years ago. Since then it has enabled people to talk and do only a handful of other things, like use voice mail. All of the features and services in the

Figure 2: Web application architecture

Intelligent servers

Intelligent clients

CRM/SFA MP3.com

doubleclick.com Virtualcart.com MSN Instant

Messenger amazon.com yahoo.com

MP3 Java

Flash

(8)

browser’s graphical user interface means that users do not have to memorize features as in the world of telephony. The use of any Web site is an intuitive discovery process, performed simply by pointing and clicking at images and words.

Scalability and Capacity

Traditional and First Generation IP Telephony

In the telephony world, big centralized boxes have all the smarts. Whenever the telephone, the “terminal” in the parlance of telephone equipment vendors, sends a flash sequence or * code, it’s the PBX or CLASS switch that figures out what it means. The PBX or the switch also must actively manage each and every call. Consequently, it just does not scale. Support for just one more user may end-up requiring a hugely expensive replacement or addition.

Web

A Web site, however, can support millions of users. Scalability is achieved not only through the connection-less nature of IP and by adding more and bigger servers to the Web site. Scalability is also achieved by exploiting an intelligent endpoint — the browser-based PC. In fact, it’s the browser software that interprets Web objects and puts a Web page together.

For example, in accessing a typical e-commerce site, it’s the browser, not a server, that:

• Retrieves and displays the source HTML page and embedded product images individually

• Retrieves and runs a Java applet, Java script, Flash, Active X or other application

components

• Retrieves and displays a dynamic advertisement from DoubleClick.com

• Retrieves shopping cart services from a ShoppingCart.com

vendors such as Alcatel, Cisco, Lucent, or Nortel to invent the “service” and add the features to a router or a switch.

Ease of Use

Traditional and First-Generation IP Telephony

For most telephone users, cryptic impossible-to-remember flash sequences and * codes are the interface to thousands of PBX and CLASS features. For the fortunate few with block character displays, even IBM 3270 and VT100 terminals appear attractive.

Users don’t know what voice features exist and if they do, they do not know how to use them. While most voice service platforms such as PBX and CLASS switches offer hundreds or thousands of features (300-400 features in a typical PBX, 3000-4000 in a CLASS 5 switch), most users typically don’t know any more than just a few — transfer, hold, last number redial. In research conducted by WorldCom, 9 out of 10 executives could not even transfer a call without resorting to the “help scream” — “Do I dial ‘flash’ first and then the number, or the other way around?” Trying to set-up just a 3-party conference call over a PBX is even a bigger nightmare. It’s no wonder that the assisted conference calling businesses of AT&T, Sprint and WorldCom are so big and profitable. For many, the most difficult part of changing jobs is learning a new phone system. “What do I dial to get an outside line?” Consequently, for the vast majority, ignorance is bliss, yet very expensive in user productivity.

Web

On the Web, millions of sites with billions, perhaps trillions, of pages can be easily navigated by pointing and clicking at pictures or words displayed on an intelligent, browser-based PC.

(9)

An enterprise has the option of providing PBX services locally through a premises-based system device or these could be outsourced to a network-based service. The outsourced service not only eliminates capital costs but may actually provide richer services than those available from a PBX.

The figure also shows some illustrative services such as unified messaging, presence messaging, instant messaging, and CRM integration, all of which can be provided by separate service providers offering best-of-breed solutions for an enterprise’s or even an individual user’s specific requirements.

PCs and other phones are simply resources on the network that provide services to users. In this model, the PC may provide services for the phone such as integration with the desktop applications or the phone may provide services for the PC such as causing the phone to ring and automating conference calls in Microsoft Outlook.

Figure 3: Web architecture for next-generation voice services and applications

Intelligent servers

Intelligent clients

Audio

Auctions IP PBX

PSTN gateways CRM/SFA

Presence & IM Unified Messaging

Phone-to-phone

data & app exchange Java HTML

MP3

Hosted PBX service

PC app integration

• Stores cookies to identify users and maintain states

• Encrypts credit card numbers

Manageability

Traditional and First-Generation IP Telephony

An expert — the equivalent of the proverbial rocket scientist — must perform all maintenance and management tasks for the PBX or the switch. Tools for managing moves/adds/changes tend to be horrendous and, consequently, administrators learn only the basic coping skills. This makes it extremely costly to administer the switch. According to some estimates it can cost as much as $300-$500 per PBX move/add/change. For a Centrex line, it can take weeks for a change to be implemented by the telephone company.

Web

Self-service by users is the normal operative model here — for registration, buying things, personalizing info, etc.

Every office device including printers, copiers, and now intelligent IP phones have a built-in Web server that enables remote configuration over the net via browser interface.

Every office device and home appliance is becoming more intelligent and capable of running automated diagnostics, reporting the findings, and ordering replacements before service is disrupted.

Exploiting the Web Architecture for Next

Generation Voice Services and Applications

(10)

Phone Intelligence Technology

An ability to support small footprint applications is the key for incorporating intelligence in phones. A powerful yet easy to use programming language used widely for Web-enabling Internet appliances is required. In addition to rich functionality for traditional Web applications, features developed specifically for telephony and security are mandatory. Lastly, the language must already be used by hundreds of thousands of programmers worldwide in order for innovation to happen rapidly.

Extensible, Scalable Call Control Protocol

A call control protocol is used for call related functions such as setting up, monitoring, and terminating calls. However, in the new IP telephony model, the call control protocol must differ from traditional telephony and the first generation IP telephony protocols. For maximum scalability, the new call control protocol must support peer-to-peer communications whereby two or more phones can set up and communicate directly without requiring anything more than locations services from a call control server. In addition, the protocol must allow the peer-to-peer exchange of applications and data in addition to voice communications.

The call control protocol must support a wide range of environments — from home-office to the largest enterprise and from the smallest to the largest services provider. Thus, the protocol must be highly scalable as well as cost effective in a diverse range of configurations. Since it is not possible to predict all future applications of IP telephony, the protocol must also be extensible in order to accommodate unforeseen requirements.

Technology Enablers for

Next Generation Voice

Services and Applications

Clearly, while the model in figure 3 is quite pedestrian in the Web world, it is quite revolutionary in the context of traditional

telephony. The components needed to implement this model for telephony are as follows:

Intelligent Servers

These are distributed resources that interact with intelligent clients (PCs and phones). In terms of hardware and software, these servers are standard Unix, Linux, and Microsoft Windows platforms. Compared to traditional PBXs, these servers offer choices of multiple vendors and competitive pricing with an open applications development environment.

Intelligent Phones

These phones should provide much more than incoming call ringing. In order to maintain their independence from a central switch, they must also provide local capabilities such as call hold, transfer, forwarding, redial, caller ID, multi-party conferencing, and many other traditional telephony features.

The intelligent phones should be thin-client computing devices that can interoperate with PCs and servers on the network. These devices must support dynamic loading and management of applications such as Java applets. For ease of use, they should incorporate functions such as

(11)

H.323, the older of the protocols, was originally designed for video conferencing over the LAN. Since then it has been morphed and used to support voice and video over then WAN as well. SIP, however, was designed from the beginning for multimedia sessions and conferences over the WAN. Because of these differences in their design objectives, SIP offers numerous compelling advantages in the areas of extensibility, scalability, and ease of deployment over H.323.

Today there are more products available supporting H.323 than SIP. However, since its introduction, SIP is rapidly becoming the preferred protocol. A January 2001 survey of Voice over IP vendors in Network World found that while 75% of the vendors offered products based on one of the four H.323 versions, an approximately equal number of them were already planning to offer SIP-based products by June 2001. However, the more telling statistic was that less than 25% of the vendors were planning to upgrade their products from H.323 Version 2 to Version 3 and even fewer to Version 4, the latest version of H.323. According to the same survey, most vendors expected H.323 to become a legacy protocol. In contrast, the list of vendors supporting or planning to support SIP is growing rapidly. Service providers embracing SIP include WorldCom, Level 3, Net2Phone, Telia, Webley, Ibasis, LipStream, and TalkingNets as of March 2001 with many more anticipated.

The reasons for the rapid ascendancy of SIP become obvious when we compare it with H.323 in the areas of innovation, scalability, ease of deployment, manageability, and the standardization process. Appendix A provides additional details on SIP concepts, definitions, and operation.

SIP (Session Initiation Protocol) — The

Call Control Protocol

SIP introduces the benefits of the Web architecture to IP telephony. It provides a

powerful, extensible, scalable, and easy-to-deploy protocol for call control and media exchange.

Several standards are available for building IP telephony solutions. These include the Session Initiation Protocol (SIP) from the IETF; ITU-T H.323, an ITU-T umbrella standard; Media Gateway Control Protocol (MGCP) from IETF; Media Gateway Control (Megaco), a joint protocol by IETF and ITU-T; and proprietary protocols such as Cisco’s Skinny Client Control Protocol (SCCP). A high-level comparison of these protocols is included in table 1.

Table1: IP Telephony standards

SIP H.323 MGCP MEGACO

PRO-PRIETARY

Architectural Peer-to-peer Peer-to-peer Master/ Master/ Master/

Model slave slave slave

Media types Voice, video, Voice, video, Voice Voice, Voice data limited data video

Network Intra, Extra, Intra, Extra, Intranet Intranet Intranet scope and Internet and Internet only only only Extensibility High Low Medium Medium Low

Scalability High Medium Low Low Low

Ease of High Low Medium Medium Medium deployment

Standardization IETF ITU-T IETF IETF and None ITU-T

Why SIP

(12)

protocols within H.323. These include Registration, Admission and Status (RAS), Q.931 for call control, and H.245 for transmission of non-telephony signals on the line. As shown in the tables, SIP has a total of 5 methods (commands) and 8 responses and H.323 has 21 commands/messages across the three protocols. SIP can be implemented as a stateless protocol and does not need to maintain any call states, which further increases scalability of SIP. SIP also shows a substantially higher efficiency than H.323 during call set-up by using approximately 50% fewer messages. Figures 4 and 5 show call set-up messages for H.323 and SIP, respectively. While H.323 requires a total 13 message exchanges, SIP requires only 7 exchanges.

SIP Methods and Response Codes

Table 2: SIP methods

SIP METHODS

INVITE User or service is being invited to participate in a session. ACK Client has received a final response to an INVITE request. OPTIONS Server being queried about capabilities.

BYE User agent client indicates to server to release the call. CANCEL Cancels a pending request.

REGISTER Client registers address with a SIP server.

Table 3: SIP response codes

SIP RESPONSE CODES

1xx Informational: Request received, continuing to process request. 2xx Success: Action successfully received, understood and accepted. 3xx Redirection: Further action required to complete request. 4xx Client Error: Request contains bad syntax or cannot be executed

at server.

5xx Server Error: Server failed to execute an apparently valid request. 6xx Global Failure: Request cannot be executed at any server.

Innovation

SIP enables new services and applications not possible with H.323 (or other IP telephony protocols) and easily empowers service providers, application developers, and enterprises to create unique, differentiated services and applications. For example, SIP uses a simple text-based encapsulation (based on the Internet standard MIME) which enables it to transmit data and application programs with the voice call, making it easy to send business cards, photos, and/or MP3 encoded information during a call.

SIP also supports third-party call control through simple applications to modify SIP messages and enable functions such as sending office calls to a home phone after 5:00 PM or forwarding video calls to a PC. Lastly, SIP envisions the need to accommodate extensions — new protocol headers, methods, bodies and parameters, to implement new and innovative applications. By design not all products are required to support these extensions (just the endpoints) servers or phones that want to use them.

Scalability

Being peer-to-peer protocols, both SIP and H.323 eliminate the need for central servers to control everything. Peer-to-peer protocols reduce costs of network and server infrastructure

equipment necessary to support a user population of a given size.

(13)

Table 6: H323/H.248 commands and responses

H.248

Command/Message Function

Master-Slave Determination Determines which terminal is the master and which is the slave. Possible replies: Acknowledge, Reject, Release (in case of a time out).

Terminal Capability Set Contains information about a terminal’s capability to transmit and receive multimedia streams. Possible replies: Acknowledge, Reject, Release.

Open Logical Channel Opens a logical channel for transport of audiovisual and data information. Possible replies: Acknowledge, Reject, Confirm. Close Logical Channel Closes a logical channel between two

endpoints. Possible replies: Acknowledge. Request Mode Used by a receive terminal to request

particular modes of transmission from a transit terminal. General mode types include VideoMode, AudioMode, DataMode, and Encryption Mode. Possible replies: Acknowledge, Reject, Release.

Send Terminal Capability Set Commands the far-end terminal to indicate its transmit and receive capabilities by sending one or more Terminal Capability Sets. End Session Command Indicates the end of the H.245 session. After

transmission, the terminal will not send any more H.245 messages.

Ease of Deployment

Deploying and supporting SIP is similar to HTTP. It uses standard protocols and functions, which already exist in the current IP networks and are well understood by system administrators and technical support personnel. SIP has the following HTTP characteristics:

Standard Internet addressing:SIP uses standard IP addressing format for both names and addresses, e.g., sip:username@abcorp.com or sip:1.781.938.5306@abcorp.com

Clear text protocol:SIP uses clear text for its protocol encapsulation unlike H.323, which uses binary encoding, making SIP easier to diagnose and troubleshoot.

H.323 Commands/Messages

Table 4: H.323 RAS commands and responses

RAS

Command/Message Function

RegistrationRequest (RRQ) Request from a terminal or gateway to register with a gatekeeper. Gatekeeper either confirms or rejects (RCF or RRJ)

AdmissionRequest (ARQ) Request for access to packet network from terminal to gatekeeper. Gatekeeper either confirms or rejects (ACF or ARJ) BandwidthRequest (BRQ) Request for changed bandwidth allocation,

from terminal to gatekeeper. Gatekeeper either confirms or rejects (BCF or BRJ)

DisengageRequest (DRQ) If sent from endpoint to gatekeeper, DRQ informs gatekeeper that endpoint is being dropped; if sent from gatekeeper to endpoint, DRQ call to be dropped. Gatekeeper either confirms or rejects (DCF or DRJ). If DRQ sent by gatekeeper, endpoint must reply with DCF. InfoRequest(IRQ) Request for status information from

gatekeeper to terminal.

InfoRequestResponse (IRR) Response to IRQ. May be sent unsolicited by terminal to gatekeeper at predetermined intervals. RAS Timers and Request Recommended default timeout values for in Progress (RIP) response to RAS messages and subsequent

retry counts if response is not received.

Table 5: H.323/Q.931 commands and responses

Q.931

Command/Message Function

Altering Called user has been alerted —”phone is ringing”. Sent by called user.

Call Proceeding Requested call establishment has been initiated and no more call establishment information will be accepted. Sent by called user.

Connect Acceptance of call by called entity. Sent from called entity to calling entity.

Setup Indicates a calling H.323 entity’s desire to set up a connection to the called entity.

Release Complete Indicates release of call if H.225.0 (0.931) call signaling channel is open. Afterwards, call reference value can be reused. Sent by a terminal

Status Responds to an unknown call signaling message or to a Status Inquiry message. Provides call state information.

(14)

Standardization

The ITU-T, organized under the auspices of the United Nations, defines traditional telephony and H.323 standards. It is a slow moving body with a highly political process. Participation in ITU-T activities is limited to paid members. Most of

Figure 5: H.323 Call set-up sequence

Endpoint 1 Gatekeeper Endpoint 2

Admission Request Admission Confirm Setup Call Proceeding Admission Request Admission Confirm Altering Connecting

Terminal Capability Set Master/Slave Determination Terminal Capability Set + Ack Master/Slave Determination + Ack

Terminal Capability Set Ack Master/Slave Determination Ack

Open Logical Channel + Ack Open Logical Channel

Open Logical Channel Ack Media (RTP) Close Logical Channel End Session Command Close Logical Channel + Ack

End Session Command Release Complete Disengage Request

Disengage Confirm

Disengage Request Disengage Confirm

Endpoint 1 Gatekeeper Endpoint 2

1 2 3 4 5 6 7 8 9 10 11 12 13

RAS 0.931 H.245

Simple error messages:SIP uses familiar error-messages with prefixes such as 10x, 20x, etc.

Leverages other Internet protocols:SIP uses other familiar Internet protocols such as MIME and Session Description Protocol (SDP), again eliminating the need for new technical training or expertise.

Figure 4: SIP Operation in Proxy Mode

Site 1

Endpoint 1@Site 1

Site 2

Location

(15)

can run on minimalist appliances. Simple Java applets can be developed in anywhere from a few minutes to a few hours. Key features of Java include:

Network Orientation

Java applications, called applets, run on thin-clients. Java applets are network-aware and can open and access objects across the Internet via URLs. The Remote Method Invocation (RMI) feature of Java allows the building of distributed applications. RMI-based applications can connect to other Java applications as well as legacy applications.

Java Naming and Directory Interface (JNDI) provides a unified interface to multiple heterogeneous naming and directory services including LDAP directories. JNDI enables seamless connectivity to these services. Developers can build powerful and portable directory-enabled Java applications using this industry-standard interface.

Java Database Connector (JDBC) is an application programming interface (API) that provides cross-DBMS connectivity to a wide range of SQL

databases. Using JDBC, an application can establish connectivity with nearly any enterprise or service provider database from a Java-enabled phone.

Java also features specifications and supports products which can automate the process of distributing new versions of applications over the network. This includes Java Management

Extensions (JMX), the specification, and Java Dynamic Management Kit (JDMK), Sun’s product which implements this specification.

Powerful APIs for Telephony and Speech Applications

Java has two APIs specially designed for telephony and speech applications:

Java Telephony API (JTAPI) defines interface to access the following functional areas: call control, telephone physical device control, ITU-T documents are written using very dense

language, which make it virtually impossible for the uninitiated to fathom their intent. Most ITU-T standards tend to be very complex. For example, H.323 specification with its co-requisite protocols runs some 700 pages compared to about 150 pages for SIP. The ITU-T specifications are not freely available and have to be purchased. As of February 2001, you could not even buy the H.323 specifications from the ITU-T bookstore because ITU-T still had not made them available for purchase.

In contrast, the Internet standardization process is geared toward rapid innovation. It has an open and democratic process which draws architects from the industry, academia, government, and individuals who are experts in specific technology areas. All Internet specifications are available for free to anyone and can be simply downloaded from the Internet. Lastly, the Internet standardization is rooted in the “proof-of-concept”, i.e., there must exist a prototype implementation for a standard to achieve approved status. The standard documents often include model codes to document the standard. Additionally, almost always, the actual code to implement a prototype is available on the Internet for free download and use.

Java — the Applications Engine

A key element of the proposed architecture for the next-generation IP voice services and applications is an intelligent phone. Java is the ideal application engine technology for intelligent phones. Java has already proven itself as one of the most innovative technologies fueling the Internet innovations and Java applications that are at the core of the contemporary Web-pages.

(16)

processor that is running Java runtime environment. Consequently, a Java applet written for an IP phone appliance can run without modification on a PC-based softphone supporting Java.

Ease of Development

Sun makes developing applications quick and easy with great tools in their Java Development Kit. In addition, Java is supported by numerous tools, components, and applications that are available from many vendors. In fact, many are available for free on the Internet. These tools include application and user interface (UI) components, authoring and workflow tools, and integrated development environments. A wide variety of Java training options ranging from classrooms to web-based are also available. Lastly, due to Java’s tremendous popularity, Java software engineers are readily available on permanent or contract basis to assist in development.

Next Generation IP Voice

Services and Applications

SIP and Java also enable a whole new generation of applications which are impossible with other telephony architectures. These applications can generally be divided into three categories:

• Personal productivity applications

• Occupation specific and industry specific applications

• Web-telephony integration (WTI) applications

Listed below are a few examples of each. media services, and telephony administrative

services. JTAPI functions can be used with both wired and wireless phones and its core

functions can be extended to build applications such as call logging and tracking, auto-dialing, screen-based telephone applications, call routing applications, automated attendants, interactive Voice Response (IVR) systems call management center, voicemail, etc.

Java Sound API (JSAPI)allows developers to incorporate speech technology into user interface for their Java applets and applications. This API specifies a cross-platform interface to support command and control recognizers, dictation systems and speech synthesizers.

Security

Java has a built-in security framework or “sandbox” that can protect basic phone operation like making and receiving calls from rogue or misbehaving applets. Java enables the construction of virus-free, tamper-free appliances like phones. It also incorporates authentication techniques based on public-key encryption. Java’s security features also allow enterprises to control access to resources via policy-based permissions.

Support for a Wide Variety of Devices and User Interfaces

Java applets can run on virtually any platform due to their platform independence. A Java applet can be written once and run on virtually any operating system including cell phone OS, HP UX, IBM AIX, Palm OS, Sun Solaris, VxWorks, Microsoft Windows, and various other varieties of Unix and Linux systems. To enable a Java application to execute anywhere on the network, the Java

(17)

Automated conference calling— create conference call appointments in Microsoft Outlook. The application would automatically set-up the conference call at the specified time.

Distinctive rings— play unique rings from any sound file based on caller ID or personal directory information. Separate rings could be set up for a boss, spouse, kids, or anyone else.

Industry and Occupation-Specific

Applications

Telecommuters— get all office telephony functionality at home — extension dialing, call transfer, intranet intercom, call billing, etc.

Consultants— start the “clock” automatically for time accounting or billing when picking up the phone or dialing the number of a client using caller ID or contact database information.

Sales reps— integrate voice and data information collected during a call with sales force automation applications such as ACT or Goldmine, or an ASP like sales.com.

Public relations— click-to-dial personalized and up-to-date press, analyst and vendor contact lists, and track and report time on the phone by client using a public relations ASP like mediamap.com.

Web-Telephony Integration (WTI)

Applications:

Auction site for purchasing agents of electronic components— create a live audio auction for excess DRAM inventory and use the “heat” of a real-time event to pump-up prices and the auctioneer’s commission. Use Java applets on the

Personal Productivity Applications

Electronic business cards— send an enriched electronic virtual business card (vCard) including photo and audio file automatically with every call as caller ID information (or selectively during the middle of call). This information can be added into any personal contact database such as Microsoft Outlook, or a corporate CRM, or a Supply Chain Management (SCM) database with the push of a button.

Presence and instant messaging— use an instant messenger service to determine when

geographically distributed colleagues are available for a quick conference call with a customer. Simply click or automatically “camp on” your “buddy list” to create the conference call.

Call filters— have every call from that very important customer ring at every phone —

business phone, cell phone, home phone, vacation phone, etc. The call will get completed to the first device from where the user picks up the call.

Phone book— use multiple phone books — corporate, personal, Internet, etc., on the phone and simply point to an entry to make the call. The phone books can be synchronized with the data on a PC or any server.

Personalized music on-hold— play personalized announcements or music from a favorite MP3 recording or Internet radio station while callers are on hold.

(18)

Summary

The Web has revolutionized the world of business. Traditional telephony, however, cannot fulfill the needs of the emergent e-business model. The traditional telephony model is constrained by an inflexible and inefficient architecture based on centralized processing and the dumb terminal. This environment inhibits innovation, is nearly impossible to use, and simply perpetuates the old, cumbersome, and limited functionality services.

IP telephony needs to embrace the Web architectural model in order to achieve rapid and cost effective innovation. Old definitions of “enhanced” services and features do not come anywhere near even the simplest applications made possible by technologies such as SIP and Java.

SIP, coupled with Java, can bring the same revolutionary innovations and mindset to the world of IP telephony that the Web has brought to IT and the data world.

phone to manage the bidding process and to track who “raised a hand” to bid first, etc.

Virtual call center ASP— support the integrated voice and data requirements of call center agents working from their homes.

(19)

IVR: Interactive Voice Response, a system used for generating voice prompts and menus and for accepting and processing user responses. JTAPI: Java Telephony API, an extension to Java that

provides telephony functions such as call control. JSAPI: Java Speech API, an extension to Java that

provides functions for controlling dictation systems and speech synthesizers JNDI: Java Naming and Directory Interface, an

extension to Java that provides a unified interface to multiple naming and directory services.

Megaco: Media Gateway Control, a VoIP protocol jointly developed by ITU-T and IETF. It uses softswitches and gatekeepers for central control of calls and conferences.

MGCP: Media Gateway Control Protocol, a VoIP protocol developed by and IETF. It uses softswitches and gatekeepers for central control of calls and conferences.

MIME: Multipurpose Internet Mail Extensions, an Internet standard used for encapsulating e-mail messages in clear text.

PBX: Private Branch Exchange, a customer premise based telephone switch for intra-campus and outside telephone calls.

PSTN: Public switched Telephone Network, a general reference to telephone networks using circuit switching and time division multiplexing. Q.931: An ITU-T Call control protocol for ISDN, also used

in H.323. It defines procedures for setting up and clearing calls.

API: Application Programming Interface, a set of programming functions and calls supported by a language or a software product. APIs are used by software developers to develop programs in a specific language or to enhance or extend the capabilities of a product.

ASN.1: Abstract Syntax Notation 1, an object-oriented language used by various architectures such as OSI, ITU-T, and SNMP to define objects including data structures.

ASP: Application Services Provider, a service provider that provides applications over a network with a usage-based fee.

CLASS: Custom Local Area Signaling Services, services such as caller ID and ring back provided by a telephone company. Devices in the telephone central office that provide such services are called CLASS switches.

CPU: Central Processing Unit, the arithmetic and logic unit in a computer. Examples include the Intel Pentium family, the AMD Atheon, and the IBM RISC processors.

CRM: Customer Relationship Management software, used with application such as ACT or Goldmine to keep track of customer contacts and sales information.

H.323: An ITU-T specification for multimedia

conferences over IP for LAN attached stations. It is a peer-to-peer protocol as opposed to MGCP and Megaco which require central control HTTP: Hyper Text Transfer Protocol, used for encoding

(20)

SIP: Session Initiation Protocol, IETF standard for peer-to-peer multimedia sessions and IP telephony. An alternative to the ITU-T H.323 protocol.

VoIP: Voice over IP, a general reference to several technologies and protocols that allow voice telephony implementation over IP networks. Examples of components and technologies that enable VoIP include codecs, IP PBXs, softswitches, gateways, H.323, SIP, MGCP, and Megaco.

RAS: Registration, Admission, and Status, a component of H.323, defines procedures whereby users can register themselves with a gatekeeper as a preliminary step to setting up a call.

RMI: Remote Method Invocation, a component part of Java, allows building of distributed applications that can connect to other Java applications as well as legacy applications.

RTCP: RTP Control Protocol, control protocol for RTP that allows multimedia session partners to monitor the quality of their sessions.

RTP: Real-time Transport Protocol, an IP standard for encapsulating multimedia streams for

transmission over IP networks. It includes information such as packet timestamps to help implement quality of service for a session. SCCP: Skinny Client Control Protocol, a Cisco proprietary

protocol for voice over IP that uses central control with gatekeeper-like functions. SCM: Supply Chain Management, used in reference to

application programs used for managing purchases and suppliers.

SDP: Session Description Protocol, an IETF standard to advertise multimedia conferences. SDP is intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation.

(21)

cases of a multicast conference, a full-mesh conference and a two-party “phone call”, as well as combinations of these. Any number of calls can be used to create a conference.

Call

A call consists of all participants in a conference invited by a common source. A SIP call is

identified by a globally unique call-ID.

SIP Components

User Agent Clients and Servers

A user agent is a program that runs on a SIP device (e.g., the phone). It contains a client function and a server function.

The user agent client (UAC) is a program that initiates SIP requests such as initiating a call. A UAC is also known as the calling user agent

A user agent server (UAS) is a program that receives SIP requests such as an incoming call and sends back responses to those requests. A UAS is also known as the called user agent.

Figure 7: SIP clients and servers

SIP Servers: Proxy Redirect Location Registrar

User Agent Client User Agent

Server

User Agent Client User Agent

Server

Session Initiation Protocol

(SIP) Concepts and Operation

SIP is an Internet protocol defined under Request for Comment 2543 (RFC 2543). SIP is not just for voice communications — it supports data and multimedia in its core specification.

In TCP/IP terminology, as shown in figure 6, SIP is an application level protocol and runs over UDP but may use TCP. SIP is based on existing and well-understood Internet protocols and extends them to support IP telephony.

SIP Concepts

Session

A SIP session is a multimedia session consisting of a set of multimedia senders and receivers and the data streams flowing from senders to receivers. Session is the basic building block in SIP. All calls and conferences are established by setting up sessions among users.

Conference

A conference is a multimedia session, identified by a common session description. A conference can have zero or more members and includes the

Figure 6: SIP and other Internet Protocols

Gopher Kerb SMTP Telnet FTP SIP SNMP RPC

TCP UDP

IP

(22)

rwhois, LDAP, multicast-based protocols or operating-system dependent mechanisms to actively determine the end system where a user might be reachable.

SIP Addressing

SIP uses traditional Internet names as addresses, which consist of a user name and a domain name. This is an important issue because it means that the existing Internet naming, addressing, and routing services can process SIP addresses without modifications. Examples of SIP addresses include:

SIP:user01@bigcorp.com

SIP:user@25.16.10.8

SIP:1-212-555-1212@business.com

These addresses are similar to HTTP URL addresses except that they start with SIP instead of HTTP. The first example shows a user being identified via a typical e-mail address. The second example shows an address where the IP address of the destination is known. The last example shows how we could use a phone number-like address under SIP.

The major advantages of this addressing scheme are:

• It invents no new directory structure and can be processed by existing IP servers

• Users can use familiar e-mail or URL addresses to make phone calls and have one less thing to remember, the phone number.

Domain Name Services (DNS)

DNS is a standard Internet service to convert user names, e.g., user01@bigcorp.com into IP addresses, e.g., 172.30.10.20, that can be used for finding user locations and routing calls. Because SIP uses standard IP naming and addressing, we are able to use existing, standard DNS services for SIP without any modification.

SIP Servers

Location Server

A location server is used to obtain information about a callee’s possible location. A location is the IP address of the domain where a user is located. To locate a user, the name of the user is sent to the location server and the location server returns zero or multiple locations (IP addresses orf domains) where a callee may be found. If the caller already knows the IP address of the destination server, the caller can directly contact the callee’s UAS.

Proxy Servers

A proxy server is an intermediary program that acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally by a proxy server or forwarded, possibly after translation, to other servers. A proxy interprets and, if necessary, rewrites a request message before forwarding it.

Redirect Server

A redirect server is a server that accepts a SIP request, maps the address into zero or more new addresses and returns these addresses to the client. Unlike a proxy server, it does not initiate its own SIP requests. Unlike a user agent server, it does not accept calls.

Registrar

A Registrar is a server that accepts REGISTER requests. A client uses the REGISTER request to let a proxy or redirect server know the location where the client can be reached. It provides a means whereby users can register their locations with a SIP server dynamically. As users move to different locations, they can register their new locations with the local location server.

(23)

When the callee sends a response to the INVITE request agreeing to participate in the call, the caller sends an ACK to confirm callee’s response.

Call Setup Using A Proxy Server

To initiate a SIP call, a caller first locates the appropriate proxy server and then sends a SIP invitation request to the proxy server. The location of the proxy server is locally configured on the user station. The proxy server can also be discovered automatically by the caller using a variety of mechanisms such as DHCP options, DNS SRV and others. Instead of directly sending the call to the intended callee, the proxy server may redirect the SIP request or trigger a chain of new SIP requests to other proxies or location servers. Figure 5 shows detailed flows for SIP call setup using a proxy server and are describe below:

1. Endpoint1@Site1 sends an INVITE request for Endpoint2@Site2 to the proxy server.

2. The proxy server contacts the location service for Endpoint2.

3. The proxy server receives a more precise location for Endpoint2 as Client2@Site2 from the location server.

4. The proxy server issues an INVITE request to the address(es) returned by the location service. The INVITE request carries a Call-ID.

(Upon receiving the INVITE request, the called user-agent alerts the user by generating a phone ring).

5. The called user agent returns a 100 Trying response indicating that it is processing the INVITE request.

6. The called user agent returns a 200 OK response to indicate successful processing of the INVITE request.

SIP Messages

SIP messages include SIP methods and responses to the methods. These are listed in tables 5 and 6.

SIP Message Encapsulation — MIME

Multipurpose Internet Mail Extensions (MIME) is the Internet standard for describing different types of content on the Internet, including video and image types. It is already used by HTTP for composing Web pages and by e-mail systems for encoding e-mail messages. SIP uses this well-established standard for encoding information, eliminating the need for inventing a new

technique for encoding voice and multimedia over the Internet.

SIP Call Setup

SIP is inherently capable of carrying voice, video, and multimedia calls. In the examples below, the setup flows remain the same irrespective of the type of the call. In these scenarios a call set up is illustrated where a caller knows the name but not the IP address of a callee, necessitating the use of a SIP server. If the caller knew the IP address of the callee, the caller would not need services from the SIP servers. With a callee’s destination IP address known, the caller’s user agent client only needs to select the protocol (UDP by default), port (5060 by default) and IP address of the SIP user agent server to which the INVITE request should be sent.

(24)

3. The location server returns information that this client can be found at Site3.

4. The redirect server forwards precise location information to the calling user agent using a 302 Moved Temporarily message: Contact Client2@Site3

5. The calling user agent acknowledges the information with ACK

6. The calling user agent sends an INVITE request directly to the called user agent.

7. The called user agent returns a 100 Trying response indicating that it is processing the request.

8. The called user agent returns a 200 OK response to indicate successful processing of the INVITE request.

9. The calling user agent sends an ACK to complete the handshake. The call is in now place.

7. The calling user agent sends an ACK to complete the handshake. The call is now in place.

Call Setup Using Redirect Server

Again we assume that the IP address of the caller is not known to the caller’s agent, thereby, necessitating services of the local SIP server, a redirect server in this case. The key difference compared to the proxy server is that the redirect server cannot initiate an INVITE request.

The flow of requests and responses for figure 8 is as follows:

1. Enduser1@Site1 sends an INVITE request to the redirect server for Endpoint2@Site2.

2. The redirect server contacts the location server for location information about Endpoint2.

Figure 8: SIP Operation in Redirect Mode

Site 1

Endpoint 1 @Site 1

Site 2

Location Server Redirect

Server

Site 3

Client 2 @Site 3

INVITE Endpoint 2

@Site 2

Endpoint 2 302

Moved Temporarily

Contact: Client 2 @Site 3

Site 3 Ack

INVITE Client 2 @Site 3

100 Trying 200 OK

(25)
(26)

Pingtel xpressa, the world’s first Java-based IP phone, does just

about anything a clever Java programmer could dream up.

To see what your Java colleagues have taught our phone to do

already, go to www.pingtel.com/payphone now and check out our

App Dev Zone.

A good idea of your own and who knows?

You just might get rich. Or famous. Real fast.

For Java Developers,

(27)

IT professionals in making informed business decisions about specific aspects of technology development and strategic deployment.

The Technology Guide Series®offers a broad array of

titles, each presenting objective information and practical guidance in a non-biased, “easy-to-understand” style and tone. Our editorial writing team has many years of experience in IT and communications technologies, and is highly conversant in today’s emerging technologies. The Technology Guide Series and techguide.com are supported by a consortium of leading technology providers. The Sponsor has lent its support to produce and publish this Guide.

This Guide, as well as the entire Technology Guide Series, is made available to view and print at no charge by visiting techguide.com.

produced and published by

Over 100 Technology Guides in the following categories:

Network Management

Internet

Enterprise Solutions Network Technology

Software Applications

Security

Imagem

Figure 1B: First-generation  IP telephony architectures "call manager" IP Centrex Softswitch "gatekeeper"LAN PBX
Figure 2: Web application architecture
Figure 3: Web architecture for next-generation voice services and applications Intelligent servers Intelligent clients Audio Auctions IP PBX PSTN gatewaysCRM/SFAPresence & IMUnified MessagingPhone-to-phone
Table 3: SIP response codes
+5

Referências

Documentos relacionados

Semente (Vaso de flores) - Nosso amigo Jesus, nós te entregamos esta pequena semente para que se transforme numa flor ao longo dos anos de catequese. Catecismo/Mochila - Amigo

A seguir, serão discutidos os resultados obtidos para a esterifícação enantiosseletiva da lipase de Chromobacíerium viscosum imobilizada em organo-gel e aplicada

Although these systems, developed for application to heterogeneous populations of critically ill patients, share some prognostic variables with the PIRO model, they do not

When a gatekeeper is located, the gateway will then register all of its data with the gatekeeper—for example, aliases, H.323 ID, call signaling address, tech and zone

networks to an all IP based network POTS ISDN Trunks Data POTS ISDN Trunks IP ATM FR NextGen Applications 1. Step Adding Softswitch and Voice-GW for converging networks

Results: Emerging from the analysis were the following categories: benefits of breastfeeding, myths and taboos surrounding breastfeeding; contradictory feelings when

1 – O Conselho Científico do CEPESE é constituído por todos os professores doutorados da Associação, e por investigadores ou equiparados, designados mediante proposta

Porém, na minha experiência, o que mais acontece é a apre- sentação de candidatos com excelentes currículos, em artes cêni- cas, dança ou performance, mas cuja pouca ou