1. Field of the Invention
This invention relates to the field of telephony. In particular, the invention relates to technologies for using voice over Internet Protocol (VoIP) solutions in a number of configurations to increase flexibility and reliability of call handling systems.
2. Description of the Related Art
As
Some inefficiencies result from the preceding configuration, for example, in order to readily support “tromboning” (connections between an incoming caller and one or more parties on an outbound call) the two calls need to be handled by the same server 116. Similarly, features like conference calls have similar dependencies. Accordingly, the telephone network 104 must be programmed to distribute the voice calls across the PRIs within the DS3 to leave sufficient capacity for outbound calling purposes. Further, physical proximity between the telephone gateway 107 and the phone application platform 110 is effectively enforced by the need for the servers supporting the phone application platform 110 to be in sufficient proximity to allow termination of circuit switched calls on those servers.
The prior approaches to providing voice activated services have been focused on the circuit switched orientation of the telephone network. Prior packet switched approaches for handling voice communications have been characterized by an end-to-end philosophy of call placement. Accordingly, what is needed is a better configuration for handling receipt and transmission of audio from and to the telephone network 104 that provides increased flexibility while maintaining compatibility with the existing telephone network 104 by leveraging VoIP standards to provide new services and functions.
An approach to abstracting the circuit switched nature of the public switched telephone network (PSTN) by using VoIP to provide voice actuated services is disclosed. By carrying a telephone call using VoIP technology for a short distance (frequently within a server room) significant benefits to call handling and capacity management can be obtained. Specifically, a PSTN-to-IP gateway is used to receive (and place) calls over the PSTN and route those calls internally to servers over an IP network in a packet switched format. A number of computer systems can receive and handle the calls in the IP format, including: translating the packets into an audio format suitable for speech recognition and creating suitable packets from computer sound files for transmission back over the PSTN.
In some embodiments, a proxy server is used to balance call load amongst a pool of server computers handling the phone calls as they are passed off from the gateway in IP form. This may also be used to reduce the need to reserve capacity on specific server computers based on circuit capacity. For example, in the prior art configuration each telephony server readily supported only a fixed number of circuits due to the physical connectivity properties. Thus if a single PRI (23 usable phone lines in North America) were connected to a server, then to easily support outgoing calls (tromboning), it is necessary to reserve capacity on that PRI. In contrast, with a packet switched abstraction, the server does not have to be concerned with which PRI, DS3, etc., is handling the incoming and outgoing legs of the call session since the capacity limit is solely based on total packet network bandwidth and processor capability on the server (both of which are more flexible than circuit capacity). Similarly, advanced calling features such as conference calling that would have previously required reservation of a large number of ports on a single telephony card and be handled more elegantly.
It should be noted that this approach is not necessarily cost reducing, e.g. the cost of the telephony gateway 107 and phone application platform 110 will not necessarily be reduced. Rather, and perhaps counter-intuitively, costs may go up since the PSTN-to-IP gateway can be rather expensive, especially if purchased in redundant pairs. Further, expensive network switches and routers to support several thousand uncompressed packet format data streams will be necessary as well. In contrast, most VoIP installations make use of (heavy) compression and expect only best effort delivery of packets. The need to perform high quality speech recognition makes such compression (as well as an unreliable network) undesirable.
Additionally, this situation is counter-intuitive to the general trend in VoIP telephony of establishing many points of presence (POPs) throughout the nation to avoid long distance charges. Rather, this approach leverages the PSTN for what it is good at: long haul transmission of voice data at a fixed quality of service and then makes use of VoIP to abstract those details. Telephone carriers who feel comfortable delivering calls directly in VoIP formats may be permitted to terminate their calls as such as well; however, that is not necessary.
A. Introduction
The invention will be described in greater detail as follows. First, a number of definitions useful to understanding the invention are presented. Then, the hardware and software architecture for localized voice over Internet Protocol (VoIP) usage will be considered. Finally, the processes and features of the environment are presented in greater detail.
B. Definitions
1. Telephone Identifying Information
For the purposes of this application, the term telephone identifying information will be used to refer to ANI information, CID information, and/or some other technique for automatically identifying the source of a call and/or other call setup information. For example, telephone identifying information may include a dialed number identification service (DNIS). Similarly, CID information may include text data including the subscriber's name and/or address, e.g. “Jane Doe”. Other examples of telephone identifying information might include the type of calling phone, e.g. cellular, pay phone, and/or hospital phone.
Additionally, the telephone identifying information may include wireless carrier specific identifying information, e.g. location of wireless phone now, etc. Also, signaling system seven (SS7) information may be included in the telephone identifying information.
2. User Profile
A user profile is a collection of information about a particular user. The user profile typically includes collections of different information of relevance to the user, e.g., account number, name, contact information, user-id, default preferences, and the like. Notably, the user profile contains a combination of explicitly made selections and implicitly made selections.
Explicitly made selections in the user profile stem from requests by the user to the system. For example, the user might add business news to the main topic list. Typically, explicit selections come in the form of a voice, or touch-tone command, to save a particular location, e.g. “Remember this”, “Bookmark it”, “shortcut this”, pound (#) key touch-tone, etc., or through adjustments to the user profile made through the web interface using a computer.
Additionally, the user profile provides a useful mechanism for associating telephone identifying information with a single user, or entity. For example, Jane Doe may have a home phone, a work phone, a cell phone, and/or some other telephones. Suitable telephone identifying information for each of those phones can be associated in a single profile for Jane. This allows the system to provide uniformity of customization to a single user, irrespective of where they are calling from.
In contrast, implicit selections come about through the conduct and behavior of the user. For example, if the user repeatedly asks for the weather in Palo Alto, Calif., the system may automatically provide the Palo Alto weather report without further prompting. In other embodiments, the user may be prompted to confirm the system's implicit choice, e.g. the system might prompt the user “Would you like me to include Palo Alto in the standard weather report from now on?”
Additionally, the system may allow the user to customize the system to meet her/his needs better. For example, the user may be allowed to control the verbosity of prompts, the dialect used, and/or other settings for the system. These customizations can be made either explicitly or implicitly. For example if the user is providing commands before most prompts are finished, the system could recognize that a less verbose set of prompts is needed and implicitly set the user's prompting preference to briefer prompts.
3. Topics and Content
A topic is any collection of similar content. Topics may be arranged hierarchically as well. For example, a topic might be business news, while subtopics might include stock quotes, market report, and analyst reports. Within a topic different types of content are available. For example, in the stock quotes subtopic, the content might include stock quotes. The distinction between topics and the content within the topics is primarily one of degree in that each topic, or subtopic, will usually contain several pieces of content.
4. Demographic and Psychographic Profiles
Both demographic profiles and psychographic profiles contain information relating to a user. Demographic profiles typically include factual information, e.g. age, gender, marital status, income, etc. Psychographic profiles typically include information about behaviors, e.g. fun loving, analytical, compassionate, fast reader, slow reader, etc. As used in this application, the term demographic profile will be used to refer to both demographic and psychographic profiles.
C. VoIP Configuration
Unlike in the prior art system, there is a clean separation between the telephone gateway 107 implementation and the phone application platform 110 implementation. This promotes modularity and improves functionality. The telephone gateway 107 is supported by one or more media gateways 302. A media gateway is a term for products such as Cisco AS5300 from Cisco Corporation, San Jose, Calif., GSX 9000 from Sonus Networks, Inc., Westford, Mass., and MultiVoice MAX TNT from Lucent Technologies, Murray Hill, N.J. More generally the media gateway 302 is a device for routing circuit switched telephone network calls to a packet switched network (and vice-versa.) Some media gateways may be capable of handling several thousand calls simultaneously. Further, as appropriate, redundant media gateways can be configured to interoperate appropriately with the telephone network 104.
Importantly, to the left of the media gateway 302 in
Before discussing call completion, consider the implementation of the phone application platform 110. A number of computers, servers 306A-Z, can be provided together with a session initiation protocol (SIP) proxy 304. The servers 306A-Z can be comprised of one or more computers, typically of a server, or rack mount variety. According to one embodiment, a Network Engine server from Network Engines, Inc., Canton, Mass., is used for the servers 306A-Z because it is a compact, 1 rack unit (1U) high, yet powerful computer system.
Through the use of one or more (proposed) standard Internet Engineering Task Force (IETF) protocols such as SIP (RFC 2543), the SIP proxy 304 can relay information from the media gateway 302 to the servers 306A-Z about incoming calls and allow them to handle the sessions. The term “proxy” is used to describe the SIP proxy 304; however, such use is not in strict conformance with the definition in RFC 2543. Rather, the SIP proxy 304 may be in the terms of RFC 2543 a “proxy”, a “proxy server”, a “redirect server”, a “server”, and/or some other type of device and/or program for balancing distribution of SIP requests (incoming calls) across the servers 306A-Z.
The call handling flow according to the implementation in
Next, at step 402, a SIP request is generated (see RFC 2543 generally for format) by the media gateway 302 to the SIP proxy 304. The SIP request can include suitable telephone identifying information, e.g. dialed number, calling party number, ANI, etc. The SIP proxy 304 will then redirect, proxy, forward, and/or otherwise cause the request to be passed to one of the servers 306A-Z for acknowledgement and handling. Criteria for distribution amongst the servers may include: the telephone identifying information (e.g. some servers are reserved for certain calling (or called) parties); server load (e.g. evenly distribute workload across the different servers relative to their capacity to handle calls); online/offline status of individual servers; network monitoring showing faults with one or more servers; and/or other criteria selected by the operator of the phone application platform 110.
For example, according to one embodiment, in order to test a new hardware and/or software configuration of a particular server (e.g. the server 306Z) a predetermined percentage of calls might be routed to that server. Similarly, if a better servers become available and are added to the existing pool, the distribution of calls could be evenly distribute based on weighted capacity. In such a configuration, a server that could handle 100 simultaneous calls versus and earlier server that only handled 50 would be considered equally loaded based on the ratio of number of current calls to capacity, e.g. 5 on the older server, and 10 on the newer server are equivalent: 5/50=1/10=10/100.
Note that this sort of flexible load balancing is not readily possible with the prior art configuration of
In some embodiments, the functionality of the SIP proxy 304 can be subsumed in whole or in part into the media gateway 302. The ability to do this will depend in large part on the monitoring and routing capabilities of the particular media gateway 302.
Next, at step 404, the SIP request is acknowledge by the selected server 306A-Z. At that point, the data (e.g. voice channel, or stream) flows between the server, the media gateway, and the telephone network 104. The data portion can be sent using one or more standard International Telecommunication Union (ITU) and/or IETF protocols, e.g. RTSP, RTP, Q.931, etc.
In one embodiment, compression of the stream is intentionally disabled between the media gateway 302 and the servers 306A-Z. Typical, VoIP data transmissions use (heavy) compression to reduce bandwidth demands; however, such compression could severely reduce the quality of speech recognition results and thus is not used. While the lack of compression would be undesirable in many other VoIP environments due to high bandwidth consumption for thousands of VoIP streams, the operator of the phone application platform need only provide high bandwidth in between the media gateway 302 and the servers 306 (frequently only a short distance, e.g. within a server room, etc.)
Lastly, at step 406, the servers communicate with the media gateway using SIP requests to control handling of the session (call). Unlike the servers with telephony cards 116A-Z of
As an example, if the initial caller to the phone application platform 110 requests an outbound call transfer (e.g. place a call to a third party), one or more SIP requests could be generated by the servers 306A-Z to the media gateway 302 (possibly via the SIP proxy 304) to cause the initiation of the call. For example, to contact a restaurant, the server could request a call placement to the phone number of the restaurant be added to the in progress session between the initial caller and the server. The media gateway 302 and/or the SIP proxy 304 could respond to this request by (ultimately) opening circuit switched connections back over the telephone network 104 to the restaurant. Notice, importantly, that there is no longer a need to reserve circuits on any particular line or interface.
Thus, despite only using the VoIP technologies in the last “100 meters” or so, e.g. within a server room, some significant functionality becomes available that also serves to increase flexibility: easier multi-party features and elimination of reserved circuit capacity. In one embodiment, VoIP can be viewed as providing an abstraction layer to the circuit switched network.
In U.S. patent application Ser. No. 09/426,102, entitled “Method and Apparatus for Content Personalization Over a Telephone Interface”, having inventors Hadi Partovi, et. al., a functional decomposition of a phone application platform substantially similar to the instant phone application platform 110 is presented. According to that functional model, the servers 306A-Z could provide a subset of the identified functions such as call management, execution, evaluation, data connectivity, and/or streaming. The specific functions provided by the servers 306A-Z will depend on their processing power, capacity, and number. For example, in the prior art arrangement of
In one embodiment, the SIP proxy 304 distributes load evenly across the servers 306A-Z and monitors their load through one or more communication channels, e.g. periodic queries to the servers 306A-Z. If the number of calls at a given time exceeds a predetermined threshold, one or more messages may be generated by the SIP proxy 304 (or one of the servers 306A-Z) to instruct the media gateway 302. The message might indicate that no more calls should be taken, e.g. busy the line. Or more specifically, when the servers 306A-Z are handling calls from multiple legal entities, the message might more specifically stop the acceptance of calls for one legal entity (e.g. by dialed phone number) in accordance with one or more limits (e.g. contracts, fairness (everyone has to have capacity for at least X calls), etc.). Responsive to such a message, the media gateway 302 may send one or more messages over the PSTN, e.g. using signaling system 7 (SS7) or such other protocols as may be available. The result, calls to a first number, +1 (800) 555-TELL might be able to proceed while calls to +1 (800) PAR-TNER might receive a busy signal or some other network status message, e.g. “All circuits are busy”.
The above type of differentiated and targeted service control is not readily possible in the circuit switched configuration of
In the case where the connectivity between the media gateway 302 and the telephone network 104 does not easily support low level communication to allow the media gateway 302 to control the behavior of the telephone network 104, the media gateway 302 can send SIP requests to a special destination, e.g. an extra server of substantially the same type as the servers 306A-Z to cause a message to be played and then terminate the call. In other embodiments, if the media gateway 302 supports the capability, it can generate and play back a busy message for specific numbers at specific times.
Returning to the prior art arrangement of
Additional protocols may be used in conjunction with SIP to further support the VoIP arrangement disclosed. For example, the PINT protocol of RFC 2848 may be used to communicate out from the phone application platform 110 to the circuit switched telephone network 104 for one or more purposes, e.g. for outbound call notification.
D. Automated Configuration Management
According to some embodiments of the invention, one or more additional computers can be coupled in communication with the phone application platform 110, e.g. configuration server 310 (shown as part of phone application platform 110). The configuration server 310 is designed to allow easy setup of the servers 306A-Z, the SIP proxy 304, and/or other computers providing the phone application platform. Configuration server 310 typically includes host descriptions (i.e., the software configuration that is mapped to each respective server 306A-Z) and a service map (i.e., information that identifies how the set of servers 306A-Z are assigned in order to maintain an operational phone platform 110).
The configuration server 310 can leverage existing protocols that are available within the respective computers to offer these features. As a result, given a unique identifier for a machine such as a hardware Ethernet address, aka media access control (MAC) address, a processor serial number, a stored value (e.g. hostname and/or Internet protocol (IP) address), and/or some other unique identifier, machines can be automatically configured with the necessary software.
This process is referred to as “blasting” or “jumpstarting” and is different from, but complimentary to, network booting and dynamic host configuration protocol (DHCP). More specifically, the blasting process creates a working system image on the blasted computer together with all appropriate software.
For example, if the server 306A were being re-purposed from performing speech recognition to handle telephony, an entry on the configuration server 310 for the server 306A could be modified to indicate the new machine purpose. Then using a net boot (or floppy boot) the machine could load an image from the configuration server 310 that causes the machine to be configured to behave in the new purpose. For example, the hard drive might be re-partitioned, a new operating system loaded (Windows(TM) NT to Solaris(TM) or FreeBSD), software removed or installed (SIP server and audio providers installed while speech recognition packages removed), etc.
The bottom line: minimal (or no) human intervention once the machine's entry in the configuration server 310 is updated, hence the respective configurations of servers 306A-Z are effectively “slaved” to the corresponding entries in configuration server 310. Deployment of configuration server 310 provides a number of other benefits, inter alia: (i) automated software (re)configuration and updates for extant or replacement servers 306A-Z; (ii) automated management, assignment, re-assignment, and control of system resources via configuration server 310; and (iii) automated system monitoring, inventory tracking, auditing, and alarming (in the event of errors or failures). According to one embodiment of the invention, the configuration server 310 includes appropriate images of operating systems, software, and/or configuration files for the full range of computers used by the phone application platform 110. Additionally, a database (or table) showing correspondences between a unique identifier for each computer and configuration options
E. Conclusion
By abstracting the circuit switched nature of the broader telephone network in the last 100 or so meters, e.g. within a server room, surprising benefits can result as described above. Further, these benefits outweigh the sometimes higher costs of such an arrangement due to the need for expensive equipment (e.g. media gateways) and high bandwidth packet based routing and switching fabrics between the media gateways and the servers.
Accordingly, a method and apparatus for using voice over Internet Protocol (VoIP) technologies in a localized fashion has been described. The approach allows improved capacity and flexibility in providing voice activated services. Further, the approach has several natural extensions such as internally routing calls in VoIP format to remote serversm e.g. for overflow to a remote data center from the location of the servers 306A-Z. Similarly, if costs for using the packet switched network are sufficiently cheaper than the circuit switched telephone network 104, some outbound calls could be placed using outbound calling through a VoIP carrier (e.g. by directing the media gateway 302 to route outbound calls using VoIP to a VoIP gateway belonging to a telecommunications carrier or one belonging to the operator of the phone application platform 110.)
In some embodiments, phone application platform 110 and the development platform web server 108 can be hardware based, software based, or a combination of the two. In some embodiments, phone application platform 110 is comprised of one or more computer programs that are included in one or more computer usable media such as CD-ROMs, floppy disks, or other media. In some embodiments, audio providers, SIP servers, SIP clients, SIP proxies, and/or some other type of SIP program, are included in one or more computer usable media.
Some embodiments of the invention are included in an electromagnetic wave form. The electromagnetic waveform comprises information such as audio providers, SIP servers, SIP clients, SIP proxies, and/or some other type of SIP program. The electromagnetic waveform may include the programs accessed over a network.
The foregoing description of various embodiments of the invention has been presented for purposes of illustration and description. It is not intended to limit the invention to the precise forms disclosed. Many modifications and equivalent arrangements will be apparent.
This application relates to, incorporates by reference, and claims priority from, U.S. Provisional Application No. 60/219,911, entitled, “Method and Apparatus for Efficient Voice Activated Services Accessible over Telephone Interface,” filed 21 Jul. 2000, having inventors Mark Verber, et. al.
Number | Name | Date | Kind |
---|---|---|---|
5497373 | Hulen et al. | Mar 1996 | A |
5799063 | Krane | Aug 1998 | A |
6070187 | Subramaniam et al. | May 2000 | A |
6226289 | Williams et al. | May 2001 | B1 |
6240449 | Nadeau | May 2001 | B1 |
6314402 | Monaco et al. | Nov 2001 | B1 |
6393467 | Potvin | May 2002 | B1 |
6404746 | Cave et al. | Jun 2002 | B1 |
6490564 | Dodrill et al. | Dec 2002 | B1 |
6512818 | Donovan et al. | Jan 2003 | B1 |
6587558 | Lo | Jul 2003 | B2 |
6600736 | Ball et al. | Jul 2003 | B1 |
6604075 | Brown et al. | Aug 2003 | B1 |
6614781 | Elliott et al. | Sep 2003 | B1 |
6654722 | Aldous et al. | Nov 2003 | B1 |
6678359 | Gallick | Jan 2004 | B1 |
6693893 | Ehlinger | Feb 2004 | B1 |
6801604 | Maes et al. | Oct 2004 | B2 |
6807574 | Partovi et al. | Oct 2004 | B1 |
20020010760 | Armenta et al. | Jan 2002 | A1 |
Number | Date | Country | |
---|---|---|---|
60219911 | Jul 2000 | US |