BACKGROUND OF THE INVENTION
1. Field of the Invention
The field of the invention is data processing, or, more specifically, methods, systems, and products for voicemail searching.
2. Description Of Related Art
Busy professionals today rely heavily upon the capabilities of voicemail systems which have become pervasive throughout both professional and person messaging channels. It is not at all uncommon that a business professional may receive dozens of voicemail messages in a single day. Often, throughout the day, that individual may check messages as opportunity arises and save those messages which need to be reviewed again or acted upon later. As a result of this scenario repeating over days and weeks, it can become quite cumbersome sifting through numerous saved messages which might be present in the user's message queues at any given time. It is also difficult for the voicemail system user to prioritize the order in which he or she hears messages, as standard systems prioritize strictly by “urgent and “standard” messages, as specified at the point of call origin. Unfortunately, these caller-defined values often will not correspond to the listeners priorities for message playback. There is therefore an ongoing need for improved methods of voicemail searching.
Methods, systems, and products for voicemail searching are disclosed as including storing, in association with voicemail messages, caller voiceprints of callers who leave voicemail messages for voicemail users in a voicemail system; storing caller speech tags in association with the voiceprints; identifying, in dependence upon caller voiceprints, callers who leave new voicemail messages; receiving, from a particular voicemail user, search keywords entered as speech and converted to text through automated speech recognition; and selecting, in dependence upon the search keywords and the caller speech tags, one or more selected voicemail messages from a multiplicity of voicemail messages for the particular voicemail user.
In some embodiments, storing caller voiceprints includes prompting callers for predefined greetings for voiceprints. In other embodiments, storing caller voiceprints includes extracting voiceprints from voicemail. In typical embodiments, storing caller speech tags is carried out by prompting voicemail users to enter caller speech tags for the voiceprints. Prompting voicemail users to enter caller speech tags often includes accepting spoken caller speech tags from voicemail users and converting the spoken caller speech tags to text.
Another method for voicemail searching is disclosed as including storing, in association with voicemail messages, caller identification data that identifies callers who leave voicemail messages for voicemail users in a voicemail system; identifying, in dependence upon the caller identification data, callers who leave new voicemail messages; receiving search keywords from a particular voicemail user; and selecting, in dependence upon the search keywords and the caller identification data, one or more selected voicemail messages from a multiplicity of voicemail messages for the particular voicemail user. A further method for voicemail searching is disclosed as including storing, in association with voicemail messages, message text converted from the voicemail messages; receiving, from a particular voicemail user, search keywords entered as speech and converted to text through automated speech recognition; and selecting, in dependence upon the search keywords and the message text, one or more selected voicemail messages from a multiplicity of voicemail messages for the particular voicemail user.
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular descriptions of exemplary embodiments of the invention as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts of exemplary embodiments of the invention.
Exemplary embodiments are described generally in this specification in terms of methods for voicemail searching. Persons skilled in the art, however, will recognize that any computer system that includes suitable programming means for operating in accordance with the disclosed methods also falls well within the scope of the present invention. Suitable programming means include any means for directing a computer system to execute the steps of the method of the invention. Suitable programming means include, for example, systems comprised of processing units and arithmetic-logic circuits connected to computer memory. Such systems generally have the capability of storing in computer memory programmed steps of methods according to exemplary embodiments for execution by a processing unit. Generally in such systems, computer memory is implemented in many ways as will occur to those of skill in the art, including magnetic media, optical media, and electronic circuits configured to store data and program instructions.
Further, embodiments may be implemented as a computer program product for use with any suitable data processing system. Embodiments of a computer program product may be implemented as a diskette, CD ROM, EEPROM (‘flash’) card, or other magnetic or optical recording media for storage of machine-readable information as will occur to those of skill in that art. Persons skilled in the art will immediately recognize that any computer system having suitable programming means will be capable of executing the steps of methods according to exemplary embodiments as included in a computer program product. Moreover, persons skilled in the art will recognize immediately that, although many of the exemplary embodiments described in this specification are oriented to software installed on computer hardware, nevertheless, alternative embodiments implemented as firmware or other computing machinery are well within the scope of the present invention.
Exemplary methods, systems, and products for voicemail searching now are described with reference to the drawings, beginning with
Methods of voicemail searching according to embodiments of the present invention typically include receiving, from a particular voicemail user, search keywords entered as speech and converted to text through automated speech recognition. When such a user provides search keywords for searching for one or more voicemail messages, typical embodiments include selecting for the user's review one or more selected voicemail messages from among all the of voicemail messages recorded for that particular voicemail user. Such a search is carried out by searching for the search keywords among caller speech tags that were previously stored as data elements associated with the voicemail messages in the voicemail system.
The exemplary architecture of
In the present example, network 238 may comprise a private network, intranet, or a public Internet Protocol network, such as, for example, the Internet. PSTN 102 is connected for data communications to network 238. Available data communications includes both voice and data signals coupled to network 238 through one or more gateways (not shown). Each gateway acts as a switch between PSTN 102 and network 238 that may compress signals, convert signals into the message form of the Internet Protocol, SIP, or other protocol packets, and routes packets through network 238 to a destination server. SIP in particular is a signaling protocol for Internet conferencing, telephony, presence, events notification and instant messaging. The gateways 124 may include Parlay gateways and SS7 gateways. Internet servers, such as telco application server 116 may include protocol agents that are enabled to interact with multiple protocols encapsulated in Internet Protocol packets including, for example, SS7, Parlay, and SIP.
±SS7” is the Common Channeling Signaling System No. 7, a global standard for telecommunications defined by the International Telecommunication Union (“ITU”). The SS7 standard defines the procedures and protocol by which network elements in the PSTN exchange information over a digital signaling network to effect wireless and wireline call setup, routing, and control. SS7 messages are exchanged between network elements over bidirectional channels called ‘signaling links.’ Signaling occur ‘out-of-band’ on dedicated channels rather than in-band on voice channels. SS7 network signaling points are uniquely identified by a numeric point code. Signaling points in SS7 networks include Service Switching Points (“SSPs”), Signal Transfer Points (“STPs”), and Service Control Points (“SCPs”). “Parlay” refers to an open-systems API for telco applications developed by the Parlay Group, an industry consortium that includes IBM, Microsoft, British Telecom, Nortel Networks, Siemens, AT&T, Cisco, Lucent, Ericsson, and others. “SIP” stands for Session Initial Protocol, a signaling protocol for Internet conferencing, telephony, presence, events notification and instant messaging. SIP supports call setup, routing, caller identification, and other features between endpoints in an Internet Protocol domain. Telco application server 116 is an example of a server systems external to PSTN 102 that may be accessed by PSTN 102 over network 238. In particular, telco application server 116 includes multiple telco specific service applications 118, 120, 122 for providing services to calls transferred to a server external to PSTN 102. Examples of telco specific services that may be provisioned through an external telco application server such as server 116 include a caller ID server 118, a call forwarding server 120, a voicemail server 122, and others as will occur to those of skill in the art. Calls may be transferred from PSTN 102 to telco application server 116 to receive at least one service after which the calls are transferred back to PSTN 102. Such services may also be provided to calls from within PSTN 102. Providing such services from a third party location such as telco application server 116 is advantageous, however, because adding services and information to PSTN 102 is time consuming and costly when compared with the time and cost of adding the services through telco application server 116.
Telco application server 116, or other servers as will occur to those of skill in the art, in addition to telco related services, may also provide messaging services, financial services, database management services, and others as will occur to those of skill in the art. Such service may be accessed by subscribers and other users in the HyperText Transport Protocol (“HTTP”) via network 238. Telco application server 116 may also support subscriber profiles as well as services for managing and updating subscriber profiles.
A caller may be identified by one of the telephony devices 114, by the PSTN itself 102, by telco application server 116. By identifying a caller as such, rather than merely identifying a device from which a call is made, an enhanced specialization of services to subscribers may be performed, particularly in the use of voicemail searching according to embodiments of the present invention.
A voicemail service 122 of telco application server 116 may include identification of a caller for a particular voicemail message. Such a service may require that callers provide voiceprints when leaving voicemail messages. Alternatively, the service may extract voiceprints from voicemail messages. Stored voiceprints may then be compared against subsequent voicemail messages to identify a caller who leaves a new voicemail message.
A PSTN 102 typically includes multiple central office switches 108 that originate and terminate calls. Central office switches 108 query service control points (“SCPs”) 104 to determine how to route calls. SCPs 104 send responses to central office switches containing routing numbers associated with a dialed number for a call. SCPs 104 may be general purpose computers storing databases of call processing information. While in the present example, SCPs 104 are depicted locally within PSTN 102, in other embodiments, SCPs 104 may be part of an extended network accessible to PSTN 102 via a network.
One of the functions performed by SCPs 104 is processing calls to and from various subscribers. For example, an SCP may store in a subscriber profile or a user profile a record of services purchased by a subscriber or user, such as a voicemail service. When a call is made to the subscriber or user, the SCP may provide a record of the voicemail service to support a request for a caller to identify provide a voiceprint.
In particular, network traffic between signaling points may be routed via a packet switch called an service transfer point (“STP”) 110. STP 110 routes each incoming message to an outgoing signaling link based on routing information. The signaling network may typically utilize an SS7 network implementing SS7 protocol.
Central office switches 108 may also send voice and signaling messages to intelligent peripherals (“IPs”) 106 via voice trunks and signaling channels. IP 106 provides enhanced announcements, enhanced digit collection, and enhanced speech recognition capabilities.
In typical embodiments of the present invention, a caller is identified according to voice recognition. Voice recognition is preferably performed by first identifying a caller by matching a voiceprint with a portion of a voicemail message. Voiceprints may be stored on and provisioned from local IPs 106, remote IPs accessed across a network, telephony devices 114, a telco application server 116, a voicemail server 122, or other repositories for voiceprints as will occur to those of skill in the art. In alternate embodiments, a caller may be identified according to caller identification information such as a telephone number or a caller's name provided by a caller ID service.
Telephony devices 114 may include, for example, wireless devices, pervasive devices equipped with telephony features, a network computer, a facsimile, a modem, PDAs, wireless telephones, other handheld wireless devices, and other devices enabled for network communication as will occur to those of skill in the art. Caller voice recognition functionality may advantageously be included in any telephony device 114.
Telephony devices are connected for communications to PSTN 102 via wireline, wireless, optical, ISDN, and other communication links. Connections to telephony devices 114 typically provide digital transport for two-way voice grade type telephone communications and a channel transporting signaling data messages in both directions between telephony devices 114 and PSTN 102. In addition to telephony devices 114, advanced telephone systems, such as call centers 112, may be connected for communications to PSTN 102 via wireline, wireless, optical, ISDN and other communication links. Call centers 112 may include PBX systems, hold queue systems, private network systems, and other systems that are implemented to handle distribution of calls to multiple representatives or agents.
In a typical PSTN 102, one central office switch 108 serves each exchange or area served by the NXX digits of an NXX-XXXX (seven digit) telephone number or the three digits following the area code digits (the Numbering Plan Area code or “NPA”) in a ten-digit telephone number. A service provider owning a central office switch also assigns a telephone number to each line connected to each of central office switches 108. The assigned telephone number includes the area code (NPA) and exchange code (NXX) for the serving central office and four unique digits (XXXX).
Central office switches 108 in such PSTNs typically utilize office equipment (“OE”) numbers to identify specific equipment, such as physical links or circuit connections. For example, a subscriber's line might terminate on a pair of terminals on a main distribution frame of a central office switches 108. The switch identifies the terminals, and therefore a particular line, by an OE number assigned to that terminal pair. A service provider may assign different telephone numbers to the one line at the same or different times. For example, a local carrier may change the telephone number because a subscriber sells a house and a new subscriber moves in and receives a new number. The OE number for the terminals and thus the line itself, however, remains the same.
On a normal call, a central office switch will detect an off-hook condition on a line and provide a dial tone. The switch identifies the line by the OE number. The central office switch retrieves subscriber or user profile information corresponding to the OE number and off-hook line. The central office switch then receives the dialed digits from the off-hook line terminal and routes the call. The central office switch may route the call over trunks and possibly through one or more central office switches to the central office switch that serves the callee's station or line. The switch terminating a call to a destination will also utilize profile information relating to the destination, for example, to forward the call if appropriate, to apply distinctive ringing, and to provide other services oriented to the callee.
The computer 106 of
The example computer 106 of
The example computer of
Exemplary methods and systems for voicemail searching are further explained with reference to
The method of
Caller voiceprints may be acquired for storage by prompting (252 on
The exemplary data structures of
The caller records 208 in the exemplary structures of
The caller records 208 are related many-to-many 236 to the user profile records 202. The relationship 236 is not literal, of course, because the user profile records 202 in this example contain no callerID fields 210, and the caller records 208 contain no userid fields 204. The relationship instead is implemented by using the voicemail search records 212 as a linking table between the user profiles 202 and the caller records 208, thereby implementing a many-to-many relationship in which one user may have voicemail messages from many callers and one caller may leave voicemail messages for many users. Each voicemail search record 212 represents one voicemail message from one caller for one user. This is represented in the exemplary data structures by the one-to-one relationship 244 between the voicemail search records 212 and the voicemail messages 228, the one-to-one relationship being implemented by use of messageld 206 as a foreign key.
The exemplary data structure of
As an aid to identifying a particular caller, the method of
The method of
In terms of the exemplary data structures of
It is typical usage for a user to contact the voicemail system and request a search for one or more of the user's voicemail messages. The method of
Searching among text speech tags is advantageously carried out with search keywords encoded also as text. The method of
Advantageously, in typical embodiments of voicemail searching according to the present invention, also illustrated by reference to the example data structures of
Methods, systems, and products for voicemail searching with speech tags associated with voiceprints are further explained through the following use case: Voice samples are taken from participating callers and are stored as voiceprints in association with a user's profile along with an associated user-generated speech tag. More particularly: A caller enters a users voicemail system. The caller enters the voicemail system because, for example, the callee user's line is busy or the callee user does not answer the telephone. The caller selects new option to “work with voice commands” and then selects submenu “register voice signature.” Outside caller is prompted to provide a standard greeting such as “Hello, this is John Doe.” A voiceprint is recorded and stored with a marker indicating user action is required. In the example data structures of
The callee user enters the voicemail system to check his or her messages. The user is prompted by the voicemail system: “You have new voice signatures, press 8 to work with markers or press 1 to continue.” The user presses the 8 key and enters a “work with speech commands” module in the voicemail system. The user selects a submenu option to “work with new voice signatures.” The voicemail system plays back for the user the marked voiceprint, “Hello, this is John Doe.” The user selects a submenu option to “create a speech tag” for this signature. The user speaks a speech tag for this voiceprint, such as, for example, “John Doe.” The voicemail system converts the speech tag to text, stores and indexes it in association with the voiceprint and the user's profile data.
In an alternative implementation, the registration of voice commands is transparent to the outside user. In this case, the association of the voiceprint with the particular caller, for indexing of voicemail, is accomplished by the user, where the user (and not the caller) is tasked with associating the caller voice tags obtained by the system with a particular user.
When a call is received, the voicemail system will attempt to match the caller's voice with existing voiceprints. If a match is found, a new voicemail message is indexed to the associated speech tag. Consider the following new voicemail message, for example:
In the case where a speech tag has already been created for caller John, the phone mail system would index this incoming call to the associated speech tag, which in many cases is the caller's name, “John Doe.” This speech tag would then be used in searching for voicemail messages from John.
If no match is found, that is, John Doe has not previously recorded a voiceprint, the voicemail system may record a sample of the caller's voiceprint, preferably extracted from the new voicemail message, of sufficient length to be useful in identifying the caller, thereby probably capturing the caller's name and the caller's usual method of greeting, and would store it as a new voiceprint. When a user then accesses the voicemail system to listen to messages, the user would be presented with the new voiceprint and provided the opportunity to assign a speech tag as described above. If the listener assigns a speech tag, it is associated with and indexed to the new voiceprint.
Continuing the use case: A new caller leaves a message, and the voicemail system attempts to recognize the caller's voice. The voicemail system then takes action in dependence upon whether it can find a match for the new caller's voice in an existing voiceprint: if it can, then the new voicemail message is indexed to speech tags for the caller; otherwise, the voicemail system records and marks a new voiceprint.
The callee user later calls in to the voicemail system to hear new (or old) messages. After the system greeting, the user chooses to “search messages through speech commands.” The user provides a speech tag (a name or other search keyword) to for the voicemail system to use in searching for messages, new, old, or both. The voicemail system provides the user provided with message information from messages found by the search keywords or returns the user to the primary voicemail menu if no matches are found. The user is returned to the legacy top level voicemail menu for additional actions.
According to a further advantage of the present invention, voicemail searching may be carried out on the basis of caller identification data in addition to, or instead of, speech tags.
The method of
As mentioned above, it is typical usage for a user to contact the voicemail system and request a search for one or more of the user's voicemail messages. The method of
Methods, systems, and products for voicemail searching with voice recognition and caller identification data are further explained through the following use case in which a user establishes a caller description or caller record for an expected caller. More particularly: A user enters a voicemail system and selects a menu option for “work with speech tags.” The user selects submenu “add new caller record.” The user selects further submenu “add caller identification data.” Using speech, keypad, or keyboard, the user enters a new caller name and phone numbers to associate with this caller, work number, mobile number, and so on. The user creates one or more speech tags to associate with the newly created caller record.
Later, the caller represented by the new caller record leaves a message, and the voicemail system identifies the caller via the stored caller identification data. The voicemail system takes appropriate action on a new message from the caller, such as marking it searchable by speech commands. In the case of a new voicemail message from a caller for whom no caller record or caller identification data has been established, the voicemail system, not being able to identify such a new caller in the absence of a caller record, may mark a new voicemail message as a candidate for user action and then prompt the user at next log-in to enter caller identification data for the new caller.
The callee user calls in to the voicemail system to hear new (or old) messages. After the system greeting, the user chooses to search messages through speech commands. The user provides a name or other search keyword to the voicemail system to search messages. The user is provided with message information meeting given the search keywords or is returned to the legacy phone mail menu if no matches are found.
In addition to searches on the basis of speech tags and caller identification data, exemplary embodiments of the present invention also advantageously may support voicemail searching on the basis of text converted from voicemail messages.
The method of
The method of
Methods, systems, and products for voicemail searching with speech recognition and converted message text are further explained through the following use case: A caller leaves a voicemail message. The voicemail system converts the voicemail message to text, applies filter rules, and stores the message text.
Search rules or filter rules may be included in a profile based on specific text search keywords. A more particular example is: A user logs on to the mail system. The user selects a menu option “work with speech commands.” The user selects submenu “create/edit text conversion rules.” The user specifies, via speech or keypad entries, words to be included or excluded from speech to text conversion. The user saves choices and exits menu.
The user subsequently calls in to the voicemail system to review messages. After the system greeting, the user chooses to “search messages through speech commands.” The user provides a name or other search keywords to the voicemail system to search messages. For example, when prompted the user may say “meeting and John,” where the word “and” is preferably removed via the filter rules. So the result is a search of all messages having the words “meeting” and “john.” The user's search keywords are converted to text and compared to stored message text converted from voicemail messages. The user is provided with message information meeting the search keywords or is returned to the primary voicemail menu if no matches are found.
It will be understood from the foregoing description that modifications and changes may be made in various embodiments of the present invention without departing from its true spirit. The descriptions in this specification are for purposes of illustration only and are not to be construed in a limiting sense. The scope of the present invention is limited only by the language of the following claims.