1. Field of the Invention
This invention relates generally to software design tools and, more particularly, to a user interface that aids a developer of an interactive voice response (IVR) system in creating a call steering application to associate user intent with the user's responses to open-ended questions.
2. Description of the Background Art
In many interactive voice response (IVR) systems, a user inputs one of a predefined set of a responses in order to be routed to a destination (e.g., “for customer service, press or say 1;” for technical support, press or say 2;” etc.). A call steering application within an IVR system is different in that it routes a caller to his or her intended destination based on receiving responses to open-ended questions (e.g., “What is the purpose of your call?”). Call steering applications are difficult to implement because there are many possible responses to an open-ended question. Even among users that have the same objective or intent, there can be many different responses to a question. For example, an IVR system for a banking institution may ask its customers what they would like assistance with. A customer could respond with “New account,” “Checking account,” or “Open account” with the intention of being transferred to the same destination. A developer of a call steering application must associate each of the various likely responses with a semantic meaning. Given the large number of possible responses, this is usually a complex and time-consuming process requiring the expertise of specialist and various scripts.
Therefore, there is a need for a call steering application development tool that enables businesses to more easily and efficiently develop and create a call steering application. Specifically, there is a need for a user interface that enables a user to easily tag response data with a semantic meaning so that statistical models can be trained to better understand the responses to the open ended questions.
The present invention is directed to a system and method for providing a developer of an interactive response system with an easy-to-use interface for training the natural language grammar of a steering application to appropriately steer the user's session in response to an open-ended question. More specifically, the system and method enables the developer of the steering application to associate semantic tags with user responses, which may be in the form of either voice or written inputs.
User responses to an open-ended steering question posed by an interactive response system are obtained. The user responses are then automatically clustered into groups, where each group is a set of sentences that are semantically related. Preliminary semantic tags are then automatically assigned to each of the groups. A user interface is provided that enables a user to validate the content of the groups to ensure that all sentences within a group have the same semantic meaning and to view and edit the preliminary semantic tags associated with the groups.
The user interface includes a groups view that displays a list of the groups and corresponding semantic tags for each group. The groups view enables a user to edit the preliminary semantic tags associated with each of the groups. The user interface also includes a sentence view that displays, for a selected group in the groups view, a list of unique sentences associated with the selected group. In the sentence view, a user is able to verify whether or not a sentence belongs to the group selected in the groups view. Finally, the user interface includes a related-groups view that displays, for a selected group in the groups view or a selected sentence in the sentence view, a plurality of groups most closely related to the selected group or sentence.
The user interface enables a user to move a displayed sentence in one group to a different group, such as, for example, by dragging the sentence from the sentence view to one of the plurality of most closely related groups in the related-groups view or to one of the plurality of groups in the groups view. In one embodiment, the user is also able to move a displayed sentence to a group-later holding area or to a rejected-sentence holding area.
Other functionality includes the ability to merge groups, create a new group, apply an inconsistency checker to detect inconsistencies in the tagged data, and apply semantic clustering to new or unverified sentences, where applying semantic clustering distributes the new or unverified sentences into groups based at least in part on the grouping of the previously verified sentences.
In a preferred embodiment, user responses are clustered and tagged in phases. In this embodiment, a subset of the obtained user responses is selected, and this subset is clustered into groups, where each group is a set of sentences that are semantically related. Preliminary semantic tags are automatically assigned to each of the groups, and the groups are then displayed in the above-described user interface to enable a user to verify and edit the content of the groups and the semantic tags associated with the groups. This process is repeated one or more times with another subset of user responses until all the user responses have been grouped and tagged. In one embodiment, the “most valuable” user responses are selected for grouping and tagging first. The “most valuable” may be the most common responses or those responses with certain key words or phrases.
Each iteration of clustering and tagging may use data from previously grouped and tagged responses to increase the accuracy of the clustering and preliminary tagging steps. In other words, the system may be able to better automatically group and tag responses with each iteration.
Examples of interactive response systems include interactive voice response (IVR) systems, as well as customer service systems that can communicate with a user via a text or written interface.
Once user utterances have been obtained, the IVR system then obtains transcriptions of the caller utterances (step 120). This may be performed manually by individuals listening to and transcribing each recorded utterance, automatically through use of an audio transcription system, or partially manually and partially automatically. In a preferred embodiment, the number of manual transcriptions are minimized in accordance with the audio clustering method described in the patent application titled “Sample Clustering to Reduce Manual Transcriptions in Speech Recognition System” filed as U.S. patent application Ser. No. 12/974,638 on Dec. 21, 2010 and herein incorporated by reference in its entirety.
After obtaining transcriptions of caller utterances, the transcribed caller utterances are then automatically grouped into groups or “clusters” based on their semantic meaning (step 130). In the above example, “Open a checking account,” “New account,” “Get another account,” would all be grouped together as they have the same semantic meaning. Further details of this process are disclosed in the patent application titled “Training Call Routing Applications by Reusing Semantically-Labeled Data Collected for Prior Applications” filed as U.S. patent application Ser. No. 12/894,752 on Sep. 30, 2010 and herein incorporated by reference in its entirety.
Preliminary semantic tags are then automatically assigned to each of the groups (step 140). In a preferred embodiment, a semantic tag could comprise an “action” tag and an “object” tag. In the above example, the group including the user utterances “Open a checking account,” “New account,” and “Get another account” may be assigned the action “Open” and the object “Account.” As a person skilled in the art would understand, other types of tags may be assigned within the scope of this invention. In certain embodiments, the IVR system and/or the call steering application may assign preliminary tags by connecting to a database of semantically-labeled data gathered from prior applications, which correlates user utterances and assigned tags. In some cases, this database would be updated regularly, such as for example, by receiving feedback from other call steering applications that connect with it. The IVR system and/or the call steering application may also assign tags based on any available data determined through previous iterations of the application being developed (e.g., based on information learned from previously-tagged data for the application).
A user interface is then provided to the developer to enable the developer to validate the content of groups to ensure that all sentences within a group have the same semantic meaning and to add or edit semantic tags associated with the groups 150. In a preferred embodiment, the user interface displays three sections: a groups view, a sentence view, and a related-groups view.
The groups view shows a list of the groups and the corresponding semantic tags. The developer is able to set or edit the values of the semantic tags in the groups view (step 160). By clustering sentences into groups, the interface reduces the amount of data the developer is required to process, as the developer tags the whole group rather than tagging sentences individually. As will be discussed in further detail later, the user interface enables a developer to fine-tune group membership in order to create a semantically consistent set of sentences. This fine-tuning includes, for example, the developer being able to create new groups, merge groups, or remove groups.
The sentence view shows, for a selected group in the groups view, a list of unique sentences associated with the selected group. The user is able to verify whether or not a sentence belongs in the selected group (step 170). The user is also able to move sentences, such as to another group, to a group-later holding area or to a rejected-sentence holding area. In one embodiment, the developer can move a sentence by dragging the sentence to one of the related groups in the related-groups view.
The related-groups view shows, for a selected group or sentence, a plurality of groups most closely related to the selected group or sentence (step 180). This helps to guide the developer to select an appropriate group for a sentence by ordering and displaying the most likely options.
In one embodiment, after a developer has validated group content and reviewed and edited (as necessary) the semantic tags for each group, the Call Steering Tagging User Interface Module 550 (see
The groups view 205 displays a description that is pre-filled with a sample sentence 220 for each group formed in step 130 of
Additionally, in certain embodiments, the groups view may contain a button that connects the developer to a rejected-sentences holding area 250 and another button that connects the developer to a group-later holding area 255. These two features will be discussed in further detail in the sentence view section below. There may additionally be an add group button 260, which would create a new group for the developer to populate with sentences.
The groups view may also include an initiate semantic clustering button 265. For example, as new sentences are introduced into the system, the initiate semantic clustering button 265 may change color to notify the developer that more data has been introduced. When the developer presses the initiate semantic clustering button 265, any new sentences added to the system will be distributed into groups based on the group membership of previously verified sentences and any unverified sentences may be redistributed into other groups based on the group membership of previously verified sentences without breaking or merging already verified groups. In other words, as group membership is fine-tuned by a user, the system can learn from the information and use it to distribute/redistribute any new or unverified sentences. This allows for the system to refresh based on the most up-to-date information.
The sentence view displays a list of sentences 270 corresponding to the selected group in the groups view 205. For example, in
When a developer clicks the arrow 285 in the next column, the corresponding sentence is moved to the group-later holding area, accessible through pressing the group-later button 255. In certain embodiments, when the number of sentences in the group-later holding area reaches a certain threshold, the application will remind the developer to review those sentences and to move them into other groups.
When a developer clicks the cross 290 in the far right column, the corresponding sentence is moved to the possible-rejects holding area, accessible through pressing the possible-rejects button 250. The developer may also move a sentence into another group by dragging it to the group. For example, the developer may drag the sentence to one of the groups in the groups view or to one of the closely related groups as shown in the related-groups section. The developer may also create a new group for a particular sentence by right clicking on a selected sentence and choosing to add a new group through a pop-up menu.
With both the groups view and the sentence view, the developer is given the option to filter 295 the groups or sentences for certain keywords. By limiting the number of groups or sentences in the groups or sentence view, the developer may more efficiently organize sentences into appropriate groups. The developer may also filter the groups based on a column header, such as for example the “All Actions” 225 or “All Objects” 230 column in the groups view. The developer can also sort the remaining columns to further optimize organization.
In the illustrated embodiment, the related groups view 215 may be displayed in either a star diagram (
In one embodiment, the user interface enables a user to merge groups.
An Automatic Transcription Module 520 transcribes utterances recorded by IVR System 510. An Audio Clustering Module 530 performs audio clustering to reduce the amount of manual verification needed for the transcriptions. Specifically, utterances that sound identical are grouped together by the Audio Clustering Module 530. Consequently, a user is able to verify the transcription of a representative sentence in each cluster, as opposed to having to verify the transcription of every utterance. In a preferred embodiment, the Audio Clustering Module 530 operates in accordance with the audio clustering method described in the U.S. patent application Ser. No. 12/974,638, which was previously incorporated by reference herein. A Manual Transcription Interface 535 enables a user to verify and correct automatic transcriptions as needed.
The Semantic Clustering Module 540 clusters the transcribed utterances into groups by semantic meaning. In the preferred embodiment, the Semantic Clustering Module 540 operates in accordance with the semantic clustering method described in U.S. patent application Ser. No. 12/894,752, which was previously incorporated by reference herein.
The Tagging User Interface Module 550 generates a user interface as discussed with respect to
In certain embodiments, the Tagging User Interface Module 550 is configured to operate in “phases” in order to ensure that the automatic groupings and assignments of semantic tags are of higher quality and to increase overall efficiency in developing the application. In “Phased Tagging,” a subset of the user responses is initially chosen. The user responses in the subset are then automatically grouped within the subset into groups of semantically related sentences. Preliminary semantic tags are automatically assigned to each group, and each group is displayed in the above-described user interface to enable a user to verify and edit the content of the groups and the semantic tags associated with the group.
The subset can be a random set of user responses or chosen based on particular criteria, such as, for example, user responses containing the most common sentences, user responses containing key words or phrases, or based on any other criteria. Because the subset contains fewer overall sentences, each group also contains fewer sentences and the developer will be able to verify the sentences in the groups and validate the semantic tags more efficiently. In certain embodiments, each subset is presented to the developer in individual sentences rather than being grouped into sets of sentences that are semantically related.
Once the initial subset has been verified, another subset of user responses is chosen. Responses in the next subset are clustered into groups of semantically related sentences and preliminary semantic tags are automatically assigned to each group, but for this subset and all future subsets, data from the previously validated and tagged groups are incorporated into the semantic clustering and tagging to increase accuracy. This process of selecting subsets, grouping within the subset, assigning preliminary semantic tags, and providing a user interface is repeated iteratively until all user responses have been processed. In certain embodiments, the next phase of the iterative process is initiated automatically. In other embodiments, the next phase of the iterative process is initiated when the developer presses a button to initiate semantic clustering for the next subset of responses.
While the foregoing discussion has been framed in terms of an interactive voice response (IVR) system, the inventive concept can be applied to any interactive response system within the scope of the present application. For example, certain embodiments of the present invention may include systems and methods for developing a steering application to direct written or textual inputs (rather than audio inputs), such as, for example, text messaging, emails, online chat system communications, internet or web based communications, or other forms of customer service systems. Such interactive response systems based on written or textual data may not require certain steps described above that are more specific to an interactive voice response system, such as, for example, transcription of the user utterances, but would be similar in all other respects.
As will be understood by those familiar with the art, the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. Accordingly, the above disclosure of the present invention is intended to be illustrative and not limiting of the invention.
Number | Name | Date | Kind |
---|---|---|---|
5740233 | Cave et al. | Apr 1998 | A |
5995935 | Hagiwara et al. | Nov 1999 | A |
6192108 | Mumford et al. | Feb 2001 | B1 |
6243680 | Gupta et al. | Jun 2001 | B1 |
6246982 | Beigi et al. | Jun 2001 | B1 |
6321188 | Hayashi et al. | Nov 2001 | B1 |
6380937 | Dong et al. | Apr 2002 | B1 |
6493703 | Knight et al. | Dec 2002 | B1 |
6778982 | Knight et al. | Aug 2004 | B1 |
6796486 | Ohashi | Sep 2004 | B2 |
6829613 | Liddy | Dec 2004 | B1 |
6842504 | Mills et al. | Jan 2005 | B2 |
6868525 | Szabo | Mar 2005 | B1 |
6934935 | Bennett et al. | Aug 2005 | B1 |
7039166 | Peterson et al. | May 2006 | B1 |
7047486 | Nagao | May 2006 | B1 |
7171426 | Farmer et al. | Jan 2007 | B2 |
7231343 | Treadgold et al. | Jun 2007 | B1 |
7360151 | Froloff | Apr 2008 | B1 |
7362892 | Lewis et al. | Apr 2008 | B2 |
7366780 | Keller et al. | Apr 2008 | B2 |
7401087 | Copperman et al. | Jul 2008 | B2 |
7548847 | Acero et al. | Jun 2009 | B2 |
7657005 | Chang | Feb 2010 | B2 |
7693705 | Jamieson | Apr 2010 | B1 |
8321477 | Schmidtler et al. | Nov 2012 | B2 |
8370362 | Szabo | Feb 2013 | B2 |
8380696 | Rogers et al. | Feb 2013 | B1 |
8401156 | Milro et al. | Mar 2013 | B1 |
8515736 | Duta | Aug 2013 | B1 |
8589373 | Mayer | Nov 2013 | B2 |
8694304 | Larcheveque et al. | Apr 2014 | B2 |
20020035466 | Kodama | Mar 2002 | A1 |
20030204404 | Weldon et al. | Oct 2003 | A1 |
20030233224 | Marchisio et al. | Dec 2003 | A1 |
20040078189 | Wen et al. | Apr 2004 | A1 |
20040078380 | Wen et al. | Apr 2004 | A1 |
20040107088 | Budzinski | Jun 2004 | A1 |
20040117737 | Bera | Jun 2004 | A1 |
20040252646 | Adhikari et al. | Dec 2004 | A1 |
20050114794 | Grimes et al. | May 2005 | A1 |
20050182765 | Liddy | Aug 2005 | A1 |
20060155662 | Murakami et al. | Jul 2006 | A1 |
20060248054 | Kirshenbaum et al. | Nov 2006 | A1 |
20080010280 | Jan et al. | Jan 2008 | A1 |
20080084971 | Dhanakshirur | Apr 2008 | A1 |
20080249999 | Renders et al. | Oct 2008 | A1 |
20080300870 | Hsu et al. | Dec 2008 | A1 |
20080304632 | Catlin et al. | Dec 2008 | A1 |
20090063959 | Stejic | Mar 2009 | A1 |
20090208116 | Gokturk et al. | Aug 2009 | A1 |
20090245493 | Chen et al. | Oct 2009 | A1 |
20090271179 | Marchisio et al. | Oct 2009 | A1 |
20090274068 | Kostner et al. | Nov 2009 | A1 |
20090276380 | Acero et al. | Nov 2009 | A1 |
20100063799 | Jamieson | Mar 2010 | A1 |
20100122214 | Sengoku | May 2010 | A1 |
20100205180 | Cooper et al. | Aug 2010 | A1 |
20100312782 | Li et al. | Dec 2010 | A1 |
20100313157 | Carlsson et al. | Dec 2010 | A1 |
20110060712 | Harashima et al. | Mar 2011 | A1 |
20110144978 | Tinkler | Jun 2011 | A1 |
20110200181 | Issa et al. | Aug 2011 | A1 |
20110218983 | Chaney et al. | Sep 2011 | A1 |
20110238409 | Larcheveque et al. | Sep 2011 | A1 |
20110238410 | Larcheveque et al. | Sep 2011 | A1 |
20110276190 | Lillis et al. | Nov 2011 | A1 |
20110276396 | Rathod | Nov 2011 | A1 |
20110295603 | Meisel | Dec 2011 | A1 |
20120022856 | Prompt et al. | Jan 2012 | A1 |
20120023102 | Venkataraman et al. | Jan 2012 | A1 |
20120079372 | Kandekar et al. | Mar 2012 | A1 |
20120079385 | Ellis et al. | Mar 2012 | A1 |
20120084283 | Chitiveli et al. | Apr 2012 | A1 |
20120096389 | Flam et al. | Apr 2012 | A1 |
20120102037 | Ozonat | Apr 2012 | A1 |
20120110008 | Pieper | May 2012 | A1 |
20120117516 | Guinness | May 2012 | A1 |
20120179980 | Whalin et al. | Jul 2012 | A1 |
20120221564 | Jones et al. | Aug 2012 | A1 |
20120254230 | Aman et al. | Oct 2012 | A1 |
20120265744 | Berkowitz et al. | Oct 2012 | A1 |
20120303605 | Lunenfeld | Nov 2012 | A1 |
20130139111 | Grimes et al. | May 2013 | A1 |
20130170631 | Raghavan et al. | Jul 2013 | A1 |
20130226559 | Lim et al. | Aug 2013 | A1 |
20130268261 | Kim et al. | Oct 2013 | A1 |
20130275121 | Tunstall-Pedoe | Oct 2013 | A1 |
20140019432 | Lunenfeld | Jan 2014 | A1 |
20140040270 | O'Sullivan et al. | Feb 2014 | A1 |
20140156664 | Gallivan | Jun 2014 | A1 |