The present disclosure relates generally to the field of audio processing. More particularly, the present disclosure relates to analysis of audio generated by a microphone.
This background section is provided for the purpose of generally describing the context of the disclosure. Work of the presently named inventor(s), to the extent the work is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.
Currently most audio communication systems have a mute function controlled locally that prevents the remote party from hearing the local audio. When the mute function is active, audio generated by the microphone is not transmitted to the remote party.
In call centers, there are several reasons an agent may mute his microphone. The agent may be coughing or sneezing, and does not want the remote party to hear. The agent may be having difficulty handling a call, and so is asking questions of his co-workers. Or the agent may be doing things not related to his work.
In each of these examples, the behavior of the agent may indicate a problem. An ill agent may spread illness to others in the call center. An agent asking questions of his co-workers may need more training, or may have competency issues. Or an agent may not be providing the work desired.
Currently, these problems are generally detected by a supervisor observing the agents directly. This process costs time and resources that could be directed to more productive endeavors. An agent may be observed remotely by monitoring his calls, but such monitoring fails while the mute function is active.
In general, in one aspect, an embodiment features an apparatus comprising: a microphone configured to produce audio; a mute control configured to select a microphone open selection or a microphone muted selection; a processor configured to identify the audio produced during the microphone open selection as primary audio, and to identify the audio produced during the microphone muted selection as secondary audio; and a transceiver configured to transmit the primary audio and the secondary audio.
Embodiments of the apparatus can include one or more of the following features. In some embodiments, the transceiver is further configured to transmit the primary audio over a first link, and to transmit the secondary audio over a second link. In some embodiments, the first link is an audio link; and the second link is a data link. In some embodiments, the first link is a Bluetooth Synchronous Connection Oriented (SCO) link; and the secondary link is a Bluetooth Asynchronous Connection-Less (ACL) link. In some embodiments, the transceiver comprises: a first transceiver configured to transmit the primary audio according to a first protocol; and a second transceiver configured to transmit the secondary audio according to a second protocol. Some embodiments comprise a memory configured to store the secondary audio prior to the transceiver transmitting the secondary audio. In some embodiments, the processor is further configured to packetize the primary audio and the secondary audio, and to mark at least one of (i) packets of the primary audio and (ii) packets of the secondary audio. Some embodiments comprise a headset.
In general, in one aspect, an embodiment features a method comprising: producing audio responsive to sound; determining a selection of a mute control configured to select a microphone open selection or a microphone muted selection; identifying the audio produced during the microphone open selection as primary audio; identifying the audio produced during the microphone muted selection as secondary audio; and transmitting the primary audio and the secondary audio.
Embodiments of the method can include one or more of the following features. Some embodiments comprise transmitting the primary audio over a first link; and transmitting the secondary audio over a second link. Some embodiments comprise transmitting the primary audio according to a first protocol; and transmitting the secondary audio according to a second protocol. Some embodiments comprise packetizing the primary audio and the secondary audio; and marking at least one of (i) packets of the primary audio and (ii) packets of the secondary audio.
In general, in one aspect, an embodiment features apparatus comprising: a receiver configured to receive audio produced by a headset, wherein the headset has a mute control configured to select a microphone open selection or a microphone muted selection, and wherein the audio includes primary audio and secondary audio, wherein the primary audio is generated by a microphone of the headset during a microphone open selection, and wherein the secondary audio is generated by the microphone of the headset during the microphone muted selection; and a switch configured to pass the primary audio to a communications channel, and to pass the secondary audio to an analytics engine.
Embodiments of the apparatus can include one or more of the following features. In some embodiments, the switch is further configured to pass the primary audio to the analytics engine. Some embodiments comprise the analytics engine. In some embodiments, the receiver is further configured to receive the primary audio over a first link, and to receive the secondary audio over a second link. In some embodiments, the first link is an audio link; and the secondary link is a data link. In some embodiments, the first link is a Bluetooth Synchronous Connection Oriented (SCO) link; and the secondary link is a Bluetooth Asynchronous Connection-Less (ACL) link. In some embodiments, the receiver comprises: a first receiver configured to receive the primary audio according to a first protocol; and a second receiver configured to receive the secondary audio according to a second protocol. In some embodiments, the audio comprises packets of the primary audio and packets of the secondary audio; at least one of (i) the packets of the primary audio and (ii) the packets of the secondary audio include marks; and the switch is further configured to distinguish the (i) the packets of the primary audio and (ii) the packets of the secondary audio based on the marks.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings, and from the claims.
The leading digit(s) of each reference numeral used in this specification indicates the number of the drawing in which the reference numeral first appears.
Embodiments of the present disclosure provides collection of muted audio for analysis and the like. In the described embodiments, sound received by a microphone while the microphone is muted (that is, the mute function is active) is collected and analyzed. Sound received by the microphone while not muted (that is, while the mute function is not active) may be analyzed as well. Audio collected while the microphone is not muted is referred to herein as “primary audio.” Audio collected while the microphone is muted is referred to herein as “secondary audio.” In the described embodiments, various techniques are employed to distinguish the primary audio from the secondary audio. In some embodiments, packets of the primary audio and/or secondary audio may be marked, for example by setting flags in headers of the packets. In other embodiments, the primary audio and secondary audio may be transmitted over different links, using different protocols, and the like. Other features are contemplated as well.
Embodiments of the present disclosure are described in terms of an agent wearing a wireless headset in a call center. However, the techniques described herein are applicable to any audio device having a microphone, and in any environment.
Referring to
The mute control 110 may select either a microphone open selection or a microphone muted selection. The mute control 110 may be user-operable, automatic, or both. A user-operable mute control 110 may be implemented as a button, slide switch, or the like. An automatic mute control 110 may automatically select the microphone open selection when donned, and may automatically select the microphone muted selection when doffed.
The processor 112 may include an analog-to-digital converter, a digital signal processor, a packetizer, and the like. The wireless channel 106 may be a Bluetooth channel, a Digital Enhanced Cordless Telecommunications (DECT) channel, a Wi-Fi channel, or the like. The audio channel 120 may be any audio channel suitable for passing packets of primary audio to a remote party. The secondary audio may be routed directly to the host 104, or via another device such as a smart phone or computer.
Referring to
The processor 112 may identify the audio produced during the microphone open selection as primary audio, and may identify the audio produced during the microphone muted selection as secondary audio. In the present embodiment, at 206, the processor 112 may identify the audio by marking some or all of the packets in the audio stream. The processor 112 may mark the packets in accordance with the mute signal 128. The processor 112 may mark the packets of the digital audio when the mute signal 128 indicates the microphone muted selection, when the mute signal 128 indicates the microphone open selection, or both. The processor 112 may mark the packets, for example, by setting or clearing a flag in the header of each packet, or in the header of a packet to indicate a transition between blocks of secondary and primary audio, and the like. The processor 112 may insert control packets transition between blocks of secondary and primary audio, and the like. At 208, the transceiver 114 of the headset 102 may transmit a signal representing the packets over the wireless channel 106.
At 210, the transceiver 116 of the host 104 may receive the signal representing the packets over the wireless channel 106. At 212, the switch 118 routes the packets according to the marks in the packets. In particular, the switch 118 routes the packets of primary audio to the audio channel 120, and routes the packets of secondary audio to the analytics engine 122 for analysis. In some embodiments, the switch 118 may also route some or all of the packets of primary audio to the analytics engine 122 for analysis.
Referring to
The mute control 310 may select either a microphone open selection or a microphone muted selection. The mute control 310 may be user-operable, automatic, or both. A user-operable mute control 310 may be implemented as a button, slide switch, or the like. An automatic mute control 310 may automatically select the microphone open selection when donned, and may automatically select the microphone muted selection when doffed.
The processor 312 may include an analog-to-digital converter, a digital signal processor, a packetizer, and the like. The wireless channel 306 may be a Bluetooth channel, a Digital Enhanced Cordless Telecommunications (DECT) channel, a Wi-Fi channel, or the like. The audio channel 320 may be any audio channel suitable for passing packets of primary audio to a remote party. The secondary audio may be routed directly to the host 304, or via another device such as a smart phone or computer.
Referring to
The processor 312 may identify the audio produced during the microphone open selection as primary audio, and may identify the audio produced during the microphone muted selection as secondary audio. In the present embodiment, the processor 312 may identify the audio by routing the primary audio to one link, and routing the secondary audio to another link. At 406, the processor 312 may route the packets of digital audio among multiple communication links in accordance with the mute signal 328. For example, the processor 312 may route the packets of primary audio to an audio link, and may route the packets of secondary audio to a data link. The audio link may be a Bluetooth Synchronous Connection Oriented (SCO) link. The data link may be a Bluetooth Asynchronous Connection-Less (ACL) link. However, other wireless protocols and links may be used.
At 408, the memory 324 may store the packets of the secondary audio before transmission to the host 304. In such embodiments, the data link need not be open continuously. At 410, the transceiver 314 of the headset 302 transmits one or more signals representing the packets over the wireless channel 306.
At 412, the transceiver 316 of the host 304 may receive the signal representing the packets over the wireless channel 306. At 414, the transceiver 316 may pass the packets according to the communication links. In particular, the transceiver 316 may route the packets of primary audio to the audio channel 320, and may route the packets of secondary audio to the analytics engine 322 for analysis. In some embodiments, the transceiver 316 may also route some or all of the packets of primary audio to the analytics engine 322 for analysis.
Referring to
The mute control 510 may select either a microphone open selection or a microphone muted selection. The mute control 510 may be user-operable, automatic, or both. A user-operable mute control 510 may be implemented as a button, slide switch, or the like. An automatic mute control 510 may automatically select the microphone open selection when donned, and may automatically select the microphone muted selection when doffed.
The processor 512 may include an analog-to-digital converter, a digital signal processor, a packetizer, and the like. The wireless channels 506 and 546 may employ different wireless protocols, for example such as Bluetooth and Wi-Fi, respectively. However, any protocol may be used, for example such as Digital Enhanced Cordless Telecommunications (DECT), or the like. The audio channel 520 may be any audio channel suitable for passing the packets of primary audio to a remote party. The secondary audio may be routed directly to the host 504, or via another device such as a smart phone or computer.
Referring to
The processor 512 may identify the audio produced during the microphone open selection as primary audio, and may identify the audio produced during the microphone muted selection as secondary audio. In the present embodiment, the processor 512 may identify the audio by routing the primary audio to one transceiver, and routing the secondary audio to another transceiver. At 606, the processor 512 may route the packets of digital audio among multiple transceivers 514, 534 in accordance with the mute signal 528. For example, the processor 512 may route the packets of primary audio to one transceiver 514, and may route the packets of secondary audio to another transceiver 534.
At 608, the memory 524 may store the packets of the secondary audio before transmission to the host 504. In such embodiments, the data link need not be open continuously. At 610, the transceivers 514, 534 of the headset 502 transmit signals representing the packets over the respective wireless channel 506, 546.
At 612, the transceivers 516, 536 of the host 504 may receive the signals representing the packets over the respective wireless channels 506, 546. At 612, the transceiver 516 may pass the packets of primary audio to the audio channel 520, and the transceiver 536 may pass the packets of secondary audio to the analytics engine 522 for analysis. In some embodiments, the transceiver 516 may also route some or all of the packets of primary audio to the analytics engine 522 for analysis.
The analytics engines 122, 322, 522 described above may perform any sort of analysis on the secondary audio. The analytics engines 122, 322, 522 may identify coughs and sneezes in the secondary audio, keeping metrics as a potential indicator of illness of individual agents and groups of agents. The analytics engines 122, 322, 522 may detect questions, for example based on intonation, voice recognition, and the like, keeping metrics as a possible indicator of need for training of individual agents or groups of agents. The analytics engines 122, 322, 522 may monitor agent's speech with mute on or off and make decisions on content, keeping metrics as indicators of time spent on work communications and personal communications. In all cases, a supervisor may be alerted when a metric threshold is exceeded, making it unnecessary for a supervisor to personally monitor calls or observe agents.
Various embodiments of the present disclosure can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations thereof. Embodiments of the present disclosure can be implemented in a computer program product tangibly embodied in a computer-readable storage device for execution by a programmable processor. The described processes can be performed by a programmable processor executing a program of instructions to perform functions by operating on input data and generating output. Embodiments of the present disclosure can be implemented in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language. Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, processors receive instructions and data from a read-only memory and/or a random access memory. Generally, a computer includes one or more mass storage devices for storing data files. Such devices include magnetic disks, such as internal hard disks and removable disks, magneto-optical disks; optical disks, and solid-state disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits). As used herein, the term “module” may refer to any of the above implementations.
A number of implementations have been described. Nevertheless, various modifications may be made without departing from the scope of the disclosure. Accordingly, other implementations are within the scope of the following claims.
Number | Name | Date | Kind |
---|---|---|---|
4356509 | Skerlos | Oct 1982 | A |
4517561 | Burke | May 1985 | A |
4590473 | Burke | May 1986 | A |
4591851 | Noble | May 1986 | A |
4594591 | Burke | Jun 1986 | A |
4636791 | Burke | Jan 1987 | A |
5318340 | Henry | Jun 1994 | A |
5884247 | Christy | Mar 1999 | A |
6275806 | Pertrushin | Aug 2001 | B1 |
6681020 | Papopoulos | Jan 2004 | B1 |
6757361 | Blair et al. | Jun 2004 | B2 |
6987846 | James | Jan 2006 | B1 |
8126136 | Tong | Feb 2012 | B2 |
8390670 | Gottlieb | Mar 2013 | B1 |
8693644 | Hodges, Jr. | Apr 2014 | B1 |
8861744 | Solomon | Oct 2014 | B1 |
8903721 | Cowan | Dec 2014 | B1 |
9167095 | Selvin | Oct 2015 | B1 |
9225833 | Koster | Dec 2015 | B1 |
20010056349 | St. John | Dec 2001 | A1 |
20020002460 | Pertrushin | Jan 2002 | A1 |
20020002464 | Petrushin | Jan 2002 | A1 |
20020010587 | Pertrushin | Jan 2002 | A1 |
20030023444 | St. John | Jan 2003 | A1 |
20030031327 | Bakis et al. | Feb 2003 | A1 |
20040172252 | Aoki et al. | Sep 2004 | A1 |
20040203730 | Fraser | Oct 2004 | A1 |
20040249650 | Freedman | Dec 2004 | A1 |
20050027539 | Weber | Feb 2005 | A1 |
20050288067 | Wainwright | Dec 2005 | A1 |
20070121824 | Agapi et al. | May 2007 | A1 |
20080013747 | Tran | Jan 2008 | A1 |
20080107255 | Geva | May 2008 | A1 |
20080167878 | Hause et al. | Jul 2008 | A1 |
20080219243 | Silverman | Sep 2008 | A1 |
20080219429 | Mandalia | Sep 2008 | A1 |
20080260169 | Reuss | Oct 2008 | A1 |
20080318518 | Coutinho | Dec 2008 | A1 |
20090013085 | Liberman Ben-Ami | Jan 2009 | A1 |
20090292541 | Daya et al. | Nov 2009 | A1 |
20100020982 | Brown | Jan 2010 | A1 |
20100020998 | Brown | Jan 2010 | A1 |
20100057444 | Cilia | Mar 2010 | A1 |
20100182507 | Haggis | Jul 2010 | A1 |
20100310095 | Nakao | Dec 2010 | A1 |
20100324891 | Cutler | Dec 2010 | A1 |
20110028136 | Frazier et al. | Feb 2011 | A1 |
20110208522 | Pereg et al. | Aug 2011 | A1 |
20110211625 | Birmingham | Sep 2011 | A1 |
20110300909 | Namima | Dec 2011 | A1 |
20120020348 | Haverinen et al. | Jan 2012 | A1 |
20120027228 | Rijken | Feb 2012 | A1 |
20120050032 | Hough | Mar 2012 | A1 |
20120296642 | Shammass et al. | Nov 2012 | A1 |
20130028399 | Kopparapu et al. | Jan 2013 | A1 |
20130050199 | Chavez | Feb 2013 | A1 |
20130051543 | McDysan | Feb 2013 | A1 |
20130067050 | Kotteri | Mar 2013 | A1 |
20130096926 | Maling, III | Apr 2013 | A1 |
20130097510 | Maling, III | Apr 2013 | A1 |
20130106985 | Tandon | May 2013 | A1 |
20130208881 | Pande et al. | Aug 2013 | A1 |
20130211567 | Oganesyan et al. | Aug 2013 | A1 |
20140093091 | Dusan et al. | Apr 2014 | A1 |
20140094151 | Klappert | Apr 2014 | A1 |
20140140497 | Ripa | May 2014 | A1 |
20140168135 | Saukko | Jun 2014 | A1 |
20140192970 | Castellani | Jul 2014 | A1 |
20140233720 | Ye | Aug 2014 | A1 |
20140280879 | Skolicki | Sep 2014 | A1 |
20150085064 | Sanaullah | Mar 2015 | A1 |
20150092930 | Mullen | Apr 2015 | A1 |
20150139043 | Grevers, Jr. | May 2015 | A1 |
20150146715 | Olivier | May 2015 | A1 |
20150156598 | Sun | Jun 2015 | A1 |
20150189438 | Hampiholi | Jul 2015 | A1 |
20150195411 | Krack | Jul 2015 | A1 |
20150310877 | Onishi et al. | Oct 2015 | A1 |
20150350418 | Rauenbuehler | Dec 2015 | A1 |
20150371652 | Ewer et al. | Dec 2015 | A1 |
20160110659 | Skeen | Apr 2016 | A1 |
Number | Date | Country |
---|---|---|
WO 2007080517 | Jul 2007 | WO |
Entry |
---|
HDLC—A Technical Overview, Jan. 4, 2103 http://vkalra.tripod.com/hdlc.html. |
Unknown, “Speech Analytics, Innovative Speech Technologies to Unveil Hidden Insights,” found at URL http://www.nice.com/speech-analytics, on Sep. 11, 2014. |
International Search Report and Written Opinion dated Oct. 5, 2015, for PCT Application No. PCT/US2015/033603. |
Pallotta et al., “Interaction Mining: the New Frontier of Call Center Analytics,” found at URL http://www.researchgate.net/profile/Vincenzo_Pallotta/publication/265145028_Interaction_Mining_the_new_frontier_of_Call_Center_Analytics/links/542beb900cf27e39fa9lbea2.pdf, Jan. 1, 2011. |
Pandharipande et al., “A Language Independent Approach to Identify Problematic Conversation in Call Centers,” ECTI Transactions on Computer and Information Technology, 7(2):146-155. |
Number | Date | Country | |
---|---|---|---|
20160073185 A1 | Mar 2016 | US |