Directional microphones are well known in the art. Such microphones are capable of converting sound received by the microphone into electrical signals, while being more sensitive to sounds from the direction in which they are pointed than from other directions. This allows the directional microphone to be used to pick up sound primarily from specific sources or locations, to the exclusion of sounds from other directions, depending on the direction the microphone is pointed. Examples of such microphones include a shotgun microphone or a parabolic microphone.
In other types of directional microphone systems, arrays of microphones are arranged, and a directionally sensitive effect can be achieved by processing the outputs of the collective array to arrive at a signal representative of the sound received from the desired direction. In some applications, such microphones may be used to ‘map’ the sound from a general direction, and isolate which specific direction certain sounds are coming from.
However, for each of these types of directional microphones, thoughtful and deliberate input or handling is required by an operator to determine which direction to point or direct the microphone. Embodiments of the invention provide solutions to these and other problems.
In one embodiment, a system for converting sound to electrical signals is provided. The system may include a gaze tracking device and a microphone. The gaze tracking device may determine a gaze direction of a user. The microphone may be more sensitive in a selected direction than at least one other direction and alter the selected direction based at least in part on the gaze direction determined by the gaze tracking device.
In another embodiment, a method for converting sound to electrical signals is provided. The method may include determining a gaze direction of a user. The method may also include altering a direction in which a microphone is directed based at least in part on the gaze direction.
In another embodiment, a non-transitory machine readable medium having instructions stored thereon is provided. The instructions are executable by a processor to at least receive, from a gaze tracking device, data representing a gaze direction of a user. The instruction may also be executable to cause a direction in which a microphone is most sensitive to be altered based at least in part on the data representing the gaze direction.
The present invention is described in conjunction with the appended figures:
In the appended figures, similar components and/or features may have the same numerical reference label. Further, various components of the same type may be distinguished by following the reference label by a letter that distinguishes among the similar components and/or features. If only the first numerical reference label is used in the specification, the description is applicable to any one of the similar components and/or features having the same first numerical reference label irrespective of the letter suffix.
The ensuing description provides exemplary embodiments only, and is not intended to limit the scope, applicability or configuration of the disclosure. Rather, the ensuing description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing one or more exemplary embodiments. It being understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the invention as set forth in the appended claims. Any detail present in one discussed embodiment may or may not be present in other versions of that embodiment or other embodiments discussed herein.
Specific details are given in the following description to provide a thorough understanding of the embodiments. However, it will be understood by one of ordinary skill in the art that the embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other elements in the invention may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.
Also, it is noted that individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed, but could have additional steps not discussed or included in a figure. Furthermore, not all operations in any particularly described process may occur in all embodiments. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination corresponds to a return of the function to the calling function or the main function.
The term “machine-readable medium” includes, but is not limited to portable or fixed storage devices, optical storage devices, wireless channels and various other mediums capable of storing, containing or carrying instruction(s) and/or data. A code segment or machine-executable instructions may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, etc.
Furthermore, embodiments of the invention may be implemented, at least in part, either manually or automatically. Manual or automatic implementations may be executed, or at least assisted, through the use of machines, hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine readable medium. A processor(s) may perform the necessary tasks.
In one embodiment, a system for converting sound to electrical signals is provided. The system may include a gaze tracking device and a microphone. The gaze tracking device may determine a gaze direction of a user. The microphone may be more sensitive in a selected direction than at least one other direction and alter the selected direction based at least in part on the gaze direction determined by the gaze tracking device.
The gaze tracking device may be any device which is able to detect the direction of the gaze of a user's eyes. Gaze tracking devices and methods, sometimes referred to as gaze detection systems and methods, include, for example, products produced and available from Tobii Technology AB, and which operate by using infrared illumination and an image sensor to detect reflection from the eye of a user. An example of such a gaze detection system is described in U.S. Pat. No. 7,572,008, which is hereby incorporated by reference, for all purposes, as if fully set forth herein. Other alternative gaze detection systems may also be employed by the invention, regardless of the technology behind the gaze detection system. Gaze tracking device may employ its own processor or the processor of another device to interpret and process data received. When a gaze tracking device is referred to herein, both possible methods of processing data are referred to.
The microphone may be any device which converts sounds into an electrical signal. In some embodiments, the microphone may include a plurality of microphones which are oriented in different directions. In some embodiments, one or more of the microphones may be a unidirectional microphone which is more sensitive in one direction than other directions.
In some embodiments, the system may also include an output device. Merely by way of example, the output device may be a speaker, a cochlear implant, or a recording device. In these or other embodiments, the gaze tracking device, the microphone, and/or the output device may be coupled with a frame configured to be worn on a user's head. In some embodiments, this frame may be an eyeglass frame worn by the user, either with or without corrective, assistive, and/or shaded lenses.
The manner in which the gaze direction is determined, and how the microphone is configured to respond may vary in different embodiments. In some embodiments, the gaze tracking device may determine a point in space representative of the gaze direction, and the microphone may be directed toward altering the selected direction such that the selected direction intersects the point in space. In other embodiments, the direction of the microphone may be altered to be parallel with the gaze direction.
While in some embodiments altering the direction of the microphone may include physically reorienting the microphone or microphones, in other embodiments, where a plurality of microphones are employed, unidirectional or otherwise, the outputs of the plurality of microphones may be processed to generate a direction-sensitive output. In one embodiment, where at least two microphones are utilized, processing the outputs may include adding delay to a signal received from a first unidirectional microphone to create a delayed signal, and then summing the delayed signal with a signal received from a second unidirectional microphone. This may result in an output which is more sensitive to sounds received in the direction of the first microphone. A processor in communication with the gaze tracking device and the microphone, which has access to information indicative of the physical location and/or orientation of the microphones, may perform the necessary processing after receiving information from the gaze tracking device.
In some embodiments, additional features may be found on the system which allow a user to control the characteristics of the system. For example, a user input device, perhaps a switch or other control, could be provided to allow the user to selectively enable and disable altering of the selected direction. In another example, a user input device could be provided to allow the user to adjust the sensitivity of the microphone in the selected direction compared to other directions. On/off switches for either directional or ambient sound collection/reproduction could also be provided.
In another embodiment, a system is provided whereby a user may direct their speech, or other sounds made, to a specific person based on where they are gazing. In one embodiment, the system may include a gaze tracking device and at least one microphone. One or both of these components, as well as other components of the embodiment may be disposed on or within a frame, possibly worn by the user. In one example, the frame could be a set of eyeglass frames worn by the user. The gaze tracking device may be any device which is able to detect the direction of the gaze of the user's eyes. The microphone may be configured to receive sound or speech from the user.
In operation, the user's speech is picked up by the microphone while the user's gaze direction is monitored. A processor can determine if a listening device, for example a wireless headset or earphone (i.e., a Bluetooth™ headset), is being used by an individual in the direction of the user's gaze (or in some range around such direction). If so, the processor can direct transmission of the sound and/or speech from the user to the listening device.
Determining whether a listening device is within the direction of the user's gaze may occur prior to or during the user's emitting the sound or speech. For example, the processor may continually monitor the user's gaze direction and determine if a listening device is within the user's gaze direction, and thus be ready to transmit to such devices upon the user emitting sound or speech. Wireless communication directionally limited to the direction of the user's gaze may be used to initiate and continue communication between the system and the listening device. In other embodiments, the processor may only attempt to identify listening device within the gaze direction of the user after sound or speech is emitted by the user. Depending on the embodiment, sounds or speech of the user may continue to be transmitted to the identified listening device for some predefined time after the user's gaze has changed directions, or may cease immediately after the user is no longer gazing in the direction of the identified listening device.
Additionally, the listening devices which are eligible to receive transmissions from the system may be known ahead of time. While any number of potential listening devices may be within the gaze direction of the user, only those in a predefined list of eligible listening devices may be allowed to receive the communication. This may be achieved either by a wireless handshake procedure between the system and the target listening device being necessary at the initiation of communications, or via encryption of the communications such that only eligible listening devices are able to decrypt such communications.
In this manner, a user may communicate with a specific person they are gazing at, and only or primarily that person, regardless of their proximity to the person. This may be helpful where the person is some distance from the user, where the user's vocal capabilities are impaired (i.e., low speech volume), or where the user does not wish others to hear the sound or speech they produce (or at least not hear it as loudly).
Turning now to
In this embodiment, gaze tracking device 110 includes eye tracking devices 110a,b, and dedicated processor 110c. Dedicated processor 110c analyzes data received from eye tracking devices 110a,b, and determines the user's gaze direction therefrom. In other embodiments, processor 130 may be used in lieu of dedicated processor 110c.
Microphones 120 may be located in multiple locations of eyeglass frames 100. In this embodiment, microphone 120a is directed to the left side of the user, and microphone 120b is directed to the right side of the user. While in some embodiments microphones 120 may be electro-mechanical, and capable of being physically reoriented in response to a change in gaze direction as discussed herein, for the sake of further discussion, it will be assumed that microphones 120 are stationary, and that processor 130 processes the outputs thereof to deliver a directional output.
An output device 140 is shown in block form, and represents any number of potential output devices for the system. Merely by way of example, output device 140 may include a speaker, for example one found in an earphone or headset; a cochlear implant of the user, or a recording device such as an MP3 device, mobile phone, tablet, computer, etc. Output provided by processor 130, or directly from microphones, could be provided to output device 140 during use of the system. While output device 140 is shown here in physical communication with processor 130, in other embodiments, output device 130 may be in wireless communication with processor 130 or other elements of the system.
During use, a user employs camera 205 to observe a scene, and the scene is reproduced on display screen 250. In some embodiments, the scene may be recorded and stored on mobile device 201. While the user views display screen 250, gaze tracking device 210 determines where the user's gaze point is located on display screen 250. In some embodiments, a marker 260, shown here as a dotted-‘X’, may be displayed on the screen to reflect the determined gaze point. Mobile device 201 may then process signals received by microphones 220, as discussed herein, to create a directionally sensitive signal representative of sounds coming from the direction corresponding to the determined user's gaze.
Thus, in the example shown, the user's gaze is located on the speaker to the right in the scene viewed from the camera. Gaze tracking device 210 would determine this, and processor 230 would cause display screen 250 to reproduce marker 260 at the location of the user's gaze. Microphones 220 would receive sound from the scene, and processor 230 would then process signals from microphones 220 to create a signal which is more directionally sensitive toward the right speaker (more specifically, toward the location in the scene which corresponds to the user's gaze point on display device 250). This directionally sensitive sound can be reproduced immediately by a speaker or other output device on mobile device 201, or stored, perhaps with the accompanying video, for later use.
At block 320, the direction of microphones 120, 220 is altered in response to changes in the gaze direction of the user. Depending on the embodiment, microphones 120, 220 may be physically reoriented at block 322, or outputs from multiple stationary microphones 120, 220 may be processed to deliver directional sensitivity at block 324. When multiple stationary microphones 120, 220 are used, at block 326, delay may be added to a signal received from a microphone 120, 220 more relatively directed toward the gaze direction as another microphone 120, 220. At block 328, the delayed signal may be summed with a signal received from a microphone 120, 220 not as relatively directed toward the gaze direction to create a directionally sensitive signal. Those of skill in the art will recognize various other algorithms which may be used to generate a directional sensitive output from multiple microphones 120, 220 when the direction from which sound is desired to be produced (i.e., the gaze direction) is known. At block 330, the new signal is outputted.
The computer system 500 is shown comprising hardware elements that may be electrically coupled via a bus 590. The hardware elements may include one or more central processing units 510, one or more input devices 520 (e.g., a mouse, a keyboard, etc.), and one or more output devices 530 (e.g., a display device, a printer, etc.). The computer system 500 may also include one or more storage device 540. By way of example, storage device(s) 540 may be disk drives, optical storage devices, solid-state storage device such as a random access memory (“RAM”) and/or a read-only memory (“ROM”), which can be programmable, flash-updateable and/or the like.
The computer system 500 may additionally include a computer-readable storage media reader 550, a communications system 560 (e.g., a modem, a network card (wireless or wired), an infra-red communication device, Bluetooth™ device, cellular communication device, etc.), and working memory 580, which may include RAM and ROM devices as described above. In some embodiments, the computer system 500 may also include a processing acceleration unit 570, which can include a digital signal processor, a special-purpose processor and/or the like.
The computer-readable storage media reader 550 can further be connected to a computer-readable storage medium, together (and, optionally, in combination with storage device(s) 540) comprehensively representing remote, local, fixed, and/or removable storage devices plus storage media for temporarily and/or more permanently containing computer-readable information. The communications system 560 may permit data to be exchanged with a network, system, computer and/or other component described above.
The computer system 500 may also comprise software elements, shown as being currently located within a working memory 580, including an operating system 584 and/or other code 588. It should be appreciated that alternate embodiments of a computer system 500 may have numerous variations from that described above. For example, customized hardware might also be used and/or particular elements might be implemented in hardware, software (including portable software, such as applets), or both. Furthermore, connection to other computing devices such as network input/output and data acquisition devices may also occur.
Software of computer system 500 may include code 588 for implementing any or all of the function of the various elements of the architecture as described herein. For example, software, stored on and/or executed by a computer system such as system 500, can provide the functions of gaze tracking device 110, microphones 120, processor 130, output device 140, and/or other components of the invention such as those discussed above. Methods implementable by software on some of these components have been discussed above in more detail.
The invention has now been described in detail for the purposes of clarity and understanding. However, it will be appreciated that certain changes and modifications may be practiced within the scope of the appended claims.
This application is a continuation of U.S. Non-Provisional patent application Ser. No. 15/453,661, filed Mar. 8, 2017, which is a continuation of U.S. Non-Provisional patent application Ser. No. 14/452,178, filed on Aug. 5, 2014, which claims the benefit of U.S. Provisional Patent Application 61/873,154, filed on Sep. 3, 2013, the entire contents of which are hereby incorporated by reference, for all purposes, as if fully set forth herein.
Number | Name | Date | Kind |
---|---|---|---|
3676641 | Olson | Jul 1972 | A |
5070883 | Kasahara | Dec 1991 | A |
6110271 | Skaggs et al. | Aug 2000 | A |
6204974 | Spitzer | Mar 2001 | B1 |
6320610 | Van Sant et al. | Nov 2001 | B1 |
6353422 | Perlman | Mar 2002 | B1 |
6577329 | Flickner et al. | Jun 2003 | B1 |
6592223 | Stern et al. | Jul 2003 | B1 |
7306337 | Ji et al. | Dec 2007 | B2 |
7380938 | Chmielewski, Jr. et al. | Jun 2008 | B2 |
7549743 | Huxlin et al. | Jun 2009 | B2 |
7561143 | Milekic | Jul 2009 | B1 |
7572008 | Elvesjo et al. | Aug 2009 | B2 |
7806525 | Howell et al. | Oct 2010 | B2 |
8066375 | Skogö et al. | Nov 2011 | B2 |
8235529 | Raffle et al. | Aug 2012 | B1 |
8292433 | Vertegaal | Oct 2012 | B2 |
8390533 | Yamamoto | Mar 2013 | B2 |
8472120 | Border et al. | Jun 2013 | B2 |
8487838 | Lewis et al. | Jul 2013 | B2 |
8488246 | Border et al. | Jul 2013 | B2 |
8500271 | Howell et al. | Aug 2013 | B2 |
8964298 | Haddick | Feb 2015 | B2 |
9041787 | Andersson et al. | May 2015 | B2 |
9196239 | Taylor | Nov 2015 | B1 |
9474131 | Ahn et al. | Oct 2016 | B2 |
9596391 | Henderek et al. | Mar 2017 | B2 |
9665172 | Engwall et al. | May 2017 | B2 |
9697649 | Shepard | Jul 2017 | B1 |
9710058 | Gustafsson et al. | Jul 2017 | B2 |
10116846 | Henderek | Oct 2018 | B2 |
20020105482 | Lemelson et al. | Aug 2002 | A1 |
20030118217 | Kondo et al. | Jun 2003 | A1 |
20040100567 | Miller | May 2004 | A1 |
20050175218 | Vertegaal et al. | Aug 2005 | A1 |
20060044461 | Popescu-Stanesti et al. | Mar 2006 | A1 |
20070030442 | Howell et al. | Feb 2007 | A1 |
20070121066 | Nashner | May 2007 | A1 |
20080190609 | Robb et al. | Aug 2008 | A1 |
20080255271 | Raymond | Oct 2008 | A1 |
20080278682 | Huxlin et al. | Nov 2008 | A1 |
20090044482 | Tooman et al. | Feb 2009 | A1 |
20090086165 | Beymer | Apr 2009 | A1 |
20090122161 | Bolkhovitinov | May 2009 | A1 |
20090175555 | Mahowald | Jul 2009 | A1 |
20090189974 | Deering | Jul 2009 | A1 |
20100002071 | Ahiska | Jan 2010 | A1 |
20100007601 | Lashina et al. | Jan 2010 | A1 |
20100045571 | Yamamoto | Feb 2010 | A1 |
20100066975 | Rehnstrom | Mar 2010 | A1 |
20100074460 | Marzetta | Mar 2010 | A1 |
20100094501 | Kwok | Apr 2010 | A1 |
20100110368 | Chaum | May 2010 | A1 |
20100198104 | Schubert et al. | Aug 2010 | A1 |
20100271587 | Pavlopoulos | Oct 2010 | A1 |
20100328444 | Blixt et al. | Dec 2010 | A1 |
20110007277 | Solomon | Jan 2011 | A1 |
20110037606 | Boise | Feb 2011 | A1 |
20110069277 | Blixt et al. | Mar 2011 | A1 |
20110096941 | Marzetta | Apr 2011 | A1 |
20110140994 | Noma | Jun 2011 | A1 |
20110211056 | Publicover et al. | Sep 2011 | A1 |
20110221656 | Haddick | Sep 2011 | A1 |
20110279666 | Strömbom et al. | Nov 2011 | A1 |
20120035934 | Cunningham | Feb 2012 | A1 |
20120075168 | Osterhout | Mar 2012 | A1 |
20120113209 | Ritchey | May 2012 | A1 |
20120163606 | Eronen | Jun 2012 | A1 |
20120194419 | Osterhout et al. | Aug 2012 | A1 |
20120230547 | Durnell et al. | Sep 2012 | A1 |
20120235883 | Border et al. | Sep 2012 | A1 |
20120235900 | Border et al. | Sep 2012 | A1 |
20120288139 | Singhar | Nov 2012 | A1 |
20130021373 | Vaught et al. | Jan 2013 | A1 |
20130028443 | Pance | Jan 2013 | A1 |
20130044042 | Olsson et al. | Feb 2013 | A1 |
20130050070 | Lewis et al. | Feb 2013 | A1 |
20130050642 | Lewis et al. | Feb 2013 | A1 |
20130083003 | Perez et al. | Apr 2013 | A1 |
20130083009 | Geisner et al. | Apr 2013 | A1 |
20130114043 | Balan et al. | May 2013 | A1 |
20130114850 | Publicover et al. | May 2013 | A1 |
20130121515 | Hooley | May 2013 | A1 |
20130127980 | Haddick et al. | May 2013 | A1 |
20130163089 | Bohn | Jul 2013 | A1 |
20130169683 | Perez et al. | Jul 2013 | A1 |
20130201080 | Evans et al. | Aug 2013 | A1 |
20130257709 | Raffle et al. | Oct 2013 | A1 |
20130286178 | Lewis et al. | Oct 2013 | A1 |
20130300648 | Kim | Nov 2013 | A1 |
20130314303 | Osterhout et al. | Nov 2013 | A1 |
20130325463 | Greenspan et al. | Dec 2013 | A1 |
20130326364 | Latta et al. | Dec 2013 | A1 |
20140002442 | Lamb et al. | Jan 2014 | A1 |
20140002718 | Spielberg | Jan 2014 | A1 |
20140125585 | Song | May 2014 | A1 |
20140129207 | Bailey | May 2014 | A1 |
20140154651 | Stack | Jun 2014 | A1 |
20140160001 | Kinnebrew et al. | Jun 2014 | A1 |
20140191927 | Cho | Jul 2014 | A1 |
20140267420 | Schowengerdt | Sep 2014 | A1 |
20150006278 | Di Censo | Jan 2015 | A1 |
20150055808 | Vennstrom | Feb 2015 | A1 |
20150058812 | Lindh | Feb 2015 | A1 |
20150061995 | Gustafsson et al. | Mar 2015 | A1 |
20150061996 | Gustafsson et al. | Mar 2015 | A1 |
20150062322 | Gustafsson et al. | Mar 2015 | A1 |
20150062323 | Gustafsson et al. | Mar 2015 | A1 |
20150063603 | Henderek et al. | Mar 2015 | A1 |
20150067516 | Park | Mar 2015 | A1 |
20150309315 | Schowengerdt | Oct 2015 | A1 |
20150319342 | Schowengerdt | Nov 2015 | A1 |
20150331485 | Wilairat et al. | Nov 2015 | A1 |
20150338915 | Publicover | Nov 2015 | A1 |
20160026847 | Vugdelija | Jan 2016 | A1 |
20160142830 | Hu | May 2016 | A1 |
20160170603 | Bastien | Jun 2016 | A1 |
20160178904 | Deleeuw et al. | Jun 2016 | A1 |
20160328016 | Andersson et al. | Nov 2016 | A1 |
20160373645 | Lin et al. | Dec 2016 | A1 |
20170017299 | Biedert et al. | Jan 2017 | A1 |
20170364198 | Yoganandan | Dec 2017 | A1 |
20180020137 | Engwall et al. | Jan 2018 | A1 |
20180232050 | Ofek | Aug 2018 | A1 |
Number | Date | Country |
---|---|---|
101796450 | Aug 2010 | CN |
103091843 | May 2013 | CN |
105682539 | Jun 2016 | CN |
105960193 | Sep 2016 | CN |
2164295 | Mar 2010 | EP |
2731049 | May 2014 | EP |
3041400 | Jul 2016 | EP |
3041401 | Jul 2016 | EP |
2281838 | Mar 1995 | GB |
20160111018 | Sep 2016 | KR |
20160111019 | Sep 2016 | KR |
2009129222 | Oct 2009 | WO |
2010085977 | Aug 2010 | WO |
2013067230 | May 2013 | WO |
2013117727 | May 2013 | WO |
2014109498 | Jul 2014 | WO |
2015034560 | Mar 2015 | WO |
2015034561 | Mar 2015 | WO |
2018064141 | Apr 2018 | WO |
Entry |
---|
Ebisawa, “Improved Video-Based Eye-Gaze Detection Method”, IEEE Transactions on Instrumentation and Measurement, vol. 47, No. 4, Aug. 1998, pp. 948-955. |
Tian et al., “Dynamic visual acuity during transient and sinusoidal yaw rotation in normal and unilaterally vestibulopathic humans”, Experimental Brain Research, vol. 137, No. 1, Mar. 1, 2001, pp. 12-25. |
Kumar et al., “Electrooculogram-Based Virtual Reality Game Control Using Blink Detection and Gaze Calibation”, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, Sep. 21, 2016, pp. 2358-2362. |
Number | Date | Country | |
---|---|---|---|
20190327399 A1 | Oct 2019 | US |
Number | Date | Country | |
---|---|---|---|
61873154 | Sep 2013 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15453661 | Mar 2017 | US |
Child | 16167874 | US | |
Parent | 14452178 | Aug 2014 | US |
Child | 15453661 | US |