Any and all applications for which a foreign or domestic priority claim is identified in the Application Data Sheet as filed with the present application are hereby incorporated by reference under 37 CFR 1.57.
Audio spatialization is of interest to many headphone users, such as gamers (where is my opponent?), audiophiles (where is the cello?), and pilots (where is ground control?), for example. Location cues can be rendered through conventional headphones to signal, for example, the location of an opponent's footsteps in a video game. The normal human array of two ears, the complex shape of the pinnae, and the computational capacities of the rest of auditory system provide sophisticated tools for sound localization.
These tools include head related transfer function (HRTF), which describes how a given sound wave input (parameterized as frequency and source location) is filtered by the diffraction and reflection properties of the head, pinna, and torso, before the sound reaches the transduction machinery of the eardrum and inner ear; interaural time difference (ITD) (when one ear is closer to the source of the sound waves than the other, the sound will arrive at the closer ear sooner than it will at the ear that is farther from the sound source); and interaural level difference (ILD) (because sound pressure falls with distance, the closer ear will receive a stronger signal than the more distant ear). Together these cues permit humans and other animals to quickly localize sounds in the real world that can indicate danger and other significant situations. However, in the artificial environment of reproduced sound, and particularly sound reproduced through headphones, localization can be more challenging.
Presenting additional information through taction can provide another means for enhancing the perception of sound location.
Apparatus and methods for audio-tactile spatialization of sound and perception of bass are disclosed. The apparatus and methods of the present invention provide quiet, compact, robust hardware that can accurately produce a wide range of tactile frequencies at a perceptually constant intensity. For greater expressiveness, some apparatus for moving the skin in multiple axes are also disclosed. Signal processing methods are presented to enhance the user's experience of audio spatialization. The methods transform audio signals into directional tactile cues matched to the time resolution of the skin, and which exploit directional tactile illusions.
In some embodiments, apparatus for generating tactile directional cues to a user via electromagnetically actuated motion is provided. The apparatus includes a first ear cup configured to be located proximate to a first one of the user's ears and a second ear cup configured to be located proximate to a second one of the user's ears. Each ear cup includes a vibration module that produces motion in a plane substantially parallel to the sagittal plane of a user's head and a cushion in physical contact with the vibration module. The vibration module of each ear cup is independently addressable, and electrical signals delivered simultaneously to each vibration module produce independent vibration profiles in each vibration module. When applied to the user's skin the independent vibration profiles produce a directionally indicative tactile sensation. In some embodiments, each ear cup can include two or more independently addressable vibration modules to provide finer directionally indicative tactile sensations. In further embodiments, electrical signals delivered to each vibration module are offset from each other in time, preferably by at least 20 ms. In still further embodiments, the electrical signals may accelerate at least one of the vibration modules more quickly when the waveform is moving in one direction and more slowly when the waveform is moving in the opposite direction.
In some embodiments, an apparatus is provided that includes electro-acoustic drivers for reproducing audio waveforms as sound and tactors for generating electromagnetically actuated motion. The apparatus further includes one or more ear cups or frames. Each ear cup or frame locates the electro-acoustic driver proximate to an ear canal of a user and locates the tactors in direct or indirect contact with the user's skin. Each tactor is capable of generating motion along at least one axis, and two or more tactors are located proximate to the same side of said user's head. Preferably, each tactor is independently addressable and generates motion in a plane parallel to the user's sagittal plane. In some embodiments, the ear cups or frames locate one or more tactors in an anterior direction relative the user's ear and one or more vibration modules in a posterior direction relative to the user's ear. In these and other embodiments, the ear cups or frames locate one or more tactors in a superior direction relative the user's ear and one or more vibration modules in an inferior direction relative to the user's ear.
In some embodiments, a vibration module is provided that generates electromagnetically actuated motion along a first axis and a second axis, where the first and second axes lie in substantially the same plane. The vibration module includes a first conductive coil and a second conductive coil, where said first coil is configured to generate a magnetic field that is oriented substantially orthogonal to the orientation of the magnetic field generated by said second coil. The vibration module also includes a pair of magnets aligned with the magnetic field generated with said first conductive coil and a pair of magnets aligned with the magnetic field generated with said second conductive coil. Still further, the vibration module includes a moveable member formed from at least the magnets or said conductive coils, a suspension that that guides said moveable member with respect to the other of said magnets or said conductive coils, and at least a damping member in communication with said moveable member. At least one of said tactors may be driven independently of at least one other of said tactors located proximate to the same side of said user's head.
In some embodiments, methods and systems are provided for electronic tuning of tactile transducer parameters for improved performance in both frequency and time domains.
In some embodiments, methods and systems are provided for the use of accelerometers to provide closed-loop control of tactile transducers.
In some embodiments, methods and systems are provided for the use of microphones to provide closed-loop control of tactile transducers
In some embodiments, methods and systems are provided for the use of tactile transducers to enhance noise cancellation in devices such as noise cancelling headphones.
In some embodiments, methods and systems are provided for the use of finite impulse response filtering to improve fidelity of tactile output of tactile transducers.
In some embodiments, methods and systems are provided for techniques for matching the dynamic range of tactile acoustic transducers.
In some embodiments, methods and systems are provided forminimizing high-frequency output from tactile transducers with soft saturation filters.
In some embodiments, methods and systems are provided for devices including wearable tactile transducers that do not block the ambient sound field.
In some embodiments, methods and systems are provided for selectably turning off acoustic output in a tactile transducer-enabled headset.
In some embodiments, methods and systems are provided for improving manufacturability of tactile transducers employing fluid damping.
In some embodiments, methods and systems are provided for using tactile transducers to enhance brain wave entrainment.
In some embodiments, methods and systems are provided for including tactile transducers for in-ear headphones.
In some embodiments, methods and systems are provided for employing controlled lighting to enhance visibility of the movement of a tactile transducer.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features and advantages will be apparent from the description and drawings.
For a fuller understanding of the inventive embodiments, reference is made to the following description taken in connection with the accompanying drawings in which:
Frequencies below about 200 Hz are perceived both by sound and touch, a fact that is familiar to anyone who has “felt the beat” of strong dance music in the chest, or rested their hand on a piano. Thus, the tactile sense has much to offer a listener when proper apparatus and signals are provided. Adding sound-derived tactile stimulation, appropriately processed, can improve the sense of sound location. Adding tactile stimulation (“taction”) is also of interest to those who enjoy loud music, as it can provide a listener with the enhanced intensity at a reduced acoustic volume, thereby sparing their hearing from damage.
A number of advantages can be achieved by enhancing the directional cues already present in sound with taction. Some embodiments of the present invention are directed to delivering a Tactile InterAural Level Difference (TILD). The enhancement offered by the subject invention may be understood with a simple example: an observer witnessing another person walking on a resonant floor, as illustrated in
A pictorial representation the tactile enhancements provided by embodiments of the present invention is shown in
Transmission of a signal conveying spatial information via relative amplitude differences using taction can be accomplished with two tactors—one on each side of the head. Tactors could also be used to convey more complex signals. For example, if the ear cup of a headphone, or a portion thereof, could alternatively push forward, or backward, or upward or downward, then a great deal more information could be communicated, including the direction of movement of an object such as the opponent's foot in the air. By this metaphor and others, one can imagine how an appropriately expressive headphone could naturally augment the cues of spatial audio.
Studies with low-fidelity actuators playing tones on the skin of the head and torso have shown that tactile cues can speed reaction time over audio alone, and can help users discriminate direction (J. B. F. van Erp and B. P. Self. RTO-TR-HFM-122-Tactile Displays for Orientation, Navigation and Communication in Air, Sea and Land Environments. NATO Science and Technology Organization, 2008).
Accordingly, the inventor undertook measurements of reaction times in a left/right discrimination task to see if low-frequency vibrations derived from audio could provide similar benefits when displayed to skin contacting the cushions of headphones. The headphones produced damped, electromagnetically-actuated motion in the sagittal plane, as disclosed previously in application Ser. No. 14/864,278, now issued as U.S. Pat. No. 9,430,921, the disclosure of which is incorporated by reference herein in its entirety. Improvement in median response time for the three subjects in the test was 60 milliseconds, indicating that the added tactile signal enabled users to respond to a left or right stimulus more quickly.
In another preliminary study conducted by the inventor and a colleague, the effect of audio-derived tactile stimulation on a user's preferred listening level was investigated, to see if adding skin vibration would lower user's preferred acoustic volume (Schweitzer, H. C. & Biggs, J. (2014). Prospects for hearing protection with multi-sensory headphone innovations. Presentation to the Annual meeting of the American Academy of Audiology. Orlando Fla.). On average, the 5 subjects in the inventor's study lowered their preferred acoustic volume 4 dB when skin vibration was added. This volume reduction was non-trivial in terms of hearing preservation, since NIOSH hearing safety guidelines show a 4 dB reduction is equivalent to cutting sound exposure time by more than half. Thus, taction may provide a long-term hearing protection benefit.
The perceptual enhancement of directional cues described above can be applied in a number of additional contexts. For example, many hearing impaired people rely heavily on visual input, but the human visual field is limited to, at best, roughly a half sphere; events outside that range may be undetected by the profoundly hearing impaired. In general, such people are likely to be at least as sensitive to taction as are fully hearing people. It would be very helpful to those with hearing impairments to have a means by which sound-generating events occurring outside a person's field of view could be conveyed via tactors, such information could be coded via TILD, for example, so as to cue the wearer as to the direction of the source of the signal. Thus, if a hearing-impaired person is crossing a street and does not see an approaching automobile, that person would receive tactile cueing indicating that a horn is honking nearby. But without directional cueing, it is likely to take the wearer precious time to find the source visually. It would be far more useful (and potentially life-saving) to use tactors to convey directional information. While such tactors can be incorporated into a headphone that also conveys information via sound waves, some hearing impaired users might prefer a system that conveys only tactile signals.
In addition to assisting the hearing impaired, this aspect of the subject invention may be used to augment the senses of people with normal hearing when they operate under conditions in which normal hearing is compromised. For example, workers in industrial settings that are very loud (e.g., steel mills and other heavy industries) often wear (and may be required to wear) hearing protection. While earplugs or over-the-ear hearing protectors can preserve hearing against long-term exposure to high sound levels, they also block audible cueing that a worker may very much want to receive, such as the sound of a forklift approaching from behind, or the voice of a co-worker. Taction could provide a means for cueing a worker wearing hearing protection of the location of a sound source that is outside her visual field.
Similarly, taction could provide soldiers with a virtually silent cueing mechanism to inform them of the location of friendly (or unfriendly) actors, and could help firefighters locate each other inside burning buildings. Situational awareness is vital in these and other high-risk situations. Battlefields can very loud, and hearing loss among soldiers is a serious problem. Hearing protection reduces situational awareness. The problem may be exacerbated when other equipment, such as night vision goggles, reduce the visual field. But a taction-based system could protect hearing while preserving situational awareness. Signal processing could convert relevant audio and other information into specific types of tactive signals. For example, if a 4-member patrol is operating in a low-visibility environment, it would be useful to provide a means by which each soldier could sense the location of each of the other team members.
There are multiple ways of determining the spatial relationship between multiple persons or objects. One such method is described US Patent application number US20150316383A1 and in WO2012167301A1, both to Ashod Donikian, which uses data from inertial sensors such as accelerometers and gyroscopes commonly found in mobile devices to provide 3D information. The acquisition of position information is outside the scope of the current invention. However, there are likely contexts in which presentation of that spatial information via traditional methods (hand-held displays, heads-up displays or even traditional audio prompts) are all impractical or ineffective. The subject invention can relieve information overload from the visual and auditory communications channel, which may both lower the cognitive load of users and provide a shorter “signal path” to the decision-making areas of the brain.
When audio-frequency signal 230 is generated forward and to the right of the person wearing the headset, front right microphone 226 captures strong signal 232, while right rear microphone 228 captures weaker signal 234. The signals from both microphones are transmitted to digital signal processor (“DSP”) 250. DSP 250 may analyze relative loudness, arrival times and other parameters in order to determine the vector of origin for the sound. DSP 250 then generates signal 252 to send to the appropriate tactor or tactors. In this case, signal 252 might be sent to solely to tactor 222, to each tactor (or a subset of all tactors) with amplitudes varying in relation to the relative distance from the vector of origin to the respective tactor.
The signal sent to the tactor must match to the frequency response of the tactor and perceptual range of the skin, even though the original sound received by the microphone might be well outside one or more of those ranges. Thus, the signal generated by DSP 250 may be harmonically related to the original signal (as when the original signal is processed through a divider network). Or it may be unrelated to the source signal, but chosen based on maximum sensitivity of the subject, or on some other basis.
When employing taction in order to enhance the bass response of headphones, it may be important to ensure good matching of the perceived volume level produced by the conventional sound-generating means (one or more transducers that create sound waves in the air between the driver and the eardrum) and the tactors, which produce vibration directly on the skin rather than through the air. Similar problems have been addressed for decades in multi-driver loudspeakers (and more recently, headphones), which may use crossover networks (traditionally comprised of capacitors, inductors and resistors) to send low frequencies to one driver and high frequencies to another. In such systems it is generally necessary to attenuate the output of at least one driver in order present the desired overall frequency response to the listener.
Presenting a desired overall frequency response is more complex when combining tactors with conventional drivers, in part because the two different drivers present information via two different perceptual channels, which the brain effectively re-assembles into the desired result. Where a calibrated microphone can take a single measurement of a multi-driver speaker system (putting aside issues of positioning, room effects, etc.), a microphone cannot integrate sound pressure levels generated by conventional drivers with the vibrations generated by tactors. As used herein, the terms “conventional driver” and “audio driver” are used interchangeably and encompass a wide range of technologies, including moving coil drivers, electrostatic drivers, balanced armatures, planar drivers and other design. As used herein, the term “conventional drivers” refer to drivers that produce sound by compressing and rarifying air, thereby creating sound waves detected primarily through hearing.
It may also be the case that there are different target tactile frequency responses for headphones relative to head-mounted displays, and other wearable technology. Finally, there are at least three ways of quantifying the magnitude of the taction effect on a “listener”: acceleration (measured in, for example, meters/second/second); velocity (measured, for example in meters/second); and displacement (measured, for example, in meters). Previous research developed the iso-sensation curves for touch illustrated in
Previous attempts to present audio frequency information via taction have tended to design and measure those systems based upon their characteristics in terms of displacement (i.e., the distance traveled by the tactor when producing vibration) and/or acceleration (the rate of change in its movement). It is likely that these measurements were favored because of the common and inexpensive availability of tools (e.g., Linear Variable Displacement Transducers, accelerometers) that can directly measure those parameters. This prior work, based on measurements of displacement, does not yield subjectively flat frequency response for taction in the range of 20 to 150 Hz.
As shown in
In contrast to this displacement-based description of perceived intensity, loudspeakers have been measured for decades using microphones and related equipment capable of plotting sound pressure levels at various frequencies. Measuring speakers in terms of sound power levels is interchangeable with measuring their velocity (with adjustment for the relative surface area of the drivers), since SPL=Apv, where A is area, p is pressure and v is speaker cone velocity.
Sufficient displacement data is presented in the Verillo et al. paper previously referenced to derive velocity and acceleration iso-sensations in addition to the iso-sensations provided for displacement. This is because for sinusoidal motion the displacement, acceleration, and frequency are related as in equations 1-3, where A is displacement amplitude, and co is frequency in radian/s.
x=A sin(ωf) (Eq. 1)
v=ωA cos(ωf) (Eq. 2)
a=−ω
2
A sin(ωf) (Eq. 3)
Each of those three iso-sensation graphs, limited to the relevant frequency range, is shown in
Actually delivering consistent velocity as a function of frequency with a tactor in a headphone is a complex undertaking. Some of the factors that will affect the velocity presented at the interface between the taction system and the wearer include (1) the mechanical characteristics of the tactor itself, including the inertial mass of the reciprocating portion of the tactor, the characteristics of the spring that provides restorative force to the reciprocating portion of the tactor, and the damping applied to the system; (2) the effective mass of the headphone cup or other tactor housing; (3) the stiffness and damping of the headphone bow or other means by which the tactor is held against the skin; (4) the shear stiffness and damping of the cushions or other compressible material(s) used to couple the tactor to the skin, if any; and (5) the shear stiffness and damping of the scalp around the ear or other location where the tactor is held against the skin.
Ear cup 512 generally contacts the wearer's head via two structures: cushion 516, and the bow 517, which generally connects the left and right ear cups and provides some clamping force while distributing some of the weight of the headphones away from the cushions and to the top of the wearer's head. Some headphones use non-contact bows; these are generally lighter weight headphones. Cushion 516 may be conceptually understood as including both a spring 516.1 and a damper 516.2, which is typically provided in the form of a foam member possessing both properties. Bow 517 may also be cushioned so as to provide characteristics of both a spring 517.1 and a damper 517.2. (If the portion of the bow contacting the wearer's head does not comprise a foam or foam-like cushion, the bow may not exhibit these properties.)
The goal of taction module 502 is to move the wearer's skin 524 relative to the rigid structure underneath: cranium 530. The skin has its own elastic properties, and thus may be viewed as including spring 526 and damper 528.
Because the point of adding taction in the first place is to create the proper amount of movement at the interface 532 of the cushion and the skin, the entire system must be taken into account in order to produce the correct velocity at that point. Thus tuning the behavior of the entire system to deliver constant velocity output at intersection 532 for a given level of input is critical. It is impractical at best to change the properties of the skin on the listener, and when adding tactors to an existing system, most of the critical parameters are difficult to significantly change. One of the properties most accessible for the taction designed is the damping 510 within the tactor 502.
A mechanical system capable of producing significant output at frequencies as low as 5 or 10 hz requires movably suspending a significant mass. In motion, such a mass stores significant kinetic energy, and if appropriate means are not provided to dissipate that energy, such a transducer will exhibit highly under-damped motion at resonance, which is inconsistent with the goal of flat velocity response. In the context of headphones used to listen to music, an under-damped tactile transducer gives “one-note bass,” which greatly reduces the pitch information present in low-frequency music. In other contexts, it may interfere with other forms of signaling associated with different frequencies.
To make the system still more complex, the resonance of the module itself becomes part of the complex resonant system discussed above. There is limited value in providing a module that has a flat frequency response when suspended in free space, if the system response becomes non-flat once it is added to headphones mounted on a human head. Accordingly, an object of present invention is to provide a method of damping of taction modules specifically adjusted to provide headphone tactors with a flat velocity response when they drive a load like cushioned ear cups shearing skin around a wearer's ears.
The effect on frequency response of applying damping to the tactors is shown in
A potential consequence of using tactors to provide deep bass is that the action of the tactors is not solely perceived via shear against the skin of the listener: the tactors may also produce audio output which can be perceived via the conventional auditory pathways. Maintaining a desired acoustic frequency response in a headphone when ear cups are vibrated thus requires accounting for the combined audio contribution of the conventional drivers and the tactors. Although moving the ear cups parallel to the side of the head (as disclosed in the present invention and in application Ser. No. 14/864,278, now issued as U.S. Pat. No. 9,430,921, and which is incorporated by reference herein in its entirety) is far quieter than moving them toward and away from the head (as practiced in the prior art), the excess sound generated may not be negligible, and could produce acoustic bass audio of 90 dB or louder all by itself. This output may not be objectionable in and of itself, but may create undesired effects when added to (or subtracted from, depending on phase) the output of the conventional driver. One way to compensate for this excess acoustic bass is to attenuate the acoustic driver when the tactile vibration is already providing the acoustic bass audio.
Accordingly, several methods for accomplishing this attenuation are disclosed. One method is to treat the tactile transducer as a subwoofer, and to use a crossover circuit that attenuates the acoustic driver based on frequency as illustrated in
Preserving phase is a desirable aspect of the hand-off from driver acoustics to tactor acoustics. It may be attained by appropriately matching the order of the high and low-pass filters, as is understood from in the art of pure audio crossover circuits. It is also preferable to perform such crossover function with low-level signals (i.e., prior to amplification), because passive high-pass filtering generally requires physically large (and expensive) inductors.
The approach shown in
It is also possible to reduce or eliminate unwanted effects resulting from overlapping coverage between tactors and conventional drivers by attenuating output of the tactors in the frequency range of concern, either through crossover design or through feedback mechanisms as disclosed above.
As previously discussed, one benefit of the instant invention is the ability to convey complex spatial information using taction. For a number of reasons, it is desirable to address how embodiments of the invention can integrate with current audio standards. Tactile technology that leverages existing audio tools has a better chance of success because sound authoring tools already exist and professionals, like sound designers for games, movies, and virtual environments, are in place to apply them. Accordingly, the present invention contemplates extending existing audio editing tools, so that authors may embed useful tactile content into existing audio streams. The present invention also contemplates the creation of hardware that is capable of extracting that tactile content from conventional audio streams and delivering that content to the user. Accordingly, plugins for audio editors such as Virtual Studio Technology (“VST”) and Audio Units are explicitly contemplated.
VST is a software interface that integrates software audio synthesizer and effect plugins with audio editors and recording systems. VST and similar technologies use digital signal processing to simulate traditional recording studio hardware in software. Audio Units are digital-audio plug-ins provided by Core Audio in Apple's OS X and iOS operating systems. AU are a set of application programming interface (API) services provided by the operating system to generate, process, receive, or otherwise manipulate streams of audio with minimal latency. There are also large existing libraries for the audio APIs of video game engines. It would be desirable to provide a means for delivering spatial cueing that is compatible with existing techniques and protocols for delivering audio content.
On the hardware side, things can be simple when the tactile content aims primarily to reinforce the audio signal. Since the tactile content is generally simultaneous with the higher-frequency audio signal, low-pass filtering can be sufficient to extract it.
As discussed above, if headphones are provided with at least two tactors in each ear cup, it is possible to do more than just enhance audio content with deep bass: if two tactors per side of the head are provided, taction can provide cues about the front-versus-back location of a sound source, in addition to right-left information. For example, an array of four tactors can be provided such that one is located in front of the left ear, the second behind the left ear, the third in front of the right ear, and the fourth behind the right ear. Such an arrangement can be achieved for example by placing multiple tactors in segmented headphone cushions, for example, as is discussed more fully below. With such an arrangement, audio-derived tactile vibration may be routed to the tactor closest to the sound source. It should also be noted that the same concept can be used to integrate the third dimension in tactile spatial signaling. That is, if additional tactors are provided and arranged so that some are higher on the user's head and some are lower, it is possible to signal not just front-back information, but also up-down information.
When presenting a sound intended to be localized as coming from behind and to the right of the headphone wearer, such as footfall 1218, a corresponding signal 1220 (represented as a waveform over time) may be sent to right rear tactor 1216, while no signal (represented by a flat line 1222) is sent to right front tactor 1214. Similarly, the left rear tactor (not shown) would receive null signal 1224, and the left front tactor would receive null signal 1226. To present a sound as localized as coming from the right front, tactor 1214 would receive a signal, while the other three would not.
In the simplest case, taction signals would go to only one tactor. However, it is also possible to represent intermediate vectors with weighted signals going to more than one tactor. Thus sending 75% of the signal to the left rear and 25% to the left front would convey that the source was to the left and somewhat to the rear; sending 50% to the left rear and 50% to the right rear would convey that the source was directly behind the user, and so on.
One example of a widely used spatial coding system is Dolby 7.1, which is used in a variety of equipment including sound cards for personal computers and home theater receivers and processors. As shown in
Other surround standards have included Dolby 5.1 and DTS. Those with ordinary skill in the art will appreciate that the techniques discussed in this document may be applied in those and other similar contexts as well.
There have been multiple commercial products that seek to provide the “surround sound” experience using headphones. Many of these involve providing a relatively large number of conventional drivers within each ear cup. The limited real estate inside a headphone cup generally requires that those conventional drivers be smaller than the drivers in typical stereo headphones, which can compromise audio quality. Furthermore, the close proximity of the drivers, and the difficulty of isolating those drivers from each other, makes providing a convincing experience challenging. Providing a method for mapping the information encoded in Dolby 7.1 to stereo headphones provided with four tactors, on the other hand, presents spatial information without compromising audio quality.
One aspect of the subject invention is a means for using multiple tactors to encode multi-channel spatial information using conventional stereo headphones. A simplified conceptual version of this concept is shown in
One drawback to such a simplified approach is that taction is most effective for low frequencies, and tactors are likely to be used with low-pass filtering, so that high frequency content in the surround channels will be filtered out of the taction signal, thereby reducing the surround effect. While tactors alone will not be capable of fully realizing a surround effects, aspects of the subject invention present more sophisticated matrix approaches that can deliver significant surround effects despite these limitations.
One method of mapping the 8 channels of a 7.1-encoded program to a headphone system consisting of two audio drivers and four tactors is shown in
In order to achieve these effects, it is necessary for the full multi-channel signal set to reach the processors performing the steps listed above. Thus the result can be accomplished by providing a separate module that is connected between the signal source and the headphones. The signal source may be a game console, home theater receiver or processor, computer, portable device capable of outputting multi-channel audio, or other compatible device. Alternatively, the processors may be located within the headphones themselves, but that approach requires that the information contained in each channel remain separate when conveyed to the headphones, which requires a more complex cable. Alternately, the data may be transmitted wirelessly from the box to the headphone, before or after the summation. An additional alternative is to transmit the audio information to the headphones as an integrated digital signal, with decoding and digital-to-analog conversion taking place in circuitry within the headphones. The particular summing scheme described here is merely an illustrative example, and other relative weight-factors, and additional audio-to-tactile connections are contemplated by the present invention
It may be that a movie, game, or song encoded with an existing audio standard such as Dolby 5.1, Dolby 7.1, or DTS already has appropriate low-frequency information in the selected channels that can be present using tactors. In those cases, routing directional cues to the tactors is more straightforward. Or, it may be that a given recording has routed much of the content to a Low Frequency Effects channel (LFE). Where low-frequency content has been routed solely or primarily to the LFE channel, the original information spatial cueing that may have once existed in those signals cannot be perfectly reconstructed. However, given the nature of most naturally occurring sounds, which tend to be comprised of both fundamentals and a series of overtones, a strong impulse in the (directionless) LFE channel, for example, is likely to be correlated with a higher-frequency impulse in one or more of the other directional channels. It is therefore possible to assign the LFE signal to one or more tactors based upon analysis of the signals in the other channels, and thereby providing a significant approximation of a full 5.1 or 7.1 experience with stereo headphones. A simple way to accomplish this is to route low frequency effects to the channel with maximum acoustic power in a specific frequency band, such as the range from 80-200 Hz, as illustrated in
Although it is possible to achieve at least some version of the type of processing discussed through analog circuitry, it is significantly simpler to do so in the digital domain. Accordingly, the simplest way to accomplish this processing is prior to conversion of the digital multichannel signals into analog signals. However, it can still be accomplished after D/A conversion; it would then however be necessary to re-convert the signal into the digital domain prior to processing, and then process it through a second D/A converter after processing.
Input channels may include right front 1502, left front 1504, center 1506, right side 1508, left side 1510, right back 1512, left back 1514, and low frequency energy channel 1520. Front left 1502 and front right 1504 signals are sent to the conventional drivers 1530 and 1532 (through circuitry that may include D/A converters and amplifiers, not shown) in addition to being sent to the digital signal processor (DSP) 1540. The remaining channels including all surround channels and the LFE channel are sent to the DSP 1540.
In an implementation of this approach, DSP 1540 is used to identify from moment to moment which of the seven directional audio channels contains the strongest signal. If, for example, left rear channel 1514 has the strongest signal (as for example, if the sound of an explosion is to be produced at that location), DSP 1540 will direct the signal from LFE channel 1520 to left back tactor 1550. Similar localization based on activity in the directionally specific channels can be used to direct output to right back tactor 1552, left front tactor 1554, or right front tactor 1556.
While some content presents sounds as being delivered purely by a single channel, modern programming sometimes uses multi-channel content in a more sophisticated way in order to present the illusion that sounds are coming from a place between two discrete outputs. For example, a sound that is intended to sound as if it coming from directly behind the listener may be presented with equal intensity in both the left rear and right rear channel, with no related output in any of the other channels. Such weighting is particularly useful when presenting the illusion of motion, so that sounds move smoothly between channels rather than jumping from one source to another; the weighting adjusts incrementally.
These more sophisticated effects can be produced as well using the subject invention. In some embodiments, the intensity of the signal in multiple input channels could be weighted and the output directed to a combination of tactors in order to approximate the ratios in the directional channels—in essence, multiplying the vector of spatial audio signals by a weighting matrix. Thus, for example, if instantaneous volume levels are 40% of maximum in the front right channel 1502, and 80% of maximum in right side 1508, and zero in the other channels, the taction signal would be divided among right front tactor 1556 and right rear tactor 1552 in order to place the subjective source of the sound reproduced by the tactors at a point between the two, but closer to the front tactor 1556.
One limitation of this approach is that in some contexts (particularly those with multiple uncorrelated events) not all sounds being generated are related to the specific content in the LFE channel. Thus a more sophisticated approach would involve analysis of the signals present in each directional channel. Heuristics can then infer sound direction from the waveforms present in each of those channels. For example, it is likely that the sound of an explosion will result (a) in a specific waveform in the LFE channel, and (b) that one or more directional channels will contain a signal that is correlated with that LFE signal. Factors indicating such correlation might include the degree to which frequencies in the audio channel are harmonics of the frequencies in the LFE channel. Or, the sound-power-level in the best audio channel might have the highest correlation with the sound power level in the LFE, or other factors. Those correlations may be used to inform the DSP as to which of the tactors should receive the LFE signal at a given moment.
In the case of many computer games, and for gaming platforms such as the Sony PlayStation and Microsoft X-Box, the problem of delivering directional bass signals to the appropriate tactor is simpler. Position information about sound sources is often available within game software, and the signal can be processed to activate the correct tactor.
Because game audio requires real-time audio-to-tactile filtering, it is most efficient to do taction processing within game-engine software. This approach does the necessary audio processing within the computer, console or other device, prior to generation of the signals for each channel and subsequent conversion to analog audio, as opposed to the methods previously discussed, in which processing occurs after those steps have already occurred.
Application Programming Interfaces for spatializing sound are standard features of video games and virtual reality simulations. The present invention contemplates extending the capabilities of these code libraries by incorporating the audio-tactile algorithms disclosed herein. The coding conventions now used to process monaural sounds into spatial audio apply in a natural way to the structure of the audio-tactile direction cueing algorithms outlined here. That is, the game or VII engine sends the following data to the spatializing sound function (1) position of a sound emitter relative to the listener's head and (2) the digital file of sound to be spatialized. After processing, the function returns to the game engine, or sends to the sound card (1) a right and left audio signal to display to the user and, optionally, (2) additional audio signals for additional transducers, such as the multiple speakers of Dolby 7.1 format.
The algorithms of the present invention are naturally implemented in this established programming structure. For directional tactile cueing, the general process of changing the signal frequencies (spectral filtering), and introducing appropriate time delays is analogous to the processing required for spatial audio.
The output of tactile directional cueing algorithms may be low-frequency modifications to sounds that will be routed to conventional right and left acoustic drivers. These low frequency signals may subsequently be extracted by low pass filtering at a processing component of the tactor driver. Or the signals may be directed to existing signal pathways that are “vestigial” for headphones, such as the multiple channels that remain unused when headphones are plugged into a Dolby 7.1 sound card. These channels may be attached to tactors instead. Or, the output of the algorithms may be directed to entirely new, dedicated tactile channels, by extension of current audio standards.
Another application for the subject invention involves imparting tactile spatial information to a user. A useful metaphor for the tactile spatialization of sound is the concept of “Liquid Sound.” The directed sensation of flowing water is familiar to everyone. It has a vibratory component—the impact of individual droplets—and a directed force component: the net momentum of the water stream. Tactile stimulation that can create a sense of directed force can make natural use of this familiar metaphor to cue the direction of sound.
When a conventional symmetrical waveform is applied to the skin via taction in the form of shear vibration, there is no net directional force, and no directional signaling other than that conveyed by the difference in intensity between multiple tactors. That is, in a system comprising x tactors, if all x tactors receive the same symmetrical waveform, no directional cueing takes place. However, when shear vibration is applied to the skin, and the vibration has an appropriate asymmetric acceleration profile, the perception can be one of both vibration and a net pulling force. See T. Amemiya, H. Ando, T. Maeda, “Virtual Force Display: Direction Guidance using Asymmetric Acceleration via Periodic Translational Motion”, In Proc. of World Haptics Conference 2005, pp. 619-622. This occurs because the human tactile system is not a perfect integrator, and brief, strong accelerations are felt more than longer weak ones.
A visual representation of this effect is shown in
This tactile illusion provides a rich opportunity to convey directional information about sound. It means that a shear tactor located in a left or right ear cup can provide more than just right/left information by virtue of being on or off. It can also provide forward-back information by directing peak accelerations forward or backward. Thus additional directional cues can be derived from fewer tactors.
To do the requisite audio-to-tactile signal processing, it is useful to consider how acceleration pulses that evoke the tactile illusion of directed pulling appear when expressed in terms of velocity and position. This is accomplished by simple integration with respect to time, and it shows that an acceleration pulse is a velocity sawtooth, as illustrated in
Consider a positively directed acceleration pulse 1802 that evokes a sensation of pulling in the positive direction, as shown in the upper left of
In
The graphs shown in
It should be noted that the equivalent graphs for an un-directed low frequency tone (that is, a sine wave), look very different. The accelerations, velocities and positions—are all simply smooth sinusoids.
When tactors in a wearable device such as headphones are tuned to respond to voltage with velocity, then tactile directional cues may be produced by signal processing methods that turn a low-frequency sine wave (simple vibration) into a saw-tooth wave (directed vibration). The steep part of the sawtooth is the needed acceleration burst. When the position of a sound source is known, as in game software or mixing film audio, the position of the sound source is used to set the polarity and steepness of the burst.
One method of turning a non-directed sine wave into a directed sawtooth is to add higher harmonics. Examples of how this processing affects a sine wave signal are shown in
Graph 1910 shows a reference sine wave 1914 identical to sine wave 1904 as well as that waveform processed in order to create polarity and directional cueing, which results in a rough sawtooth wave 1916. (A perfect sawtooth includes all harmonics, and is thus not achievable by a low frequency driver. As a practical matter, adding a few harmonics is currently deemed sufficient and even advantageous.) Rough sawtooth wave 1916 shows a slow rise and a fast fall. It is thus biased in the negative direction, and the person wearing headphone cup 1918 will perceive that the cup is pulling backwards relative to his head, as indicated by arrow 1919.
Graph 1920 shows both the reference sine wave 1924 identical to sine wave 1904 as well as that waveform processed in order to create polarity and directional cueing, which results in a rough sawtooth wave 1926. Rough sawtooth wave 1926 shows a fast rise and a slow fall. It is thus biased in the positive direction, and the person wearing headphone cup 1928 will perceive that the cup is pulling forward relative to his head as indicated by arrow 1929.
Exemplary Matlab code for transforming a non-directed sine wave into a directed one is presented in
2 sin(θ)·cos(θ)=sin(2θ) (Eq. 4)
and that
cos(θ)=d/dt(sin(θ)).
Thus, differentiation of the input signal, and multiplication of the result with the input signal itself is used to produce the desired harmonics. But it will be clear to one skilled in the art that any number of approaches to “sawtoothing” the sine wave can yield the desired result.
An example of the effect of such processing on a 15 Hz sine wave is shown in terms of the expected velocity and acceleration of a tactor-enabled headphone driven by that signal in
This harmonics-based filter (which synthesizes higher harmonics based on the frequency of the fundamental) is just one exemplary method for creating the same directed effect. One possible disadvantage of this particular approach is that the velocity calculation step is sensitive to noise. This may in some cases increase distortion. Another exemplary method for adding directionality that does not have those effects is to detect zero-crossings and add a polarizing bump to the signal when appropriately-directed crossings are detected. A graphic representation of this approach is shown in
Audio waveform 2202 is a complex signal. The portion of the signal displayed includes 11 zero crossings. It should be noted that adding a positive pulse at an upward zero crossing produces a smooth, continuous, and positively-directed directional signal, while adding a negative pulse at downward zero crossing smoothly produces the opposite result. Thus when seeking to produce a directional cue, on average half of the zero crossings will be appropriate to modify and half will not. In the illustrated example, the six negative-to-positive crossings are at 2204, 2206, 2208, 2210, 2212 and 2214. A pulse 2216 of a given duration, tmin, is added when an appropriate zero crossing is detected at 2202. To prevent prematurely re-triggering the pulse, once a first pulse is triggered, additional zero-crossing are disregarded until tmin has elapsed. Thus negative-to-positive crossings 2206, 2210 and 2214 do not receive the polarizing bump because they are too temporally proximate to the previous pulses. By this means, a series of directed asymmetric velocity pulses may be added to an audio signal at a frequency approximately equal to (1/tmin). By adding these pulses at zero crossings, audible discontinuities in the signal are avoided.
This approach has the advantage of simplicity and robustness. If a pulse shape and frequency that best evokes the haptic illusion of directed pulling is determined, for example by deconvolution, it guarantees that exactly this signal is added, and that it is added at approximately the best frequency. It is an approach that prioritizes the directed pulling sensation.
In contemplating the range of processing techniques that may produce the directed pulling sensation, this approach lies at one extreme. It is almost indifferent to the input signal. At the other end is the first algorithm presented, in which the sharpening harmonics are derived entirely from the input signal. In this range, one skilled in the art may imagine a variety of processing techniques, some that conform more closely to the input signal, and others that prioritize production of the directed tactile illusion. These two non-limiting embodiments merely serve to illustrate the range of techniques for processing audio into directed tactile sensation that will occur to one skilled in the art.
Pseudocode and an illustration of the zero-crossing method are provided in
Many other synthesis or filtering methods are possible, and fall within the scope of the present invention. Generally speaking, appropriately-directed acceleration bursts consonant with the existing low-frequency audio (that is, appropriately related both harmonically and temporally) can be generated, where the polarity and sharpness of the bursts indicate the direction and proximity of the sound source.
An advantage of the “bump” method of adding these bursts is that the shape of the bump can be tailored to the step response of the wearable system. That is, the bump can be whatever wave shape best produces the desired acceleration burst on the body. The shape of this wave can be found, for example, by deconvolving the desired velocity profile with the step response of the system.
Despite best design efforts, an inertial tactor cannot be a perfect velocity source. There are several limitations on performance. The rate of velocity change is limited by peak power. The peak velocity is limited by damping. The velocity can go in one direction for only a limited time before the inertial mass hits a travel stop. The overall system of tactor, headphone, and head may be slightly underdamped, and therefore remain in motion after zero velocity is commanded. Furthermore, different users with different skin mechanics will introduce different stiffness and damping into the system, altering the system response. For all these reasons, inertial tactors are an imperfect velocity source.
In the presence of these limitations, the degree to which the system follows a desired velocity trajectory can be improved with signal processing. Deconvolution, for example, may be applied to a target tactor velocity signal, so that the tactor does the best possible job of reproducing it. A full discussion of deconvolution is beyond the scope of this disclosure, but briefly, the steps are these.
First, the deconvolution filter is found with the following steps:
A deconvolution filter that gives good results for most people may be found by testing tactor-equipped headphones on multiple people and averaging their deconvolution filters. Alternately, a user-specific custom filter can be determined by the system automatically upon startup. To do this the system follows steps 1-5 upon startup. To use the deconvolution filter, the following steps are undertaken:
In view of the above, one skilled in the art will understand that applying a deconvolution filter to an input signal can correct, to limited degree, deficiencies in the ability of a tactor to faithfully reproduce the target velocity signal. Limitations of the deconvolution approach include sensitivity to noise, and the introduction of lag. Thus, it is particularly appropriate for offline processing. A good application of deconvolution processing is to determine the voltage signal that best produces a velocity sawtooth that makes an acceleration pulse that evokes the tactile illusion of directed pulling.
A logical place to implement this kind of directional filtering is in the audio API of a game engine, for real-time processing. For offline work, the directional filtering can be embodied in plug-ins for sound editing software, such as in VST or AU plugins, for example.
As discussed above, asymmetric waveforms can be used to present directional effects with tactors. Additional effects can be presented (or other effects can be made more convincing) by using tactile timing cues for signaling the direction of a sound source. Because the speed of sound in air is fast (˜343 m/s), the interaural time differences normally used by our audio cortex (that is, differences in arrival time of sound waves between our two ears) are short (sub-millisecond). Applying such short delay may be effective in synthesizing locational cues when applied to conventional drivers in headphones. Unfortunately, this time-scale is too short for the tactile system, which is “blurry” by comparison the human body perceives tactile events less than about 20 milliseconds apart as simultaneous. One may imagine, however, that a time interval perceptible to the tactile system would occur if, for example, one could fall slowly into water, as illustrated in
At time t0, subject 2402 has not yet contacted the water. Tactor-equipped headphones would ordinarily not be called upon to produce any effect related to the impending event. At time t1, the right side of the head of subject 2402 has entered the water. In order to simulate this effect, one or more tactors in right headphone cup 2404 would generate a pulse. At time t2, the left side of the head of subject 2402 has entered the water. In order to simulate this effect, one or more tactors in left headphone cup 2406 would generate a pulse. The delay between the first pulse in right headphone cup 2404 and the second pulse in left headphone cup 2406 could reinforce the illusion of that event. That is, the water line might nudge a closer ear cup first at time t1 and a further ear cup second at time t2. In the absence of all other information, the relative timing of the events would provide some information about the orientation of the water surface relative to the head.
Absent specific preparation as described below, no preloading or other preparatory action is generally taken in the use of tactors or loudspeakers. The maximum force a tactor can generate is defined at least in part by its maximum travel and the maximum speed with which it can cover that distance. Tactors, such as those described herein, are likely to move more or less symmetrically about a resting position. In the simplified case in which a tactor is completely idle until time t1, at which point it is called upon to deliver a single maximal impulse, only half of the total potential travel of the tactor is available.
A method for increasing the capacity of tactor to convey such an effect would be to use a low-velocity signal to give the tactor a “backswing,” allowing it to reach maximum travel in the “minus” direction immediately before it is asked to deliver a maximal pulse in the positive direction, and vice versa in the opposite case. If the “backswing” is sufficiently slow, it will be imperceptible to the wearer, but this technique will effectively double the power available for single impulse without requiring a more massive tactor or a more powerful amplifier.
Delivering such an effect requires a preview capability: the ability (preferably in the digital domain) to insert a backswing into the signal stream before the event that is to be modified. Inserting this pre-pulse “backswing” is straightforward when processing sound files offline, such as in the production of sound for movies and music. In real-time spatial audio applications, such as computer gaming and virtual reality, a reasonable approach is to include the “back swing” at the end of the pulse. Although the first pulse in a train of pulses does not get the benefit of a back-swing, all subsequent pulses do, and no lag is introduced into the system. With this approach, the backswings will generally (but not always) be correctly oriented, since the direction of a sound emitter in a virtual environment changes slowly with respect to the frequency of the pulses. In situations with multiple sound emitters from multiple directions (requiring oppositely directed backswings), this approach degrades naturally to be no better or worse on average than performance without backswing.
This general idea about timing and directional taction (Liquid Sound) can be extended from the situation of falling slowly into water to perceiving a very slowly moving shock wave in air, as shown in
Accordingly, an aspect of the subject invention is to cue sound direction by processing audio so that amplitude of a tactor farther from a sound source is kept low for an interval long enough for the tactile system to perceive the onset difference (e.g. 50 milliseconds). This adds a tactually perceptible time difference cue to the acoustically perceptible interaural time difference cue that the nervous system already uses to localize sound. Thus in
Another view of this method for delivering spatial cueing is shown in
The delay 2620 between tactile signal 2608 and tactile signal 2616 may be produced in a number of ways. One method is to apply an envelope filter 2620 to tactile signal 2616, with the duration of the “closed” phase of the envelope filter's action timed to equal the desired delay. However, for events of very short duration, this method may eliminate a significant portion of the desired signal. Thus another approach would be to produce the same signal in the delayed channel as in the non-delayed channel, but provide delay, a process best performed in the digital domain, though analog delay lines could also be used.
A number of variations on this general approach are contemplated. The signals sent to the left and right tactors may be the same, so that only the time delay distinguishes them. They may be different, so that additional cues are provided by other characteristics such as phase differences, amplitude differences and the like. More nuanced presentation is also possible if more than one tactor is present on each side. Each tactor may also use filtering and/or waveform synthesis in order to provide polarization of one or more of the signals, as described above. These techniques may be combined in order to enhance the effect.
Some previously discussed embodiments assume that the tactor is rigidly mounted to the cup of the headphone. This approach requires that the tactor move at least the entire mass of the headphone cup, and in many cases some portion of the mass of the rest of the headphone system, in order to produce motion at the wearer's skin. This approach is analogous to holding one's hand against the side of a bookshelf loudspeaker: in order to produce sensible vibration, the driver must not merely excite air, as loudspeakers are intended to do; it must shake the entire cabinet, which is considerably more demanding. Thus significant force is required to excite the entire headphone cup, which necessitates a relatively powerful motor and amplifier, as well as a large battery or other power source.
It would be advantageous to provide a method for producing tactive forces without having to excite the relatively large mass of the entire headphone assembly. An additional aspect of the subject invention is therefore the use of tactive cushions movable on actuated plates that are partially decoupled from the headphone cups, so that the cushions efficiently transmit shear to the skin without having to excite the mass of the rest of the headphone assembly.
It is clear that there are some advantages to this approach over vibrating an entire ear cup, as disclosed in application Ser. No. 14/864,278, now issued as U.S. Pat. No. 9,430,921. If only the cushion is moved, as opposed to the entire headphone assembly, the effective moving mass is reduced and less force is required for a given tactive output. Also, everything in the headphone that is not the cushion becomes a reaction mass (analogous to the cabinet of a conventional loudspeaker when producing sound waves), providing a heavier platform for the cushion to push off of, enabling the tactor to provide output at lower frequencies.
The conformable portion of cushion 2712 is rigidly coupled to moveable stage 2714. In a conventional headphone, the cushion would be attached to the cup and/or baffle plate 2706 so as to allow minimal shear motion of the cushion relative to the baffle plate, and to damp whatever motion is permitted. In the subject invention, moveable stage 2714 is permitted to move relative to baffle plate 2706 by suspension 2708, described in greater detail below.
One or more tactors are mounted so as to provide motive force to the moveable stage relative to the baffle plate. This may be accomplished, for example, by attaching magnets 2716 to moveable stage 2714, and electrical coil 2710 to the baffle plate. When current is applied to coil 2710 in the form of a waveform the magnets attached to the stage experience a force in one direction 2720, and coil attached to the baffle plate experiences an equal force in the opposite direction 2722. Where cushion 2712 and moveable stage 2714 together have significantly lower mass than baffle plate 2706 and all of the elements rigidly attached thereto, the primary result will be the desired motion 2730 of the stage 2714 and cushion 2712 (shown in plan view), applies shear traction to the skin of the wearer.
The suspension of a tactor as movable cushion must meet a daunting array of challenges. In the preferred embodiment, it should be thin, drop-proof, allow multiple degrees of freedom, limit over-travel, and be silent.
A first example of such a suspension is shown in
Such a suspension system may be mounted so that both the first plate 2804 and second plate 2806 are mounted between headphone cup 2830 and the rest of the headphone system, so that both plates are roughly parallel to the sagittal plane of the listener's head 2832, and relative motion 2834 is enabled between the stage 2806 that carries the cushion and headphone cup 2830 along an axis parallel to the sagittal plane.
When tactive force 2836 is applied, plates 2806 and 2804 attempt to move relative to each other, and elastic domes 2802 deform as bosses 2808 move against the domes. When the opposite force is applied 2838, the domes distort in the other direction. Because the domes are elastic, they provide restorative force as well as a measure of damping.
Such a suspension system may require restraining means so that the cushion assembly is generally attached to the cup assembly. One means for restraining the cushion assembly is illustrated in
Components of a second suitable suspension are shown in
One direction a tethered ball bearing allows is axial motion transverse to the orientation of the tether. Thus as illustrated in
Although a tether as shown in
In some embodiments, multiple bearings may be arranged by receiving features in a baffle plate, so as to define a movement plane for a cushion stage. The bearings may be pre-compressed by elastic elements to prevent rattling and to elastically limit lateral travel of the stage. An exploded view of certain components of one side of a pair of headphones with three tethered ball bearings providing bounded relative motion is shown in
Baffle plate 3102 attaches to main headphone structure, including the cups. It also provides locating features for other components, including recesses 3104 for each of the three bearings 3106, as well as tabs 3108 for retaining elastic pre-loading elements 3110. These pre-loading elements, which may be composed of silicone or other elastic material, may both pre-load the bearings in order to minimize noise generated by the bearings, and may also provide means for preventing separation of the overall assembly.
One or more tactors consist of at least a coil 3120 and at least a pair of magnets 3122. One of coil 3120 and magnets 3122 will be fixed relative to baffle plate 3104; the other will be fixed relative to moveable stage 3130, which is in turn attached to cushion 3140. Moveable stage 3130 may also include tabs for attaching elastic pre-loading elements 3110. When an appropriate signal is fed to coil 3120, relative motion between the two assemblies is created, limited by bearings 3106 and/or elastic preloading elements 3110.
An alternative embodiment of the tethered ball bearing would include a second tether orthogonal to the first tether and anchored to the plate that the first tether is not tethered to. This implementation would provide both the function of a bearing and the function of holding the two major assemblies together. A variation on this embodiment would use elastic tethers in addition to the elastic balls (potentially molded as a single component) so that the tethers themselves provide sufficient pre-load to address potential noise caused by relative movement of the assemblies.
It is convenient that these suspensions allow translation in two axes and rotation to facilitate additional drivable degrees of freedom. For example,
When current i 3208 flows through each of coil 3204 and 3206, motion relative to the magnets mounted on the opposite component is created. A positive voltage moves the system in one direction; a negative voltage moves it in the other. When two tactors are mounted as shown in
If coil 3206 is driven to produce translational force 3210, and coil 3204 is driven to produce translational force 3212, and both force 3212 and 3210 are aligned, then the resulting action will be a translational force 3214 that is the combined force of the individual tactors (less system losses). However, if both coils are driven so that coil 3204 delivers force 3222 which is 180 degrees from force 3220 generated by coil 3706, the result is not translation but rotational movement 3224 (i.e., torque). In video games and virtual reality simulations, torque may be used, for example, to cue changes to the user's pitch orientation, such as the moment the orientation of a roller-coaster cart changes from uphill to downhill. The magnitude of that torque will depend on both the force of the individual tactors and the radii 3226 and 3228 that define the distance between each tactor and the center of rotation.
With additional coils, three degrees of freedom may be controlled individually; these may be thought of as (i) front-to-back motion, (ii) up-and-down motion, and (iii) rotation around an axis running between the wearer's ears. However, other orientations are also possible. Two exemplary coil layouts are shown, but many are possible and lie within the scope of the present invention.
While three tactors will be generally less expensive than four tactors as illustrated in
A four-tactor array is illustrated in
Translational motion can be generated in any direction along the x and y axes through various combinations of signals to the four tactors. In an example case, current through tactor 3356 generates force 3382 and current through tactor coil 3352 generates force 3380 and those two forces combine to generate the vertical component of force 3384. The horizontal component of force 3384 comes from the net difference of oppositely directed forces 3360 and 3362 produced by the other two coils. If both of these actions take place simultaneously; that is, if tactors 3352 and 3356 both generating translational force 3384, and tactors 3350 and 3354 generate rotational force 3364, the resulting force 3390 both torque and a net force vector are simultaneously produced. Combining signals in this way permits the creation of force along any vector in the plane defined by the x and y axes, and simultaneous presentation of an arbitrary torque.
As shown above tactors may be mounted on plates that move separately from the headphone cups. A further embodiment of the invention provides multiple moving segments, to provide additional tactile expressiveness, as shown in
In addition to locating tactors in the cups of the headphone, it is also possible to locate them in other parts of the headphone, such as the bow connecting the cups, which often distributes the weight of the headphones to the top of the head, and thus provides another point of contact. As illustrated in
One of the challenges associated with delivering taction transmitted through headphones is that the signal generator (the tactor) generally does not directly contact the skin of the user: it has to transmit its signal through the cushions used to locate the audio driver relative to the wearer's ear, and to provide comfort and (in most cases) noise isolation. Those cushions tend to consist of a pliable outer material such as leather, vinyl or fabric, and an inner component, which is generally resilient foam, but may also me comprised of liquid, air or other material. Some headphones provide only open-cell foam, and dispense with the separate outer layer. One purpose of the combined inner and outer portions of the cushion is to conform to the complex and irregular topology of the head in the immediate vicinity of the ear (or, in the case of on-ear headphones, the ear itself.) A second goal is to absorb sound—from outside the earphone, in order to provide a level of isolation, and in some cases to absorb unwanted reflections from hard surfaces inside the headphone. These goals are generally achieved by configuring the cushion assembly so that is soft and dissipative—that is, so that it will absorb vibration. This property works at cross-purposes with a tactor, in effect potentially throwing away a significant portion of the energy generated by the tactor before it reaches the listener.
When headphones include tactors as described herein, the headphone cushion may ride on a stage, moving in-plane, with the goal of applying shear taction to the skin. It may be desirable that the displacement of the stage not be consumed by the elastic compliance of the cushion. However, reducing losses by reducing compliance through existing methods is likely to cause sacrifices in the performance of the cushion in other aspects like conformance to the head or ear, sound isolation and comfort.
An aspect of the invention is to improve the performance of the taction system without significant adverse effect on the other aspects of cushion performance. This goal may be achieved by employing an anisotropic material as part of the construction of the headphone cushion; in other words, a material that is stiff in shear, so that it is effective in transmitting the sheer force of the tactor(s), but still compliant and comfortable in compression. A full discussion of anisotropic linear elasticity is beyond the scope of this specification, and may be reviewed elsewhere (for example see Piaras Kelly, Solid Mechanics Lecture Notes, Part I—An Introduction to Solid Mechanics, Section 6.3, pg. 157—Anisotropic Elasticity, University of Aukland, 2013). That said, a brief explanation is required in order to be clear about the sort of anisotropic material properties the present invention teaches.
For an isotropic material, the shear modulus (G) and Elastic Modulus (E) are related by Poisson's ratio (v), which captures volumetric compressibility of the material. For an isotropic material the ratio of shear modulus to elastic modulus is:
The present invention teaches headphone cushions comprised of anisotropic materials, where the ratio of shear modulus to elastic modulus is greater than it would be for an isotropic material. That is, where
Since typical foams have Poisson's ratio around 0.3, the present invention teaches the use of materials where the unitless ratio of shear modulus to elastic modulus (Gxz/Ezz) is greater than 0.4. Specifically, where:
The cushion material is oriented so that the (softer) z-axis of the material points at the wearer's skin, and the (stiffer) x-axis of the material points parallel to the skin, in the direction shear forces are to be applied to the skin by the cushion. Soft materials are of particular interest. Accordingly, the present invention teaches the use of anisotropic materials with elastic modulus in the range typical of cushioning foams, 10 kPa<E<10 MPa.
A simplified cross-sectional view of the foam commonly found in headphone cushions is shown in
There has been at least one prior design applied to a headphone cup that may provide some anisotropic stiffness. Kokoon has marketed a design that includes a low-profile cushion support comprised of discrete flexures, as illustrated in
An aspect of the subject invention that overcomes some drawbacks of art shown in
Sectional view 3608 is taken from cushion 3604 through section A-A 3610. A magnified view of the material within the cushion is shown in 3612 in its relaxed state (that is, uncompressed). It illustrates a means for creating an anisotropic compressible material: its fibers (and the airspaces between these fibers) are not randomly shaped or oriented, but instead are elongated along the plane in which motion is to be resisted. Thus when force 3614 is applied transverse to the face of the cushion, material 3612 offers relatively little resistance to deformation 3616 (compression Δz). However, when shear traction 3618 is applied to cushion material 3612, multiple individual fibers such as 3620 are oriented so that they run in a plane relatively parallel to the force applied, and are relatively resistant to tensile deformation, so that overall movement 3622 (lateral displacement Δx) is relatively small. Thus a headphone cushion comprising an anisotropic material will improve the efficiency with which the output of tactors is conveyed to the skin of the person wearing tactor-equipped headphones.
Another aspect of the invention is a tactor capable of both inertial and impact actuation in multiple degrees of freedom. Inertial actuation may be thought of as the generation of vibrations with a tactor over a range of motion in which the relation between input signal and output is relatively linear—that is, that an increase in the magnitude of the input signal (generally measured in voltage) results in a proportionate increase in forces generated by the tactor. As a practical matter, a tactor as described in as disclosed previously in application Ser. No. 14/864,278, now issued as U.S. Pat. No. 9,430,921, will perform in inertial mode so long as its displacement does not cause it to make contact with its frame.
When a tactor is driven with enough energy to cause it to make contact with its frame, the tactor is operating as an impact device. In impact mode, additional input force does not materially increase travel. The difference is illustrated in
Single-axis impact tactors are already known from prior art, but can make noise unsuitable for headphones. They generally include a metal moving mass and a frame made of metal or other material, and the transition from inertial to impact mode creates undesirable noises as the mass hits the frame. Metal-to-metal collisions are particularly loud. Accordingly, an aspect of the invention is an inertial/impact tactor suspended by collapsible elastic elements that change spring rate more smoothly than a metal-to-metal collision, thereby minimizing acoustic noise.
The four collapsible elements on the sides orthogonal to the compressing and elongating collapsible elements are free to roll in order to maintain contact with both the frame 3712 and the mass 3710. Collapsible elements 3720 also provide a method for delivering impact taction while suppressing the undesirable noise associated with contact between hard surfaces. Where a tactor without a suspension such as described herein would sharply transition between inertial and impact regions with an audible “click” or other similar noise, collapsible elements 3720 may offer a smoother transition, permitting effective use of impact taction. If a fully relaxed collapsible element is round cylinder 3730, when force is applied a partially compressed collapsible element will begin to flatten 3732. If sufficient force is applied, the collapsible element will fully collapse 3734. If the collapsible element is made of a material that is itself compressible, such as silicone or similar materials, then additional force may provide slightly greater travel. However, the collapsible elements may also be made of incompressible materials, such as string steel. In that case it is possible that the collapsible element would reach a point at which no (relevant) force will yield additional travel.
A further advantage of this arrangement of compressible elements is that motion in multiple degrees of freedom is supported.
In this embodiment, elastic elements 3810 are beveled 3820 on both ends so that the elastic element has a short side and a long side. Elastic elements 3810 are oriented in the assembly so that the long side 3822 contacts the frame 3804 and the short side 3824 faces the moveable mass 3802. The reason for the bevel is highlighted in the detail view 3840. It provides the clearance that allows the mass 3802 and retainer plates 3812 and 3814 to move without scraping against cover plates 3806 and 3808.
Mass 3802 moves within frame 3804. Moving mass 3802 is fixed to retainer plates 3812 and 3814, which move with moving mass 3802. Each of the four hooks 3816 on the top retainer plate 3812 inserts into the cylinder of a respective elastic element 3810 from the top; each of the four hooks 3816 on the bottom retainer plate 3814 inserts into the cylinder of a respective elastic element 3810 from the bottom, so that each of the eight cylinders is retained by one hook.
It should be noted that multiple variations on the embodiments described are contemplated. Elastic members may be made of any resilient material, including metals that can function as springs. Elastic members may be shapes other than cylinders, such as leaf springs, coil springs, foam cubes, or other shapes and materials. Tactors and their housing can be shaped in a variety of forms other than squares or rectangles, such as circles, toroids, sections of toroids, etc. More or fewer elastic members may be used to suspend the mass and to elastically limit travel.
Together and separately, these improvements enhance perception of bass and improve spatialization of sound. Benefits for spatial reaction time and hearing health are demonstrated, in addition to numerous other benefits as previously described.
An ideal transducer, whether tactile or acoustic, would have a constant linear transfer function. In other words, the output of the transducer at different frequencies should generally be a simple function only of its input.
In the case of a loudspeaker, for example, the ideal is primarily thought of as flat frequency response over the full range of the speaker.
Both because ideal transducers are rare and expensive at best, and because a driver that delivers excellent performance in one context (e.g., an anechoic chamber) may be seriously flawed in another (e.g., a room with its own resonances, modes, reflections, etc.) a variety of techniques for improving the performance of a transducer have evolved over the years.
The simplest and probably oldest approach is the use of passive components (capacitors, resistors and inductors) to shape the output of a transducer or an array of transducers. For example, a two-way loudspeaker generally includes a few such passive components to send low frequency signals to one driver, and high frequency signals to another. Such a network can be a simple as a single capacitor in series with a high frequency driver and another in parallel with the low frequency driver.
In order to deliver significant low frequency tactile output through the mass of a complete headphone, a tactile transducer must itself have significant mass. That mass means that the transducer will have at least one significant resonant mode (at least in the absence of techniques to reduce such resonances). In addition to uneven response in the frequency domain, the presence of resonant modes leads to time domain nonlinearities, such as ringing and poor impulse response.
Passive components can also be used in more complex networks to compensate for anomalies in frequency response, to adjust the impedance the overall system presents to the amplifier that drives it, to create higher order crossover slopes, etc. Such techniques are well-known in the art.
One way of compensating for transducer resonance is to mechanically damp the transducer, as described in U.S. Pat. No. 9,430,921. An alternative approach is to electrically damp the output of the transducer.
One approach to electrical damping is to apply attenuation at the frequency of the known resonance of the transducer to the signal prior to transmission to the transducer. Thus, for example, if the tactile transducer has a primary resonance of 10 dB at 50 Hz, a notch filter as described above that provides 10 dB of attenuation at 50 Hz will (assuming that the Q factor of the filter matches the Q factor of the resonance) reduce or eliminate the ringing of the transducer in addition to reducing the frequency response errors as shown in
Given a transducer with an undamped frequency response as shown in
It should be noted that many highly resonant transducers (such as linear resonant actuators) are only capable of producing significant power at their resonant frequencies. Applying notch filtering to these transducers is likely to render them useless, because they have so little output at frequencies other than their mechanical resonance.
Such passive networks have been used for decades, and can be reasonably cost-effective in many applications. However, they have a number of disadvantages. High-quality passive devices can be bulky and expensive. In general, only relatively coarse corrections can be made with passive networks. The wide production tolerances of many passive components (and of the audio or tactile drivers themselves) can lead to significant mismatches between the anomalies to be corrected and the changes effected by the passive networks. And perhaps most significantly, a purely passive network cannot adapt itself to actual operating conditions of the device they are connected to. Thus a network/transducer combination that is tuned for flat response on a test bench is likely to have very different performance in a system that includes, for example, the mass and compliance of a complete headphone, and the mass and compliance of a wearer's head.
where
The resulting frequency response from such an IRR filter is shown in
Frequency response correction can be implemented as an open-loop system—that is, with static parameters optimized for the assumed transfer function of the system based on optimization at the time the system is designed. This approach has a number of potential drawbacks. First, the key parameters of the system to be optimized will have tolerance ranges, as will the components used to correct for them. Thus in production, mismatches are likely, reducing the effectiveness of the networks. Second, even if the components are perfectly matched, the performance of the system in practice is the sum of additional factors such as the shape and size of the ears and head of the user. In addition, the goal should be to optimize the combined outputs of the tactile transducer and the acoustic transducer, as experienced by the user. It would thus be desirable to provide a means to measure the combined outputs—both tactile and acoustic—in real or near-real time, and dynamically optimize their combined output.
Altering frequency response in the digital domain offers a number of advantages, particularly in devices that already include digital signal processing capability. Complex filter topologies are possible, unit-to-unit variation (of the filter parameters) is largely eliminated, and the need for expensive and bulky inductors and capacitors may be eliminated. Digital signal processing also permits more sophisticated frequency response shaping, and allows the flexibility to easily permit adjustment of output at multiple frequencies.
Another objective of altering the output of audio drivers is to increase their linearity by comparing the input signal to the driver's output, and using any detected undesired differences to alter the signal sent to the driver. Such systems can compensate for a number of potential limitations. For example, the moving portions of most loudspeakers have a specified maximum range of motion. For a variety of reasons, most drivers are not fully linear through that range of motion. Mechanical suspensions may stiffen at extreme extensions of travel; the relationship between the magnetic field and the coil within it may change; the acoustic loading of an air suspension or the room in which the speaker is operating may change the mechanical impedance of the system at high volume levels, etc.
One method for applying error correction involves sensing the back electromotive force generated when a magnet moves relative to an electric coil (or vice versa). One such system is disclosed in U.S. Pat. No. 4,764,711). In such a system, current-sensing means can be connected to the wires transmitting voltage and current to a motor (such as a loudspeaker). The motion of the coil relative to the magnet modulates the voltage output by the current sensor. This modulation provides a means for sensing error in the fidelity of the output of the motor relative to its input: by monitoring the coil current relative to the amplifier voltage, it is possible to compute the relative velocity of the coil or coils relative to the magnet(s). When the velocity of the motor differs from the commanded velocity, that error can be used to generate an error-correction signal to be transmitted to the motor to improve the match.
A motor converts electrical energy into kinetic energy (and heat), and loudspeakers and tactile transducers are motors. Thus it is possible to use back EMF to improve the accuracy with which an audio or tactile transducer produces an output signal relative to an input signal.
However, such an error correction system may have limitations. The error signal read by the correction circuit is the sum of all electromagnetic forces acting on the coil of the driver being measured. Thus if another tactile or acoustic driver is proximate to the tactile driver being monitored, the varying magnetic field generated by the second transducer can modulate the magnetic field measured by the sensor read by the back EMF.
Another limitation of the back-EMF as an error-correction signal is that it is only loosely coupled to the desired control variable. The currently preferred control variable for a tactile transducer is force or motion at the skin/device interface. Back EMF from the coil is not a direct measurement of this variable.
Active correction has been applied to loudspeakers, and offers a number of advantages. One such system was developed by Velodyne, as illustrated in U.S. Pat. No. 4,573,189. It includes an accelerometer attached to the moving portion of the loudspeaker. A comparator circuit receives both the input signal and the output of the accelerometer. The comparator circuit detects nonlinearities between the input and output, and adjusts the signal sent to the driver to correct for them. This approach can significantly reduce distortion.
A perfect tactile transducer would produce a housing velocity proportional to the low-pass-filtered voltage of the acoustic signal. The dynamics of a real tactor/display/body system, however, make the uncorrected output likely to deviate from this ideal response, causing an error. Closed-loop feedback reduces this error. Housing motion can be sensed by the accelerometer, integrated to determine velocity, and this signal is inverted and scaled to produce an error correction signal. The error correction signal is combined with the primary input and fed to the amplifier driving the tactile transducer. Thus the signal from the accelerometer may be used to reduce distortion generated by the tactile transducer.
One aspect of the present invention applies active correction to tactile drivers that may be applied in headphones, virtual reality/augmented reality headsets and other devices.
Earcup housing 4802 contains active and passive components in desired locations relative to the wearer's head 4801. Cushion 4804 surrounds ear 4805 holds the cup against the wearer's head, and transmits tactile forces to the wearer's skin 4806 and subcutaneous tissue 4808, which is in turn anchored to the wearer's skull 4812. Tactile transducer 4814 is mounted to earcup housing 4802 so that when inertial mass 4816 moves transversely to axis 4818, earcup housing 4802 also moves transversely to axis 4818, but with opposite phase.
Mounted within earcup housing 4802 is circuit board 4820. In addition to drive electronics, amplification and other related circuitry, circuit board 4820 includes at least one accelerometer 4822. Accelerometer 4822 is oriented so as to measure acceleration transverse to axis 4818, and provide an electrical signal to compensation circuitry as described further below.
Similarly, one or more accelerometers fixed to the housing of a wearable display is a useful source of sensor feedback for closed loop control.
Just as acoustic power in air is proportional to air velocity, mechanical power in skin is proportional to skin velocity. Accordingly, to transfer power uniformly to the skin across a range frequencies, velocity is a preferred process variable to control. To achieve this, accelerometer output can be integrated in the digital or analog domain to determine housing velocity, and a closed-loop velocity controller can be implemented to make housing velocity track input. An appropriate signal for the tactile driver is derived by low-pass filtering the acoustic signal bound for the acoustic driver, to extract frequencies below about 200 Hz, which are felt with the skin as well as heard with the ear. For simplicity of illustration, an example of an analog proportional controller based on this error signal is shown in
Input signal 4902 is sent to both acoustic driver 4904 and to buffer 4906. (The signal to the audio driver may be full range, or may pass through a high pass filter to reduce overlap between the tactile and acoustic drivers.) After the buffer, the signal may be passed through a low-pass filter 4908 to optimize the signal for the bandwidth of the tactile transducer. The signal may then be adjusted with gain control 4910. The signal then moves to summing junction 4912, and then to amplifier 4914, and tactile transducer 4916. Optionally, a shaping network 4913 may be included to adjust the frequency response of the tactile transducer. The vibrational effect of transducer 4916 is sensed by accelerometer 4918. The output of accelerometer 4918 is processed by integrator 4920 and then module 4922, which inverts and scales the correction signal so that it can be fed back to amplifier 4914 through summing point 4912.
One method of implementing such correction is to mount an accelerometer to one of the moving internal components of the tactile transducers. Alternately, the accelerometer can be mounted to the housing of the transducer, or equivalently to a PCB fixed to the housing of a body-contacting device. Accelerometer 4918 can be one of several types well-understood in the relevant arts, such as a piezoelectric MEMS device.
Input signal 5002 is sent to both acoustic driver 5004 and to buffer 5006. (The signal to the audio driver may be full range, or may pass through a high pass filter to reduce overlap between the tactile and acoustic drivers.). After buffer 5006, the signal may be passed through a low-pass filter 5008 to optimize the signal for the bandwidth of the tactile transducer. The signal may then be adjusted with gain control 5010. The signal then moves to summing junction 5012, and then optionally to shaping network 5014. It then is sent to amplifier 5016, and tactile transducer 5018. The vibrational effect of transducer 5018 is sensed by accelerometer 5020.
As in
The accelerometer signal can be further processed to improve its accuracy. Gravity exerts a constant pull (at least when a user is at rest). Sudden movements of the user's head can also generate movement that could reduce the accuracy of the correction signal. It may be therefore useful to filter those effects from the error-correction signal. This can be accomplished with a band-pass filter 5026 tuned to reject gravitational acceleration (for example, flow_corner<5 Hz) and impacts (for example, thigh_corner>300 Hz). Furthermore, both acceleration and velocity signals may be used by the controller, with proportional gain KP and derivative gain KD provided by additional filter 5030 and inverter-scaler 5032. This implementation provides closed-loop proportional-derivative control (PD-control) of velocity. As will be familiar to those skilled in the art, the inclusion of these derivative signal enables reduced overshoot as the system tracks the velocity setpoint.
An alternative approach to controlling tactile output is to directly sense the motion of the moving mass of the tactile transducer, for example using a device such as an optical sensor or Hall effect sensor. In this approach, the sensor may be attached to the moving mass, or to the frame or other structure that moves relative to the moving mass when the tactile transducer is active.
As will be familiar to practitioners skilled in the art, the control elements outlined above in analog form can be economically implemented in the digital domain in a microcontroller. The shaping filter, for example, may be stably implemented as a Finite Impulse Response (FIR) filter with tap coefficients chosen to suppress unwanted resonances and bring up under-represented frequencies. The bandpass filters and integrators may be conveniently implemented in IIR (Infinite Impulse Response) biquad form, or in other digital forms suited to the capabilities of a given microcontroller. Anti-aliasing filters before and after the computation may be used to reduce discretization artifacts associated with conversion between digital and analog signals.
On the sensor side, another approach is to close the loop using a sensor signal that is related to the force that the body-worn device imposes on the skin. A force sensor may be situated, for example, between the rigid housing of a worn device and a cushion, to provide a measurement of contact force. Thin, commercially-available Force-Sensitive Resistors (FSRs) and force-sensitive capacitors (FSCs) are suited to this purpose. Methods used in sensing contact force of robot grippers can also provide suitable indications of skin contact force. For example, if a hole is provided in the rigid housing and cushion cover so that the cushion foam is exposed to light from an underlying infrared emitter-detector pair, compression of the foam cells could be discerned optically as a change in reflected light. Or, if such a housing port is provided with a deformable dome, deformation of that dome can be tracked optically from reflected light. Or, if the dome is fitted with a small magnet, the deformation may be tracked with Hall effect sensors. By employing multiple sensors at the base of the dome, both shear and normal force may be determined. The sensing point may be brought still closer to the body surface. For example, a flexible force sensor may be situated beneath the cushion cover and on top of the cushion foam or other internal resilient material. Or, if the cushion is comprised of an electrically conductive material, and in direct contact with the skin, then the state of contact of the cushion with the skin may be sensed electrically as a change in capacitance or resistance of the entire electrode, or as changes between segments of that electrode. By these means and others, the contact force at the interface of skin and cushion may be measured and fed back to a tactor velocity controller.
Another alternative approach is to employ one or more microphones to measure acoustic output. An advantage of this approach is that it is capable of addressing not only the nonlinearities of the acoustic transducer, but also unwanted interactions between the tactile transducer and the acoustic transducer. A tactile transducer generates movement of the earcup of the headphone. That movement can change the effective volume of the chamber defined by the earcup and the wearer's head. Those volume changes can generate acoustic signals that are both perceptible and can either cancel or increase the bass signals generated by the acoustic driver. If sufficiently large in their movement, they can also affect higher frequencies by moving the acoustic driver relative to the eardrum, which Doppler shifts the frequencies produced by the acoustic driver when they are received at the ear drum. Since velocity of tactile actuation (0.01 m/s typical) is small (less than one part in one thousand) with respect to the sound velocity in air (˜345 m/s) this distortion is not generally a problem. Acoustic effects due to changes in the chamber volume, however, can be significant. To produce low frequencies (e.g. f=10 Hz, □=62 rad/s) at a perceptible tactile intensity (|v|=0.01 m/s) requires housing displacement of amplitude A=(|v|/□)=0.2 mm. This displacement approaches the magnitude of the maximum working throw of a headphone acoustic driver (typically around 1 mm). So, if a component of this displacement vector slightly lifts or lowers the housing away from the ear, or if motion of one side of the chamber is impeded by the wearer's mandible, then this housing displacement can produce a significant change in chamber volume, and therefore a significant fluctuation in pressure.
To control the acoustic output to the user in the presence of this disturbance, Automatic Noise Cancellation (ANC) may be employed. ANC is known in the art, and is generally employed to mask external sound. Here, that technology can be used to reduce acoustic distortion generated by the tactile transducer. A microphone is located inside the housing or ear chamber of a headphone. It measures the instantaneous sum of the outputs of both the acoustic and the tactile transducers. The difference between the commanded and measured pressure provides an error signal that can be used to adjust the current to the acoustic driver. Before summation, a delay is introduced to the microphone signal to compensate for the finite speed of sound in the air between the driver and microphone. Appropriately timed feedback of this error signal to an amplifier powering the acoustic driver makes it possible to compensate for the unwanted disturbance of the tactile driver, significantly improving acoustic fidelity.
To the extent tactor-induced changes in chamber pressure are related to skin velocity, excess pressure measured at the ANC microphone can also provide a feedback signal for the tactile controller. In practice, however, the relationship between excess chamber pressure and tactile velocity of the cushion has proved to be complex. This may be because tactile motion of an ear cup housing can involve simultaneous motions parallel to the skin (x), normal to the skin (z), and rocking (θ). Each of these degrees of freedom has a different effect on chamber pressure. Furthermore, it is natural for each degree of housing motion freedom (x, z, θ) to have a different transient response associated with the compliance and inertia of each direction, which will naturally depend on variables including the dynamics of the wearer's body. Despite these difficulties, with boot-up calibration to the user, for example to an impulse response, excess microphone pressure may be used as a tactor feedback signal.
The embodiment disclosed in
An additional application for such a feedback system is to enable the use of tactile transducer with audio or audiovisual head-mounted systems that use means for sensing head position and or movement.
Virtual reality and augmented reality headsets generally rely on sensors to determine the orientation and relative movement of the head of the person wearing the device. This is important in order to create the illusion of being in an alternate space (in the case of VR) and to present contextual information in the correct location (in the case of AR). The same is true of headphones that seek to create the illusion of a virtual sound field such that when the headphone wearer turns her head, the source of a given sound remains fixed in the virtual environment. To the extent that these devices rely on motional sensors, which may include but may not be limited to accelerometers, the addition of tactile transducers could interfere with the accuracy of the determination of the instantaneous position of the wearer's head.
One solution to this problem is to provide an error correction signal tied to tactile output to the processor that performs the head position calculations. This error correction signal could be predictive or a true form of motional feedback. In the first case, a model of the motional response of the device for a given tactile signal can be provided to the positional algorithm, thereby allowing it to compensate for that predicted tactile-generated motion. In the second case, one or more accelerometers can detect actual tactile-generated motion and thus apply a correction factor to compensate.
It should also be noted that frequency response is not the only aspect of sound reproduction that is important. In order to reproduce sound with maximum fidelity, sounds need to not just be produced at the right levels, they must also be reproduced at the right time. One of the complexities associated with passive equalization is that it generally changes temporal signal relationships as well. In a multiple driver system, than can cause complex conditions in which the output of multiple drivers produce the same frequencies. This can result in cancellation at some frequencies and reinforcement at others. When two drivers reproduce the same frequency, the resultant output (measured at a given point, such as the listener's ear) can be equal to the sum of the individual outputs of the two drivers if the signals arrive in phase (zero degrees of phase difference), or the second driver can completely cancel the output of the first driver, if the signals arrive precisely (180 degrees) out of phase, or anywhere in between these two extremes if the offset is somewhere between 0 and 180 degrees. These effects can have significant adverse effects on fidelity.
A major advantage of tactile bass transducers is that tactile drivers use a different neural pathway than audio transducers use, so (with pure tactile transmission) such cancellation/reinforcement effects do not occur. However, in some implementations of tactile transducers in a headphone or VR/AR headset or similar device, the movement of the tactile transducer may also generate audible signals. This audio output may be directly generated by the movement of the transducer, or it may be generated by movement of the earcup and cushion against the wearer's ear, or some other mechanism. Such interactions can materially alter the overall frequency response of the combined system. Those interactions can be difficult to predict a priori. Differences in the size and shape of the wearer's head, the mechanical properties of the ear cushions, the components forming the mechanical connection between the two earcups can affect these interactions.
It would be advantageous for a combined tactile/audio head-mounted system to provide a means to detect and correct for unwanted interactions between tactile and audio driver output.
Another aspect of temporal fidelity that such control circuits can improve is the impulse response of its drivers. An ideal transducer does not store energy: for a speaker or tactile transducer, electrical energy would be immediately converted to motion, and that motion would stop as soon as the electrical input signal stopped. In reality, transducers have mass, and when combined with the suspension systems used to locate the moving portion of the transducer, they become energy storage devices. Worse, the amount of energy stored varies with frequency, and energy that is put into the system at one frequency may be subsequently released at one or more different frequencies.
These effects can be partially mitigated through use of mechanical damping, such as with fluid damping. But with the advent of powerful and inexpensive digital processing circuitry, it is possible to further improve the time domain performance of both audio and tactile transducers.
It would be desirable to provide a means for digitally controlling the output of tactile transducers, and the combined output of tactile and audio transducers in a mixed system, to improve time domain performance.
One method for electronically tailoring the output of the tactile transducer is to apply filtering that compensates not only in the frequency domain but also in the time domain. A finite impulse response (FIR) filter can be used to improve the impulse response of a transducer in several ways. An FIR filter can reduce the tendency of a transducer to ring at its resonance. An FIR filter can also improve impulse response by boosting the first cycle of a given signal while attenuating subsequent cycles. And they can correct phase anomalies to reduce unwanted cancellation or reinforcement between the tactile and acoustic drivers.
This feature of the instant invention is illustrated in
The output of the same tactile transducer when it is fed a digitally generated error correction signal may look like 5608. The rise time of the impulse has been improved, and the ringing significantly reduced.
A finite impulse response filter generates an error correction signal to be combined with the input signal 5602. The error correction signal provides additional in-phase energy corresponding with the initial impulse in order to make the initial impulse output more closely resemble the input impulse. The correction signal provides out-of-phase energy to the portion of the waveform following the initial impulse in order to reduce ringing. The resulting output 5608 more closely represents the input.
Finite impulse response filtering can also be used to flatten frequency response and reduce the resonant peak of a tactile transducer, thereby reducing or eliminating the need for mechanical damping. Thus finite impulse response filtering may also improve the efficiency of a tactile transducer relative to mechanical damping, because mechanical damping is likely to reduce output at a wide range of frequencies, while finite impulse response filtering can be applied selectively, only reducing output at certain frequencies.
When tactile drivers are combined with acoustic drivers, additional complications are introduced. A tactile driver may have acoustic output of its own, which can interact with the acoustic output of the primary driver. In addition, in a closed back headphone, a chamber is defined by the space enclosed by the headphone cup, cushion, and the wearer's head. If the movement of the tactile transducer modulates the volume of that chamber, then the effect will be similar to movement of a transducer moving at that frequency, with a potentially strong acoustic component added to the tactile output. That acoustic output can interact with the acoustic output of the main driver (and the tactile output of the tactile driver) in undesirable ways. It can cancel acoustic output at some frequencies and augment it at others.
A development in the field of headphones in recent years is the emergence of active noise-canceling headphones. The basic principle these headphones employ is that a sound wave can be cancelled by generating its inverse. Thus in general these headphones use a microphone to detect noise in the environment, and active signal processing and amplification to generate an inverse signal to be produced by the acoustic driver of the headphone, thus substantially reducing the amplitude of the undesired sound at the wearer's ear. In some designs the microphone is located on the outside of the headphone, so that it (ideally) only receives the external sounds the system is supposed to attenuate. In other systems, there may be a microphone located inside the earcup.
One limitation of ANC is that it is most effective for periodic noise. This is inherent in the fact that ANC is always generating output that lags behind its stimulus. For periodic signals, this is not a serious problem, but for transient signals with wave periods on the order of the feedback lag, ANC is of limited value. Because the noise prevalent in one of the most common uses for noise cancelling headphones (air travel) is largely periodic, they can work well in those contexts.
Another significant challenge for conventional noise cancelling headphones is that the cancellation can only operate within a somewhat constrained frequency spectrum. The acoustic drivers in most headphones have limited output at low frequencies. They are therefore limited in their ability to attenuate low frequency noise that can annoy listeners [Leventhall, 2004].
HG Leventhall. Low frequency noise and annoyance. Noise and Health. 6(23):57-72 (2004).
It would therefore be advantageous to provide a means for extending the low frequency capabilities of noise cancelling headphones
Tactile transducers such as those described above can be capable of significant output at extremely low frequencies. Those transducers can be utilized to cancel external noise at low frequencies that are not easily cancelled by acoustic drivers.
Combining ANC, acoustic transducers and tactile transducers permits an additional benefit. It is also possible to reduce not only the effects of outside noise, but unwanted acoustic noise generated by the tactile transducer. In brief, the presence of a microphone inside the headphone cup makes it possible to detect interaction between the output of the two transducers, and to alter the output of one (or both) of them in order to reduce or even eliminate adverse interactions, as described above, and shown in
Another issue that can arise when combining tactile and acoustic drivers (or using tactile drivers in conjunction with acoustic content generated by other systems) is that the dynamic range commonly used in recorded audio (that is, the range between the very soft and very loud sounds in a recording is roughly 80 dB, while the comfortable tactile dynamic range appropriate for applications such as headphones and VR headsets is significantly smaller—on the order of 20 dB. That difference makes matching a tactile driver with an acoustic driver challenging. If no mechanism is used to compensate for that difference, at least one of two issues will arise: either quiet acoustic signals will have no tactile component, or there will be no difference in tactile output between moderately loud acoustic sounds and the loudest acoustic sounds (or potentially both will occur if the tactile transducer is set so that its dynamic range corresponds with the middle of the acoustic range).
It would thus be desirable to implement a means for mapping the output of a tactile driver to the relatively wide dynamic range of music and other audio content when perceived acoustically.
An embodiment of the present invention accomplishes this goal by applying a form of dynamic range compression to the signal prior to amplification and transmission to the tactile transducer. For example,
Analog or digital input signal 6002, sent from an audio amplifier, Bluetooth receiver, USB interface IC, or similar, is transmitted both to acoustic driver 6004 and to buffer 6006. It is then passed to dynamic compressor 6008. It then passes through low pass filter 6010, and non-linear user-adjustable gain control 6012, finite impulse response filter 6014 (if used), amplifier 60616, and finally to tactile driver 6018.
In many applications, it will be undesirable for a tactile transducer to produce audio-frequency output. If tactile transducers in headphones or devices like VR/AR headsets are sent audio-frequency signals, they may produce audio-frequency output that the user will hear. One situation in which audio-frequency signals may be unintentionally sent to a tactile transducer is if a stage in the electronic circuit feeding signals to the transducer is overloaded, also known as clipping. Such a clipped signal is likely to contain considerable high-frequency energy. Because the ear is exquisitely sensitive to frequencies in the 1 kHz-5 kHz range, special care should be taken to avoid sending clipping signals to the tactile transducers. Clipping signals can also arise from compression. For example, in gaming impulse sounds like shots often occur after long periods of silence. Clipping signals can also occur if a tactile transducer is intentionally over-driven. Overdriving a tactile transducer may be a useful technique for increasing its peak output in certain circumstances. However, it would be desirable to suppress from its drive signal any frequencies from approximately 200 Hz up into the kilohertz range. Fortunately, this can be accomplished with a soft saturation filter.
Unlike some other approaches to limiting, which operate on a “look-ahead” basis and thus introduce latency, a soft saturation filter has no look-ahead, and is continuous in its first few derivatives. One computationally efficient way of providing the soft saturation function is to apply a cubic roll-off above a given threshold.
The soft saturation output limiting shown in
Tactile drivers can enhance the immersiveness of almost any kind of sound reproduction experience. Headphones as described above can add to the experience of watching movies, especially in the home theater context. But modern movie sound tracks tend to provide more than 2 channels of audio—in some cases, many more channels. High-end home theaters, like modern traditional theaters, can present a rich, 3-dimensional soundscape. Conventional stereo headphones cannot reproduce that complex soundfield (unless uncommon techniques like binaural recording are employed). Although many attempts have been made to provide the multichannel experience with headphones, convincing experiences have proven elusive. And an important part of the experience of going to a large movie theater is the large acoustic space and its carefully engineered loudspeaker system.
Movie theaters and their sound systems can be designed to present powerful low-frequency sound effects. But those approaches to delivering deep bass can have drawbacks. Low-frequency systems capable of high pressure levels in such venues can be large and expensive, and require very powerful amplifiers. Perhaps more significantly, it is extremely difficult and expensive to prevent low frequency energy from leaking from one theater to another. It can be distracting and annoying to patrons watching a move in one theater to hear the low frequency portion of the soundtrack of a different movie playing in the next theater.
It would therefore be advantageous to provide a system that would permit a user to achieve tactile low-bass stimulation while still being exposed to a outside sounds, including a multichannel sound field.
Alternatively, tactile transducers may be included in devices that contact other parts of a wearer's body, such as the neck or shoulders or upper back.
When combined with a multi-channel sound system in an environment such as a movie theater, a headset or other device as disclosed can be used to deliver an immersive experience that includes the full range of frequencies down to infrasonic range without the bleeding between theaters or other rooms that are common when those frequencies are produced acoustically.
The same approach can also be implemented with a more traditional open back headphone. Because an open back headphone is (more or less) acoustically transparent to the outside environment, the same approach can be employed. As long as the headphone is largely open acoustically, the effect can be similar. However, in order to deliver this functionality, it will be necessary to either (i) transmit only the tactile frequencies to the open-back headset, or (ii) equip the open-back headset with a switch or other means for turning off the acoustic transducers in the headphone. If provided with a personal volume control for acoustic output, such a device would also permit customization of sound levels on a per viewer/listener basis, as opposed to the one-size-fits-all level that normally is found in a movie theater, concert venue, etc.
An additional capability that may be employed with such an open back headphone is to direct specific sounds to the headset in higher frequency ranges. Thus for example, if a character on screen is whispering to another character, that whisper could be directed not to the speakers mounted on the walls of the theater, but to the appropriate transducer in the headset.
As discussed above, another area in which tactile low-frequency drivers can be desirable is gaming. One specialized case which provides a demanding test of headphones is that experienced at elite competitive levels of gaming. E-sports tournaments between top teams can now take place in arenas in front of thousands or even tens of thousands of fans. Team play requires coordination and communication between team members. Gaming headsets generally include microphones to facilitate the required conversations. When those fans are cheering, the ambient noise levels can be high enough to interfere with the ability of team members to hear each other. So when playing before a crowd, elite gamers may resort to wearing both in-ear and over-ear headphones simultaneously in order to maximize isolation from that noise, using the over-the-ear headphones passively only for their noise-isolation.
As discussed above, tactile transducers can improve performance in gaming because the tactile neural pathway provides faster reaction times. It would therefore be advantageous to provide a way to enable tactile stimulation in the two-headphone context.
One aspect of the invention is to include circuitry and components that enable a tournament mode in over-the-ear headphones incorporating tactile transducers. Those components could include hardware or software switching to mute or attenuate the audio driver, but leave the tactile transducers active. This approach can also be applied to an over-the-ear headphone that also includes active noise cancellation. Preferably tournament mode in such a device would mute game sounds, but ANC would remain active.
As discussed above, one method for enhancing the frequency response of a tactile transducer as described is to provide mechanical damping of its primary resonance with a fluid such as a ferrofluid.
One challenge associated with that approach is that the performance characteristics of the transducer can be dependent on the precise amount of fluid deployed. Too much fluid may overdamp the transducer, thereby reducing output unnecessarily; too little fluid may not adequately reduce the resonance.
Accurately dispensing small quantities of fluids (which may be on the order of microliters) can be challenging in a production environment. It would therefore be advantageous to provide a method for reducing the criticality of the quantity of fluid used to damp the transducer.
Providing openings 6412 in plate 6406, reduces the contact area between the mass and the plate. It also provides volume into which unneeded damping fluid can flow without affecting damping.
Numerous researchers have begun to examine the effects of various forms of stimulation intended to affect brain wave activity. Specifically, it is believed that brain waves associated with different states of awareness or relaxation have differing frequencies. For example, the waves associated with relaxation are commonly called alpha waves, and are thought to have a frequency range of 8-12 Hz, while the waves generally associated with alertness, called beta waves, have a frequency range of 12 Hz and higher.
A number of people have experimented with the use of headphones to generate signals based on the frequencies of the brainwaves they seek to encourage. It is believed that by externally generating these signals at the desired frequency, it is possible to help the brain itself to generate waves at the same frequencies, and thereby entered a more relaxed state, in a process called brainwave entrainment.
One of the challenges in generating these waves using the prior art is that the desirable waves are at frequencies that are below the normal range of audio transducers. A further challenge is that even if a traditional acoustic transducer can be made to generate the required frequencies, the human ear is not very sensitive to them. For these reasons, brain wave entrainment has generally been attempted using the phenomenon of interaural beats. Interaural beats take advantage of psychoacoustics to combine a frequency played in one ear with a different frequency played in the other ear to synthesize a third signal that exists only in the listener's brain. If, for example a 200 Hz tone plays in one ear, and a 210 Hz tone plays in the other ear (frequencies easily generated by conventional headphone drivers), the two signals “beat” against each other at 10 Hz, and a 10 Hz signal is perceived by the listener.
This is the process that has been employed by previous efforts to generate low-frequency brain waves. A significant drawback to this system is that in order to generate, say a 10 Hz interaural beat signal at a given amplitude, it is necessary to produce the two higher frequency signals that beat against each other. It is not possible with this approach to allow the user to perceive only the 10 Hz signal. The higher frequency signal is likely to be in a region that the human perceptual system is significantly more sensitive to than the desired frequency is, which may make the beating frequencies distracting and at cross purposes in terms of helping the user achieve a more relaxed state.
It would be desirable to provide a means for achieving brainwave entrainment while substantially eliminating or reducing distracting higher frequency signals.
A significant advantage of the subject invention is that it can be employed to directly generate the desired alpha wave frequencies, without also generating very obvious audio-frequency signals.
A key aspect of the entrainment mechanism is that externally producing the frequency that is desired to be reproduced in the form of brain waves will lead the brain to slowly synchronize its internal wave production to equal that of the externally produced wave. There is evidence that an open loop system can eventually achieve this result, at least in some cases.
A closed loop system—that is, a system that measures the brain wave activity of the person wearing the entrainment device, and then generates an entrainment signal that is adapted to the existing brain wave activity—could significantly improve the effectiveness of entrainment.
For example, the frequency, phase and/or intensity of the wearer's EEG in the alpha frequencies could be observed by the system, then tactile output could be ramped up that had frequency, phase and/or intensity matched to the wearer's native EEG, facilitating entrainment. The phase, intensity, and frequency content of the tactile signal could then progressively be moved closer to the target frequencies—the lower frequencies associated with greater relaxation.
In another non-limiting example, the tactile channel could signal to the user the degree to which they were achieving high alpha output.
Thus it would be desirable to provide a means for adapting brainwave entrainment signals to actual brainwaves of the user of the entrainment device.
EEG sensor and related modules 6620 transmit the brain wave signal to the microprocessor 6622. The microprocessor uses that data to generate appropriate signals to transmit to buffer 6604.
Various methods have been disclosed for providing tactile drivers in an over-the-ear headphone. Another type of headphone that has become very popular is the in-ear headphone, or earphone. This type of device includes at least a portion that is intended to be placed inside the ear canal of the user. In some designs, the entire device is small and light enough that it can be held in place by a resilient foam, rubber or plastic component that makes contact with the ear canal. In other designs there may be a rigid or semi-rigid arm that wraps behind the user's pinnae, in a manner analogous to the arms of a pair of eyeglasses.
As wireless systems such as Bluetooth have been applied to headphones of various form factors, some in-ear headphones have been developed that include a roughly horseshoe-shaped central component that includes one or more batteries, amplification circuitry, and wireless circuitry. The in-ear portions of the headphone are connected to the horseshoe section with short wires, which may be configured so that wires retract into the horseshoe, and the in-ear components are protected by the horseshoe when the system is not in use. In some designs, the horseshoe is intended to fit loosely around the user's neck. In other designs it may be intended to fit over the user's head. The horseshoe section is generally capable of communicating wirelessly with an audio source such as, for example, a smart phone.
In-ear headphones are of course small in size, and present limited space for tactile transducers. It would therefore be desirable to offer a means for including tactile transducers that can be paired with in-ear headphones.
A semi-rigid, resilient band wraps 6702 around a significant portion of the wearer's head, or sits on the wearer's neck or shoulders. The band may provide means to adjust fit for a variety of head shapes and sizes. The band may contain functional components such as wireless communication circuitry, which may use protocols such as Bluetooth, 802.11 or one or more different wireless communications protocols). It may also contain analog and/or digital signal processing circuitry, one or more batteries, amplification, one or more displays, and one or more spaces for storing the attached in-ear audio drivers when the device is not in use. Each of the forgoing aspects of the device is or may be known in the prior art.
By including one or more tactile transducers 6704 and related drive circuitry in the band, and configuring them so that they transmit tactile signals to the wearer's skin, the benefits of tactile signal transmission can be delivered for users of in-ear headphones 6706.
It should be understood that the aspects, features and advantages made apparent from the foregoing are efficiently attained and, since certain changes may be made in the disclosed inventive embodiments without departing from the spirit and scope of the invention, it is intended that all matter contained herein shall be interpreted as illustrative and not in a limiting sense.
It is also to be understood that the following claims are intended to cover all of the generic and specific features of the invention herein described, and all statements of the scope of the invention that, as a matter of language, might be said to fall there between.
The systems described herein, or portions thereof, can be implemented as a computer program product or service that includes instructions that are stored on one or more non-transitory machine-readable storage media, and that are executable on one or more processing devices to perform or control the operations described herein. The systems described herein, or portions thereof, can be implemented as an apparatus, method, or electronic system that can include one or more processing devices, parallel processing devices, and memory to store executable instructions to implement various operations.
Number | Date | Country | |
---|---|---|---|
62762443 | May 2018 | US | |
62219371 | Sep 2015 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16501601 | May 2019 | US |
Child | 16797411 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15268423 | Sep 2016 | US |
Child | 16501601 | US |