The present invention is in the technical field of digitally and acoustically controlled audio signal processing systems and methods. More particularly, the present invention is in the technical field of signal processing of acoustic signals to allow for a full sound stage imaging experience by acoustic and electronic manipulation using digital signal processing and applied psychoacoustic principles.
Conventional speaker system designs use various techniques to provide for an authentic listening experience of sound recordings. Movie Theater and live music performances provide listener perceived acoustic images and spatial characteristics. These characteristics are desirable in audio recordings and movie sound tracks when intended for later playback. A number of factors impact the listening experience of recorded sound. The placement of speakers within the listening environment, the reflection of sound off of objects in the environment, the power output to speaker driver components when playing particular individual notes or frequencies, volume of play, harmonics and frequency cancellation and other psychoacoustic phenomena all affect the fidelity of the acoustic sound image, spatial perception and overall sound quality. Audio engineers have uses a variety of well-known techniques to address limitations in sound reproduction resulting from a particular speaker design, from environmental factors where speakers are used, and from biological processes in the ear and brain hearing process.
Traditionally, large home speakers with separate drivers for hi, mid, and low frequencies were used primarily for music listening. These speakers were generally large and required calculated placement in a dedicated listening space. Over time, consumers began using these systems for in-home movie viewing. With the advancement of digital technology, more and more sophisticated digital video and digital sound systems have become available and home theater systems are now ubiquitous. Consumers expect their home theater systems to authentically recreate the sound experience of traditional theaters and 5.1 and 7.1 speaker systems. Consumers now demand the same listening experience for both music and movie viewing in small inconspicuous speaker enclosure. Multi-driver, single speaker systems, known as surround bars, have been developed to satisfy this demand.
Interaural Crosstalk
The brain uses the small difference in arrival time of a sound to each ear to calculate the direction or origin of the sound. For example, if a sound arrives at your right ear before arriving at your left ear, the listener perceives the sound as coming from somewhere to the right. This phenomenon is known as interaural time difference (ITD). Our brain measures and processes those subtle timing differences in a way that allows us to accurately determine where a sound source is located
Interaural crosstalk (IAC) occurs when two sound sources (for example a set of speakers) which are separated in space and are intended to replace a single source. In such an arrangement, you get a sound signal representative of a single sound arriving at each ear from the left speaker and a sound signal arriving at each ear from the right speaker, each with a slight time delay. This is unnatural and one of the flaws of stereo reproduction of recorded sound. It is also IAC that restricts the sound stage for the playback of recorded sound from stereo speakers to the area between the speakers, reducing the sound stage and distorting the sound image. IAC is a fundamental problem not only for stereo surround sound reproduction but for any system with more than one speaker. It is because of IAC that we hear the positions of the speakers in a stereo system and not the natural surround sound or live sound stage experience. The effect of IAC when listening to recorded sound is to tell your brain where the speakers are located while, at the same time, covering up the original recorded sound source location information. Once your brain knows where the loudspeakers are all of the sounds seem to come only from the loudspeaker locations and the space in between the speakers, reducing the perceived sound stage and your sense of immersion with the performance. This is nothing like what you would hear at the original concert or in a real world environment, and it is one of the major reasons why even the best conventional playback systems still don't quite sound like the real thing.
Methods of cancellation of IAC to created a very wide soundstage are known. One method provides pairing a driver located about one head width outside a second driver unit that reproduces the normal Left and Right front channel signals. The additional drivers receive an inverted version of the sound signal from the opposite channel. The geometry and spacing of the drive units insures that this inverted crosstalk cancellation signal arrives at the ear at the same time as the unwanted IAC and acoustically cancels it. The geometry also insures that proper cancellation will occur regardless of how far apart the two speakers are or how far away the listener sits. The head-width based geometry of the system means that the system functions properly regardless of how far apart the left and right rear channel drivers are located or how far away you are sitting.
Another method used to cancel IAC is through the use of digital signal processing in multi-driver speaker systems. Sound emitted from each driver in the system is manipulated in time, volume or frequency shift relative to a second or third driver or channel within the system. In this way selective sound waves can be cancelled or enhance and the timing of signals can be modified to create perceptual impression of location in the sound stage. These cancellation and image stabilizing signals are limited to a range of psychoacoustically significant frequencies, mainly in the midrange. The use of a carefully determined frequency range for these signals contributes to the natural sound and highly musical characteristic of the speaker sound, meaning the system delivers a credible surround sound experience over a much broader range of listening locations.
Each of these signals, including the main left and right rear channel signals, is modified by its own front-to-back transformation filter. For each of the rear signals, a separate front-to-back filter transforms the rear signals such that when they are combined acoustically at the listener's ears the resulting perceived sounds have characteristics associated with a sound originating from behind you rather than in front. The benefit of eliminating (or at least substantially reducing) IAC is that you now hear the original recorded information relating to the locations of the instruments and the acoustics of the concert hall unrestricted by the locations of your playback speakers.
Head Related Transfer Functions
If there is a time delay for sound arriving at your left ear relative to your right ear the sound is perceived as coming from a location to the right of center. The greater the time delay, the further to the right sound is perceived. The smaller the time delay, the closer to the center. Zero time delay means the sound is perceived as originating directly in front of you or directly behind you. This perceived directionality also occurs for sounds located directly to either side of the head. In our example, a time delay would be the same for a sound originating off center from right or left side of the head, either to the front or to the rear.
To address the ambiguity, the asymmetry of our ears, head, and torso changes the frequency response of sound arriving from behind us so that they sound different than if they were in front. This is also, generally, how we determine whether a sound is above us or below us. In fact, for each possible direction of arrival at our ear there is a unique frequency response characteristic or sonic signature based on the shape features of the head. So long as we are somewhat familiar with the sound, such as the voice of someone we know or a door slamming, we can easily and accurately determine high or low, front or back, which direction it's coming from. U.S. Pat. No. 8,000,485, which is fully incorporated herein by reference, describes a number of the mathematical equations applicable to calculating the perceived location of sound.
It is possible using well know digital signal processing techniques to electronically manipulate or synthesize the correct HRTF adjusted sound signal that provides perceptual cues to make a sound coming from a loudspeaker directly in front of you seem like it's coming from behind you. The achievement of “virtual” surround sound is accomplished by feeding the surround channels to a pair of front speakers with the correct electronic and digital signal HRTF reformatting, so that they sound like they're behind you rather than in front. A number of devices have used digital signal processing to electronically synthesize the proper audio signals that provide HRTF cues to make two front loudspeakers seem as though they are reproducing the sound of five loudspeakers surrounding the listener. Many digital audio systems include “virtual” surround algorithms to simulate a surround sound experience.
To function properly all of these systems require speakers with high enough performance capability to preserve the accuracy of the synthesized or digitally enhanced HRTF adjusted signals. It is also required that the speakers and listener be located in exactly the positions that correspond to the synthesized HRTF cues. The HRTF cues are also somewhat different for each person, and those differences can mean that a system that produces a convincing surround sound illusion for one person may barely work for another. To avoid these limitations, it is preferable to use the HRTF cues that rely only on those key features of the HRTF's that are common to everyone and have nearly identical sound characteristics over a broad range of sound arrival directions. Many of these key HRTF cue components lie within the same range of frequencies found to be psychoacoustically important for the cancellation of IAC. Sound signal filters containing the key HRTF cue components are combined with crosstalk cancellation signals, binaural image stabilization signals and time delays to achieve both cancellation of IAC and front to back soundstage transformation. This system works much more sympathetically with the way that we hear naturally and offers a more natural surround sound experience over a broader range of seating locations than purely electronic attempts to synthesize a virtual 5.1 system. Additionally, of course, the system works for almost anyone with normal hearing.
Using these techniques provides tremendous flexibility in speaker placement and listener location options and works equally well for almost anyone with normal hearing. In addition, HRTF recognizes that movement of the listener is an important part of the surround sound experience. The HRTF acoustic cues that reach the listeners ears while the listener moves dynamically reinforce the surround sound experience as the listener turns or moves their head.
Audio engineers have used various acoustical engineering techniques, digital signal processing and applied psychoacoustic sound signal manipulation in speakers to develop sophisticated and authentic sound stages with accurate spatial and acoustic image reproduction of movie theater experience or live music performance experience. However, in known single speaker surround bar audio applications these methods are independent, mutually exclusive and dedicated to a single speaker design application. For example, if a speaker is intended for home theater movie viewing, one set of digital signal processing techniques and applied psychoacoustic configurations are applied, and if a music listening application is intended a different set of configurations are used. The optimal configuration for movie viewing is different from the optimal configuration for music listening. A configuration intended for a movie can create distortions and reduced fidelity if applied to music listening. To reconfigure an audio surround bar system that has been configured for movie viewing to one intended for music listening requires sophisticated technical understanding and expertise. Reconfiguration requires changes in many of the DSP, and other configuration parameters. Reconfiguration becomes even more challenging when using a single speaker surround bar system. The average listener will simply not make the changes in configuration when changing between movies and music modes, enduring a limited sound experience.
Therefore, a need exists for a simple audio configuration control that allows for ready change between movie viewing configuration and music listening configuration of a surround bar type speaker audio system.
The present invention is a speaker system capable of, and a method for, processing acoustic signals for playback in a single speaker multi-drive surround sound system to allow for full sound stage imaging and a realistic reproduction of spatial acoustic experience of the listener when listening to the playback of recorded music or when listening to digital movie sound tracks. The sub parameters controlled by the method include Head Related Transfer Functions (HRTFs), Inter Aural Crosstalk (IAC) Cancellation, Direct/unprocessed signal, and center channel level. The method provides for full immersion of the listener in the available sound stage created by the speaker system regardless of whether the system is used for music listening or home theater. One advantage of the current invention is that it provides a single user adjustable tool that controls how and which spatialization sub parameter configurations are used to create the desired amount of surround effect in the sound stage.
The tool is represented as an on-screen display sound stage audio immersion scale (SSA Immersion Scale). The user accesses the scale using a remote controller and the system on-screen display. The SSA Immersion Scale is presented to the user via the on-screen display when the user enters the configuration mode. The user adjusts the configurations of the various sub parameters using a single scale that runs from ‘−10’ (less immersive) to ‘+10’ (more immersive). The SSA Immersion Scale setting is set by the listener and adjusted for listener taste and listening environment. The amount of processing required to achieve the desired immersion effect varies greatly depending on room acoustics and speaker placement.
In an alternative embodiment, the sound immersion scale is embodied in a smart phone application. Current smart phones incorporate a Bluetooth® wireless function that allows for short range pairing and radio frequency communications of electronic devices. In the smart phone embodiment of the current invention, an application is downloaded to the phone. The application essentially allows the user to replace the television's remote control with the smart phone to control the functionality of the speaker system. The speaker includes a Bluetooth® or other short range radio frequency capability and is paired to the smartphone.
Additionally, the smartphone can serve as the display device for digitally recorded movies or as the digital audio player for music, with both digital movies and music downloaded from the internet. Preferably, headphones are used with the smart phone; the audio signal provided to the headphones from the smart phone either wirelessly or through an audio cable. As the movie or music is played, the phone app performs the function of configuring the sub parameters to provide maximum sound stage immersion.
The SSA Immersion scale adjusts each sub parameter by use of an algorithm. The SSA Immersion Scale is correlated to each sub parameter and concurrently shifts emphasis on each sub parameter to a configuration that is preferable for either music listening or movie sound track listening, favors the most useful tool to maximize the sound immersion experience and adjusting the others to work well in tandem.
Referring now to
The method of the current invention are described in more detail in
The head related transfer function processing module 320, the interaural crosstalk cancellation module 330, and the direct sound channel 340each processes signals to the front left 7 and front right 8 drivers. The sound source signal to the center channel trim 350 is process to adjust the trim level 394.
It will be appreciated by one of ordinary skilled in the art that a number of know digital signal processing methods and algorithms are suitable at the signal processing steps. For example, DTS, SRS Labs, Dolby and others have developed well known digital signal processing methods and algorithms that are suitable.
While the foregoing written description of the invention enables one of ordinary skill to make and use what is considered presently to be the best mode thereof, those of ordinary skill will understand and appreciate the existence of variations, combinations, and equivalents of the specific embodiment, method, and examples herein. The invention should therefore not be limited by the above described embodiment, method, and examples, but by all embodiments and methods within the scope and spirit of the invention.
This application claims the benefit of and is a continuation of U.S. Provisional Patent Application Ser. No. 61/702,728 filed Sep. 18, 2013
Number | Date | Country | |
---|---|---|---|
61702728 | Sep 2012 | US |