The present invention relates to a method and a processing system for processing an input signal to produce three-dimensional (3D) audio effects. The processing system may be coupled with a plurality of loudspeakers to form an audio system for producing the 3D audio effects.
3D visual content is readily available, for example, in 3D games, 3D movies and 3D TV broadcast. To create a convincing 3D environment, the viewer of the 3D visual content should preferably be able to experience and feel a certain sense of spaciousness (for example, the spaciousness of a typical forest when the viewer is “in” a virtual forest). Preferably, there should be accompanying 3D audio effects that are matched with the 3D visual content, for example, as the viewer is “walking through” the virtual forest. More preferably, the viewer should be able to experience different depths of the audio content.
3D games usually place the player's avatar in the middle of the action, regardless of whether they are 1st person shooter games or 3rd person shooter games. To enhance the realism of the gaming experience, 3D sounds are often used extensively with 3D graphics in 3D games. The audio content in a 3D game generally comprises a soundtrack, which in turn comprises ambience sounds and sound effects embedded with audio (or binaural) cues to enhance the realism of the game. For example, the audio content may comprise ambience sounds of a typical room or forest which may be used when the player's avatar is in a virtual room or forest and 3D audio cues reflecting sounds of bullets flying towards the player's avatar. The sound effects in 3D games are usually processed with 3D audio techniques such as Direct Sound in Windows, allowing game developers to position the sound effects almost anywhere in a virtual space surrounding the player, hence adding another dimension of realism into the games.
Other than gaming applications, there are many other applications in which it is highly desirable to create an auditory experience which allows the user (or listener) to feel that he or she is indeed in a particular environment. Creating such an immersive experience requires that the audio sounds presented to the user provide a certain level of spaciousness and envelopment. The level of spaciousness refers to the extent of space portrayed to the user and may be expressed as the direct sound to reflections and reverberation ratio. Spaciousness may be achieved using a two-channel (stereo) or a multi-channel (more than two channels) system, although for a two-channel system, the spaciousness and depth dimension of the audio content are usually constrained by the space between the two conventional loudspeakers used in the system. On the other hand, envelopment i.e. the sensation of being surrounded by sound is usually only achievable using a multi-channel system. The level of envelopment is usually dependent on the number of loudspeakers in the system and the spacing between these loudspeakers.
As shown in the above examples, both visual and audio cues play important roles in 3D media such as 3D TV broadcast, 3D games and 3D movies. Unfortunately, due to the limitation of conventional loudspeakers, it remains difficult to achieve immersive sounds for 3D media using current audio systems.
Although setting up surround loudspeakers in a multi-channel system may achieve 3D audio effects, this may be problematic in an environment with limited space. In such an environment, a two-channel system is more attractive but its use is usually at the expense of a smaller sound field. Furthermore, head related transfer functions (HRTFs) are often required to approximate a desired multi-channel sound using a two-channel system. Without personalized HRTFs, there may be problems such as in-head localization and front-back confusion. In addition, using a two-channel system to approximate a multi-channel sound requires good crosstalk cancellation. This limits the performance of this approach since crosstalk cancellation usually requires a good subtraction of two sound fields and tends to be very sensitive to system variations or errors. Moreover, such an approach is sweet spot dependent. Although it may be possible to overcome these problems (i.e. the sweet spot dependency and the need for crosstalk cancellation) by using headphones, this solution is not without issues. For example, discomfort and fatigue may arise after prolonged use of headphones.
Virtual surround sound systems (VSSS) using 3D sound techniques and conventional loudspeakers to create a virtual audio/sound image (i.e. audio/sound effects) have also been developed. However, there is usually a lack of auditory depth in the audio effects produced using such virtual systems. Furthermore, similar to systems which require the use of HRTFs, VSSS are generally sweet spot dependent.
The present invention aims to provide a new and useful processing system and method for processing an input signal to produce 3D audio effects. The processing system may be integrated with a plurality of loudspeakers to form an audio system for producing the 3D audio effects. It may also be integrated with a device for generating or capturing audio signals.
In general terms, the present invention proposes a processing system configured to transmit a first group of components in the input signal to at least one directional loudspeaker and a second group of components in the input signal to at least one conventional loudspeaker. A conventional loudspeaker is defined in this document as a loudspeaker configured to produce a wide dispersion of sound (by “wide”, it is meant that the angle of dispersion of the sound from a conventional loudspeaker is more than 30 degrees) whereas a directional loudspeaker is defined in this document as a loudspeaker configured to produce a directional sound beam (by “directional”, it is meant that the angle of dispersion of the sound from a directional loudspeaker is less than 30 degrees). Furthermore, the directional loudspeaker is typically a parametric loudspeaker generating a modulated ultra-sonic wave, whereas the conventional loudspeaker(s) does not typically generate a modulated ultrasonic beam.
More specifically, a first aspect of the present invention is a processing system for processing an input signal to produce three-dimensional audio effects, the processing system comprising: a cue sending path configured to extract a set of binaural cues from the input signal and further configured to send at least a portion of the extracted set of binaural cues to at least one directional loudspeaker for transmission; and an ambience sending path configured to send at least a part of the input signal comprising ambience sounds to at least one conventional loudspeaker for transmission.
A second aspect of the present invention is a method for processing an input signal to produce three-dimensional audio effects, the method comprising the steps of: extracting a set of binaural cues from the input signal and sending at least a portion of the extracted set of binaural cues to at least one directional loudspeaker for transmission; and sending at least a part of the input signal comprising ambience sounds to at least one conventional loudspeaker for transmission.
The present invention is advantageous as it exploits the directivity of directional loudspeakers and the wide dispersive characteristic of conventional loudspeakers. The dispersive nature of the conventional loudspeakers helps to recreate a certain degree of spaciousness and envelopment whereas the directional loudspeakers are not only useful for 3D sound projection, they can also achieve sharper and more vivid auditory spatial images. The directional loudspeakers are also capable of bringing these auditory images closer to the users. Thus, using at least one directional loudspeaker for transmitting a portion of a set of binaural cues extracted from the input signal and using at least one conventional loudspeaker for transmitting a part of the input signal comprising ambience sounds helps to create a highly-focused sound image comprising vivid auditory images close to the users while still projecting the background audio image to the users.
An embodiment of the invention will now be illustrated for the sake of example only with reference to the following drawings, in which:
The audio system 200 serves to produce 3D audio effects. As shown in
The different components of the audio system 200 will now be described in more detail.
The processing system 201 comprises a cue sending path and an ambience sending path. These paths comprise front-end digital audio processing blocks which serve to pre-process the input signal 202.
The cue sending path comprises a cue extraction module in the form of a binaural cue extraction module 204 and is configured to extract a set of binaural cues from the input signal 202 using this binaural cue extraction module 204. The extracted set of binaural cues may comprise only a single binaural cue and may be used to synthesize audio effects. The cue sending path is further configured to send at least a portion, if not the whole, of the extracted set of binaural cues to at least one directional loudspeaker 214 for transmission. This portion of the extracted set of binaural cues to be sent to the at least one directional loudspeaker 214 may be adjusted using a variable gc as shown in
As shown in
In the direct-through mode, the cue sending path is configured to send the portion of the extracted set of binaural cues directly to the directional loudspeakers 214. This mode is usually used when the configuration of the input signal 202 (and hence, the extracted set of binaural cues) matches the configuration of the directional loudspeakers 214 to be used.
On the other hand, the reconfiguration mode is usually used when the configuration of the input signal 202 does not match the configuration of the directional loudspeakers 214 to be used. The cue sending path comprises a reconfiguration module in the form of an Audio Reconfiguration (AR) module 207. This AR module 207 serves to reconfigure the portion of the extracted set of binaural cues to be sent to the directional loudspeakers 214, so as to match the configuration of the directional loudspeakers 214 to be used. For example, if the number of channels in the portion of the extracted set of binaural cues is not the same as the number of directional loudspeakers 214 to be used for transmitting the binaural cues, the AR module 207 is operable to reconfigure this portion of the extracted set of binaural cues by up-mixing or down-mixing it.
If the input signal 202 comprises a plurality of channels, at least a part of the cue sending path may be configured to process each channel of the input signal 202 independently. For example, the binaural cue extraction module 204 may be configured to extract a group of binaural cues from each channel in the input signal 202. Alternatively, binaural cues may be extracted from only a subset of (i.e. not all) the channels in the input signal 202 whereby a group of binaural cues is extracted from each channel in this subset. The cue sending path may be further configured to send at least a portion of each extracted group of binaural cues to the directional loudspeakers 214 for transmission. The portion of each extracted group of binaural cues to be sent to the directional loudspeakers 214 may be adjusted independently (in one example, this portion may range from zero to one (not inclusive of zero)).
The cue sending path of system 201 further comprises a pre-processing module 208 and an amplification module 210 which serve to modulate and amplify the portion of the extracted set of binaural cues (which may comprise portions of different groups of binaural cues extracted from different channels) before sending it to the directional loudspeakers 214 for transmission. In one example, the pre-processing module 208 is configured to modulate the portion of the extracted set of binaural cues onto an ultrasonic carrier signal using a Modified Amplitude Modulation (MAM) technique. The MAM technique is discussed in more detail below and in PCT Patent Application No. PCT/SG2010/000312, the contents of which are herein incorporated by reference. The portion of the extracted set of binaural cues is then amplified in the amplification module 210 before it is sent to the directional loudspeakers 214 for transmission. Note that different channels of the input signal 202 may also be independently processed through the pre-processing module 208 and the amplification module 210.
The ambience sending path of processing system 201 in
In one example, the conventional loudspeakers 212 comprise surround loudspeakers and non-surround loudspeakers. In this example, the ambience sending path is configured to send at least a portion of the set of binaural cues extracted using the binaural cue extraction module 204 to the surround loudspeakers for transmission. These binaural cues may be distributed accordingly among the surround loudspeakers. In this example, the ambience sending path is further configured to send the part of the input signal 202 comprising ambience sounds to the non-surround loudspeakers for transmission.
In another example, the conventional loudspeakers 212 do not comprise any surround loudspeaker and the ambience sending path is configured to send the part of the input signal 202 comprising ambience sounds to all the conventional loudspeakers 212 for transmission. This part of the input signal 202 may be distributed accordingly among the conventional loudspeakers 212.
If the input signal 202 comprises a plurality of channels, at least a part of the ambience sending path may be configured to process each channel of the input signal 202 independently. For example, the ambience extraction unit 205 may be configured to subtract from each channel in the input signal 202, at least a portion of a group of binaural cues extracted from the channel. Alternatively, this subtraction may be performed for only a subset of (i.e. not all) the channels in the input signal 202. The portion of each group of binaural cues to be subtracted from the respective channel in the input signal 202 may be adjusted independently (in one example, this portion may range from zero to one (inclusive of zero)). Note that if this portion is zero for a particular channel, it implies that the subtraction is not performed for the channel i.e. the whole of this channel is sent to the at least one conventional loudspeaker 212 for transmission.
To accommodate different user requirements, the ambience sending path in the processing system 201 is also operable in two modes: the reconfiguration mode and the direct-through mode. The choice of which mode to use usually depends on the configuration of the input signal 202 and the configuration of the conventional loudspeakers 212.
In the direct-through mode, the ambience sending path is configured to send the extracted part of the input signal 202 comprising ambience sounds directly to the conventional loudspeakers 212. This mode is usually used when the configuration of the input signal 202 (and hence, the extracted part of the input signal 202 comprising ambience sounds) matches the configuration of the conventional loudspeakers 212 to be used for transmitting the extracted part of the input signal 202, for example, when the number of channels n in the input signal 202 is equal to the number of conventional loudspeakers 212 (i.e. n=m) and all the conventional loudspeakers 212 are used for transmitting the extracted part of the input signal 202.
On the other hand, the reconfiguration mode is usually used when the configuration of the input signal 202 does not match the configuration of the conventional loudspeakers 212 to be used for transmitting the extracted part of the input signal 202 (for example, when m≠n). In the reconfiguration mode, the ambience sending path is operable to reconfigure the extracted part of the input signal 202 comprising ambience sounds to match the configuration of the conventional loudspeakers 212 to be used. The ambience sending path comprises a reconfiguration module in the form of an Audio Reconfiguration (AR) module 206 for this purpose. In other words, the AR module 206 is operable to reconfigure the extracted part of the input signal 202 comprising ambience sounds to match the configuration of the conventional loudspeakers 212 to be used. For example, if m≠n (and all m conventional loudspeakers are to be used for transmitting the extracted part of the input signal 202), the AR module 206 serves to reconfigure the extracted part of the input signal 202 by up-mixing or down-mixing it. More specifically, if the input signal 202 is configured for a 5.1 speaker configuration and the conventional loudspeakers 212 belong to a 7.1 speaker configuration (i.e. (n=6)<(m=8)), the extracted part of the input signal 202 may be up-mixed using the AR module 206. Alternatively, if the input signal 202 is configured for a 5.1 speaker configuration and the conventional loudspeakers 212 belong to a 2.1 speaker configuration (i.e. (n=6)>(m=3)), the extracted part of the input signal 202 may be down-mixed using the AR module 206.
If the conventional loudspeakers 212 comprise surround and non-surround loudspeakers as in one of the examples mentioned above, the AR module 206 may be operable to reconfigure the portion of the set of binaural cues to be sent to the surround loudspeakers to match the configuration of the surround loudspeakers. In this case, the part of the input signal 202 comprising ambience sounds may be reconfigured using the AR module 206 to match the configuration of the non-surround loudspeakers.
As mentioned above,
As mentioned above, each of the directional loudspeakers 214 is configured to transmit a signal comprising modulated and amplified binaural cues. As this signal is radiated into a transmission medium (usually, air), it interacts with the transmission medium and self-demodulates to generate a tight column of audible signal. An audible sound beam is thus generated in the transmission medium through a column of virtual audible sources.
The Berktay far-field model as shown in Equation (1) may be used to approximate the above nonlinear sound propagation through the transmission medium. According to Equation (1), the demodulated signal (or audible difference frequency) pressure p2(t) along the axis of propagation is proportional to the second time-derivative of the square of the envelope of the modulated signal (i.e. the signal comprising the modulated and amplified binaural cues). In Equation (1), β is the coefficient of nonlinearity, P0 is the primary wave pressure, a is the radius of the ultrasonic emitter comprised in the directional loudspeaker 214, ρ0 is the density of the transmission medium, c0 is the small signal sound speed, z is the axial distance from the ultrasonic emitter, α0 is the attenuation coefficient at the source frequency and E(t) is the envelope of the modulated signal.
As shown in Equation (1), the nonlinear sound propagation results in a distortion in the demodulated signal p2(t). This in turn results in a distortion in the audible signal generated.
The following is a discussion of some prior attempts to reduce the above-mentioned distortion in the demodulated signal. This is followed by an elaboration of the MAM technique which also serves to reduce the above-mentioned distortion.
As shown in
The difference between √{square root over (E1(t))} and E2 (t) is then used to train the pre-distortion adaptive filter 508 using the least mean square (LMS) scheme. The coefficients am of the adaptive filter 508 are obtained using Equations (2) and (3) as follows wherein β is an adaptive coefficient.
a′
m(t)=−2(√{square root over (E1(t))}−E2(t))x(t−m) (2)
a
m(t+1)=am(t)+βa′m(t) (3)
The output x′(t) of the adaptive filter 508 is shown in Equation (4) as follows.
As shown in
with a second carrier signal cos ω0t to produce a compensation signal, and summing the main signal and the compensation signal to generate the output ĝ(t). Note that the first and second carrier signals are orthogonal to each other and that the pre-distortion term is generated by the signal generator 702 whereby the order of the signal generator 702 represents the order of the pre-distortion term it generates. From Equation (5), it can be seen that as compared to a typical DSBAM scheme which merely generates the main signal (1+mg(t))sin ω0t, the output ĝ(t) comprises an additional orthogonal term
The additional pre-distortion term can help to reduce the distortion in the demodulated signal. This is elaborated below. Denoting f1(t)=1+mg(t) and the output of the signal generator 702 as f2(t), the output ĝ(t) of the MAM technique can be written in the form as shown in Equation (6).
In other words, the envelope of the modulation technique output ĝ(t) is √{square root over (f12(t)+f22(t))}. According to the Berktay's approximation (Equation (1)), the demodulated signal (or audible difference frequency) pressure p2(t) along the axis of propagation is proportional to the second time-derivative of the square of the envelope of the modulated signal. Substituting √{square root over (f12(t)+f22 (t))} into Equation (1), Equation (7) is obtained as follows.
Setting f2(t)=√{square root over (1−m2g2(t))}, Equation (7) can be written as follows:
As shown in Equation (8), by setting f2(t)=√{square root over (1−m2g2 (t))}, the demodulated signal becomes proportional to the input signal g(t). In other words, the distortion in the demodulated signal is completely removed. However, this is only true if and only if the directional loudspeaker 214 has infinite bandwidth. As this is not the case with practical loudspeakers, the pre-distortion term f2(t)=√{square root over (1−m2g2 (t))} is approximated using its truncated Taylor series
By adjusting the value of q, the order of the pre-distortion term
can be varied.
In the MAM technique, the amount of reduction in the distortion is dependent on the order of the pre-distortion term. A higher order will achieve a greater amount of reduction in the distortion. However, a higher order pre-distortion term requires a loudspeaker with a larger bandwidth. By using a pre-distortion term with a variable order, the flexibility of the modulation technique is increased and the order of the pre-distortion term may be varied to suit the requirements of the directional loudspeakers 214. For example, a lower order may be used for loudspeakers with smaller bandwidths whereas the order may be scaled up for loudspeakers with larger bandwidths to further reduce the distortion in the audio signal output of the audio system 200.
The following are a few examples of how binaural cues may be extracted from the input signal 202 using the cue extraction module 204. These binaural cues may contain information to be simulated in the virtual environment, such as the azimuth between the listener and the virtual sound source, the angle of elevation between the listener and the virtual sound source and the distance between the listener and the virtual sound source.
In one example, the binaural cues are extracted by detecting and extracting transient events from the input signal 202. This may be performed in real-time or by post-processing a segment of the input signal 202. Furthermore, the detection and extraction of the transient events may be carried out in the time domain by repeatedly detecting an onset of (for example, an increase in) signal power in the input signal 202.
In another example, the binaural cues are extracted by performing a time-frequency transform in which components of the input signal 202 from a left channel, L, components of the input signal 202 from a right channel, R and a signal M whereby M=0.5 (L+R) are compared against each other. This method may be used to extract the binaural cues from the input signal 202 even if the input signal 202 is a multi-channel audio signal i.e. it comprises more than just the left and right channels. This is because the remaining channels in the input signal 202 are usually surround channels comprising mainly ambience sounds with no or very few binaural cues and thus may be ignored. However, more advanced techniques using more than two channels of the input signal 202 may be employed for the cue extraction.
Besides the two examples mentioned above, other techniques may be employed for the extraction of binaural cues from the input signal 202. For example, the binaural cues may be extracted using a short time Fourier Transform as described in reference [1].
The audio system 200 may be implemented using a sub-band approach for an input signal 202 comprising a plurality of frequency bands. In the sub-band approach, at least a part of the cue sending path and/or the ambience sending path is configured to process each frequency band of the input signal 202 independently. For example, the cue extraction module 204 may use a time-frequency transform which can be implemented using a sub-band cue extraction algorithm. If the input signal 202 comprises a plurality of channels, and each channel of the input signal 202 comprises a plurality of frequency bands, at least a part of the cue sending path and/or ambience sending path may be configured to process each frequency band of each channel independently.
Most prior art systems are based on a single-band approach, whereby a single pre-processing method and modulation technique is applied to the entire frequency range of the input signal. However, different ultrasonic emitters comprised in different loudspeakers usually have different frequency responses that are preferably individually addressed in order to achieve an accurate reproduction of directional sound with minimum distortion. Hence, the sub-band approach is advantageous [2] since different loudspeakers may be employed for different frequency bands, with each frequency band processed differently to suit the respective loudspeaker. This helps to optimize the performance of each frequency band and in turn, helps to improve the performance of the audio system 200.
Furthermore, although the MAM technique may be used with both the sub-band and full-band approaches, the advantages of the MAM technique can be better exploited with the sub-band approach. As mentioned above, a higher order pre-distortion term in the MAM technique will achieve a greater amount of reduction in the distortion but will require a loudspeaker with a larger bandwidth (which is generally more expensive). The sub-band approach allows the use of different types of loudspeakers in the same system, thus allowing the use of cheaper loudspeakers with lower bandwidths for frequency bands which are less important. This in turn lowers the cost of the audio system 200.
In addition, using the sub-band approach, the input signal 202 may be down-sampled, thus lowering and varying the speed requirement for processing each frequency band and in turn lowering the speed requirement for processing the entire signal. This mixed-rate processing technique thus removes the need for high-end processors and instead, a low cost digital signal processor can be used to implement the processing system 200.
Also, more variations may be made to the processing system 201 using the sub-band approach (for example, the number of frequency bands, the processing technique for each frequency band etc. may be varied), allowing manufacturers of the processing system 201 and the audio system 200 to differentiate their products in terms of pricing and applications.
Integration of Processing System 201 with Different Types of Systems
The processing system 201 may be integrated with different types of systems having different loudspeaker configurations.
In one example, the input signal 202 is selected to have a configuration matching the loudspeaker configuration the processing system 201 is to be integrated with. In this example, the ambience sending path of the processing system 201 is configured to operate in the direct-through mode. In another example, the configuration of the input signal 202 does not match the loudspeaker configuration and the ambience sending path of the processing system 201 is configured to operate in the reconfiguration mode. As mentioned above, in the reconfiguration mode, the AR module 206 is operable to reconfigure the part of the input signal 202 comprising ambience sounds to match the configuration of the conventional loudspeakers 212 to be used for sending this part of the input signal 202. This may be performed without user intervention for example, by automatically detecting the configuration of the conventional loudspeakers 212 or with slight user intervention via a user interface (e.g. a screen) to input the configuration of the conventional loudspeakers 212 into the processing system 201. The term “automatic” is used in this document to mean that although human interaction may initiate a process, human interaction is not required while the process is being carried out.
The processing system 201 may further comprise a video tracking module which is configured to track the user's position and/or head movements. In one example, the audio system 200 further comprises a steering mechanism coupled with each of the directional loudspeakers 214 for steering the sound beam from the directional loudspeaker 214. The steering mechanism may comprise mechanical motors, electric motors and/or beam steering circuits and may be configured to cooperate with the video tracking module of the processing system 201 to steer the sound beams from the directional loudspeakers 214 according to the user's position and/or head movements. In one example, a small mechanical motor is built into each of the directional loudspeakers 214 and the directional loudspeakers 214 are rotated to face the user. Due to the highly directional nature of the sound beam from a directional loudspeaker, the sound beams from the loudspeakers 214 are thus directed to the user in this example.
The above-mentioned head-tracking feature of the audio system 200 is advantageous as it can present the same audio experience to the user regardless of the user's head movements. Furthermore, using this head-tracking feature, multiple sweet spots may be created to support a multi-listener auditory experience, providing the user with the same or similar audio experience at different locations.
The advantages of the audio system 200 are as follows.
In a multi-channel setup, the degree of audio imaging (mainly the sound effects) and the spaciousness provided by the audio sounds are usually dependent on the directivity (i.e. directional characteristic) of loudspeakers used in the setup.
The audio system 200 employs both directional loudspeakers 214 and conventional loudspeakers 212, and thus is able to exploit both the directivity of directional loudspeakers and the wide dispersive characteristic of conventional loudspeakers. This helps to avoid the auditory spatial imaging issues, as discussed above with reference to
The use of directional loudspeakers 214 in the audio system 200 is particularly advantageous. Transaural audio beam projection using an audio beam system (ABS) employing directional loudspeakers has been shown to be well suited for projecting 3D sound. Furthermore, studies based on several objective measurements and informal listening tests show that directional loudspeakers are not only useful for 3D sound projection, they can bring auditory spatial images closer to the listeners. It has also been shown that auditory spatial images are sharper and more vivid when directional loudspeakers are used. These enhancements in the auditory spatial images are highly desirable in 3D games, and provide garners with a more immersive gaming experience. The audio system 200 is hence advantageous since it exploits the strengths of directional loudspeakers 214 to enhance the auditory experience in for example, gaming and entertainment applications.
In particular, the directional loudspeakers 214 in the audio system 200 serve to transmit binaural cues selectively extracted from the audio channels of the input signal 202 whereas the conventional loudspeakers 212 serve to transmit the background audio image (i.e. the ambience sounds). The dispersive nature of the conventional loudspeakers 212 helps to recreate a certain degree of spaciousness and envelopment in the ambience sounds especially when more channels of the input signal 202 are used. The use of the directional loudspeakers 214 and the conventional loudspeakers 212 in this manner helps to create a highly-focused sound image comprising vivid auditory images close to the users while still projecting the background audio image to the users. In other words, the audio system 200 is able to provide both ambient effects (or surround sound effects) and sound depth reproduction. Thus, the audio system 200 is capable of achieving better auditory depth in for example, gaming and movie viewing as compared to conventional surround sound systems.
The selective extraction of binaural cues for transmission via the directional loudspeakers 214 is advantageous as compared to prior art systems such as the one disclosed in U.S. Pat. No. 6,229,899 (as illustrated in
Furthermore, in the processing system 201, binaural cues may be subtracted from the input signal 202 to extract the part of the input signal 202 to be sent to the conventional loudspeakers 212 for transmission. This is advantageous as it prevents the resultant audio output from being over-processed due to the over-emphasis of cues (since extracted cues are already transmitted via the directional loudspeakers 214). This advantage applies especially when down-mixing of the part of the input signal to be sent to the conventional loudspeakers 212 is performed.
In addition, the processing system 201 may be integrated with a user's existing surround loudspeaker system without replacing the surround loudspeakers with directional loudspeakers. Furthermore, the processing system 201 is configured such that it can be integrated with almost any loudspeaker configuration. Hence, it is capable of enhancing the audio output of many systems with different loudspeaker configurations (which may comprise stereo channels or multiple channels). Furthermore, as shown in
Furthermore, the processing system 201 employs the MAM technique which helps to overcome the high distortion normally found in the audio output of directional loudspeakers. In addition, the audio system 200 may be implemented using a sub-band approach whose advantages have been discussed above. The audio system 200 may also be implemented using a multi-channel approach whereby each channel of the input signal 202 is configured to be processed independently. Hence, each channel of the input signal 202 can employ a different loudspeaker and/or a different processing technique optimized for the channel.
The audio system 200 is also advantageous as compared to prior art systems such as the virtual surround sound system (VSSS) which uses 3D sound techniques to create a virtual sound image. Using the VSSS often results in a lack of auditory depth. In contrast, the audio system 200 achieves good auditory depth and creates vivid auditory images close to the users, hence adding a new dimension in sound projection that is currently not found in most other commercial systems.
The high definition graphics in today's gaming platforms have brought a new level of realism to garners. Due to the above advantages, the audio system 200 is able to enhance the level of realism in these gaming platforms by providing them with surround and accurate audio projection. This is crucial in completing the gaming experience. Furthermore, many of the current (and probably, next generation) interactive games, such as the widely popular Wii games, Kinect for XBOX360 and Move controller for Playstation 3, require users to interact with items or characters in the games via body movements. These gaming products are usually designed for a group of garners (may be up to 4 garners) within close proximity to one other. However, even though these gaming products emphasize on the interactive multi-player gaming experience, it is difficult to deliver personalized audio information to each gamer. The audio system 200 can be used to solve this problem as it is capable of delivering personalized cues/sound effects to each gamer via the directional loudspeakers 214. Thus, it can enhance the interactive multi-player gaming experience and allows two or more garners within close proximity to have a co-operative gaming session without the need for headphones. The garners are thus able to communicate directly with each other and problems (such as fatigue) related to prolonged usage of headphones may be avoided.
The following summarizes a few key advantages provided by the audio system 200:
1. The sound effects produced by the audio system 200 are closer to the user as compared to many prior art systems. These sound effects are also sharp and highly accurate. Despite this, the audio system 200 is still able to provide sufficient spaciousness and envelopment for ambience sounds through the conventional loudspeakers 212.
2. The audio system 200 removes the need for headphones and thus, is not faced with problems associated with the use of headphones, for example, in-the-head problems and front-back confusion problems.
3. The processing system 201 of the audio system 200 may be integrated with different loudspeaker configurations as it comprises an AR module 206 which is operable to reconfigure its input to match the configuration of the conventional loudspeakers 212.
Furthermore, the audio system 200 may be used in a variety of commercial applications. These applications include for example:
The audio system 200 may also be used for making sound systems, consumer electronics and various products in the entertainment industry.
Further variations are possible within the scope of the invention as will be clear to a skilled reader.
For example, although the processing system 201 in
This is a continuation of U.S. application Ser. No. 13/516,898, filed Jun. 18, 2012, now abandoned, which is a 371 of International Application No. PCT/SG2011/000027, filed Jan. 19, 2011, which claims the benefit of U.S. Provisional Application No. 61/296,187, filed Jan. 19, 2010.
Number | Date | Country | |
---|---|---|---|
61296187 | Jan 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13516898 | Jun 2012 | US |
Child | 15051599 | US |