This application is a 371 of international application of PCT application serial no. PCT/CN2018/120990, filed on Dec. 14, 2018, which claims the priority benefit of China application no. 201811297263.8, filed on Nov. 1, 2018. The entirety of each of the above mentioned patent applications is hereby incorporated by reference herein and made a part of this specification.
The present invention relates to the field of electroacoustic technologies, and more particularly, refers to a method of virtual reproduction for multichannel spatial surround sound in three-dimensional space.
At present, a multichannel surround sound technology has been evolving from traditional horizontal surround sound to spatial surround sound, and has been applied to movies, domestic audio and video reproduction, and other fields. Loudspeakers are arranged on a horizontal plane for the traditional horizontal surround sound. For example, the domestic 5.1-channel surround sound recommended by the International Telecommunication Union involves five loudspeakers with full audible bandwidth, including left L, centre C, and right R loudspeakers in front of the horizontal plane, as well as left-surround LS and right-surround RS loudspeakers on a side back of the horizontal plane, and an optional subwoofer. Compared with the horizontal surround sound, a spatial surround sound greatly improves the spatial perceptual performance, but is more complex and requires more loudspeakers in reproduction, and usually utilizes layer-wise loudspeaker configurations. For example, the 9.1 channel spatial surround sound consists of a horizontal-layer and an upper (high)-layer loudspeaker configuration (i.e., nine loudspeakers with full audible bandwidth), as well as an optional subwoofer. The arrangement of five loudspeakers with full audible bandwidth in the horizontal layer is identical to that of 5.1-channel surround sound recommended by the International Telecommunication Union. And the four loudspeakers with full audible bandwidth in the upper layer are arranged above the left-front, right-front, left-surround and right-surround loudspeakers in the horizontal plane, respectively.
On the other hand, in some practical uses such as TV set, it is inconvenient to arrange multiple loudspeakers for multichannel spatial surround sound reproduction due to the limitation of a room condition. Therefore, virtual reproduction method has been used the number of loudspeakers. The principle of virtual reproduction is first processing the multichannel surround sound signals with head related transfer functions (HRTFs) and the mixing to less channel signals, and finally reproducing with fewer actual loudspeakers, so as to achieve an effect similar to that of multichannel surround sound and reach the aim of simplifying multichannel surround sound.
For the reproduction of the 5.1-channel horizontal surround sound, foreign countries have developed patented technologies and products (such as SRS, Qsurround, Dolby, etc.) of virtual reproduction with two front loudspeakers, but there are some general defects, especially a narrow listening area and a change in timbre. In the national granted invention patent (patent number: CN1219415C), the problems of the narrow listening area and the change in timbre in the past technology are overcome, and a head related transfer function filters are also simplified. The two patented technologies can be used in the reproduction of the horizontal surround sound in the TV set, and implemented by a pair of left and right loudspeakers arranged on two sides of the TV set, or a bar-shaped loudspeaker system integrating left and right channels. (called “Sound Bar”) that is arranged above (or below) the TV set.
The technology of virtual reproduction for spatial surround sound with two front loudspeakers has been developed internationally. Although a structure of this kind of technology is simple, due to the limitation of a physical principle of the virtual reproduction with two actual loudspeakers, this kind of technology is only able to recreate virtual spatial sound effect in the front-horizontal quadrants, but cannot generate a stable spatial surround sound effect in a vertical direction.
The present invention further provides a method of virtual reproduction for multichannel spatial surround sound in three-dimensional space. In the method, spatial sound signals are converted into four-channel signals by processing with HRTF (Head Related Transfer Function), and reproduced by four actual loudspeakers arranged in the directions of left-front, right-front, left-front-up and right-front-up. In practical application, this four-loudspeaker arrangement can be implemented by a pair of bar-shaped loudspeaker systems arranged above and below a TV set respectively, or a pair of bar-shaped loudspeaker systems vertically arranged on left and right sides of the TV set respectively, or a pair of loudspeakers arranged on the left and right sides of the TV set respectively and one bar-shaped loudspeaker system arranged above the TV set.
The method of virtual reproduction for multichannel spatial surround sound in three-dimensional space according to the present invention includes the following steps:
Further, in the step 10, the sum and difference operation is carried out on the total sum signal ESUM and the total difference signal EDIF of the horizontal layer, and the signals are attenuated by −3 dB, which is, multiplied by 0.7 to respectively obtain the reproduced signals EL1=0.7 (ESUM+EDIF) and ER1=0.7 (ESUM−EDIF) of the actual loudspeakers at left-front and right-front directions, and the signals are fed to the corresponding actual loudspeakers for reproduction.
Further, in the step 11, the sum and difference operation is carried out on the total sum signal E′SUM and the total difference signal E′DDIF of the upper layer, and the signals are attenuated by −3 dB, which is, multiplied by 0.7 to respectively obtain the reproduced signals EL2=0.7 (E′SUM+E′DIF) and ER2=0.7 (E′SUM−E′DIF) of the actual loudspeakers at left-front-up and right-front-up directions, and the signals are fed to the corresponding actual loudspeakers for reproduction.
Further, the filtering with the M/2 virtual reproduction signal processing functions Σ1,2, Σ3,4, . . . ΣM−1,M in the step 6, and the filtering with the M/2 virtual reproduction signal processing functions Δ1,2, Δ3,4 . . . ΔM−1,M in step 7 are carried out according to the virtual reproduction signal processing functions obtained by the following equations:
Where HL (θm, f) and HR (θm, f) are a pair of Head Related Transfer Functions (HRTFs) from virtual loudspeakers in a direction of θm of the horizontal plane to left and right ears, wherein f is a frequency; and α1=α1 (f) and β1=β1(f) are HRTFs from actual loudspeaker at horizontal left-front or right-front to the ipsilateral and contralateral ears, respectively.
Further, the filtering with the M′/2 signal processing functions Σ′1,2, Σ′3,4, . . . Σ′M′−1,M′ in the step 8, and the filtering with the M′/2 signal processing functions Δ′1,2, Δ′3,4, . . . Δ′M−1,M′ in the step 9 are carried out according to signal processing functions obtained by the following equations:
Where HL(θ′m′, f) and HR(θ′m′, f) are a pair of Head Related Transfer Functions (HRTFs) from virtual loudspeakers in a direction of θ′m′ of the horizontal plane to left and right ears; and α2=α2(f) and β2=β2(f) are HRTFs from actual loudspeaker at left-front-up (or right-front-up) directions in the ipsilateral and contralateral ears, respectively.
A principle of the present invention is that: according to a basic theory of virtual reproduction, the left-front and right-front actual loudspeakers arranged on a certain elevation plane may generate multiple virtual loudspeakers on front quadrants of the elevation plane. Four actual loudspeakers are used for virtual reproduction, wherein two loudspeakers are respectively arranged at left-front and right-front directions of the horizontal plane, and the other two loudspeakers are respectively arranged at left-front-up and right-front-up directions of the high elevation plane. Virtual reproduction signal processing may generate virtual loudspeakers on a front quadrant of the horizontal layer and the upper layer for multichannel spatial surround sound, thus generating a surround sound effect in three-dimensional space, including an effect in a vertical direction. In practical application, a pair of horizontal bar-shaped loudspeaker systems arranged above and below a TV set respectively, or a pair of vertical bar-shaped loudspeaker systems arranged on left and right sides of the TV set respectively, or a pair of loudspeakers arranged on the left and right sides of the TV set respectively and one horizontal bar-shaped loudspeaker system arranged above the TV set are all equivalent to a combination of a pair of left-front and right-front actual loudspeakers of the horizontal plane and a pair of left-front-up and right-front-up actual loudspeakers of the high elevation plane, so that the present invention may be implemented.
Compared with the prior art, the present invention has the following advantages and beneficial effects.
Independent original signals of the multichannel spatial surround sound are virtually processed, and then reproduced by four actual loudspeakers arranged at the left-front and right-front directions of the horizontal plane and the left-front-up and right-front-up directions of the high elevation plane, compared with the original loudspeakers arrangement of the multichannel spatial surround sound, a hardware structure thereof is simpler, and meanwhile, the spatial surround sound effect may be generated, including the effect in the vertical direction.
The arrangement of the loudspeakers in the present invention is suitable for the TV set and other video reproduction applications.
The present invention is compatible with virtual reproduction of traditional 5.1-channel surround sound by two loudspeakers.
The present invention may be designed as special hardware or general software for sound reproduction in a digital television, a home theater, and the like, and may also be used as hardware or software for sound reproduction in a multimedia computer.
The present invention is further described in detail hereinafter with reference to the accompanying drawings and the embodiments, but the implementation and the scope of protection of the present invention are not limited to this.
According to a method of virtual reproduction method for a multichannel spatial surround sound in three-dimensional space in the embodiment, loudspeakers are arranged first, and coordinates are selected as an elevation of −90°≤ϕ≤90° and an azimuth of −180≤θ≤180, wherein ϕ=−90°, 0°, and 90° respectively represent below, a horizontal plane, and above; and on the horizontal plane, θ=0°, 90°, and 180° respectively represent front, left, and back.
φL1=φR1=0° (1)
For the practical uses in the TV set, a span azimuth between the left-front and right-front loudspeakers is smaller than the standard of 60°, which usually vary between 20° and 30°. Therefore, azimuths of the left-front and right-front loudspeakers are:
θL1=10°˜15° θR1=−10°˜−15°. (2)
The left-front-up and right-front-up loudspeakers are arranged at a position above the horizontal plane, and elevations are:
φL2=φR2=30°±15°. (3)
Moreover, azimuths of the left-front-up and right-front-up loudspeakers are:
θL2=10°˜15° θR2=−10°˜−15°. (4)
There are various multichannel spatial surround sound formats, which generally includes channel signals and loudspeaker arrangement on two layers (a horizontal layer and an upper layer) as shown in
M non-front and non-back channel signals of the horizontal layer of the multichannel spatial surround sound are processed with virtual reproduction signal processing functions, and undergo a sum and difference operation to obtain a total sum signal ESUM=Σ1,2 (E1+E2)+Σ3,4 (E3+E4), . . . +ΣM−1,M (EM+EM+1)+EM+1+EM+2 of the horizontal layer and a total difference signal EDIF=Δ1,2 (E1−E2)+Δ3,4 (E3−E4), . . . +ΔM−1,M (EM−1−EM) of the horizontal layer, and then mixed with front or back signals (if they exist) of the horizontal layer which are attenuated by −3 dB (multiplied by a coefficient of 0.7), and then the signals are fed to the left-front and right-front actual loudspeakers. According to a condition that binaural sound pressures of virtual reproduction are equal to that of actual reproduction, and a power spectrum of reproduction signals is unchanged, signals reproduced by a pair of left-front and right-front actual loudspeakers are as follows:
wherein the virtual reproduction signal processing functions are given by the following equations:
Where HL (θm, f) and HR (θm, f) are a pair of Head Related Transfer Functions (HRTFs) from virtual loudspeakers in a direction of θm in the horizontal plane to left and right ears, wherein f is a frequency; suppose that the left and right actual loudspeaker of the horizontal plane are left-right symmetric relative to a listener, and α1=α1 (f) and β1=β1(f) are HRTFs (ipsilateral and contralateral HRTFs) from actual loudspeaker at left-front (or right-front) to the ipsilateral and contralateral ears, respectively.
M′ non-front and non-back channel signals of the upper layer of the multichannel space surround sound are processed with virtual reproduction signal processing functions, and undergo an add and subtract operation to obtain a total sum signal E′SUM=Σ′1,2 (E′1+E′2)+Σ′3,4 (E′3+E′4) . . . +Σ′M′−1,M′ (E′M′+E′M′+1)+E′M′+1+E′M′+2 of the upper layer and a total difference signal E′DIF=Δ′1,2 (E′1−E′2)+Δ3,4 (E′3−E′4) . . . +Δ′M′1,M (E′M′−E′M′+1) of the upper layer, and then mixed with front and back signals (if they exist) of the upper layer which are attenuated by −3 dB (multiplied by a coefficient of 0.7), and then the signals are fed to left-front-up and right-front-up actual loudspeakers. According to a condition that binaural sound pressures of virtual reproduction are equal to that of actual reproduction, and a power spectrum of reproduction signals is unchanged, signals reproduced by a pair of left-front-up and right-front-up actual loudspeakers are as follows:
wherein the virtual reproduction signal processing functions are given by the following equations:
Where HL(θ′m′, f) and HR(θ′m′, f) are a pair of Head Related Transfer Functions (HRTFs) from virtual loudspeakers in a direction of θ′m′ in the horizontal plane to left and right ears; and α2=α2(f) and β2=β2(f) are HRTFs (ipsilateral and contralateral HRTFs) from actual loudspeaker at left-front-up (or right-front-up) to the ipsilateral and contralateral ears, respectively.
Generally, the loudspeakers arrangement in the multichannel spatial surround sound are left-right symmetric. Signal processing can be simplified by considering a symmetry. M non-front and non-back channel signals of the horizontal layer are numbered according to a rule that an odd number represents a left half-space channel and an even number represents a symmetric right half-space channel. Then the responses of virtual signal processing function in equation (6) satisfy following symmetric relationship:
Then, the signal processing of the equation (5) is equivalent to following equation (10):
The sum in equation (10) is over all odd numbers m, and
Σm,m+1=0.707[A1(θm,f)+A1(θm+1,f)]
Δm,m+1=0.707[A1(θm,f)−A1(θm+1,f)] (11)
M′ non-front and non-back channel signals of the upper layer are numbered according to a rule that an odd number represents a left half-space channel and an even number represents a symmetric right half-space channel. Then the responses of virtual signal processing function in equation (8) satisfy the following symmetric relationship:
Then, the signal processing of the equation (7) is equivalent to following equation (13):
The sum in equation (13) is over all odd numbers m′, and
Σ′m′,m′+1=0.707[A1(θ′m,f)+A1(θ′m′+1,f)]
Δ′m′,m′+1=0.707[A1(θ′m,f)−A1(θ′m′+1,f)] (14)
The virtual signal processing in equation (10) and equation (13) include (M+M′) filters, which is just half of original 2(M+M′) filters in equation (5) and equation (7). Therefore, the efficiency of virtual signal processing is improved.
Multichannel spatial surround sound (digital) signals decoded and outputted by a blue-ray disc player or obtained from a digital transmission medium are virtually processed according to the method shown in
Multichannel spatial surround sound (digital) signals decoded and outputted by a blue-ray disc player or obtained from a digital transmission medium are fed to an amplifier of the home theater. The virtual signal processing in
Multichannel spatial surround sound (digital) signals are read by a blue-ray drive of the computer, or obtained by passing through a digital transmission medium and decoded, and then the virtual signal processing shown in
The present invention specifically introduces an application of virtual reproduction of 9.1-channel spatial surround sound in the TV set as an embodiment, and the present invention is implemented by a hardware circuit made of a general signal processing chip (DSP). However, the present invention is not limited to the virtual reproduction of the 9.1 channel spatial surround sound, but also includes virtual reproduction of other multichannel spatial surround sounds, such as virtual reproduction of 11.1-channel spatial surround sound and virtual reproduction of 13.1 channel spatial surround sound. The present invention is not limited to the application in the TV set, but also includes other applications, such as the application in the blue-ray disc player, the application in the home theater, the application in the multimedia computer, and the like. The present invention is not limited to being implemented by the general DSP, but may also be implemented in other ways, such as implemented by being designed as a special integrated circuit chip, or being designed as software to be implemented on the multimedia computer.
The 9.1-channel surround sound is the simplest spatial surround sound system. The 9.1-channel spatial surround sound includes arrangement of two layers of loudspeakers and nine independent full audible bandwidth channel signals in total. Arrangement directions thereof are shown in
E1=ELE2=ERE3=ELSE4=ERSE5=EC (15)
Corresponding elevation of each loudspeakers of the horizontal layer is 0°, and azimuths are respectively:
θ1=θL=30° θ2=θR=−30° θ3=θLS=110° θ4=θRS=−110° θ5=θC=0° (16)
The upper layer of the 9.1-channel spatial surround sound includes M′=4 left-right symmetric non-front and non-back channel signals in total, namely, left-up E′LH, right-up E′RH, left-up-surround E′LSH, and right-up-surround E′RSH, without front or back channel signals. Numbering is carried out according to a rule that an odd number represents a left half-space channel and an even number represents a symmetric right half-space channel, and a number of each signal is:
E′1=E′LHE′2=E′RHE′3=E′LSHE′4=E′RSH (17)
Corresponding elevation of each loudspeakers of the upper layer is 30°, and azimuths are respectively:
θ′1=θ′LH=30° θ′2=θ′RH=−30° θ′3=θ′LSH=110° θ′4=θ′RSH=−110° (18)
The virtual signal processing can be implemented by the method of the above equations (10) and (13) with loudspeakers arrangement parameters of the 9.1-channel surround sound. Since the actual loudspeakers of the front-half space cannot generate virtual sound sources (virtual loudspeakers) of the back-half space, the virtual surround loudspeakers of the horizontal layer and the upper layer are moved forward to two sides in signal processing parameters, and azimuths of the loudspeakers are taken as
θ3=θLS=90° θ4=θRS=−90° θ′3=θ′LS=90° θ′4=θ′RS=−90° (19)
A subwoofer channel signals is processed in the same way as the central channel signal of the horizontal plane.
The 9.1-channel spatial surround sound (digital) signals decoded and outputted by a blue-ray disc player or obtained from a digital transmission medium are virtually processed to obtain four-channel signals EL1, ER1, EL2, and ER2, then the signals are reproduced by a pair of actual bar-shaped loudspeaker systems arranged above and below the TV set respectively. The virtual signal processing is implemented by a hardware circuit made of a general signal processing chip (ADAU1701), which is used as a part of the hardware circuit in the (active) actual bar-shaped loudspeaker system. HRTF data of a KEMAR artificial head obtained by experimental measurement is used for the signal processing, and a sampling frequency is 44.1 kHz. A finite impulse response (FIR) filter is used to implement the virtual signal processing, and a length of the filter is 128 points.
Specific implementation steps are as follows.
In step 1, two bar-shaped loudspeaker systems are respectively arranged above and below the TV set, the elevations of the loudspeakers are 0° and 30° respectively, and the azimuths of the loudspeakers are ±15°;
In step 2, five channel signals of the horizontal layer of the original 9.1-channel spatial surround sound are inputted, including left EL, right ER, left surround ELS, right surround ERS, and front central channel EC;
In step 3, four channel signals of the upper layer of the original 9.1-channel spatial surround sound are inputted, including left-up E′LH, right-up E′RH, left-up-surround E′LSH, and right-up-surround E′RSH;
In step 4, an add and subtract (sum and difference) operation is carried out on every left half-space channel signal of the horizontal layer and a symmetric right half-space channel signal to obtain two sum signals (EL+ER) and (ELS+ERS) of the horizontal layer and two difference signals (EL−ER) and (ELS−ERS) of the horizontal layer;
In step 5, the add and subtract (sum and difference) operation is carried out on each left half-space channel signal of the upper layer and a symmetric right half-space channel signal to obtain two sum signals (E′LH+E′RH) and (E′LSH+E′RSH) of the upper layer and two difference signals (E′LH−E′RH) and (E′LSH−E′RSH) of the upper layer;
In step 6, two sum signals of the horizontal layer are filtered with two virtual reproduction signal processing functions Σ1,2 and Σ3,4, and then summed, and the central channel signal is added to obtain a total sum signal ESUM=Σ1,2 (EL+ER)+Σ34(ELS+ERS)+EC of the horizontal layer.
In step 7, two difference signals of the horizontal layer are filtered with two virtual reproduction signal processing functions Δ1,2 and Δ3,4, and then summed to obtain a total difference signal EDIF=Δ1,2 (EL−ER)+Δ3,4 (ELS−ERS) of the horizontal layer.
In step 8, two sum signals of the upper layer are filtered with two virtual reproduction signal processing functions E′1,2 and E′3,4, and then summed to obtain a total sum signal E′SUM=Σ′1,2 (E′LH+E′RH)+Σ′3,4 (E′LSH+E′RSH).
In step 9, two difference signals of the upper layer are filtered with two virtual reproduction signal processing functions Δ′1,2 and Δ′3,4, and then summed to obtain a total difference signal E′DIF=Δ′1,2 (E′LH−E′RH)+Δ′3,4(E′LSH−E′RSH) of the upper layer.
In step 10, the add and subtract (sum and difference) operation is carried out on the total sum signal ESUM and the total difference signal EDIF of the horizontal layer, and the signals are attenuated by −3 dB (multiply by 0.7) to obtain the reproduction signals EL1=0.7 (ESUM+EDIF) and ER1=0.7 (ESUM−EDIF) of the left-front and right-front actual loudspeakers of the horizontal plane, and the signals are fed back to the corresponding actual loudspeakers for reproduction.
In step 11, the add and subtract (sum and difference) operation is carried out on the total sum signal E′SUM and the total difference signal E′DIF of the upper layer, and the signals are attenuated by −3 dB (multiplied by 0.7) to obtain the reproduction signals EL2=0.7 (E′SUM+E′DIF) and ER2=0.7 (E′SUM−E′DIF) of the left-front-up and right-front-up actual loudspeakers, and the signals are fed back to the corresponding actual loudspeakers for reproduction.
As described above, the present invention can be well implemented.
Since arrangement of five channels and loudspeakers of the horizontal layer of the 9.1-channel spatial surround sound is consistent with that of the traditional 5.1 channel horizontal surround sound, the signal processing of the present invention is completely compatible with that of existing virtual reproduction of 5.1 channel surround sound double loudspeakers (granted national invention patent, ZL02134416.7).
A subjective evaluation experiment verifies an actual effect of the present invention. A key to evaluate the virtual reproduction of the multichannel spatial surround sound is the effect of the virtual loudspeakers, which is to evaluate a perception direction of each virtual loudspeaker. In the embodiment of the virtual reproduction of the 9.1-channel spatial surround sound of the present invention, the five virtual loudspeakers of the horizontal layer and the signal processing are exactly the same as those of existing virtual reproduction of 5.1-channel surround sound double loudspeakers, and the effects should also be the same. Therefore, the subjective evaluation experiment focuses on verifying a positioning effect of the four virtual loudspeakers of the upper layer.
The experiment is carried out in a listening room with reverberation time of 0.15 s. The elevations and the azimuths of the four actual loudspeakers are ϕL1=ϕR1=0° and ϕL2=θR2=30° as well as θL1=θL2=15° and θR1=θR2=−15°, and a distance from a head center of a listener is 1.5 m. Original experimental signals include a speech signal (standard Chinese of male voice) and a music signal (orchestral music: John Strauss, a segment of Blue Danube). After the signal processing, signals corresponding to directions of the four virtual loudspeakers of the upper layer of the 9.1-channel spatial surround sound are generated respectively, and the actual loudspeakers are used.
In the experiment, the listener judges the directions of the perceived virtual loudspeakers, and repeatedly judges for three times under each reproduction condition. A total of eight subjects participate in the experiment, so that there are 24 judgments under each reproduction condition. Finally, 24 judgments under each reproduction condition are statistically analyzed. Statistical parameters for measuring a localization effect include: front-back confusion rate, up-down confusion rate, average unsigned azimuth error and standard deviation, and average unsigned elevation error and standard deviation of a virtual source. Results are shown in Table 1.
It can be seen from Table 1 that no front-back as well as up-down confusion in perceiving virtual sources occurs in reproduction. The average unsigned elevation error is not large, so that localization perception in a vertical direction can be generated. The average unsigned azimuth error of a lateral target azimuth θ=±90° is large, and an azimuth of an actual perceived virtual source is about 60°, which is an inherent defect of virtual processing. Therefore, the virtual source localization experiment verifies the present invention.
Number | Date | Country | Kind |
---|---|---|---|
201811297263.8 | Nov 2018 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2018/120990 | 12/14/2018 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2020/087678 | 5/7/2020 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6795556 | Sibbald | Sep 2004 | B1 |
20170064484 | Borss | Mar 2017 | A1 |
20180192226 | Woelfl | Jul 2018 | A1 |
Number | Date | Country |
---|---|---|
1440217 | Sep 2003 | CN |
1219415 | Sep 2005 | CN |
1953620 | Apr 2014 | CN |
104284291 | Jan 2015 | CN |
105072557 | Nov 2015 | CN |
107105384 | Aug 2017 | CN |
107347173 | Nov 2017 | CN |
206908863 | Jan 2018 | CN |
2006052188 | May 2006 | WO |
WO2006052188 | May 2006 | WO |
Number | Date | Country | |
---|---|---|---|
20210377688 A1 | Dec 2021 | US |