This invention relates to surround audio signal processing system, more particularly it relates to audio signal encoding and decoding which can be used in any digitized and compressed audio signal storage or transmission applications and rendering for audio play back applications.
When listen to music or watch a video with audio, it is desirable for the audience to have high degree of audio envelopment, so that they have better sensation of the audio/and video scene. The sense of audio envelopment includes immersive 3D audio and accurate audio localization. Immersive 3D audio means that the audio system is able to virtualize sound sources at any position in space. Accurate audio localization means that the audio system is able to locate the sound sources precisely align with the original audio scene, in terms of both direction and distance [1].
The sense of audio envelopment can be provided by a 3D audio system, which uses a large number of loudspeakers. The speakers might be surrounding the audience and be situated at high, mid and low vertical positions.
Three types of input signals and formats are commonly used in 3D audio system: channel-based input, object-based input and Higher-Order Ambisonics.
Channel-based input is commonly used in today's 2D and 3D audio signal production processes and media (e.g. 22.2, 9.1, 8.1, 7.1, 5.1 etc), where each produced audio signal channel is intended to directly drive a loudspeaker in a designated position.
For object-based input, each produced audio signal channel represents an audio source that is intended to be rendered at a designated spatial position, independent of the number and location of actually available loudspeakers.
For Higher-Order Ambisonics (HOA), each produced audio signal channel is part of an overall description of the entire sound scene, independent of the number and location of actually available loudspeakers.
Among the three formats, the HOA format is representation of audio scene it is possible to render the ambisonic signals to any playback setup, including the non-standard speaker layout.
In prior arts, such as the model for MPEG-H 3D audio standardization, for the HOA format, at the decoder side, the HOA signal is firstly reconstructed from decoded core signals and then rendered to the speaker setup.
Firstly, the input bit stream is de-multiplexed (101) into N bit streams originally created by the AAC-family mono encoders plus the parameters required to recompose the full HOA representation from these bit streams.
In the multi-channel perceptual decoding component (102, 103 and 104), the N bit streams are individually decoded by AAC-family mono decoders to produce N signals.
In the successive spatial decoding component, first, the actual value range of these signals is reconstructed by the inverse gain control processing (105). In a next step, the N signals are re-distributed to provide the M pre-dominant signals and (N−M) HOA coefficient signals representing the more ambient HOA components (105).
The fixed subset of the (N−M) HOA coefficient signals is re-correlated, this means the decorrelation at the HOA encoding stage is reversed (107).
Next, all of the (N−M) HOA coefficient signals are used to create the ambient HOA components (107).
The predominant HOA components are synthesized from the M predominant signals and the corresponding parameters (106).
Finally, the predominant and the ambient HOA components are composed into the desired full HOA representation (108), which is then rendered to a given loudspeaker setup (109).
The detail process of the predominant sound synthesis, ambiance synthesis, HOA composition and rendering is explained as below.
In the Predominant Sound Synthesis (PSS) block (106), the HOA representation of the predominant sound component is computed from either of two methods. These methods are referred to as ‘directional based’ and ‘vector based’.
In vector based PSS, the predominant sound is computed from the vector based signals. XVEC(k). The XVEC(k) signals represent time domain audio signals that have been decoupled from their spatial characteristics. The reconstructed HOA coefficients are computed by multiplying the vector based signals XVEC(k) with corresponding transformation vectors (represented by multiple vectors in MVEC (k)). The MVEC (k) thus contain spatial characteristics (such as directionality and width) of the corresponding XVEC (k) time domain audio signals. The computation can be seen as below:
CVEC(k)=(XVEC(k)(VEC(k))T)T (1)
where
In directional based PSS, the HOA coefficients are computed from all direction based predominant sound signals XPS (k), using the tuple set DIR (k), the computation can be seen as below:
CDIRk=(XPS(k)(DIR(k))T)T (2)
where
In Ambient Synthesis, the ambient HOA component frame CAMB (k) is obtained as below, according to reference [2]:
Finally, in the HOA Composition the ambient HOA component and the predominant sound HOA component are superposed to provide the decoded HOA frame. If the prediction is not activated for the direction based predominant synthesis, the decoded HOA frame C(k) is computed by
C(k)=CAMB(k)+CDIR(k) for direction based synthesis (5)
C(k)=CAMB(k)+CVEC(k) for vector based synthesis (6)
If the near field compensation is not applied, the decoded HOA coefficients C(k) is converted to the representation of loudspeaker signals W(k) by multiplication with the rendering matrix D:
W(k)=DC(k). (7)
where
In order to calculate the complexity of the above process, the following notations are defined:
The complexity for Predominant Sound Synthesis is
COMPSS=Fs*M*(OHOA+1)2 (8)
where
The complexity for Rendering is
COMRENDER=Fs*L*(OHOA+1)2 (9)
where
The number of HOA coefficients is very large in typical HOA formats, as example if OHOA=4, then number of HOA coefficients is (4+1)2=25.
And in order to have better sensation of the 3D audio, the number of playback channels is also very large, for example, 22.2 setup has in total of 24 speakers.
The sampling frequency for audio signal is normally at 44.1 kHz or 48 kHz.
As example, the complexity is estimated for the predominant sound synthesis and rendering for M=4, OHOA=4, L=24 and Fs=48 kHz:
From the example, it can be seen that both of the synthesis and rendering processes are very complex and it is desirable to reduce the complexity.
As shown in the HOA composition process (Equation 1&2), predominant sound synthesis is done according to:
CVEC(k)=(XVEC(k)(VEC(k))T)T for vector based synthesis (1)
CDIR(k)=(XPS(k)(DIR(k))T)T for direction based synthesis (2)
Ambient sound synthesis is done according to:
Rendering is done according to (Equation 7):
W(k)=DC(k) (7)
The HOA composition and rendering process can be combined to one process of channel conversion*:
As example, the complexity is estimated for the predominant sound synthesis and rendering for OHOA=4, M=4, N=8, L=24 and Fs=48 kHz:
It can be seen from the above example, by implementing the invented idea, the complexity can be greatly reduced.
In the MPEG-H 3D Audio model, there is prediction component for some of the input sequences, and near field compensation before rendering for some conditions. This invention is not applied to the conditions when prediction component exists or near field compensation is performed.
In the MPEG-H 3D Audio model, in order to avoid artefacts due to changes of the directions (for direction based synthesis) between successive frames, the computation of the HOA representation from the directional signals is based on the concept of overlap add.
Hence, the HOA representation CDIR(k) of active directional signals is computed as the sum of a faded out component and a faded in component:
Which brings problem for the invented method as the fading in and fading out is done in HOA domain. To solve this problem, the following ideas are conceived:
The above principle can be applied to the vector based synthesis if the fading in and fading out is done in the HOA domain for vector based synthesis.
If the fading in and fading out is done in vector domain for vector based synthesis,
The following embodiments are merely illustrative for the principles of various inventive steps. It's understood that variations of the details described herein will be apparent to others skilled in the art. Those who are skilled in the art will be able to modify and adapt this invention without deviating from the spirit of the invention.
As the first embodiment of this invention, the invented surround sound decoder comprises a bitstream De-multiplexer for unpacking a bitstream into spatial parameters and core parameters; a set of Core Decoder for decoding the core parameters into a set of core signal; a matrix derivation unit for deriving the rendering matrix from the spatial parameters and the layout of the playback speakers; a renderer for rendering of the decoded core signal to playback signals using the rendering matrix.
The bitstream de-multiplexer (200) unpacks the bitstream into spatial parameters and core parameters;
A set of core decoder (201, 202, 203) decode the core parameters into a set of core signal, the decoder can be any existing or new audio codec such as: MPEG-1 Audio Layer III or AAC or HE-AAC or Dolby AC-3 or MPEG USAC standard.
A matrix derivation unit (204) computes the rendering matrix from the spatial parameters and the layout of the playback speakers. The rendering may be derived using part of or all of the following parameters: number of target speaker (5.1, 7.1, 10.1 or 22.2 . . . ), the speakers' positions (distance from the sweet spot, horizontal angle and elevation angle), positions of a spherical modelling (horizontal and elevation angle), HOA order (1st order (4 HOA coefficients), 2nd order (9 HOA coefficients) or 3rd order (16 HOA coefficients) . . . ) and HOA decomposition parameters (direction based decomposition or PCA or SVD).
There are technologies available to derive the rendering matrix from the reconstructed input signal to desired speaker layout, such as VBAP (Vector based amplitude panning) [3] or DBAP (Direction based amplitude panning) [4] or the method described in released reference model for MPEG-H 3D audio for HOA format[2].
As example, if the input signal is 4th order HOA, it has 25 HOA coefficients to cover 25 directions of the spherical space, the play back speaker set up is standard 22.2 channel set up. The rendering matrix maps 25 HOA coefficients to the 24 speaker channels.
If VBAP is used to derive the rendering matrix, VBAP uses a set of 24 unit vectors l1 . . . l24 which point at the loudspeakers of the 22.2 speaker setup and a mesh of triangles are formed between the loudspeakers. For each of the 25 HOA spherical directions p, it is in one of the triangles formed by the speakers. The three speakers which forms the triangle are chosen to be the active speakers, the spherical direction p can be calculated by a linear combination of those loudspeaker vectors,
p=[ln1,ln2,ln3][g1, . . . ,g24]T (15)
In 3, a vector space is formed by 3 vector bases. This leads to the solution
[gn1,gn2,gn3]T=[ln1,ln2,ln3]−1p (16)
The above procedure repeats for all of the 25 HOA spherical directions, all the gain parameters for each spherical directions can be derived and form the rendering matrix D.
The rendering from HOA coefficients to loudspeaker output can be explained in the equation below:
W(k)=DC′(k) (17)
However, in this invention, the fully reconstructed HOA coefficients are not available. Suppose that reconstructed HOA coefficients can be derived according to the equation below:
C′(k)=M−1S′(k) (18)
By combing the equation (17) and equation (18),
Besides the above approach, it is possible to derive the rendering matrix directly using the decoded core signal and the speaker layout information.
Above procedures and equations are given as examples on how to implement the invention, those who are skilled in the art will be able to modify and adapt this invention without deviating from the spirit of the invention.
Finally, the renderer (205) renders the decoded core signal to playback signals using the rendering matrix.
Effect: In this embodiment, the surround audio signals are reconstructed and rendered to the desired speaker layout in one single step, which improves the efficiency and greatly reduces the complexity.
The invented surround sound decoder comprises a bitstream de-multiplexer for unpacking a bitstream into predominant sound parameters, ambiance parameters, channel assignment parameters and core parameters; a set of Core Decoder for decoding the core parameters into a set of core signal; a predominant sound ambiance switch for assigning the decoded core signal to predominant sound and ambiance according to the channel assignment parameters; a matrix derivation unit for deriving the predominant sound rendering matrix from the predominant sound parameters and the layout of the playback speakers; a matrix derivation unit for deriving the ambiance rendering matrix from the ambiance parameters and the layout of the playback speakers; a predominant sound renderer for rendering of the predominant sound to playback signals using the rendering matrix; a ambiance renderer for rendering of the ambiance to playback signals using the rendering matrix; a output signal composition unit for composing the playback signals using the rendered predominant sound and ambient sound;
The bitstream de-multiplexer (300) unpacks the bitstream into predominant sound parameters, ambiance parameters, channel assignment parameters and core parameters;
A set of core decoder (301, 302 and 303) decode the core parameters into a set of core signal, the decoder can be any existing or new audio codec such as: MPEG-1 Audio Layer III or AAC or HE-AAC or Dolby AC-3 or MPEG USAC standard.
The predominant sound/ambiance (304) switch assigns the decoded core signal to predominant sound or ambiance according to the channel assignment parameters.
A rendering matrix computation unit (305) computes the rendering matrix from the predominant sound parameters and the layout of the playback speakers. The detail derivation is skipped in this embodiment and supposes that the rendering matrix derived for the predominant sound is D′.
The predominant sound renderer (306) converts the decoded predominant sound to playback signals using the PS rendering matrix.
WPS(k)=D′CPS(k) (20)
A rendering matrix computation unit (307) computes the rendering matrix from the ambiance parameters and the layout of the playback speakers. The detail derivation is skipped in this embodiment and supposes that the rendering matrix derived for the ambient sound is DAMB.
If the ambient sound was transformed to some other formats or processed in other ways before encoding, before rendering, the signals may be post processed to reconstruct the original ambient sound.
The ambiance renderer (308) converts the decoded ambient sound to playback signals using the ambiance rendering matrix.
WAMB(k)=DAMBCAMB(k) (21)
The output signal composition unit composes the playback signals using the rendered predominant sound and ambient sound.
W(k)=WPS(k)WAMB(k) (22)
Effect: In this embodiment, the predominant sound signals are reconstructed and rendered to the desired speaker layout in one single step, which improves the efficiency and greatly reduces the complexity.
The invented surround sound decoder comprises a Bitstream De-multiplexer for unpacking a bitstream into spatial parameters and core parameters; a set of Core Decoder for decoding the core parameters into a set of core signal; a matrix derivation unit for deriving the rendering matrix from the spatial parameters and the layout of the playback speakers; a windowing unit for performing windowing on the previous frame and current frame decoded core signal; a summation unit for summing the windowed previous frame decoded core signal and windowed current frame decoded core signal to derived the smoothed core signal; a renderer for rendering of the smoothed core signal to playback signals using the rendering matrix;
In order to avoid artefacts across frame boundaries, it is common to apply windowing in audio signal processing.
As shown in
C′(k)=M−1(wincurS′(k)+winpreS′(k−1)) (23)
Effect: In this embodiment, windowing is applied to avoid artefacts across frame boundaries.
The invented surround sound decoder comprises: a Bitstream De-multiplexer for unpacking a bitstream into predominant sound parameters, ambiance parameters, channel assignment parameters and core parameters; a set of Core Decoder for decoding the core parameters into a set of core signal; a predominant sound ambiance switch for assigning the decoded core signal to predominant sound and ambiance according to the channel assignment parameters; a matrix derivation unit for deriving the predominant sound rendering matrix from the predominant sound parameters and the layout of the playback speakers; a matrix derivation unit for deriving the ambiance rendering matrix from the ambiance parameters and the layout of the playback speakers; a windowing unit for performing windowing on the previous frame and current frame predominant sound signal; a summation unit for summing the windowed previous frame predominant sound signal and windowed current frame predominant sound signal to derived the smoothed predominant sound signal; a predominant sound renderer for rendering of the smoothed predominant sound to playback signals using the rendering matrix; a ambiance renderer for rendering of the ambiance to playback signals using the rendering matrix; a output signal composition unit for composing the playback signals using the rendered predominant sound and ambient sound.
As shown in
WPS(k)=D′(wincurCPS(k)winpreCPS(k−1)) (25)
Effect: In this embodiment, windowing is applied to ensure a continuous and smooth evolution of the sound field across the frame boundaries.
As shown in
In order to avoid artefacts across frame boundaries, it is common to apply windowing in audio signal processing.
Suppose that in embodiment 1, windowing cannot be applied to the decoded core signal, because the decoded core signals in previous frame and current frame have different spatial directions, the windowing has to be applied to the reconstructed HOA coefficients.
Then equation (17) would be revised as:
As shown in
For the windowing & rendering (606) for the previous frame decoded core signal, the previous frame rendering matrix can be retrieved from previous frame calculation if it is available/stored. If it is not available/stored, the rendering matrix can be computed following the same way as (604) but using previous frame spatial parameters and speaker layout information.
Another method is shown in
Effect: In this embodiment, windowing is applied to avoid artefacts across frame boundaries.
As shown in
a matrix derivation unit (705) for deriving the predominant sound rendering matrix for current frame predominant sound signal from the predominant sound parameters and the layout of the playback speakers; a windowing and rendering (706) unit for performing windowing and rendering on the current frame predominant sound signal; a windowing and rendering unit (707) for performing windowing and rendering on the previous frame predominant sound signal; an addition unit (708) for adding the previous frame rendered predominant sound and current frame predominant sound to form the rendered predominant sound; a matrix derivation unit (709) for deriving the ambiance rendering matrix from the ambiance parameters and the layout of the playback speakers; a ambiance renderer (710) for rendering of the ambiance to playback signals using the rendering matrix; a output signal composition unit (711) for composing the playback signals using the rendered predominant sound and ambient sound;
Suppose that in embodiment 2, windowing cannot be applied to the decoded predominant sound signal, because the predominant sound signals in previous frame and current frame have different spatial directions, the windowing has to be applied to the reconstructed HOA coefficients.
Equation (20) would be revised as:
As shown in
For the PS windowing & rendering (707) for the previous frame predominant sound, the previous frame PS matrix can be retrieved from previous frame calculation if it is available/stored, if it is not available/stored, the PS rendering matrix can be computed following the same way as (705) but using previous frame spatial parameters and speaker layout information.
Another method is shown in
Effect: In this embodiment, windowing is applied to ensure a continuous and smooth evolution of the sound field across the frame boundaries.
The invented surround sound decoder comprises: a Bitstream De-multiplexer for unpacking a bitstream into rendering flag, predominant sound parameters, ambiance parameters, channel assignment parameters and core parameters; a set of Core Decoder for decoding the core parameters into a set of core signal; a predominant sound ambiance switch for assigning the decoded core signal to predominant sound and ambiance according to the channel assignment parameters; a matrix derivation unit for deriving the predominant sound rendering matrix from the predominant sound parameters and the layout of the playback speakers utilizing the computation method specified by the rendering flag; a matrix derivation unit for deriving the ambiance rendering matrix from the ambiance parameters and the layout of the playback speakers; a predominant sound renderer for rendering of the predominant sound to playback signals using the rendering matrix; a ambiance renderer for rendering of the ambiance to playback signals using the rendering matrix; a output signal composition unit for composing the playback signals using the rendered predominant sound and ambient sound;
In this embodiment, in the bitstream, there is a rendering flag to indicate whether some other data exists in the bitstream which makes the invented idea not practical to be implemented.
In the bitstream, when there is only PS parameter data, ambiance parameter data, channel assignment parameters data, and core coder data, it is recommended to use the invented idea to achieve low complexity composition and rendering, therefore the rendering flag LC_RENDER_FLAG is set to 1.
In the bitstream, when there is prediction data and near field compensation data, which make it not practical to use the invented idea, it is recommended to use the conventional decoding, composition and rendering tools, therefore the rendering flag LC_RENDER_FLAG is set to 0.
The bitstream de-multiplexer (901) unpacks the bitstream into LC_RENDER_FLAG and other parameters;
If LC_RENDER_FLAG is equal to 1, the invented decoder (902) is selected to perform decoding, composition and rendering to achieve low complexity solution.
If LC_RENDER_FLAG is equal to 0, the conventional decoder (903) is selected to perform decoding, composition and rendering.
Effect: In this embodiment, the incompatibility problem of bitstream is solved.
In this embodiment, the encoder comprises a spatial encoder which analyses the input signal and encodes the input signal into the spatial parameters and the N generated signals; a set of core encoders which encode the N generated signals into a set of core parameters; a bitstream multiplexer which packs the spatial parameters and core parameters into a bitstream.
The invented surround sound decoder comprises a bitstream De-multiplexer for unpacking a bitstream into spatial parameters and core parameters; a set of Core Decoder for decoding the core parameters into a set of core signal; a matrix derivation unit for deriving the rendering matrix from the spatial parameters and the layout of the playback speakers; a renderer for rendering of the decoded core signal to playback signals using the rendering matrix.
The spatial encoder (1001) analyses the input signal and encodes the input signal into the spatial parameters and the N generated signals.
The spatial encoding may be based the analysis of the audio scene, to decide how many sound sources or audio objects in the input audio scene, so as to determine how to extract and encode the sound sources or audio objects. As example, it may be determined that Principal Component Analysis (PCA) is used to extract the sound sources or audio objects and N sound sources are extracted and encoded. During this process, the PCA parameters and the N audio signals are derived. The PCA parameters and N generated audio signals are encoded and transmitted to decoder side.
The generated signal may be derived according to the following equation:
S(k)=MC(k) (28)
The set of core encoders (1002, 1003, and 1004) encode the N generated signals into a set of core parameters, the encoder can be any existing or new audio codec such as: MPEG-1 Audio Layer III or AAC or HE-AAC or Dolby AC-3 or MPEG USAC standard.
The bitstream multiplexer (1005) packs the spatial parameters and core parameters into a bitstream.
The corresponding decoder can be the decoder illustrated in
In the ninth embodiment of this invention, the encoder comprises a audio scene analysis and spatial encoder which analyses the input signal and encodes the input signal into a number of predominant sound and a number of ambiance sound, and also the corresponding predominant sound parameters and ambiance parameters; a channel assignment unit which assigns the core encoders to encode the predominant sound and ambiance sound; a set of core encoders which encode the N channel audio signals, including both the predominant sound and ambiance sound into a set of core parameters; a bitstream multiplexer which packs the predominant sound parameters, ambiance parameters, channel assignment information and core parameters into a bitstream.
The invented surround sound decoder comprises a bitstream de-multiplexer for unpacking a bitstream into predominant sound parameters, ambiance parameters, channel assignment parameters and core parameters; a set of Core Decoder for decoding the core parameters into a set of core signal; a predominant sound ambiance switch for assigning the decoded core signal to predominant sound and ambiance according to the channel assignment parameters; a matrix derivation unit for deriving the predominant sound rendering matrix from the predominant sound parameters and the layout of the playback speakers; a matrix derivation unit for deriving the ambiance rendering matrix from the ambiance parameters and the layout of the playback speakers; a predominant sound renderer for rendering of the predominant sound to playback signals using the rendering matrix; a ambiance renderer for rendering of the ambiance to playback signals using the rendering matrix; a output signal composition unit for composing the playback signals using the rendered predominant sound and ambient sound;
the encoder comprises a audio scene analysis and spatial encoder which analyses the input signal and encodes the input signal into a number of predominant sound and a number of ambiance sound, and also the corresponding predominant sound parameters and ambiance parameters; a channel assignment unit which assigns the core encoders to encode the predominant sound and ambiance sound; a set of core encoders which encode the N channel audio signals, including both the predominant sound and ambiance sound into a set of core parameters; a bitstream multiplexer which packs the predominant sound parameters, ambiance parameters, channel assignment information and core parameters into a bitstream.
The audio scene analysis and spatial encoder (1101) analyses the input signal and encodes the input signal into a number of predominant sound and a number of ambiance sound, and also the corresponding predominant sound parameters and ambiance parameters.
The audio scene analysis and spatial encoding conducts the analysis of the audio scene, to decide how many sound sources or audio objects in the input audio scene, so as to determine how to extract and encode the sound sources or audio objects. As example, it may be determined that Principal Component Analysis (PCA) is used to extract the sound sources or audio objects and M sound sources are extracted and encoded. During this process, the PCA parameters and the M predominant sound signals are derived. The PCA parameters and M predominant audio signals are encoded and transmitted to decoder side.
The generated signal may be derived according to the following equation:
CPS(k)=MC(k) (29)
The audio scene analysis and spatial encoder may determine that the residual between the input signal and the synthesis signal from predominant sound signal, which may be named as the ambient signal, should also be extracted and encoded. The spatial encode extracts the ambient signal from the difference between the input signal and the synthesis signal from predominant sound signal. The synthesis of the predominant sound may be done according to the equation below:
C′(k)=M−1CPS(k) (30)
The ambient signal may be derived according to the equation below:
CAMB(k)=C(k)−C′(k) (31)
Among all the ambient signals, it was determined which of the ambient signals should be encoded. The ambient signals may be processed or transformed to other formats, so that they can be more efficiently encoded.
The channel assignment unit (1101) assigns the core encoders to encode the predominant sound and ambiance sound The information about the choice of the ambient HOA coefficient sequences to be transmitted, about their assignment and about the assignment of the predominant sound signals to the given N channels are transmitted to the decoder side.
The set of core encoders (1102, 1103, and 1104) encode the M predominant sound signals and (N−M) ambient signals into a set of core parameters, the encoder can be any existing or new audio codec such as: MPEG-1 Audio Layer III or AAC or HE-AAC or Dolby AC-3 or MPEG USAC standard.
The bitstream multiplexer (1105) packs the predominant sound parameters, ambiance parameters, channel assignment information and core parameters into a bitstream.
The corresponding decoder can be the decoder illustrated in
The audio scene analysis and spatial encoder (1201) analyses the input signal and encodes the input signal.
The audio scene analysis and spatial encoding conducts the analysis of the audio scene, to decide whether the generated parameters are compatible with the invented idea, and reflect the decision by transmitting the LC_RENDER_FLAG.
If all the generated parameters such as PS parameter data, ambiance parameter data, channel assignment parameters data, and core coder data are compatible with the invented idea, it is recommended to use the invented idea to achieve low complexity composition and rendering in the decoder side, therefore the rendering flag LC_RENDER_FLAG is set to 1.
If not all the generated parameters are compatible with the invented idea, such as there are prediction data and near field compensation data, which make it not practical to use the invented idea in the decoder side, it is recommended to use the conventional decoding, composition and rendering tools, therefore the rendering flag LC_RENDER_FLAG is set to 0.
Effect: In this embodiment, the incompatibility problem of bitstream is solved.
This is a continuation application of International Application No. PCT/JP2014/059700, with an international filing date of Mar. 26, 2014, the content of which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
20120155653 | Jax | Jun 2012 | A1 |
20120243690 | Engdegard | Sep 2012 | A1 |
20120259442 | Jin et al. | Oct 2012 | A1 |
20130010971 | Batke et al. | Jan 2013 | A1 |
20130132098 | Beack | May 2013 | A1 |
Number | Date | Country |
---|---|---|
2008046530 | Apr 2008 | WO |
2010013450 | Feb 2010 | WO |
2013171083 | Nov 2013 | WO |
Entry |
---|
V. Pulkki, “Virtual Sound Source Positioning Using Vector Base Amplitude Panning”, J. Audio Eng. Soc., vol. 45, No. 6, Jun. 1997, pp. 456-466. |
T. Lossius et al., “DBAP—Distance-Based Amplitude Panning”, International Computer Music Conference (ICMC), Montreal, 2009, pp. 1-4. |
International Search Report (ISR) dated Aug. 26, 2014 in International (PCT) Application No. PCT/JP2014/059700. |
International Preliminary Report on Patentability (IPROP) dated Jun. 8, 2016 in International (PCT) Application No. PCT/JP2014/059700. |
Number | Date | Country | |
---|---|---|---|
20170011750 A1 | Jan 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2014/059700 | Mar 2014 | US |
Child | 15274415 | US |