The present document contains subject matter related to Japanese Patent Applications JP 2006-324775 and JP 2007-272856 filed in the Japanese Patent Office on Nov. 30, 2006, and Oct. 19, 2007, respectively, the entire contents of which being incorporated herein by reference.
1. Field of the Invention
The present invention relates to a playback method and apparatus, a program, and a recording medium for decode-processing and playing back coded audio data which is transmitted with stereo process information intermittently multiplexed into coded information of a monaural audio signal.
2. Description of Related Art
Playback apparatuses are known which are supplied with a monaural audio signal and stereo process information, and which generate stereo audio signals by stereo processing the monaural audio signal on the basis of the stereo process information.
A typical stereo process such as above which is based on a monaural audio signal and stereo process information will now be described with reference to the drawings.
In
In the configuration of
Meanwhile, in a discontinuous frame playback, such as a fast-forwarding playback based on a playback by decimating frames (transmission units), or in a playback from an arbitrary frame, multiplexed coded information may drop out in some cases. When coded audio data is supplied from an arbitrary frame (transmission unit) due to such a discontinuous frame playback or the like, the absence of usable stereo process information may occur. For example, when the input starts at a position corresponding to the transmission unit #2 of
In the apparatus of
Accordingly, if the data is supplied at the position corresponding to the transmission unit #2 of
Here, a specific example of a coding system will be described below, by which part of coding information for the stereo process and the like is multiplexed into a monaural audio signal to be transmitted.
Audio data coded by, e.g., an HE AAC (High Efficiency Advanced Audio Coding, International Standard ISO/IEC 14496-3) coding system, particularly, an HE AAC v2 (version 2) coding system, is transmitted with part of coded information required for decoding, multiplexed thereinto. This HE AAC v2 coding system is configured by combining three technologies, i.e., an advanced audio coding (AAC) process, a spectral band replication (SBR) process, and a parametric stereo (PS) process. Coded information for the SBR process and the PS process is transmitted as partially multiplexed.
The AAC process is a coding process in an audio compression algorithm standardized by MPEG (Moving Picture Experts Group) audio. The SBR process is a coding process for band extension by dividing an input signal into a plurality of subbands, and replicating high sound frequency bands from lower frequency bands thereof. The PS process is a coding process for spatial coding using spatial information and the like required for generating stereo signals from a monaural signal.
Coded audio data which is coded by the above-mentioned HE AAC v2 system includes AAC core coded information equivalent to a monaural audio data coded by the above-mentioned AAC coding system, the coded information for the above-mentioned SBR process, and the coded information for the above-mentioned PS process. The coded information for the SBR process includes coded information (sbr header) which is multiplexed and intermittently transmitted, and coded information (sbr data) which is always transmitted. For decoding the sbr data (SBR data), the sbr header (SBR header) is required. As to the sbr header (SBR header), its content can be changed under a specific rule, and also its transmission timing is subject to an operational practice. The coded information (ps data) for the PS process is transmitted as contained in an extended area of the sbr data (SBR data). Thus, for decoding the ps data (PS data), the sbr header (SBR header) information is likewise required. Namely, the sbr header (SBR header) is necessary stereo process information required for acquiring the ps data (PS data) for the stereo process.
As shown in
Upon arrival of a frame containing multiplexed SBR header SH, the above-mentioned SBR data SD and the PS data contained in its extended area are decoded using this SBR header SH. Then, a “complete” decoding process (including the stereo process) using these SBR data and PS data is performed to generate output stereo audio signals. In the decoding process for the above-mentioned HE AAC v2 coded audio data, the above-mentioned AAC decoding process is performed, and then in the above-mentioned SBR process, band division and generation of high frequency (HF) components are performed, after which stereo signals are generated from the band-divided monaural signals on the basis of spatial information coded in the above-mentioned PS process, and finally output stereo audio signals are generated by a band synthesis process in the SBR process.
In the HE AAC v2 coding system, when part of the SBR header SH differs from that contained in a previous frame, an initialization for the SBR process needs to be performed. By the initialization for the SBR process, state variables (delay signals) in QMF analyzers/synthesizers, a hybrid analyzer, and the like, later-described, are initialized. A state variable (delay signal) herein used is intended to mean data (signal) held at a delay element within a filter. In a filtering process, a delay occurs within a period from the input to the output of a signal in accordance with a filtering length, and the state variable means this delay signal.
By the way, monaural audio data acquired by decoding the AAC coded information which is coded by the HE AAC v2 coding system is up-sampled by carrying out QMF analysis and QMF synthesis in the SBR process. For example, the apparatus SBR-processes the monaural audio data after the AAC decoding, at a sampling rate of 24 kHz, whereby the apparatus outputs audio data whose sampling rate is 48 kHz.
In
If stereo process information is acquired from the above-mentioned PS coded information (PS data), the selector switches 22, 25 are switched for connection to selectable terminals C. Signals from the selectable terminal C of the selector switch 25 are delivered to a hybrid analyzer 27. The hybrid analyzer 27 further band-divides low frequency (LF) signals of the supplied band-divided signals, and supplies resultant signals to a signal de-correlator 29 and a stereo processor 30. The signal de-correlator 29 de-correlates the supplied signals, makes an acoustic adjustment thereon, and supplies resultant signals to the stereo processor 30. The stereo processor 30 generates Lch, Rch stereo signals from the supplied band-divided signals and stereo process information. For the generated Lch, Rch stereo signals, hybrid synthesizers 31, 32 of the respective channels band-synthesize the band-divided signals obtained by the above-mentioned hybrid analyzer 27, and further, QMF synthesizers 33, 34 band-synthesize the band-divided signals obtained by the above-mentioned QMF analyzer 21, to generate Lch, Rch stereo output audio signals. The Lch audio signal from the QMF synthesizer 33 is delivered to a selector switch 36 and an output terminal 37. The Rch audio signal from the QMF synthesizer 34 is delivered to the selector switch 36, where one of this Rch audio signal and the signal from the QMF synthesizer 33 is selected, and the selected signal is delivered to an output terminal 38.
If multiplexed information such as the above-mentioned stereo process information is not transmitted, the selector switches 22, 25, 35, 36 of
In
In step S104, a QMF band division process is performed by, e.g., the above-mentioned QMF analyzer 21. In the following step S105, it is judged whether or not the multiplexed coded information is already decoded, and if YES, the process proceeds to step S106, whereas if NO, the process proceeds to step S113. In step S106, an HF signal generation process is performed using the multiplexed HF generation coded information (already decoded information) by, e.g., the above-mentioned HF generator 23, and then, in the following step S107, it is judged whether or not the PS process is to be performed.
If it is judged YES (the PS process is to be performed) in step S107, control proceeds to step S108, where a hybrid analysis process is performed. Then, in step S109, a stereo signal generation process based on the spatial information is performed, and further in step S110, a hybrid synthesis process is performed. Thereafter, control proceeds to step S111. These processes correspond to, e.g., processing extending from the processing performed by the hybrid analyzer 27 to the processing performed by the hybrid synthesizers 31, 32 of
In step S111, an Lch QMF band synthesis process is performed, and in step S112, an Rch QMF band synthesis process is performed, and resultant audio signals are outputted. Furthermore, in the above-mentioned step S113, the Lch QMF band synthesis process is performed, and in step S114, the monaural signal is replicated, as necessary, to generate stereo signals, and resultant audio signals are outputted. These processes correspond to, e.g., the processing performed by the QMF synthesizers 33, 34 via the selector switches 22, 35, 36 of the above-mentioned
As related-art technologies, Published translation of International Patent Application (KOHYO) No. 2004-535145 (Patent Reference 1) and Japanese Patent Application Publication (KOKAI) No. JP 2006-085183 (Patent Reference 2). disclose a technology for generating stereo audio signals by stereo-processing a monaural audio signal on the basis of stereo process information, and ISO/IEC 14496-3: 2005, Information technology—Coding of audio-visual objects, —Part 3: Audio (Non-patent Reference 1) discloses a standard of the above-mentioned HE AAC (High Efficiency Advanced Audio Coding) coding system.
By the way, in a playback from an arbitrary frame by, e.g., playing back discontinuous frames such as playing back of the above-mentioned frame decimation, the internal state variables are initialized, and thereafter, when partially multiplexed coded information such as the stereo process information is supplied, the updating of these state variables is started. Consequently, abnormal sounds occur due to the influence of the filtering delays and the like.
For example, in the configuration of the above-mentioned
Furthermore, in the case of the configuration of the above-mentioned
In the case where frames are played back discontinuously in this way, the state variables (delay signals) of the filters within the playback apparatus and the input audio data coded by the HE AAC v2 coding system result in discontinuity. Thus, the playback apparatus needs to be initialized (including SBR process initialization) to initialize its internal state variables. These state variables (delay signals) within the playback apparatus include state variables of the QMF analyzer 21, QMF synthesizers 33, 34, and hybrid analyzer 27, and these state variables are set to 0 when initialized. Since the SBR coded information/PS coded information cannot be decoded until an SBR header SH is transmitted, the playback apparatus switches the selector switches 22, 35, 36 to their selectable terminals A to allow the monaural audio signal from the AAC core decoder 13 to be up-sampled through processing by the QMF analyzer 21 and the Lch QMF synthesizer 33, to output resultant output audio signals to the stereo left and right channels. When an SBR header SH is transmitted, the SBR coded information and the PS coded information are decoded for the first time after the initialization of the playback apparatus, and the SBR process and the PS process are executed. Since the QMF analyzer 21 and the Lch QMF synthesizer 33 perform their processing for up-sampling even before the SBR header SH is transmitted, their state variables are kept updated. Meanwhile, the state variable of each of the hybrid analyzer 27 and the Rch QMF synthesizer 34 is in an initialized state. This state exerts influence on the downstream processing, thereby causing abnormal sounds in the output audio signals.
In
For avoiding the disadvantage, it is conceivable to constantly monitor multiplexed coded information. In this case, the multiplexed information is transmitted simultaneously with normal coded information. Thus, all the coded information needs to be decoded, and this prevents reduction of the processing volume.
In view of the above circumstances, it is desirable to provide a playback apparatus and method, a program, and a recording medium, all being capable of effectively preventing negative influence (occurrence of abnormal sounds and the like) from being exerted on output audio signals, the negative influence being caused by filtering delays and the like that occur when required coded information is supplied from a state in which internal state variables are as initialized, in a case where a playback is performed from an arbitrary position because multiplexed coded information and information (SBR header and the like) required for decoding are transmitted intermittently.
In one embodiment of the present invention, in decode-processing and playing back coded audio data which is transmitted with necessary stereo process information required for a stereo process intermittently multiplexed into coded information of a monaural audio signal, it is arranged to output stereo audio signals using the monaural audio signal if the necessary stereo process information is not supplied, to start updating stereo variables within filters, and to output the stereo audio signals using the monaural audio signal until all the state variables are updated if the necessary stereo process information is supplied, and to perform the stereo process based on stereo process information acquired by the necessary stereo process information, on the monaural audio signal to generate and output stereo audio signals if all the state variables within the filters are updated.
Here, it is preferable to perform the above-mentioned stereo process on band-extended monaural audio signals.
Furthermore, it is preferable to divide the above-mentioned monaural audio signal into at least two subbands by a band division filtering process, up-sample resultant band-divided monaural audio signal by a band synthesis filtering process, and output the stereo audio signals using the monaural audio signal, if the above-mentioned necessary stereo process information is not supplied. If the above-mentioned necessary stereo process information is supplied, it is preferable to process a state variable within a filter for the monaural audio signal as filtering state variables for the stereo audio signals.
Furthermore, the above-mentioned coded audio data has AAC core coded information equivalent to the monaural audio data based on an HE AAC (High Efficiency Advanced Audio Coding) coding system, coded information for an SBR (Spectral Band Replication) process, and coded information for a PS (Parametric Stereo) process. The coded information for the above-mentioned SBR process includes SBR data (sbr data) being coded information which is always transmitted, and an SBR header (sbr header) being coded information which is intermittently transmitted as multiplexed. PS data (ps data) being the coded information for the above-mentioned PS process is transmitted as contained in an extended area of the above-mentioned SBR data. The SBR header is the above-mentioned necessary stereo process information required for decoding the above-mentioned SBR data.
These and other features and aspects of the invention are set forth in detail below with reference to the accompanying drawings in the following detailed description of the embodiments.
A specific embodiment of the present invention will be described below in detail with reference to the accompanying drawings.
A monaural audio signal is supplied to an input terminal 41 and stereo process information is supplied to an input terminal 42, of
In a case where an input signal (a monaural audio signal M and intermittent stereo process information S) such as shown in the above-mentioned
If the usable stereo process information is available in this way, the switch 43X is connected to a selectable terminal B, the switches 61, 62 are connected to selectable terminals C, and the selector switches 53X, 54X are switched for connection to selectable terminals C. Under this condition, the monaural audio signal supplied from the input terminal 41 is band-divided by the band divider 44, and stereo signals are generated by the stereo processor 45 on the basis of the stereo process information. Then, the generated stereo signals are band-synthesized by the band synthesizers 51, 52 of the respective channels, and resultant Lch, Rch stereo audio signals are outputted from the output terminals 55, 56, respectively.
Meanwhile, when coded audio data is supplied from an arbitrary frame (transmission unit) due to a discontinuous frame playback such as a fast-forwarding playback, or the like, the absence of usable stereo process information may occur. For example, when the input starts at the position corresponding to the transmission unit #2 of
Then, when the data at the position corresponding to the transmission unit #5 of the above-mentioned
Namely, in the embodiment of the present invention, when coded audio data, which is transmitted with stereo process information intermittently multiplexed into coded information of a monaural audio signal, is to be decode-processed and played back, if the stereo process information is not supplied, it is arranged to output stereo signals using the monaural audio signal, whereas if the stereo process information is supplied, it is arranged to start updating state variables within filters, and to output the stereo audio signals using the monaural audio signal until all the state variables are updated. Then, if all the state variables within the filters are updated, it is arranged to perform a stereo process based on the stereo process information, on the monaural audio signal to generate and output stereo audio signals.
Next, a configuration example of a playback apparatus will be described with reference to
A coded audio stream is supplied, by transmission, to an input terminal 11 of
Furthermore, if the HF generation coded information (SBR data) and the PS coded information (PS data) are contained, an audio signal to be decoded by an AAC core decoder 13 is outputted at a half sampling rate of the final output audio signals. Thus, by combining a QMF analyzer 21 with QMF synthesizers 33, 34, the audio signal is up-sampled. For example, if an output signal from the AAC core decoder 13 is a signal whose sampling frequency is 24 kHz, the output audio signals from the QMF synthesizers 33, 34 are signals whose sampling frequency is 48 kHz.
The coded audio data from the input terminal 11 is delivered to a bitstream payload deformatter, that is, a payload deformatter 12 to be separated into the AAC core coded information to the AAC core decoder 13, and into the HF generation coded information/PS coded information.
The HF generation coded information/PS coded information is delivered to an SBR processor 20, and then delivered to a Huffman decoder/dequantizer 15 via a bit stream parser, that is, a parser 14 of the SBR processor 20. At the Huffman decoder/dequantizer 15, HF signal generation information, envelope adjustment information, and stereo process information are extracted. The former two items of the extracted information are delivered to an HF generator 23 and an envelope adjuster 24, respectively, whereas the latter one item is delivered to a stereo processor 30 via an Lch replication process judgment section 16. The parser 14 of the SBR processor 20 acquires multiplexed information such as the HF generation coded information and the like from the payload deformatter 12, checks their content, judges whether or not an SBR process initialization is needed, and if so, outputs an initialization control signal from a terminal 14t, so that an SBR process initialization is performed on the relevant sections as later described. Furthermore, the Lch replication process judgment section 16 judges that multiplexed coded information is acquired for the first time after the SBR process initialization, and outputs a judgment output from a terminal 16t, so that the Rch QMF synthesizer 34 performs a later-described process of replicating a state variable (delay signal) of the Lch QMF synthesizer 33.
The AAC core decoder 13 decodes the supplied AAC core coded information, and generates an AAC core monaural audio signal. The decoder 13 delivers the generated monaural audio signal to the QMF analyzer 21 of the SBR processor 20. The QMF analyzer 21 band-divides the monaural audio signal into sixty-four bands, and delivers resultant band-divided signals to a selector switch 22X. If the HF generation coded information (SBR data) is supplied, the selector switch 22X is switched for connection to a selectable terminal B, C, so that the signals from the QMF analyzer 21 are delivered to the HF generator 23. The HF generator 23 generates HF signals, and the envelope adjuster 24 makes an envelope adjustment. The envelope adjuster 24 delivers resultant signals to a hybrid analyzer 27 and a selector switch 35X.
If stereo process information is acquired from the above-mentioned PS coded information (PS data), the selector switch 22X is switched for connection to the selectable terminal C. The hybrid analyzer 27 further band-divides LF signals of the supplied band-divided signals, and supplies resultant further band-divided signals to a signal de-correlator 29 and the stereo processor 30, together with HF ones of the previously band-divided signals. The signal de-correlator 29 de-correlates the supplied signals, makes an acoustic adjustment thereon, and supplies resultant signals to the stereo processor 30. The stereo processor 30 generates Lch, Rch stereo signals from the supplied band-divided signals and the stereo process information. The generated stereo signals of the respective channels are delivered to hybrid synthesizers 31, 32 of the respective channels via switches 17, 18, respectively. The hybrid synthesizers 31, 32 band-synthesize the divided bands obtained by the above-mentioned hybrid analyzer 27. Resultant signals from the hybrid synthesizer 31 are delivered to the QMF synthesizer 33 and a selector switch 19 via the selector switch 35X, whereas resultant signals from the hybrid synthesizer 32 are delivered to the QMF synthesizer 34 via the selector switch 19. The QMF synthesizers 33, 34 of the respective channels band-synthesize the divided bands obtained by the above-mentioned QMF analyzer 21, to generate Lch, Rch stereo output audio signals, respectively. The Lch audio signal from the QMF synthesizer 33 is delivered to a selector switch 36X and an output terminal 37. The Rch audio signal from the QMF synthesizer 34 is delivered to the selector switch 36X, where one of this Rch audio signal and the signal from the QMF synthesizer 33 is selected, and the selected signal is delivered to an output terminal 38.
Here, operations of various sections including switching of the playback apparatus of
When compared with the configuration of the playback apparatus shown in the above-mentioned
A case will be described where coded audio data is supplied from an arbitrary frame (transmission unit) as mentioned above, in the playback apparatus of
Here, in the state in which the usable stereo process information (PS data) cannot be acquired and hence the internal state variables are initialized, the selector switches 22X, 35X, 36X are switched for connection to the selectable terminals A. Under this condition, the QMF analyzer 21 band-divides the monaural audio signal from the AAC core decoder 13, and the Lch QMF synthesizer 33 band-synthesizes the band-divided signals to output identical audio signals from the left and right channels.
Then, when multiplexed coded information is transmitted, the selector switches 22X, 35X, 19, 36X are switched for connection to their selectable terminals B, C. In this case, the terminals B are selected when the coded information contains only band extension coded information, whereas the terminals C are selected when the coded information contains the band extension coded information (HF generation information) and stereo process information.
A case will be described below where an SBR header SH being the necessary stereo process information is transmitted, whereby the playback apparatus decodes SBR data SD, and thus acquires stereo process information (PS data). When the coded information (SBR data) for the SBR process and the stereo process information (PS data) are acquired, the apparatus becomes ready to deliver a signal to the Rch QMF synthesizer 34 for the first time. For this reason, when generating output audio signals without considering the state variables (delay signals), the apparatus outputs a state variable initialization signal to the Rch audio signal, thereby causing abnormal sounds. In view of this, in the embodiment of the present invention, the judgment output from the Lch replication process judgment section 16 is used at this timing to replicate the state variable (delay signal) of the Lch QMF synthesizer 33 for the Rch QMF synthesizer 34 in a state variable replication process. Through this operation, a state variable equivalent to the state variable of the Lch QMF synthesizer 33 is set to the Rch QMF synthesizer 34 despite the fact that the playback apparatus were playing back the coded audio data with the selector switches connected to their selectable terminals A until the stereo process information was transmitted. When the above-mentioned replication process is executed, the selector switches 22X, 35X, 19, 36X are switched for connection to selectable terminals F.
Usually, when an irrelevant, arbitrary signal is used as a delay signal during a band synthesis process, unexpected amplification/damping is occurred during the band synthesis process, thereby causing abnormal sounds. In a method according to the present embodiment, any frame from which multiplexed coded information is acquired for the first time after an initialization marks a switching point from monaural output to stereo output, so that even if the state variable (delay signal) of the Lch QMF synthesizer 33 is used as a state variable (delay signal) of the Rch QMF synthesizer 34, no abnormal sounds will occur.
Further, in the stereo process (PS process), in order to apply spatial coded information, the playback apparatus performs band division by the hybrid analyzer 27, a stereo signal generation process based on the de-correlation result from the signal de-correlator 29 and the transmitted spatial information, and hybrid synthesis. Since the hybrid analyzer 27 requiring a delay also performs its process for the first time after multiplexed coded information is decoded, its state variable (delay signal) at the time when the multiplexed coded information is acquired for the first time after the initialization of a variable within the decoder is as initialized, and this influences de-correlation by the signal de-correlator 29, thereby causing abnormal sounds. Namely, the band-divided signals obtained by the QMF analyzer 21 are supplied to the hybrid analyzer 27, and since the state variable (delay signal) of the hybrid analyzer 27 is as initialized, the downstream processing is not performed correctly.
In view of this, in the present embodiment, in order to eliminate this influence, when the hybrid analyzer 27 performs its process for the first time after an initialization, the playback apparatus performs a process of updating Lch, Rch stereo signal generation coefficients for both the hybrid analyzer 27 and the stereo processor 30 in order to update their delay signals. For output, the switches 35X, 19 are switched to the selectable terminals F, so that signals branched before the hybrid analyzer 27 are outputted to the QMF synthesizers 33, 34 of the respective channels.
Specifically, the stereo signals are disconnected by the switches 17, 18 (the switches 17, 18 are turned off) until the state variable (delay signal) of the hybrid analyzer 27 is fully updated. Instead, the signals delivered via the selectable terminals F of the selector switches 22X, 35X are delivered to the Lch QMF synthesizer 33 and to the Rch QMF synthesizer 34 via the selectable terminal F of the selector switch 19. A resultant signal from the Lch QMF synthesizer 33 is outputted from the output terminal 37, whereas a resultant signal from the Rch QMF synthesizer 34 whose state variable is identical with that of the Lch QMF synthesizer 33 is outputted from the output terminal 38 via the selectable terminal F of the selector switch 36X.
The state variable (delay signal) of the hybrid analyzer 27, as clearly described in Section 8.6.4 of the above-cited Non-Patent Reference 1, has a delay by 6 QMF samples. The process of updating the Lch, Rch stereo signal generation coefficients of the stereo processor 30 is required to be performed, since the coefficients are transmitted as difference information, as described in Section 8.6.4.4 of the above-cited Non-Patent Reference 1.
When the state variable (delay signal) of the hybrid analyzer 27 is fully updated, the switches 17, 18 are both turned on (connected to selectable terminals E), so that the Lch, Rch stereo signals from the stereo processor 30 are delivered to the hybrid synthesizers 31, 32, respectively. The selector switches 35X, 19, 36X are switched for connection to selectable terminals E, respectively, so that the signals from the hybrid synthesizer 31 are processed at the QMF synthesizer 33, and a resultant signal is outputted from the output terminal 37 as the Lch stereo audio signal, whereas the signals from the hybrid synthesizer 32 are processed at the QMF synthesizer 34, and a resultant signal is outputted from the output terminal 38 as the Rch stereo audio signal. It is noted that the playback apparatus can connect the switches 17, 18 and the selector switches 35X, 19, 36X to their selectable terminals E by updating the state variable of the Rch QMF synthesizer 34 even while updating the state variable of the hybrid analyzer 27, whereby the apparatus can switch these switches without causing abnormal sounds within its processing of a single frame.
In
In step S104, a QMF band division process is performed by, e.g., the above-mentioned QMF analyzer 21. In the following step S105, it is judged whether or not the multiplexed coded information is already decoded. If YES, the process proceeds to step S106, whereas if NO, the process proceeds to step S113. In step S106, an HF signal generation process is performed using multiplexed HF signal generation coded information (already decoded information) by, e.g., the above-mentioned HF generator 23. In the following step S107, it is judged whether or not the PS process is to be performed.
If it is judged YES (the PS process is to be performed) in step S107, the process goes to step S111 after the PS process is performed in step S120, whereas if it is judged NO (the PS process is not to be performed) in step S107, the process proceeds directly to step S111. A specific example of the PS process in step S120 will be described later with reference to
In step S111, an Lch QMF band synthesis process is performed, and in step S112, an Rch QMF band synthesis process is performed. Then, resultant audio signals are outputted. Furthermore, in the above-mentioned step S113, the Lch QMF band synthesis process is performed, and in step S114, a monaural signal is replicated, as necessary, to generate stereo signals. Then, resultant audio signals are outputted. These processes correspond to, e.g., the processing performed by the QMF synthesizers 33, 34 via the selector switches 35X, 36X, and the like of the above-mentioned
In these specific examples shown in
Next,
In step 3115, it is judged whether or not the state variable (e.g., the state variable of the QMF synthesizer 34 of
In these specific examples shown in
The Lch, Rch stereo output audio signals shown in
According to the above-described embodiment of the present invention, in decode-processing and playing back coded audio data which is transmitted with part of coded information containing stereo process information multiplexed into a monaural audio signal, it is arranged to initialize internal state variables (delay signals) under a state in which the above-mentioned multiplexed coded information which is usable is not supplied, and to output stereo audio signals using the monaural audio signal. When the above-mentioned multiplexed coded information is supplied in a state in which the above-mentioned internal state variables are initialized, it is arranged to start updating the internal state variables, and to output the stereo audio signals using the monaural audio signal until all the state variables are updated. When all the above-mentioned state variables are updated, it is arranged to perform a signal process including a stereo process based on the above-mentioned multiplexed coded information, on the above-mentioned monaural audio signal to generate and output stereo audio signals.
Namely, in decode-processing and playing back coded audio data which is transmitted with part of coded information containing stereo process information intermittently multiplexed into a monaural audio signal, if the stereo process information is not supplied, it is arranged to output stereo audio signals using the monaural audio signal. If the stereo process information is supplied, it is arranged to start updating internal state variables within filters, and to output the stereo audio signals using the monaural audio signal until all the state variables are updated. If all the state variables within the filters are updated, it is arranged to perform a stereo process based on the stereo process information, on the monaural audio signal to generate and output stereo audio signals.
In another embodiment of the present invention, there is provided a coded audio data playback apparatus. The playback apparatus includes decoding means, information acquisition means, audio signal band division means, high frequency information generation means, stereo signal generation means, subband-divided signal synthesis means, and output audio signal generation means. The decoding means decodes coded audio data which is transmitted with part of coded information multiplexed thereinto. The information acquisition means acquires information for generating output audio signals from part of the transmitted coded information even if the part of the multiplexed coded information is not transmitted. The audio signal band division means performs division into at least two subbands to generate band-divided signals. The high frequency information generation means generates high frequency information for the generated band-divided signals when band extension coded information is transmitted. The stereo signal generation means causes subband-divided signal generation means requiring a delay to generate subband-divided signals with regard to the band-divided signal, and generates stereo signals from a monaural signal based on spatial coded information, when the spatial coded information is transmitted. The subband-divided signal synthesis means synthesizes the subband-divided signals into the band-divided signals. The output audio signal generation means causes audio signal synthesis means requiring a delay to synthesize the synthesized band-divided signals to generate output audio signals. In the playback apparatus, in a playback from a discontinuous position (frame), there are provided subband signal generation means, state variable initialization means, playback continuing means, and monaural signal state variable utilization means. The subband signal generation means requires a delay of the coded audio data playback apparatus. The state variable initialization means initializes state variables (delay signals) of the audio signal synthesis means. The playback continuing means continues the playback after the above-mentioned initialization. The monaural signal state variable utilization means performs, in decoding the spatial coded information when the multiplexed coded information is transmitted for a first time after the above-mentioned initialization, and in generating the stereo signals from the monaural signal, a process using a state variable (delay signal) for the monaural signal as the state variables (delay signals) of audio signal synthesis means for the generated stereo signals.
Furthermore, there are also provided pseudo-subband-divided signal generation means, replication and output means, updating means, and stereo signal generation performing means. The pseudo-subband-divided signal generation means performs, in decoding the spatial coded information when the multiplexed coded information is transmitted for a first time after such an initialization of delay signals of the coded audio data playback apparatus, and in generating the stereo signals from the monaural signal, subband-divided signal generation in a pseudo manner until all state variables (delay signals) of the subband-divided signal generation means are updated. The replication and output means replicates monaural band-divided signals supplied to the subband-divided signal generation means during a period in which the pseudo-subband-divided signal generation means is operating in the pseudo manner, and outputs stereo band-divided signals to the audio signal synthesis means. The division coefficient updating means updates division coefficients to be updated by a difference of the stereo signal generation means for generating the stereo signals from the monaural signal, during the period in which the pseudo-subband-divided signal generation means is operating in the pseudo manner. The stereo signal generation performing means generates the stereo signals from the monaural signal on the basis of the spatial coded information after all the delay signals of the subband-divided signal generation means are updated.
Namely, in performing a normal playback from an arbitrary frame by a decoding process for coded audio data which is transmitted with part of coded information multiplexed thereinto, it is arranged to initialize a delay signal of a decoder, to divide into at least two subbands, even if the coded information which is transmitted as multiplexed is absent, and to perform up-sampling by a band synthesis filtering process requiring a delay to replicate the monaural audio signal, whereby the replicated monaural audio signals can be outputted as stereo audio signals, whereas it is arranged, when the coded information is transmitted for the first time and the spatial coded information is thus effective, to process a delay signal for an audio signal band synthesis process for the monaural signal as delay signals for audio signal band synthesis processes for the stereo signals, whereby occurrence of abnormal sounds in the output audio signals due to delays in QMF synthesis filtering processes.
Furthermore, the delay signal updating process and the output signal replication process are performed until all the delay signals of at least the subband division filtering process are updated so that the delay in the subband division filtering process will not affect the output audio signals. Then, after all the delay signals are updated, a normal playback process is performed, whereby occurrence of abnormal sounds in the output audio signals due to the delays caused by the filtering processes can be prevented.
As a result of these arrangements, even in coded audio data requiring a spatial decoding process, which is transmitted with part of coded information multiplexed thereinto, a playback from an arbitrary position can be realized without causing abnormal sounds.
It is noted that the present invention is not limited only to the above-described embodiment, but can, of course, be modified in various ways without departing from the scope and spirit of the present invention. For example, in the above-described embodiment of the present invention, a playback apparatus or a playback method having a hardware configuration has been disclosed. However, the above-described process steps can be realized by software, i.e., by causing a computer using a CPU (Central Processing Unit) to execute a program. Additionally, this computer program can be provided as recorded on a recording medium.
According to the embodiments of the present invention, good stereo audio signals free from occurrence of abnormal sounds can be played back even in a case from the necessary stereo process information is not supplied to the necessary stereo process information is supplied.
It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.
Number | Date | Country | Kind |
---|---|---|---|
2006-324775 | Nov 2006 | JP | national |
2007-272856 | Oct 2007 | JP | national |