The present invention relates to a digital broadcast transmitting system for transmitting information, such as audio, video and text, in digital format over a transmission channel including ground waves and satellite waves, and to a digital broadcast transmitting apparatus which is used for transmission and a digital broadcast receiving apparatus.
In recent years, digital broadcast which transmits information, such as audio, video, and text, as digital signals over a transmission channel including ground waves and satellite waves, is becoming common.
The scheme proposed in ISO/IEC 13818-1 is well known as a scheme for transmitting digital signals. In ISO/IEC 13818-1, schemes relating to control are specified in which the transmitting apparatus side multiplexes and transmits audio, video, and other data separately encoded for respective programs, and the receiving apparatus side receives and reproduces a designated program.
Examples of well known schemes of encoding audio signals include ISO/IEC 13818-7 (MPEG-2 Audio AAC) and its derived scheme AAC+SBR. Examples of well known schemes of encoding video signals include ISO/IEC 13818-2 (MPEG-2 Video) and ISO/IEC 14496-10 (MPEG-4 AVC/H.264).
Each encoded audio signal and video signal is divided at an arbitrary position, and header information including reproduction time information is added, so that a packet referred to as packetized elementary stream (PES) is constructed. Further, the PES is basically divided into 184 bytes, header information including an ID for identification referred to as a packet identifier (PID) is added, and the PES is reconstructed into a packet referred to as a transport packet (TSP). Subsequently, the TSP is multiplexed together with data packet such as text. At this time, table information referred to as program specific information (PSI) indicating relationship between programs and packets making up the programs, is also multiplexed together.
In the PSI, four kinds of tables, such as a program association table (PAT) and a program map table (PMT), are specified. In the PAT, PIDs of PMTs corresponding to respective programs are described. In the PMT, PIDs of the packets storing audio and video signals making up corresponding programs, are described. The receiving apparatus can extract only packets making up a desired program from among the TSPs in which plural programs are multiplexed, by referring to the PAT and PMT. Note that data packets and PSIs are stored in TSPs in a format called a section, but not as a PES.
The demultiplexing unit 23 first selects a PAT packet from the received TSP sequence (S11, and S12 in
The packet analyzing units 24a and 24b divide each of audio and video packets into header information field and payload field (here, referred to as an encoded signal) and extracts the respective divided fields (S4). Then, the audio signal decoding unit 25 and the video signal decoding unit 26 respectively decodes the encoded audio and video signals. The audio and video signals obtained through decoding, are outputted according to presentation time stamp (PTS) included in the header information extracted by the packet analyzing units 24a and 24b.
Here, for example, in the case where the user selects the program A, it is possible to obtain PESs made up of encoded audio data making up the program A by extracting, from among the TSPs, only packets which have “0x21” as PID. Further, it is possible to obtain PESs made up of encoded video data making up the program A by extracting, from among TSPs, only packets which have “0x22” as PID
By extracting only data excluding header and the like from thus obtained PES packets, it is possible to obtain MPEG-2 AAC stream, for example (See Patent Reference 1).
In recent years, in order for further improvement of compression efficiency of audio signals, there is a proposed scheme in which a conventional AAC is extended. The proposed scheme allows reconstruction of high frequency components and reproduction of multi-channel sound by adding a small amount of additional information to a basic signal that is a bitstream in which rate is reduced by narrow banding or by converting into monaural or stereo with down-mixing. For example, there are proposed schemes such as AAC+SBR in which even a signal, narrow-banded into approximately 10 kHz as a basic signal, can be reproduced with sound quality equivalent to CD even at approximately 48 kbps by adding high frequency information, and MPEG-Surround in which reproduction of 5.1-channel surround is possible even at approximately 96 kbps by adding inter-channel level difference and phase difference information.
Here, supplemental explanation is given of operation of the case where the MPEG-Surround bitstream shown in
However, although there is format compatibility between MPEG-2 AAC and MPEG-Surround, details of decoding processing are different; and thus, time required for processing, that is, processing delay amount is different. More specifically, since channel extension processing, which is not required for decoding in AAC, includes filtering on a frame basis, a larger delay (for example, delay indicated by (b) in
Therefore, in a digital broadcast receiving apparatus which can reproduce MPEG-Surround, unexpected additional delay occurs in reproduction of each frame, causing time lag indicated by (b) in
For the above problem, there is a possible method in which the audio and video signals are synchronized by delaying start of reproduction of the video signal in the digital receiving apparatus by an amount of time indicated by (c) in
However, channel extension information analyzed by the multi-channel information analyzing unit 254, is described at the end of the bitstream as shown in
When the received packet in Step S3 is a video packet, the packet analyzing unit 24b analyzes the received packet, and outputs, to the video signal decoding unit 26, the encoded video signal included in the packet and the PTS of the encoded video signal (S4v). The video signal decoding unit 26 decodes the encoded video signal inputted by the packet analyzing unit 24b (S5a). After decoding the encoded video signal, the video signal decoding unit 26 determines whether or not the signal inputted by the multi-channel information analyzing unit 254 indicates MPEG-Surround (S7). As a result of the determination, when the audio signal is MPEG-Surround, the video signal decoding unit 26 corrects output timing of the video signal by the corresponding amount of time. As a result of the determination in Step S7, when the audio signal is not MPEG-Surround, the video signal decoding unit 26 outputs the video signal at the timing indicated by PTS (S6v).
Note that the determination of MPEG-Surround in the video signal processing (S7) may be performed before analysis of video packet information (S4v) or decoding of video signal (S5v), but, at least, it needs to be performed after decoding of the audio signal (S5a). Furthermore, in the case where decoding of the audio signal (S5a) cannot be performed by the time designated by PTS added to the video packet, output of the video signal (S6v) starts first, which causes a problem that AV synchronization cannot be made, or correction is made in the middle of program reproduction, resulting in interruption of the video signal output.
The present invention has been conceived to solve the above conventional problems, and has an object to provide a digital broadcast transmitting apparatus, a digital broadcast receiving apparatus, and a digital broadcast transmitting/receiving system, in which determination of processing depending on the encoding scheme of the transmitted audio signal can be promptly made by the digital broadcast receiving apparatus.
In order to solve the problems, the digital broadcast transmitting apparatus according to the present invention is a digital broadcast transmitting apparatus which provides multiplex broadcast by encoding and packetizing an audio signal and a video signal that are reproduced in synchronization. The digital broadcast transmitting apparatus includes: an audio stream packet generating unit which converts the audio signal into an encoded audio signal by encoding the audio signal and to generate an audio stream packet including the encoded audio signal; a data packet generating unit which generates a data packet which is analyzed by a digital broadcast receiving apparatus before decoding the audio stream packet is started, the data packet including encoding information which is not included in header information of the audio stream packet, and which indicates whether or not decoding of the encoded audio signal includes a processing which causes decoding time of the encoded audio signal to exceed a predetermined decoding time; a video stream packet generating unit which converts the video signal into an encoded video signal by encoding the video signal, and to generate a video stream packet including the encoded video signal; and a transmitting unit which multiplexes the audio stream packet, the data packet, and the video stream packet so as to generate multiplexed data, and transmit the generated multiplexed data via a broadcast wave.
As described above, according to the digital broadcast transmitting apparatus of the present invention, data packet includes encoding information which is not included in header information of the audio stream packet, and which indicates whether or not decoding of the encoded audio signal includes a processing which causes decoding time of the encoded audio signal to exceed a predetermined decoding time. The data packet is analyzed by the digital broadcast receiving apparatus before decoding of the audio stream packet is started. Therefore, it is possible for the digital broadcast receiving apparatus to know, before starting decoding of the audio stream packet, whether or not decoding of the encoded audio signal includes any processing which exceeds a predetermined decoding time. As a result, processing for adjusting synchronization of audio signal with video signal can be performed well in advance.
Note that it may be that the audio stream packet generating unit includes an audio encoding unit which converts the audio signal into the encoded audio signal using one of a first encoding mode and a second encoding mode, the first encoding mode being a mode in which the audio signal is encoded in accordance with MPEG-2 AAC scheme, the second encoding mode being a mode in which the audio signal is encoded in accordance with the MPEG-2 AAC scheme, and is also encoded including auxiliary information for extending a high frequency component or an output channel count of a basic signal obtained in the first encoding mode. Also it may be that the data packet generating unit includes an encoding information generating unit which generates the encoding information indicating which one of the first encoding mode and the second encoding mode has been used by the audio encoding unit in the conversion of the audio signal into the encoded audio signal. According to the present invention, since the encoding information describes whether the audio signal has been encoded simply in accordance with MPEG-2 AAC, or high frequency components or output channel count of the basic signal has been extended in addition to encoding in accordance with MPEG-2 AAC. Therefore, it is possible for the digital broadcast receiving apparatus to perform processing for adjusting synchronization of audio signal with video signal before starting decoding the audio stream packet.
Furthermore, it may be that the data packet generating unit generates an independent data packet including only the encoding information as data. With this, it is possible for the digital broadcast receiving apparatus to analyze the encoding information data packet, and the audio and video stream packets at the same time.
Furthermore, it may be that the data packet generating unit generates the data packet for each audio stream packet generated by the audio stream packet generating unit, and when data packet includes information that is identical to information included in an immediately preceding data packet, the transmitting unit transmits multiplexed data in which the data packet is not multiplexed. Since it is not likely that the encoding information changes continuously within a single program, it is not necessary to multiplex an encoding information packet for each audio packet. As a result, it is possible to improve transmission efficiency of multiplexed data.
Further, it may be that the data packet generating unit generates the data packet in a format defined as a section format.
Further, it may be that the data packet generating unit (i) represents, using a descriptor, the encoding information indicating which one of the first encoding mode and the second encoding mode has been used by the audio encoding unit in the conversion of the audio signal into the encoded audio signal; and (ii) generates a packet in which the descriptor is embedded into a descriptor area, the descriptor area being repeated for each of elementary streams within a program map table (PMT). In the PMT packet, PID, indicating elementary stream packets which stores audio signal making up a program, is described. Thus, by embedding a descriptor representing encoding information into a descriptor area which is associated with each PID, it is possible to efficiently transmit encoding information.
Furthermore, it may be that the data packet generating unit further generates a data packet including encoding information indicating an extended channel count of the basic signal, the extended channel count of the basic signal being an output channel count of the basic signal of the case where the output channel count of the basic signal is extended using the auxiliary information. As described, by transmitting in data packet including the encoding information, the channel count of the case where the output channel count is extended using auxiliary information, it is possible to select a channel extension processing optimal to reproduction environment with sufficient promptness.
Further, it may be that the data packet generating unit further generates a data packet including encoding information indicating data length of the basic signal. With this, it is possible to determine whether there is an error in the basic signal; and thus reproducing only the basic signal is possible when there is no error in the basic signal. It is also possible to extend the channel count of the basic signal directly into the channel count of the multi-channel signal, and to reproduce the multi-channel signal.
Further, a digital broadcast receiving apparatus according to the present invention is a digital broadcast receiving apparatus which receives multiplex broadcast in which an audio signal and a video signal are encoded, packetized, and transmitted, the audio signal and the video signal being reproduced in synchronization. The digital broadcast receiving apparatus includes a receiving unit which receives the multiplex broadcast; a separating unit which separates, from multiplexed data, an audio stream packet, a video stream packet, and a data packet, the multiplexed data being received by the receiving unit via the multiplex broadcast, the audio stream packet including an encoded audio signal which is an audio signal that has been encoded, the video stream packet including an encoded video signal which is a video signal that has been encoded, the data packet being other than the audio stream packet and the video stream packet; an analyzing unit which analyzes encoding information from the separated data packet before decoding the audio stream packet is started, the encoding information being information which is not included in header information of the audio stream packet, and which indicates whether or not decoding of the encoded audio signal includes a processing which causes decoding time of the encoded audio signal to exceed a predetermined decoding time; and a decoding unit which adjusts output timings of the audio signal and the video signal by an amount of time that the decoding time of the audio signal exceeds the predetermined decoding time, when the encoding information indicates the decoding of the encoded audio signal includes the processing which causes the decoding time of the encoded audio signal to exceed the predetermined decoding time.
As described, according to the digital broadcast receiving apparatus according to the present invention, it is possible to analyze, before starting decoding of the audio stream packet, data packet which includes encoding information which is not included in header information of the audio stream packet, and which indicates whether or not decoding of the encoded audio signal includes a processing which causes decoding time of the encoded audio signal to exceed a predetermined decoding time. With this, it is possible for the digital broadcast receiving apparatus to know, before starting decoding of the audio stream packet, whether or not decoding of the encoded audio signal includes any processing which exceeds a predetermined decoding time. As a result, processing for adjusting synchronization of the audio signal with the video signal can be performed well in advance.
Note that it may be that the separating unit separates, from the received multiplexed data, the audio stream packet including the encoded audio signal which has been encoded using one of a first encoding mode and a second encoding mode, the first encoding mode being a mode in which the audio signal is encoded in accordance with MPEG-2 AAC scheme, the second encoding mode being a mode in which the audio signal is encoded in accordance with the MPEG-2 AAC scheme, and is also encoded including auxiliary information for extending a high frequency component or an output channel count of a basic signal obtained in the first encoding mode; the analyzing unit analyzes, based on the encoding information, which one of the first encoding mode and the second encoding mode has been used in the encoding of the encoded audio signal included in the separated audio stream packet; and the decoding unit adjusts output timings of the audio signal and the video signal by an amount of time necessary for extending the high frequency component or the output channel count of the basic signal obtained in the first encoding mode, when the analysis result obtained by the analyzing unit indicates that the second encoding mode has been used in the encoding. With this, the encoding information describes whether the audio signal has been converted into the encoded audio signal using the first encoding mode, or using the second encoding mode; and thus, it is possible for the digital broadcast receiving apparatus to perform processing for adjusting synchronization of the audio signal with the video signal before starting decoding of the audio stream packet.
Further, it may be that when the analysis result obtained by the analyzing unit indicates that the second encoding mode has been used in the encoding of the encoded audio signal included in the separated audio stream packet, the decoding unit delays outputting the video signal by a predetermined time than the case where the first encoding mode has been used in the encoding. With this, the decoding unit can decode the video signal in a normal way, and adjust synchronization of the video signal and the audio signal by delaying output of the video signal obtained through the decoding by a predetermined time. As a result, it is possible to adjust synchronization easily with lower processing load.
Furthermore, it may be that when the analysis result obtained by the analyzing unit indicates that the second encoding mode has been used in the encoding of the encoded audio signal included in the separated audio stream packet, the decoding unit starts decoding of the encoded audio signal earlier by a predetermined time than the case where the first encoding mode has been used in the encoding. With this, it is possible to know, before the decoding unit starts decoding, whether the audio signal has been converted using the first encoding mode, or using the second encoding mode; and thus it is possible to adjust synchronization of the video signal and the audio signal easily by starting decoding of the encoded audio signal earlier by a predetermined time.
Further, it may be that the predetermined time is a delay time that is an additional time required for decoding processing of the encoded audio signal in the second mode compared to decoding processing of the encoded audio signal in the first mode.
Further, it may be that the analyzing unit further analyzes, based on the encoding information, an extended channel count of the basic signal, the extended channel count of the basic signal being an output channel count of the basic signal of the case where the output channel count of the basic signal is extended using the auxiliary information, and when the output channel count of the digital broadcast receiving apparatus is different from the channel count indicated by the encoding information, the decoding unit: (i) extends the channel count of the basic signal directly into the output channel count of the digital broadcast receiving apparatus; and (ii) adjusts output timings of the audio signal and the video signal by an amount of time necessary for extending the output channel count of the basic signal. With this, it is possible for the decoding unit to directly extend the channel count of the basic signal into the output channel count of the digital broadcast receiving apparatus while omitting double work in that the decoding unit first extends the channel count of the basic signal into the channel count that is identical to the original sound using the auxiliary information, and then converting it into the output channel count of the digital broadcast receiving apparatus. Therefore, while adjusting synchronization of the video signal and the audio signal, it is possible to decode the audio signal compatible to the equipment of the digital broadcast receiving apparatus efficiently.
Further, it may be that the decoding unit includes: a multi-channel estimating unit which estimates channel extension information, using one of channel-extension related information included in the basic signal, and an initial value or a recommended value used for channel count extension from 2-channel of the basic signal into 5.1-channel of a multi-channel signal, the channel extension information being information for extending the channel count of the basic signal to the output channel count of the digital broadcast receiving apparatus. Also it may be that the decoding unit extends the channel count of the basic signal directly into the output channel count of the digital broadcast receiving apparatus, using the channel extension information estimated by the multi-channel estimating unit. With this, it is possible for the decoding unit to omit extending the channel count of the basic signal into the channel count that is identical to that of the original sound using the auxiliary information, and to directly extend the channel count of the basic signal into the output channel count of the digital broadcast receiving apparatus.
Further, it may be that the analyzing unit further analyzes, based on the encoding information, data length of the basic signal of the encoded audio signal, and the decoding unit: (i) determines whether or not the basic signal has been correctly decoded by comparing the data length of the basic signal obtained by the analyzing unit and data length of the basic signal obtained though decoding of the encoded audio signal; and (ii) extends, using the channel extension information estimated by the multi-channel estimating unit, the channel count of the basic signal directly into the output channel count of the digital broadcast receiving apparatus, when determined that the basic signal has been correctly decoded. With this, when the basic signal has been decoded correctly, it is possible to directly extend the channel count of the basic signal into the output channel count of the digital broadcast receiving apparatus without using the auxiliary information.
Further, it may be that the decoding unit (i) further determines whether or not the channel extension processing using the auxiliary information has been correctly performed, when determined that the basic signal has been correctly decoded; and (ii) outputs only the basic signal without adjusting output timings of the audio signal and the video signal, when determined that an error has occurred in the channel extension processing using the auxiliary information. With this, it is possible to output only the basic signal when the basic signal has been decoded correctly.
According to the present invention, in the digital broadcast receiving apparatus, it is possible to know information specific to the encoding scheme before starting decoding of the encoded audio signal even without analyzing the details of the encoded audio signal up to the end; and thus it is possible to easily perform optimal synchronization control according to the encoding scheme of the audio signal.
Hereinafter, embodiments of the present invention will be described with reference to the drawings. Note that in the embodiments, descriptions are given of an exemplary case of a digital broadcast transmitting system using MPEG-Surround (hereinafter referred to as “MPS”) as an audio encoding scheme.
In the present embodiment, an encoding information packet including a new PID is generated, and the generated encoding information packet is transmitted with information, indicating whether or not MPS is used, being described.
Each of audio signals and video signals making up programs is respectively inputted into the audio signal encoding unit 11 and the video signal encoding unit 12, and converted into digital signals. The packetizing units 13a and 13b add header information to the respective converted digital signals and packetize them into PES. Here, the audio signal encoding unit 11 and the packetizing unit 13a are an example of “an audio stream packet generating unit which converts the audio signal into an encoded audio signal by encoding the audio signal and to generate an audio stream packet including the encoded audio signal”. The video signal encoding unit 12 and the packetizing unit 13b are an example of “a video stream packet generating unit which converts the video signal into an encoded video signal by encoding the video signal, and to generate a video stream packet including the encoded video signal”. Furthermore, the audio signal encoding unit 11 is an example of “an audio encoding unit which converts the audio signal into the encoded audio signal using one of a first encoding mode and a second encoding mode, the first encoding mode being a mode in which the audio signal is encoded in accordance with MPEG-2 AAC scheme, the second encoding mode being a mode in which the audio signal is encoded in accordance with the MPEG-2 AAC scheme, and is also encoded including auxiliary information for extending a high frequency component or an output channel count of a basic signal obtained in the first encoding mode”, and “an encoding information generating unit which generates the encoding information indicating which one of the first encoding mode and the second encoding mode has been used by the audio encoding unit in the conversion of the audio signal into the encoded audio signal” At the same time, data signals, PAT and PMT are also inputted into the packetizing unit 13c, and packetized in a section format. Furthermore, information relating to processing in the audio signal encoding unit 11 is similarly packetized, as an encoding information, by the packetizing unit 13d in a section format. The packetizing unit 13c and the packetizing unit 13d are an example of “a data packet generating unit that generates a data packet which is analyzed by a digital broadcast receiving apparatus before decoding the audio stream packet is started, the data packet including encoding information which is not included in header information of the audio stream packet, and which indicates whether or not decoding of the encoded audio signal includes a processing which causes decoding time of the encoded audio signal to exceed a predetermined decoding time”. Furthermore, the packetizing unit 13d is an example of “the data packet generating unit which generates an independent data packet including only the encoding information as data” and “the data packet generating unit which generates the data packet in a format defined as a section format”. Subsequently, the multiplexing unit 14 time-multiplexes all PES and section packets, the channel encoding/modulating unit 15 performs transmitting processing on the time-multiplexed PES and section packets. Then the PES and section packets are transmitted through the antenna 16. The multiplexing unit 14, the channel encoding/modulating unit 15 and the antenna 16 are an example of “a transmitting unit which multiplexes the audio stream packet, the data packet, and the video stream packet so as to generate multiplexed data, and transmit the generated multiplexed data via a broadcast wave”.
Here, a significant effect can be obtained in optimal operation of a receiving apparatus by selecting, as encoding information, information which cannot be known simply from the format structure of the encoded audio signal outputted by the audio signal encoding unit 11 or information which is not described in the header information of the encoded signal, and by transmitting the selected information to the receiving apparatus. For example, when a flag, indicating whether the encoding scheme used by the audio signal encoding unit 11 is AAC or MPS, is packetized separately in a section format as encoding information, the receiving apparatus can know whether the encoding scheme of the audio signal is AAC or MPS earlier than the start of the decoding of the basic signal. In a conventional method, as shown in
Note that it is also possible to adjust the synchronization timing by reflecting the output timing taking the encoding information into account to the PTS of the PES packet; however, there is a problem of compatibility with apparatuses which support only conventional MPEG-2 AAC.
The demodulating/channel decoding unit 22 performs receiving processing on digital broadcast wave received via the antenna 21, and outputs a multiplexed TSP sequence. The demultiplexing unit 23 selects a PAT packet and PMT packets from the received TSP sequence, and outputs the selected PAT packet and PMT packets to the packet analyzing unit 24c. The packet analyzing unit 24c extracts PAT and PMTs from the PAT packet and the PMT packets inputted by the demultiplexing unit 23, and outputs the extracted PAT and PMTs to the program information analyzing unit 27. The program information analyzing unit 27 extracts program information from the PAT and PMTs inputted by the packet analyzing unit 24c, and presents a user with the detailed information of respective programs in service. The program information analyzing unit 27 informs the demultiplexing unit 23 of the PIDs of the PES packets in which audio signals and video signals making up the desired program are stored, according to the program selected by the user from among the presented detailed information of the programs. As a result, audio, video and data packets making up the program selected by the user are selected. The antenna 21 and the demodulating/channel decoding unit 22 are an example of “a receiving unit which receives the multiplex broadcast”. The demultiplexing unit 23 is an example of “a separating unit which separates, from multiplexed data, an audio stream packet, a video stream packet, and a data packet, the multiplexed data being received by the receiving unit via the multiplex broadcast, the audio stream packet including an encoded audio signal which is an audio signal that has been encoded, the video stream packet including an encoded video signal which is a video signal that has been encoded, the data packet being other than the audio stream packet and the video stream packet”, and “the separating unit which separates, from the received multiplexed data, the audio stream packet including the encoded audio signal which has been encoded using one of a first encoding mode and a second encoding mode, the first encoding mode being a mode in which the audio signal is encoded in accordance with MPEG-2 AAC scheme, the second encoding mode being a mode in which the audio signal is encoded in accordance with the MPEG-2 AAC scheme, and is also encoded including auxiliary information for extending a high frequency component or an output channel count of a basic signal obtained in the first encoding mode”. The packet analyzing unit 24d is an example of “an analyzing unit which analyzes encoding information from the separated data packet before decoding the audio stream packet is started, the encoding information being information which is not included in header information of the audio stream packet, and which indicates whether or not decoding of the encoded audio signal includes a processing which causes decoding time of the encoded audio signal to exceed a predetermined decoding time”, and “the analyzing unit which analyzes, based on the encoding information, which one of the first encoding mode and the second encoding mode has been used in the encoding of the encoded audio signal included in the separated audio stream packet”. The audio signal decoding unit 25 and the video signal decoding unit 26 are an example of “a decoding unit which adjusts output timings of the audio signal and the video signal by an amount of time that the decoding time of the audio signal exceeds the predetermined decoding time, when the encoding information indicates the decoding of the encoded audio signal includes the processing which causes the decoding time of the encoded audio signal to exceed the predetermined decoding time” and “the decoding unit which adjusts output timings of the audio signal and the video signal by an amount of time necessary for extending the high frequency component or the output channel count of the basic signal obtained in the first encoding mode, when the analysis result obtained by the analyzing unit indicates that the second encoding mode has been used in the encoding”. Further, the video signal decoding unit 26 is an example of “when the analysis result obtained by the analyzing unit indicates that the second encoding mode has been used in the encoding of the encoded audio signal included in the separated audio stream packet, the decoding unit which delays outputting the video signal by a predetermined time than the case where the first encoding mode has been used in the encoding”. Further, the audio signal decoding unit 25 is an example of “when the analysis result obtained by the analyzing unit indicates that the second encoding mode has been used in the encoding of the encoded audio signal included in the separated audio stream packet, the decoding unit which starts decoding of the encoded audio signal earlier by a predetermined time than the case where the first encoding mode has been used in the encoding”.
Of data packets, in particular, as to a packet relating to encoding information of the audio signal, the encoding information is extracted by the packet analyzing unit 24d, and inputted to the encoding information analyzing unit 28. The encoding information analyzing unit 28 analyzes, for example, whether the audio signal has been encoded in MPEG-2 AAC or MPS, based on the encoding information inputted by the packet analyzing unit 24d, and then outputs the analysis result to the audio signal decoding unit 25 and the video signal decoding unit 26. The analysis of the encoding information is performed, for example, while the packet analyzing unit 24a and the packet analyzing unit 24b are extracting encoded audio and video signals which are substantial data from the audio and video packets making up the program selected by the user. The audio signal decoding unit 25 decodes the encoded audio signal inputted by the packet analyzing unit 24a according to the encoding scheme inputted by the encoding information analyzing unit 28. The video signal decoding unit 26 decodes the encoded video signal inputted by the packet analyzing unit 24b, and adjusts, with respect to the designated PTS, output timing of the decoded video signal according to the encoding information of the audio signal inputted by the encoding information analyzing unit 28. With this, the video signal decoding unit 26 outputs the video signal such that optimal synchronous reproduction of the audio and video signals can be performed.
Note that the description has been given above of the method for adjusting synchronization of the audio and video signals by the video signal decoding unit 26 adjusting output timing of the video signal; however, the present invention is not limited to the described example. It may be such that when encoding information indicates that the audio signal has been encoded in MPS, the audio signal decoding unit 25 starts decoding of the audio signal earlier by a predetermined time than the case where the audio signal has been encoded in MPEG-2 AAC, and the video signal decoding unit 26 decodes and outputs the video signal in a normal way.
More particularly, respective audio and video packets are divided into header information and encoded signal, and are extracted by the packet analyzing unit 24a and the packet analyzing unit 24b. Then the respective encoded signals are decoded by the audio signal decoding unit 25 and the video signal decoding unit 26.
Hereinafter, operations of the digital broadcast receiving apparatus 2 are described in an exemplary case where the encoding information is a flag indicating the encoding scheme of the audio signal is AAC or MPS.
When the received packet is an audio packet (Yes in S3), the packet analyzing unit 24a analyzes the information of the received audio packet, and extracts the encoded audio signal (S4a). Subsequently, the audio signal decoding unit 25 decodes the extracted encoded audio signal (S5a). Note that by this time, in the case where an analysis of whether the encoding scheme of the audio signal is MPEG-2 AAC or MPS has been performed in Step S10, the audio signal decoding unit 25 decodes the encoded audio signal according to the decoding scheme indicated by the encoding information. The audio signal decoding unit 25 outputs the audio signal obtained through the decoding according to PTS (S6a).
When the received packet is a video packet in Step S3 (Yes in S3), the packet analyzing unit 24b analyzes the information of the received video packet, and extracts the encoded video signal (S4v). Subsequently, the video decoding unit 26 decodes the extracted encoded video signal (S5v). Note that by this time, in the case where an analysis of whether the encoding scheme of the audio signal is MPEG-2 AAC or MPS has been performed, the video signal decoding unit 26 determines delay time from timing indicated by PTS for outputting the decoded video signal, according to whether the inputted encoding scheme is MPEG-2 AAC, MPS, or (AAC+SBR). When the encoding scheme is MPEG-2 AAC, the video signal decoding unit 26 outputs the video signal obtained through the decoding as it is according to PTS. When the encoding scheme is MPS, the video signal decoding unit 26 outputs the video signal with a large output delay time. Further, when the encoding scheme is (AAC+SBR), the video signal decoding unit 26 outputs the video signal with a small output delay time (S6v). More particularly, when the encoding scheme is MPS, the video signal decoding unit 26 delays outputting the video signal obtained through the decoding, by an amount of time equivalent to a predetermined processing time of MPS with respect to a predetermined timing indicated by PTS (S6v). This is the same in the case of (AAC+SBR), too.
As described, by packetizing the encoding information indicating the encoding scheme of the audio signal in a section format and multiplexing into a TSP sequence, prompt determination of the encoding scheme of the encoded audio signal is possible. Therefore, the packet analyzing unit 24d, which analyzes the packet describing the encoding information, can determine whether or not correction of output timing of the video signal is necessary before the audio signal decoding unit 25 and the video signal decoding unit 26 start decoding of the encoded audio and video signals, that is, before starting the audio signal decoding processing S5a and the video signal decoding processing S5b. As a result, it is possible for the video signal decoding unit 26 to perform optimization processing, such as delaying the start of decoding of the encoded video signal according to delay amount in decoding of the encoded audio signal, or adjusting buffer amount to delay the output of the decoded video signal, regardless of progress of the decoding of the audio signal S5a, which can be performed only after decoding the basic signal of the audio signal in a conventional method.
Note that the above delay time is adjusted by causing the audio signal decoding unit 25, the video signal decoding unit 26, or a memory not shown, to store a value indicating delay time which is associated with the encoding scheme in advance, such as n seconds for SBR, and n seconds for MPS. Whether the audio signal decoding unit 25 or the video signal decoding unit 26 stores such delay time can be determined depending on which processing unit adjusts synchronization. For example, it may be that when the audio signal decoding unit 25 adjusts synchronization by starting decoding early, the audio signal decoding unit 25 stores the delay time, and when the video signal decoding unit 26 adjusts synchronization by delaying output timing of the decoded video signal, the video signal decoding unit 26 stores the delay time. Further, the delay time changes depending on processing capacity of the audio signal decoding unit, such as operation speed of CPU; and thus, the delay time of video signal output is defined according to the model of the digital broadcast receiving apparatus. The delay time is an example of “the predetermined time which is a delay time that is an additional time required for decoding processing of the encoded audio signal in the second mode compared to decoding processing of the encoded audio signal in the first mode”.
Note that the packetizing unit 13d is an example of “the data packet generating unit which generates the data packet for each audio stream packet generated by the audio stream packet generating unit”. The multiplexing unit 14 and the channel encoding/modulating unit 15 are an example of “when data packet includes information that is identical to information included in an immediately preceding data packet, the transmitting unit which transmits multiplexed data in which the data packet is not multiplexed”. Since it is not likely that the encoding information changes continuously within a single program, it is not necessary to multiplex an encoding information packet for each audio packet. When the encoding scheme of the encoded signal included in the audio packet is the same (encoding scheme) as the encoding scheme indicated by the encoding information included in the immediately preceding audio packet, it is possible to omit multiplexing the encoding information packet. For example, as for audio signals making up the same program, it may be that only a single encoding information packet for the program is multiplexed. This improves transmission efficiency.
On the other hand, at the time of start up of the receiving apparatus, or switching of the viewing program, it is desirable to finish analyzing the encoding information packet S10 early enough before starting the audio signal decoding S5a and video signal decoding S5v; and thus, when sending out an encoding information packet is performed exceeding a predetermined interval from sending the immediately preceding encoding information packet, it may be that the encoded information packet having the same information is sent out again without omitting multiplexation.
In the first embodiment, the case has been described where encoding information is described in individual packets having a new PID in a section format, and the packet is multiplexed into TSP for transmission to the digital broadcast receiving apparatus. In the second embodiment, a new packet is not generated, but the encoding information is embedded into PMT as a descriptor for transmission.
The digital broadcast transmitting apparatus 151 according to the second embodiment features inclusion of a descriptor updating unit 17 instead of a packetizing unit 13d which packetizes encoding information of audio signals.
Then, the packetizing unit 13c packetizes the PAT and the PMT into which the descriptors indicating the encoding information are inserted. The multiplexing unit 14 multiplexes the PES packets of the encoded audio signal, the PES packets of the encoded video signal, and the PAT and the PMT packets. The multiplexed TSP is transmitted by broadcast wave via the channel encoding/modulating unit 15 and the antenna 16.
As described, according to the digital broadcast transmitting apparatus 151 of the embodiment 2, as shown in
In the present embodiment, the case will be described where encoding information is an extended channel count (of an original sound).
The encoded audio signal extracted by the packet analyzing unit 24a has, as shown in
Here, when the encoded audio signal is AAC bitstream, an output of the basic signal analyzing unit 252 is a normal stereo signal, and the channel extension information to be analyzed by the multi-channel information analyzing unit 254 does not exist in the bitstream. At this time, using the stereo signal outputted by the basic signal analyzing unit 252, the multi-channel information estimating unit 258 estimates channel extension information. Alternatively, it may be that the channel extension information outputted by the multi-channel information estimating unit 258 is information associated with initial value or recommended value of the channel extending unit 256, and estimation, which is not correlated with the stereo signal, is made. The channel extending unit 256 selects channel extension information to be used according to the output of the encoding information analyzing unit 28, which allows the output of the multi-channel audio signal under delay and output control similar to the case of MPS, even when the received encoded audio signal is a conventional AAC stereo signal. Here, due to characteristics of the AAC format, the output of the basic signal analyzing unit 252 is not limited to stereo signals, but the output may be a monaural signal or the multi-channel signal of more than 3-channel; and thus the effects of the present invention is not limited to the stereo AAC.
Furthermore, estimation of channel extension information is not limited to the case where the encoded audio signal is AAC, but as a matter of course, similar effects can also be obtained in the case of MPS. For example, when channel extension into a channel count different from the channel count designated in the bitstream is desired, estimated channel extension information can be used without using the channel extension information in the bitstream. At this time, estimation using the channel extension information in the bitstream allows higher precision estimation. This is because inter-channel level differences and the like can be effective information regardless of the extended channel count.
Further, by transmitting the extended channel count as encoding information, it is possible to select the channel extension information in the bitstream and the estimated channel information more efficiently. Due to compatible format structures of AAC and MPS, similar to the case where whether MPS is used or not cannot be determined till a bitstream is analyzed up to the end, the extended channel count is also information which cannot be known before the multi-channel information analyzing unit 254. However, There are many cases where the extended channel count and reproduction environment are not matched, for example, as in the case where a general in-vehicle loudspeaker includes four loudspeakers, but the extended multi-channel audio signal is 5.1-channel. More particularly, as explained in
More particularly, note that examples of methods for extending the channel count by using the multi-channel information estimating unit 258 include methods (1), (2), and (3) described below.
More particularly, (1) without using the channel extension information provided from the multi-channel information analyzing unit 254, the down-mixed signal outputted from the basic signal analyzing unit 252 is, for example, directly extended from 2-channel into the target channel count, for example, into 4-channel.
(2) Using the channel extension information provided from the multi-channel information analyzing unit 254, for example, the 2-channel down-mixed signal is extended into the target channel count, for example, into 4-channel.
(3) With the Enhanced Matrix Mode standardized in MPS, 2-channel is extended into 4-channel. Here, the Enhanced Matrix Mode is a channel extending unit standardized in MPS, and is a method for reconstructing the down-mixed signal into the multi-channel signal using a predetermined fixed parameter without using the transmission parameter of MPS.
In the present embodiment, description is given of the case where the bit length of the basic signal of AAC and MPS is transmitted separately from the basic signal, as encoding information.
The header information analyzing unit 251 analyzes header information of the encoded audio signal, outputs the basic signal of the encoded audio signal to the basic signal analyzing unit 252, and outputs the multi-channel extension information of the encoded audio signal to the multi-channel information analyzing unit 254. The basic signal analyzing unit 252 outputs a down-mixed signal which is a basic signal to the error detecting unit 260. The multi-channel information analyzing unit 254 outputs the multi-channel extension information to the error detecting unit 260.
The error detecting unit 260 analyzes the bit length of the basic signal inputted by the basic signal analyzing unit 252, and determines whether the analyzed bit length matches to the bit length of the basic signal inputted by the encoding information analyzing unit 28. When they are not matched, it can be determined that there is an error in the basic signal. In addition, in the case where an error is detected at the time of channel extension while knowing that there is no error in the basic signal, it can be determined that the error is included in the channel extension information. As described, when there is an error in the channel extension information, outputting with 2-channel without channel extension is possible, or outputting with channel extension using the channel extension information estimated by the multi-channel information estimating unit 258. The channel extending unit 256, the output buffer 257, the multi-channel information estimating unit 258 and the error detecting unit 260 are an example of “the decoding unit which: (i) determines whether or not the basic signal has been correctly decoded by comparing the data length of the basic signal obtained by the analyzing unit and data length of the basic signal obtained though decoding of the encoded audio signal; and (ii) extends, using the channel extension information estimated by the multi-channel estimating unit, the channel count of the basic signal directly into the output channel count of the digital broadcast receiving apparatus, when determined that the basic signal has been correctly decoded”.
As described earlier, the AAC bitstream and MPS bitstream have the exact same structure of basic signal in order to maintain compatibility. In other words, the MPS bitstream can be reproduced even only with the basic signal on which the channel extension processing is not performed. In this case, the error detecting unit 260 and the output buffer 257 are an example of “the decoding unit which: (i) further determines whether or not the channel extension processing using the auxiliary information has been correctly performed, when determined that the basic signal has been correctly decoded; and (ii) outputs only the basic signal without adjusting output timings of the audio signal and the video signal, when determined that an error has occurred in the channel extension processing using the auxiliary information”. Therefore, even when the receiving condition becomes worse, reproduction of a program can be continued if only the basic signal can be decoded without any errors. However, due to convenience of the compression scheme of AAC, an error in the bitstream is rarely detected at the position where the error occurs. In the header information of AAC, the frame length of the frame is described, and error detection of AAC is performed by comparing frame length for each frame and the frame length described in the header, and determining whether or not there is no difference between them. In AAC, since the portion following the basic signal field is a padding area which has no meaning, there is not much differences in error resilience between detecting the error immediately after the basic signal and detecting the error at the end of the frame.
In contrast, as in the case of MPS, even the error is included in the basic signal, there are many cases that the error is detected not in the basic signal, but in the channel extension information or at the end of the frame. Therefore, even when the error is detected at the time of channel extension, there used to be no way to specify whether the error is included in the basic signal or not. The bit length of only the channel extension information is also described in the channel extension information, but the top position of the channel extension information is not clarified; and thus, the bit length of the channel extension information can be used for confirming if decoding has been correctly performed without any errors, but cannot be used for detecting the error. Thus, in a conventional receiving apparatus, no matter whether the basic signal includes an error or not, when the error is detected in the frame, muting has to be performed.
According to the present invention, by transmitting, as encoding information, information which can clarify the bit length of only the basic signal, it is possible for the digital broadcast receiving apparatus having the structure according to the embodiment 4 to easily determine whether decoding of only the basic signal has been correctly performed or not at the time of occurrence of error. As a result, optimal error correction can be performed depending on the error status, and continuation of reproduction of the program is possible without interrupting reproduction by muting.
Here, the information which can clarify the bit length of only the basic signal is, of course, a bit length of the basic signal itself, but also may be a bit length of the field compatible with AAC, such as bit length of (header+basic signal). In addition, describing bit length of the field which is after the channel extension information also enables calculation of the bit length of the basic signal by subtracting the bit length from the frame length indicated in the header.
As described, according to the digital broadcast transmitting apparatus of the present invention, encoding scheme, of the audio signal, which is not described in the header information, such as presence of MPS, and output channel count and bit count of the basic signal, is transmitted separately from the encoded stream of the substantial data. Thus, the digital broadcast receiving apparatus can know delay time necessary for decoding the audio signal compared to the case of MPEG-2 AAC, before starting decoding of the encoded audio signal, allowing higher precision synchronization with the video signal.
It should be noted that although the present invention has been described based on aforementioned embodiments, the present invention is obviously not limited to such embodiments. The following cases are also included in the present invention.
(1) Each of the aforementioned apparatuses is, specifically, a computer system including a microprocessor, a ROM, a RAM, a hard disk unit, a display unit, a keyboard, a mouse, and the like. A computer program is stored in the RAM or hard disk unit. The respective apparatuses achieve their functions through the microprocessor's operation according to the computer program. Here, the computer program is configured by combining plural instruction codes indicating instructions for the computer in order to achieve predetermined functions.
(2) A part or all of the constituent elements constituting the respective apparatuses may be configured from a single System-LSI (Large-Scale Integration). The System-LSI is a super-multi-function LSI manufactured by integrating constituent units on one chip, and is specifically a computer system configured by including a microprocessor, a ROM, a RAM, and so on. A computer program is stored in the RAM. The System-LSI achieves its function through the microprocessor's operation according to the computer program.
(3) A part or all of the constituent elements constituting the respective apparatuses may be configured as an IC card which is attachable to the respective apparatuses or as a stand-alone module. The IC card or the module is a computer system configured from a microprocessor, a ROM, a RAM, and the so on. The IC card or the module may include the aforementioned super-multi-function LSI. The IC card or the module achieves its function through the microprocessor's operation according to the computer program. The IC card or the module may also be implemented to be tamper-resistant.
(4) The present invention may be a previously described method. Further, the present invention, may be a computer program causing a computer to realize such method, and may also be a digital signal including the computer program.
Furthermore, the present invention may also be realized by storing the computer program or the digital signal in a computer readable recording medium such as a flexible disc, a hard disk, a CD-ROM, an MO, a DVD, a DVD-ROM, a DVD-RAM, a BD (Blu-ray Disc), and a semiconductor memory. Furthermore, the present invention may also include the digital signal recorded in these recording media.
Furthermore, the present invention may also be realized by the transmission of the aforementioned computer program or digital signal via a telecommunication line, a wireless or wired communication line, a network represented by the Internet, a data broadcast and so on.
The present invention may also be a computer system including a microprocessor and a memory, in which the memory stores the aforementioned computer program and the microprocessor operates according to the computer program.
Furthermore, by transferring the program or the digital signal by recording onto the aforementioned recording media, or by transferring the program or digital signal via the aforementioned network and the like, execution using another independent computer system is also made possible.
(5) Combination of the above described embodiments and variations is also possible.
The present invention is suitable for a digital broadcast transmitting system for transmitting information, such as audio, video and text, in a digital format, and particularly for a digital broadcast receiving apparatus, such as a digital television, set top box, car navigation system, and mobile one-seg viewer.
Number | Date | Country | Kind |
---|---|---|---|
2007-078496 | Mar 2007 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2008/000659 | 3/19/2008 | WO | 00 | 9/18/2009 |