The present invention relates to a playback device that decodes encoded data and plays back the decoded data. The present invention particularly relates to a synchronous playback device that decodes bit streams of digital-encoded video signals and audio signals and performs synchronous playback of the decoded video signals and audio signals.
In Japan, digital broadcasting is performed in accordance with the ARIB (Association of Radio Industries and Businesses) standard. The ARIB standard is established based on the DVB (Digital Video Broadcasting) standard used in Europe, and covers the MPEG (Moving Picture Experts Group) 2-TS (Transport Stream) system for video and audio broadcasting.
An MPEG2-TS stream is composed of a plurality of TS packets each having a fixed length of 188 bytes. The length of the TS packet is determined to be 188 bytes in consideration of consistency with the length of ATM (Asynchronous Transfer Mode) cells.
Each of the TS packets is composed of a packet header having a fixed length of 4 bytes, an adaptation field having a variable length, and a payload. A PID (packet identifier) and various flags are defined in the packet header. The PID identifies a type of the TS packet.
A PES (Packetized Elementary Stream) packet that contains streams such as video streams and audio streams is divided into a plurality of TS packets having the same PID number, and the divided TS packets are transmitted. Video encoding is performed in accordance with the MPEG2 standard or the MPEG4-AVC/H.264 standard used for one-segment broadcasting, for example. Audio encoding is performed in accordance with the 23C (MPEG2 Advanced Audio Coding) standard, for example.
Also, in the same way as a PES packet that contains video streams and audio streams, a PES packet that contains data such as subtitles is divided into a plurality of TS packets, and the divided TS packets are transmitted.
An MPEG2-TS stream contains a PCR (Program Clock Reference) that is transmitted at predetermined time intervals so as to adjust an STC (System Time Clock) that is a reference signal of a decoder.
An adaptation field included in a TS packet is used for transmitting additional information relating to streams. A PCR of 6 bytes is stored in an optional field included in the adaptation field.
An STC counter included in the decoder counts the STC. The STC is corrected based on the PCR.
In order to perform audio-video synchronization of an MPEG2-TS stream, a time stamp is added to each access unit (for example, each video picture and each audio frame). The access unit is converted into a PES packet. A PTS (Presentation Time Stamp) for designating a playback time is added to a header of the PES packet.
The PTS is described in the same method as that of the STC. The STC increases in synchronization with a PCR inside the decoder. The decoder compares the STC with the PTS. When the STC matches the PTS, the decoder starts display processing and decoding processing. As a result, playback in synchronization with the STC is performed.
In this way, the STC that increases in synchronization with the PCR and a PTS added to each of transmitted audio and video PES packets included in an MPEG2-TS stream are compared with each other. When the STC matches a PTS of each of the audio and video PES packets, display processing and decoding processing are started. This realizes audio-video synchronous playback.
According to broadcast recording/playback system, an encoded and multiplexed stream provided by a video provider is stored in the decoder. Encoded audio data and encoded video data included in the stored stream are separated from each other. The separated encoded audio data and video data are independently decoded, and decoded audio signals and video signals are played back in synchronization with each other.
As a method for audio-video synchronization of stored streams, an audio master mode in which audio playback signals are focused on is used (See Patent Document 1 for example). According to the audio master mode, when each of audio frames is output, an STC is updated based on an output time of the audio frame. At this time, a display time of each of video frames is compared with the STC. Based on a result of the comparison, operations for accelerating or delaying a display time of the video frame are performed.
Also, a video master mode in which video playback signals are focused on is used for audio-video synchronization of stored streams (See Patent Document 2 for example).
Here, in order to start audio-video playback of stored streams, a control method for setting a playback start time (hereinafter an “entry point”) to start playback from a desired point (time) is used.
The following describes the above method for starting playback from a desired entry point to perform synchronous playback.
As shown in
When an entry point is externally designated and start of playback of a stream from the entry point is instructed, the synchronization control unit 1305 issues a start-up request to the transmission unit 1301, the system decoder 1302, the video decoding unit 1303, and the audio decoding unit 1304 (Step S1401). Upon receiving the start-up request, the transmission unit 1301, the system decoder 1302, the video decoding unit 1303, and the audio decoding unit 1304 start up.
Next, the synchronization control unit 1305 issues a stream supply request to the transmission unit 1301 (Step S1402). Upon receiving the stream supply request, the transmission unit 1301 transmits the stream including the entry point to the system decoder 1302 in order from the front of the stream. Upon receiving the stream from the transmission unit 1301, the system decoder 1302 starts separating encoded video data and encoded audio data included in the stream from each other, and extracting the separated encoded video and audio data.
Then, the video decoding unit 1303 continues to decode video frames of the encoded video data received from the system decoder 1302 until a video PTS received from the system decoder 1302 matches the entry point within a predetermined threshold range (hereinafter “an entry point acceptable range”) (Step S1403). At this time, the video decoding unit 1303 only decodes video frames, and does not display the decoded video frames yet.
If a video PTS received from the system decoder 1302 matches the entry point within the entry point acceptable range (Step S1403: YES), the synchronization control unit 1305 initializes the STC included therein based on the video PTS (Step S1404).
Then, the synchronization control unit 1305 issues a video frame synchronous display request to the video decoding unit 1303 (Step S1405). Upon receiving the video frame synchronous display request, the video decoding unit 1303 decodes a video frame whose video PTS received from the system decoder 1302 matches the entry point within the entry point acceptable range in Step S1403, and also displays the decoded video frame at the same time. The decoded video frame is displayed for the first time at this stage. After this, under the synchronization control by the synchronization control unit 1305 using the STC and video PTSs received from the system decoder 1302, the video decoding unit 1303 successively decodes video frames of the encoded video data received from the system decoder 1302, and displays the decoded video frames.
Then, the audio decoding unit 1304 continues to decode audio frames of the encoded audio data received from the system decoder 1302 until an audio PTS received from the system decoder 1302 matches the STC within the entry point acceptable range (Step S1406).
If an audio PTS received from the system decoder 1302 matches the STC within the entry point acceptable range (Step S1406: YES), the synchronization control unit 1305 issues an audio frame synchronous display request to the audio decoding unit 1304 (Step S1407).
Upon receiving the audio frame synchronous display request, the audio decoding unit 1304 decodes an audio frame whose audio PTS received from the system decoder 1302 matches the STC within the entry point acceptable range in Step S1406, and also displays the decoded audio frame at the same time. The decoded audio frame is displayed for the first time at this stage. After this, under the synchronization control by the synchronization control unit 1305 using the STC and audio PTSs received from the system decoder 1302, the audio decoding unit 1304 successively decodes audio frames of the encoded audio data received from the system decoder 1302, and displays the decoded audio frames.
Also, the Patent Document 2 discloses an art for detecting a video frame and an audio frame that match a designated entry point with an accuracy in units of video frames and audio frames by using the temporal reference that is display order information defined by the MPEG standard.
Furthermore, the Patent Document 3 discloses a synchronous playback device including a timer for error judgment. According to this synchronous playback device, when output preparation of only one of a video frame and an audio frame is completed after a lapse of a predetermined time period, it is possible to output the only one frame whose output preparation is complete. Also, even if a received stream includes only one of a video frame and audio frame that matches an entry point, it is possible to normally start playback.
Patent Document 1: Japanese Patent Application Publication No. H7-50818
Patent Document 2: Japanese Patent Application Publication No. H9-514146
Patent Document 3: Japanese Patent Application Publication No. 2001-346166
A first problem to be solved by the present invention is described with reference to
Next, a second problem to be solved by the present invention is described with reference to
In other words, if only one of an audio start frame and a video start frame is found, only the found start frame is output at a fixed time without being influenced by the time periods for searching for the audio start frame and the video start frame. Accordingly, in order to output only one of an audio start frame and a video start frame, start of playback is delayed for a longer time period for searching for one of the audio start frame and the video start frame than a time period for searching for the other.
Furthermore, a third problem to be solved by the present invention is described with reference to
As shown in
Since the video playback device operates in the audio master mode, the audio playback data is continuously played back even at a discontinuity point at which the discontinuity has occurred. The reference clock increases discontinuously like the audio playback data. Compared with this, there is a great difference between a PTS of the video frame 2302 and the reference clock, and this difference is beyond an allowable range for synchronous playback. As a result, an asynchronous playback is performed. Suppose that the allowable range is 10, for example. The PTS of the video frame 2302 is 113, and the reference clock is 47. A difference between the PTS of the video frame 2302 and the reference clock is 66, and is beyond the allowable range 10. However, after playback of the audio playback data at a discontinuity point between the audio frames 2303 and 2304, a PTS of the video frame 2305 is synchronous with the reference clock. As a result, synchronous playback is normally performed again, and there arises no problem.
However, as shown in
If the asynchronous playback continues, the audio playback data and the video playback data are not synchronous with each other. As a result, lip-sync mismatch continues.
In order to address such a case, according to conventional arts, if asynchronous playback continues for a certain time period, playback of video playback data or audio playback data is suspended, a status of the video playback device is restored into a playback start status. Then, playback is started again. However, if playback is suspended and then the playback is started again, a status of the video playback device is initialized. Accordingly, start-up processing needs to be performed. Furthermore, since data that has been already stored in a buffer or the like is discarded due to the initialization, it takes a long time period for starting the playback again. As a result, a problem occurs that neither video nor audio is output for the long time period.
The present invention is made in view of the above problems. The present invention aims to provide a playback device. According to the playback device, in any case such as a case where a temporal discontinuity between video display times and/or between audio display times has occurred, when playback starts or during playback, it is possible to start playback from the most appropriate video frame and audio frame depending on case, and continue the playback without interruption of video signals and audio signals.
In order to solve the above problems, a playback device according to a first aspect of the present invention is a playback device for playing back a data stream that contains frames respectively having frame playback times that are provided with predetermined time intervals, the playback device comprising: an acquisition unit operable to sequentially acquire the frame playback times of the frames; a storage unit operable to store therein the frame playback times acquired by the acquisition unit; a judgment unit operable to judge whether a current frame playback time that has been most recently acquired by the acquisition unit is included in a playback start period including a designated playback start time, and if judging negatively, further judge whether a difference between the current frame playback time and a previous frame playback time that is immediately previous to the current frame playback time is greater than the predetermined time interval; and a playback unit operable, (a) if the judgment unit judges that the current frame playback time is included in the playback start period, to start playback from a frame having the current frame playback time, and (b) if the judgment unit judges that the current frame playback time is not included in the playback start period and further judges that the difference is greater than the predetermined time interval, to start playback from a frame having the current frame playback time.
In order to solve the above problems, a playback device according to a second aspect of the present invention is a playback device for decoding and playing back a data stream that contains video frames and audio frames respectively having video playback times and audio playback times that are provided with predetermined time intervals, the playback device comprising: an acquisition unit operable to sequentially acquire video frame playback times of the video frames and audio frame playback times of the audio frames; a judgment unit operable to judge whether a video frame playback time and an audio frame playback time acquired by the acquisition unit are respectively included in a playback start period including a designated playback start time; and a decoding unit operable, if the judgment unit judges affirmatively, to decode a video frame and an audio frame respectively having the video frame and the audio frame included in the playback start period, and start playback from the decoded video frame and the decoded audio frame, wherein maximum time limits Tv and Ta are set, Tv being required for judging whether a video frame playback time acquired by the acquisition unit is included in the playback start period, and Ta being required for judging whether the audio frame playback time acquires by the acquisition unit is included in the playback start period, and (i) when Tv is greater than Ta, if the acquired audio frame playback time is not included in the playback start period and the video frames have been decoded after a lapse of Ta, the decoding unit plays back the video frames, and (ii) when Ta is greater than Tv, if the acquired video frame playback time is not included in the playback start period and the audio frames have been decoded after a lapse of Tv, the decoding unit plays back the audio frames.
In order to solve the above problems, a playback device according to a third aspect of the present invention is a playback device for playing back a data stream that contains frames respectively having frame playback times provided with predetermined time intervals, in synchronization with a system time clock included therein, the playback device comprising: a judgment unit operable to sequentially judge whether frame playback times of the frames match the system time clock; a playback unit operable, if the judgment unit judges affirmatively, to play back a frame having a frame playback time that matches the system time clock; a control unit operable, if the judgment unit judges negatively, to instruct the playback unit to play back, in asynchronization with the system time clock, a frame having a frame playback time that does not match the system time clock; and a count unit operable, in accordance with an instruction issued by the control unit, to count the number of frames that have been played back by the playback unit in asynchronization with the system time clock, wherein if the judgment unit judges negatively and the count unit counts a predetermined number of frames that have been played back in asynchronization with the system time clock, the control unit instructs the playback unit to forcibly play back, in synchronization with the system time clock, a frame having a frame playback time that does not match the system time clock.
In order to solve the above problems, a portable telephone of the present invention is a portable telephone comprising the playback device according to any one of the first to the third aspects.
According to the playback device according to the first aspect, the following is possible. At start of playback of a stream from which a frame has been lost or in which an error has occurred due to recording under a bad radio wave condition, even if a frame matching an entry point acceptable range (a frame included in a playback start time period) is lost from the stream, it is possible to start playback from the most appropriate frame located near the lost frame.
According to the playback device according to the second aspect, maximum time limits for searching for an audio frame and a video frame matching an entry point acceptable range are separately managed. Accordingly, if only one of the audio frame and the video frame matching the entry point acceptable range is found, it is possible to output the one found frame to start playback without being influenced by a longer time period for searching for either one of an audio or a video playback start frame than a time period for searching for the other.
According to the playback device according to the third aspect, it is possible to reduce an asynchronous status between a PTS of each frame and the reference clock during playback of a stream from which a frame has been lost or in which an error has occurred due to recording under a bad radio wave condition.
Also, in the synchronous playback device according to the first aspect, the data stream may contain program reference times that increase by each predetermined period, and the playback device may further comprise a detection unit operable to (a) sequentially acquire program reference times, (b) judge whether a difference between a current program reference time that has been most recently acquired and a previous program reference time that is immediately previous to the current program reference time is greater than a predetermined threshold value, and (c) if judging affirmatively, detect that a discontinuity has occurred between the current program reference time and the previous program reference time, and store therein the current program reference time, if the difference between the current frame playback time and the previous frame playback time is greater than the predetermined time interval and the detection unit stores therein the current program reference time, the judgment unit may further judge whether a difference between the current program reference time and the current frame playback time is greater than the predetermined period, and if the difference between the current program reference time and the current frame playback time is no more than the predetermined period, the playback unit may not start playback from the frame having the current frame playback time.
With this structure, if a frame matching an entry point acceptable range (a frame included in a playback start time period) is lost from a stream, whether the frame is lost due to a frame bit error or a packet error in the stream is judged. Only if the frame is lost due to the frame bit error, it is possible to start playback from the most appropriate frame located near the lost frame.
Also, the synchronous playback device according to the first aspect may comply with an MPEG (Moving Picture Experts Group) standard, wherein each of the frame playback times may be a PTS (Presentation Time Stamp) defined by the MPEG standard, and each of the program reference times may be a PCR (Program Clock Reference) defined by the MPEG standard.
With this structure, when a playback device defined by the MPEG standard starts playback of a stream from which a frame has been lost or in which an error has occurred, even if a frame matching an entry point acceptable range (a frame included in a playback start time period) is lost from the stream, it is possible to start playback of the stream from the most appropriate frame located near the lost frame.
Also, in the synchronous playback device according to the third aspect, (i) if the judgment unit judges that a frame playback time is greater than the system time clock and the count unit counts the predetermined number of frames that have been played back in asynchronization with the system time clock, the control unit may instruct the playback unit to wait until the frame playback time matches the system time clock and play back a frame having the frame playback time, and (ii) if the judgment unit judges that a frame playback time is less than the system time clock and the count unit counts the predetermined number of frames that have been played back in asynchronization with the system time clock, the control unit may instruct the playback unit not to play back a frame having the frame playback time.
With this structure, if a PTS of each frame is asynchronous with the reference clock during playback of a stream from which a frame has been lost or in which an error has occurred, it is possible to restore an asynchronous status of the playback device into a synchronous status by forcibly accelerating or delaying the frame.
Also, in the synchronous playback device according to the third aspect, if the judgment unit judges that a frame playback time is greater than the system time clock and the count unit counts the predetermined number of frames that have been played back in asynchronization with the system time clock, and further if a difference between the frame playback time and the system time clock is greater than a predetermined difference t1, the control unit may instruct the playback unit to wait for a predetermined period t2 that is less than the predetermined difference t1, and play back a frame having the frame playback time.
With this structure, if asynchronization occurs between a PTS of a frame and the reference clock to a great extent, it is possible to restore an asynchronous status of the playback device into a synchronous status by forcibly accelerating the frame little by little while removing a user's uncomfortable feeling at viewing video as much as possible.
Also, in the synchronous playback device according to the third aspect, the frames may include first media frames relating to first media and second media frames relating to second media, the system time clock may be set based on each of first media frame playback times of the first media frames, the judgment unit may separately judge whether each of the first media frame playback times matches the system time clock, and whether each of second media frame playback times of the second media frames matches the system time clock, the count unit may separately count the number of first media frames that have been played back in asynchronization with the system time clock and the number of second media frames that have been played back in asynchronization with the system time clock, and if the judgment unit judges that a first media frame playback does not match the system time clock and the count unit counts the predetermined number of first media frames that have been played back in asynchronization with the system time clock, the control unit may determine to set the system time clock based on each of the second media frame playback times.
With this structure, during playback of a stream in a master mode corresponding to first media frames (audio frames, for example), if a frame has been lost from the first media frames or an error has occurred in the first media frames, and further if second media frames (video frames, for example) are normal, it is possible to switch from the master mode corresponding to the first media frames to a master mode corresponding to the normal second media frames in which no discontinuity has occurred.
The following describes embodiments of the present invention with reference to the drawings.
As shown in
The transmission unit 101 transmits a multiplexed stream (an MPEG2-TS stream, which is described later) to the demultiplexing unit 102.
The demultiplexing unit 102 extracts a video PTS that is video playback time information and encoded video data from a PES header of a video PES packet (a PES packet which is described later) of the multiplexed stream, and transmits the extracted video PTS and encoded video data to the video decoding unit 104 and the video PTS acquisition unit 105. Also, the demultiplexing unit 102 extracts an audio PTS that is audio playback time information and encoded audio data from a PES header of an audio PES packet of the multiplexed stream, and transmits the extracted audio PTS and encoded audio data to the audio decoding unit 103 and the audio PTS acquisition unit 108.
The audio decoding unit 103 decrypts encoded audio data.
The video decoding unit 104 decrypts encoded video data.
The audio PTS acquisition unit 108 acquires an audio PTS extracted by the demultiplexing unit 102.
The audio PTS storage unit 109 stores therein an audio PTS acquired by the audio PTS acquisition unit 108.
The audio PTS judgment unit 110 compares an audio PTS currently acquired by the audio PTS acquisition unit 108 with an audio PTS stored in the audio PTS storage unit 109.
The video PTS acquisition unit 105 acquires a video PTS extracted by the demultiplexing unit 102.
The video PTS storage unit 106 stores therein a video PTS acquired by the video PTS acquisition unit 105.
The video PTS judgment unit 107 compares a video PTS currently acquired by the video PTS acquisition unit 105 with a video PTS stored in the video PTS storage 106.
The synchronization control unit 111 controls an operation timing of each of the components included in the synchronous playback device 100.
The reference clock unit 112 manages a system time clock (hereinafter an “STC”) that is a reference clock of the system.
Here, the data structure of an MPEG2-TS stream is described with reference to
The TS header 503 is composed of a sync word 505 showing the beginning of the MPEG2-TS packet 502, a data identifier (PID) 506 identifying data included in the MPEG2-TS packet 502, an adaptation field control 507 indicating whether the payload 504 is valid, an adaptation field 508, and other control flags (not shown in the figure). As shown in
Furthermore, the PES header 705 is composed of an optional field 707 and other control flags (not shown in the figure). A PTS 708 that is playback time information of the stream included in the PES packet 701 is stored in the optional field 707. Also, encoded data is stored in the PES packet data 706.
In the present specification, video data is decoded in accordance with the MPEG4-AVC/H.26 or the MPEG2/4. Audio is decoded in accordance with the 23C.
Firstly,
When an entry point of video data is externally designated, the synchronization control unit 111 clears a video frame output preparation completion flag to “0”, and starts processing.
Firstly, in Step S201, the synchronization control unit 111 requests the transmission unit 101 to supply a stream. Upon being requested, the transmission unit 101 transmits an MPEG2-TS stream (hereinafter a “multiplexed stream”) including the entry point to the demultiplexing unit 102. The demultiplexing unit 102 stores the multiplexed stream in a buffer included therein. The transmission unit 101 manages overflows of the buffer included in the demultiplexing unit 102. When overflows are likely to occur, the transmission unit 101 stops transmitting a multiplexed stream. Then, when the buffer becomes available, the transmission unit 101 restarts transmitting the multiplexed stream. The buffer is consumed in skip processing of audio frames, decoding processing of audio frames, or decoding processing of video frames.
In step S202, the demultiplexing unit 102 detects a first PES packet. Here, the first PES packet is a PES packet of a transmitted stream that firstly includes a PTS.
In Step S203, the demultiplexing unit 102 extracts encoded video data and a video PTS, and transmits the extracted encoded video data to the video decoding unit 104, and transmits the extracted video PTS to the video PTS acquisition unit 105.
In Step S204, the video PTS acquisition unit 105 transmits, to the synchronization control unit 111 and the video PTS storage unit 106, the video PTS transmitted from the demultiplexing unit 102.
In Step S205, the synchronization control unit 111 compares the entry point that has been externally designated with a video PTS currently acquired by the video PTS acquisition unit 105 (hereinafter a “current video PTS”), and performs video entry point judgment for judging whether the entry point corresponds to a video frame having the current video PTS. If the entry point corresponds to the video frame having the current video PTS, the flow proceeds to Step S206.
If the entry point does not correspond to the video frame having the current video PTS, the flow proceeds to Step S208.
Here, this video entry point judgment is performed based on whether the following condition expression (Expression 1) is satisfied. In the (Expression 1), EPv1 represents the entry point that has been externally designated, Tv1 represents a time period per frame, and PTSnv1 represents the video PTS extracted in Step S203. One time period per video frame is used as Tv1. For example, one time period per frame is 66 ms for 15 fps, and is 33 ms for 30 fps. Note that this condition expression is just an example, and a method for performing the video entry point judgment is not limited to this condition expression.
EPv−1/2×Tv1<PTSnv1≦EPv1+1/2×1/2×Tv1 (Expression 1)
If the entry point corresponds to the current video PTS (Step S205: YES), the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S206). In accordance with the decoding permission signal transmitted from the synchronization control unit 111, the video decoding unit 104 performs decoding processing of the encoded video data. Decoded video frames are stored in a frame buffer included in the video decoding unit 104.
Then, the synchronization control unit 111 changes the video frame output preparation completion flag to “1” (Step S207).
If the entry point does not correspond to the current video PTS (Step S205: NO), the video PTS judgment unit 107 performs gap judgment for judging whether a gap has occurred between the current video PTS and a video PTS of a previous video frame stored in the video PTS storage unit 106 (hereinafter a “previous video PTS”) (Step S208). Here, this gap judgment is performed based on whether the following condition expression (Expression 2) is satisfied. If the (Expression 2) is satisfied, the flow proceeds to Step S209. If the (Expression 2) is not satisfied, the flow proceeds to Step S210. Regarding a first video frame, a previous video PTS corresponding to a video PTS of the first video frame is not stored in the video storage unit 106. Accordingly, this gap judgment is not performed on the first video frame. In the (Expression 2), Xv1 represents a threshold value that is externally set for performing gap judgment, and is desirably greater than the frame period.
current PTS−previous PTS<0 or current PTS−previous PTS>Xv1 (Expression 2)
If the (Expression 2) is satisfied (Step S208: YES), the video PTS judgment unit 107 judges that the previous video frame having the previous video PTS in which the gap has occurred is fail-safe data (Step S209). Then, the flow proceeds to Step S206.
If the (Expression 2) is not satisfied (Step S208: NO), the video decoding unit 104 performs skip processing and decoding processing of the encoded video data in accordance with an instruction issued by the video PTS judgment unit 107 (Step S210).
Then, the video PTS storage unit 106 sets the current video PTS as a previous video PTS (Step S211). That is, the video PTS storage unit 106 stores the current video PTS as a previous video PTS in order to use the current video PTS for performing gap judgment of a next video frame. Then, the flow returns to Step S203.
Next,
When an entry point of audio data is externally designated, the synchronization control unit 111 clears an audio frame output preparation completion flag to “0”, and starts processing.
Firstly, in Step S301, the synchronization control unit 111 requests the transmission unit 101 to supply a stream. Upon being requested, the transmission unit 101 transmits a multiplexed stream including the entry point to the demultiplexing unit 102. The processing described so far is the same as the processing performed on video data as described above.
In Step S302, the demultiplexing unit 102 detects a first PES packet. Here, the first PES packet is a PES packet of a transmitted stream that firstly includes a PTS.
In Step 303, the demultiplexing unit 102 extracts encoded audio data and an audio PTS, and transmits the extracted encoded audio data to the audio decoding unit 103, and transmits the extracted audio PTS to the audio PTS acquisition unit 108.
In Step S304, the audio PTS acquisition unit 108 transmits, to the synchronization control unit 111 and the audio PTS storage unit 109, the audio PTS transmitted from the demultiplexing unit 102.
In Step S305, the synchronization control unit 111 compares the entry point that has been externally designated with an audio PTS currently acquired by the audio PTS acquisition unit 108 (hereinafter a “current audio PTS”), and performs audio entry point judgment for judging whether the entry point corresponds to an audio frame having the current audio PTS. If the entry point corresponds to an audio frame having the current audio PTS, the flow proceeds to Step S306.
If the entry point does not correspond to the audio frame having the current audio PTS, the flow proceeds to Step S308.
Here, this audio entry point judgment is performed based on whether the following condition expression (Expression 3) is satisfied. In the (Expression 3), EPa1 represents the entry point that has been externally designated, Ta1 represents a time period per frame, and PTSna1 represents the audio PTS extracted in Step S303. One time period per audio frame is used as Ta1. For example, one time period per frame is 42 ms for AAC24 kHz, and is 22 ms for AAC48 kHz. Note that this condition expression is just an example, and a method for performing the audio entry point judgment is not limited to this expression.
EPa1−1/2×Ta1<PTSna1≦EPa1+1/2×1/2×Ta1 (Expression 3)
If the entry point corresponds to the current audio PTS (Step S305: YES), the synchronization control unit 111 transmits a decoding permission signal to the audio decoding unit 103 (Step S306). In accordance with the decoding permission signal transmitted from the synchronization control unit 111, the audio decoding unit 103 performs decoding processing of the encoded audio data. Decoded audio frames are stored in a frame buffer included in the audio decoding unit 103.
Then, the synchronization control unit 111 changes the audio frame output preparation completion flag to “1” (Step S307).
If the entry point does not correspond to the current audio PTS (Step S305: NO), the audio PTS judgment unit 110 performs gap judgment for judging whether a gap has occurred between the current audio PTS and an audio PTS of a previous audio frame stored in the audio PTS storage unit 109 (hereinafter a “previous audio PTS”) (Step S308). Here, this gap judgment is performed based on whether the following condition expression (Expression 4) is satisfied. If the (Expression 4) is satisfied, the flow proceeds to Step S309. If the (Expression 4) is not satisfied, the flow proceeds to Step S310. Regarding a first audio frame, a previous audio PTS corresponding to a PTS of the first audio frame is not stored in the audio storage unit 109. Accordingly, this gap judgment is not performed on the first audio frame. In the (Expression 4), Xa1 represents a threshold value that is externally set for performing gap judgment, and is desirably greater than the frame period.
current PTS−previous PTS<0 or current PTS−previous PTS>Xa1 (Expression 4)
If the (Expression 4) is satisfied (Step S308: YES), the audio PTS judgment unit 110 judges that the previous audio frame having the previous audio PTS in which the gap has occurred is fail-safe data (Step S309). Then, the flow proceeds to Step S306.
If the (Expression 4) is not satisfied (Step S308: NO), the audio decoding unit 103 performs skip processing of the encoded audio data in accordance with an instruction issued by the audio PTS judgment unit 110 (Step S310).
Then, the audio PTS storage unit 109 sets the current audio PTS as a previous audio PTS (Step S311). That is, the audio PTS storage unit 109 stores the current audio PTS as a previous audio PTS in order to use the current audio PTS for performing gap judgment of a next audio frame. Then, the flow returns to Step S303.
The following describes operations of the synchronization control unit 111 for determining a playback start video frame and a playback start audio frame and starting playback, with reference to the flow chart shown in
In Step S401, the synchronization control unit 111 waits until output preparation of both of a video frame and an audio frame from which playback is instructed to be started has completed. If the output preparation completes (Step S401: YES), the flow proceeds to Step S402.
If both of the video frame and the audio frame are fail-safe data (Step S402: YES), the flow proceeds to Step S405. Otherwise (Step S402: NO), the flow proceeds to Step S403.
If at least one of the video frame and the audio frame is not fail-safe data (Step S402: NO), and further if only the audio frame is fail-safe data (Step S403: YES), the flow proceeds to Step S408. Otherwise (Step S403: NO), the flow proceeds to Step S404.
If only the video frame is fail-safe data (Step S404: YES), the flow proceeds to Step S406. Otherwise (Step S404: NO), the flow proceeds to Step S405.
If both of the video frame and the audio frame are fail-safe data (Step S402: YES), or if both of the video frame and the audio frame are not fail-safe data (Step S404: NO), the synchronization control unit 111 compares an audio PTS of the audio frame with a video PTS of the video frame (Step S405). If the audio PTS is equal to or less than the video PTS (Step S405: YES), the synchronization control unit 111 determines to start the playback in the audio master mode, and the flow proceeds to Step S406. If the audio PTS is greater than the video PTS (Step S405: NO), the synchronization control unit 111 determines to start playback in the video master mode, and the flow proceeds to Step S408.
If determining to start the playback in the audio master mode (Step S405: YES), the synchronization control unit 111 instructs the audio decoding unit 103 to play back audio frames, and corrects the reference clock unit 112 based on the audio PTS (Step S406). The reference clock unit 112 is corrected based on the audio PTS at the frame period, which is a playback timing of the audio frames.
Then, the synchronization control unit 111 performs synchronization control based on a difference between the video PTS and an STC detected from the reference clock unit 112 (Step S407). If the video PTS matches the STC, the synchronization control unit 111 instructs the video decoding unit 104 to play back the video frames.
If determining to start the playback in the video master mode (Step S405: NO), the synchronization control unit 111 instructs the video decoding unit 104 to play back video frames, and corrects the reference clock unit 112 based on the video PTS (Step S408). The reference clock unit 112 is corrected based on the video PTS at the frame period, which is a playback timing of the video frames.
Then, the synchronization control unit 111 performs synchronization control based on a difference between the audio PTS and an STC detected from the reference clock unit 112 (Step S409). If the audio PTS matches the STC, the synchronization control unit 111 instructs the audio decoding unit 103 to play back the audio frames.
As have been described above, with the structure of the synchronous playback device 100 according to the first embodiment of the present invention, it is possible to perform the following processing. Even if a discontinuity has occurred due to a frame loss from a stream or a PTS bit error, the synchronous playback device 100 detects a gap between PTSs of audio frames or video frames at a time of starting playback, and stops searching for a lost audio frame or video frame corresponding to an entry point. Then, the synchronous playback device 100 determines the most appropriate frame (fail-safe data) located near the lost audio frame or video frame corresponding to the entry point, and starts playback of the stream from the fail-safe data.
The following describes a second embodiment of the present invention.
The second embodiment differs from the first embodiment in the following point. In the second embodiment, maximum time limits for searching data for a video frame and an audio frame corresponding to an entry point are determined beforehand. After a lapse of the maximum time limits, the searching is timed out.
The following describes the second embodiment, focusing on the difference from the first embodiment.
The synchronous playback device 800 includes a transmission unit 801, a demultiplexing unit 802, an audio decoding unit 803, an audio PTS acquisition unit 808, an audio PTS storage unit 809, an audio PTS judgment unit 810, a video decoding unit 804, a video PTS acquisition unit 805, a video PTS storage unit 806, a video PTS judgment unit 807, a synchronization control unit 811, a reference clock unit 812, an audio time management unit 813, and a video time management unit 814.
The transmission unit 801, the demultiplexing unit 802, the audio decoding unit 803, the audio PTS acquisition unit 808, the audio PTS storage unit 809, the audio PTS judgment unit 810, the video decoding unit 804, the video PTS acquisition unit 805, the video PTS storage unit 806, the video PTS judgment unit 807, the synchronization control unit 811, and the reference clock unit 812 respectively have the same functions of the transmission unit 101, the demultiplexing unit 102, the audio decoding unit 103, the audio PTS acquisition unit 108, the audio PTS storage unit 109, the audio PTS judgment unit 110, the video decoding unit 104, the video PTS acquisition unit 105, the video PTS storage unit 106, the video PTS judgment unit 107, the synchronization control unit 111, and the reference clock unit 112 that are included in the synchronous playback device 100 according to the first embodiment.
The audio time management unit 813 transmits an audio time-out signal to the synchronization control unit 811 at a preset time.
The video time management unit 814 transmits a video time-out signal to the synchronization control unit 811 at a preset time.
It is possible to separately determine a preset time for the audio time management unit 813 and the video time management unit 814.
Operations for determining a playback start video frame are the same as those shown in
Operations for determining a playback start audio frame are the same as those shown in
The operations shown in
Here, a time period preset for the video time management unit 814 is represented as Tev. A time period preset for the audio time management unit 813 is represented as Tea. Tev and Tea are each a time period enough for searching for a frame corresponding to an entry point.
If the synchronization control unit 811 receives an audio time-out signal from the audio time management unit 813 (Step S901: YES), the flow proceeds to Step S902. Otherwise (Step S901: NO), the flow proceeds to Step S904.
If the output preparation completion flag indicates “1” (Step S902: YES), the flow proceeds to Step S903. Otherwise (Step S902: NO), the processing ends.
If the output preparation completion flag indicates “1” (Step S902: YES), the synchronization control unit 811 instructs the video decoding unit 804 to play back video frames, and corrects the reference clock unit 812 based on the video PTS (Step S903). The reference clock unit 812 is corrected based on the video PTS at the frame period, which is a playback timing of the video frames.
If the synchronization control unit 811 does not receive an audio time-out signal from the audio time management unit 813 (Step S901: NO), and further if the synchronization control unit 811 receives a video time-out signal from the video time management unit 814 (Step S904: YES), the flow proceeds to Step S905. Otherwise (Step S904: NO), the processing ends.
If the audio output preparation completion flag indicates “1” (Step S905: YES), the flow proceeds to Step S906. Otherwise (Step S905: NO), the processing ends.
In Step S906, the synchronization control unit 811 instructs the audio decoding unit 803 to playback audio frames, and corrects the reference clock unit 812 based on the audio PTS. The reference clock unit 812 is corrected based on the audio PTS at the frame period, which is a playback timing of the audio frames.
As have been described above, with the structure of the synchronous playback device 800 according to the second embodiment of the present invention, maximum time limits for searching for an audio playback start frame and a video playback start frame corresponding to an entry point are separately managed. Accordingly, even if only one of the audio playback start frame and the video playback start frame is found, it is possible to output only the found frame without being influenced by a longer time period for searching for either one of the audio playback start frame and the video playback start frame than a time period for searching for the other. For example, suppose that a maximum time limit for searching for a video playback start frame is set to be longer than a maximum time limit for searching for an audio playback start frame. If the video playback start frame is found, it is possible to output the found video playback start frame after a lapse of the maximum time limit for searching for the audio playback start frame.
The following describes a third embodiment of the present invention.
The third embodiment differs from the first embodiment in the following point. In the third embodiment, a PCR (described later) included in a stream is used for searching for a video frame and an audio frame corresponding to an entry point.
The following describes the third embodiment, focusing on the difference from the first embodiment.
The synchronous playback device 1000 includes a transmission unit 1001, a discontinuity detection unit 1002, a demultiplexing unit 1003, an audio decoding unit 1004, an audio PTS acquisition unit 1005, an audio PTS storage unit 1006, an audio PTS judgment unit 1007, a video decoding unit 1008, a video PTS acquisition unit 1009, a video PTS storage unit 1010, a video PTS judgment unit 1011, a synchronization control unit 1012, and a reference clock unit 1013.
The transmission unit 1001, the demultiplexing unit 1003, the audio decoding unit 1004, the audio PTS acquisition unit 1005, the audio PTS storage unit 1006, the audio PTS judgment unit 1007, the video decoding unit 1008, video PTS acquisition unit 1009, the video PTS storage unit 1010, the video PTS judgment unit 1011, the synchronization control unit 1012, and the reference clock unit 1013 respectively have the same functions of the transmission unit 101, the demultiplexing unit 102, the audio decoding unit 103, the audio PTS acquisition unit 108, the audio PTS storage unit 109, the audio PTS judgment unit 110, the video decoding unit 104, the video PTS acquisition unit 105, the video PTS storage unit 106, the video PTS judgment unit 107, the synchronization control unit 111, and the reference clock unit 112 that are included in the synchronous playback device 100 according to the first embodiment.
The discontinuity detection unit 1002 detects a discontinuity between PCRs of an MPEG2-TS. The discontinuity detection unit 1002 compares a PCR currently transmitted (hereinafter a “current PCR”) with a PCR that has been previously transmitted (hereinafter a “previous PCR”) included in PCRs transmitted at predetermined time intervals, and detects a discontinuity based on a difference between the current PCR and the previous PCR. The discontinuity detection unit 1002 notifies the video PTS comparison unit 1011 and the audio PTS comparison unit 1007 of the PCR in which a discontinuity is detected. Although not shown in the figure, the discontinuity detection unit 1002 includes therein a buffer for storing a PCR in which a discontinuity is detected.
Here, PCRs of an MPEG2-TS stream is described with reference to
As shown in
Firstly,
Firstly, in Step S1101, the synchronization control unit 111 requests the transmission unit 1001 to supply a stream. Upon being requested, the transmission unit 1001 transmits a multiplexed stream including the entry point to the discontinuity detection unit 1002. The demultiplexing unit 1003 stores, in a buffer included therein, the multiplexed stream transmitted from the discontinuity detection unit 1002. The transmission unit 1001 manages overflows of the buffer included in the demultiplexing unit 1003. When overflows are likely to occur, the transmission unit 1001 stops transmitting a multiplexed stream. Then, when the buffer becomes available, the transmission unit 1001 restarts transmitting the multiplexed stream. The buffer is consumed in decoding processing of audio frames, decoding processing of audio frames, or decoding processing of video frames.
In Step S1102, the discontinuity detection unit 1002 searches PCRs of TS packets of the multiplexed stream for a discontinuity, and stores therein information indicating that the discontinuity has occurred. Discontinuity detection judgment for judging whether a discontinuity has occurred is performed based on whether the following condition expression (Expression 5) is satisfied. If the (Expression 5) is satisfied, the flow proceeds to Step S1103. If the (Expression 5) is not satisfied, the flow proceeds to Step S1104. In the (Expression 5), the current PCR represents a PCR of a TS packet that is currently detected. The previous PCR represents a PCR of a TS packet that has been previously detected. Ya represents a threshold value, and is desirably greater than the PCR transmission cycle.
current PCR−previous PCR<0 or current PCR−previous PCR>Ya (Expression 5)
If the current PCR and the previous PCR are discontinuous from each other (Step S1102: YES), the discontinuity detection unit 1002 stores therein a PCR which is a discontinuity point between TS packets, and transmits the PCR to the video PTS judgment unit 1011 (Step S1103). Also, the discontinuity detection unit 1002 transmits the multiplexed stream to the demultiplexing unit 1003.
Each time a stream is transmitted from the transmission unit 1001, the above PCR storage and discontinuity judgment in Steps S1102 and S1103 are performed.
If a transmitted PES packet is a first PES packet, the demultiplexing unit 1003 detects the transmitted PES packet as a first PES packet (Step S1104). Here, the first PES packet is a PES packet of a transmitted stream that firstly includes a PTS. If a transmitted PES packet is a PES packet following the first PES packet, the transmitted PES packet is not detected in Step S1104. Then, the flow proceeds to Step S1105.
Then, the demultiplexing unit 1003 extracts encoded video data and a video PTS, and transmits the extracted encoded video data to the video decoding unit 1008, and transmits the extracted video PTS to the video PTS acquisition unit 1009 (Step S1105).
The video PTS acquisition unit 1009 transmits, to the synchronization control unit 1012 and the video PTS storage unit 1010, the video PTS transmitted from the demultiplexing unit 1003 (Step S1106).
The synchronization control unit 1012 compares the entry point which has been externally designated with a current video PTS managed by the video PTS acquisition unit 1009, and performs video entry point judgment for judging whether the entry point corresponds to a video frame having the current video PTS (Step S1107). If the entry point corresponds to the video frame having the current video PTS (Step S1107: YES), the flow proceeds to Step S1108.
If the entry point does not correspond to the video frame having the current video PTS, the flow proceeds to Step S1110.
Here, the video entry point judgment is performed based on whether the following condition expression (Expression 6) is satisfied. If the (Expression 6) is satisfied, the synchronization control unit 1012 judges that the video frame having the video PTS corresponds to the entry point. In the (Expression 6), EPv3 represents an entry point that has been externally designated, Tv3 represents a time period per frame, and PTSnv3 represents the video PTS extracted in Step S1105. One time period per video frame is used as Tv3. For example, one time period per frame is 66 ms for 15 fps, and is 33 ms for 30 fps. Note that this condition expression is just an example, and a method for performing the video entry point judgment is not limited to this condition expression.
EPv3−1/2×Tv3<PTSnv1≦EPv3+1/2×1/2×Tv3 (Expression 6)
If the entry point corresponds to the current video PTS (Step S1107: YES), the synchronization control unit 1012 transmits a decoding permission signal to the video decoding unit 1008 (S1108). In accordance with the decoding permission signal transmitted from the synchronization control unit 1012, the video decoding unit 1008 performs decoding processing of the encoded video data. Decoded video frames are stored in a frame buffer included in the video decoding unit 1008.
Then, the synchronization control unit 111 changes the video frame output preparation completion flag to “1” (Step S1109).
If the entry point does not correspond to the current video PTS (Step S1107: NO), the video PTS judgment unit 1011 performs gap judgment for judging whether a gap has occurred between a current video PTS and a previous video PTS (Step S1110). Here, this gap judgment is performed based on whether the following condition expression (Expression 7) is satisfied. If the (Expression 7) is satisfied, the synchronization control unit 1012 judges that a gap has occurred in the video frame, and the flow proceeds to Step S1111. If the (Expression 7) is not satisfied, the flow proceeds to Step S1113. Regarding a first video frame, a previous video PTS corresponding to a video PTS of the first video frame is not stored in the video storage unit 1010. Accordingly, this gap judgment is not performed on the first video frame. In the (Expression 7), Xv3 represents a threshold value that is externally set for performing the gap judgment, and is desirably greater than the frame period.
current PTS−previous PTS<0 or current PTS−previous PTS>Xv3 (Expression 7)
If the (Expression 7) is satisfied (Step S1110: YES), the video PTS judgment unit 1011 compares the video PTS in which the gap has occurred with a PCR stored in the discontinuity detection unit 1002 in Step S1003 (Step S1111). If the following condition expression (Expression 8) is satisfied (Step S1111: YES), the flow proceeds to Step S1113. If the (Expression 8) is not satisfied (Step S1111: NO), the flow proceeds to Step S1112. In the (Expression 8), Z is desirably a value in consideration of a PCR transmission cycle of TS packets and interleaving between audio frames and video frames. For example, in the one-segment broadcasting, a PCR transmission cycle of TS packets is within 400 ms, an interleaving interval between audio frames and video frames is approximate 1.5 seconds. Accordingly, it is appropriate to set a value corresponding to approximate 2 seconds as Z.
−Z<PCR−current PTS<Z (Expression 8)
If the (Expression 8) is not satisfied (Step S1111: NO), the video PTS judgment unit 1011 judges that the video frame having the video PTS in which the gap has occurred is fail-safe data (Step S1112). Then, the flow proceeds to Step S1108.
The video decoding unit 1008 performs skip processing and decoding processing of the encoded video data in accordance with an instruction issued by the video PTS judgment unit 1011 (Step S1113).
Then, the video PTS storage unit 1010 sets the current video PTS as a previous video PTS (Step S1114). That is, the video PTS storage unit 1010 stores the current video PTS as a previous video PTS in order to use the current video PTS for performing gap judgment of a next video frame. Then, the flow returns to Step S1105.
Next,
When an entry point of audio data is externally set, the synchronization control unit 111 clears an audio frame output preparation completion flag to “0”, and starts processing.
Firstly, in Step S1201, the synchronization control unit 111 requests the transmission unit 1001 to supply a stream. Upon being requested, the transmission unit 1001 transmits a multiplexed stream including the entry point to the discontinuity detection unit 1002.
In Step S1202, the discontinuity detection unit 1002 searches PCRs of TS packets of the multiplexed stream for a discontinuity, and stores therein information indicating that the discontinuity has occurred. Discontinuity detection judgment for judging whether a discontinuity has occurred is performed based on whether the (Expression 5) is satisfied. If the (Expression 5) is satisfied, the flow proceeds to Step S1203. If the (Expression 5) is not satisfied, the flow proceeds to Step S1204.
If a current PCR and a previous PCR are discontinuous from each other (Step S1202: YES), the discontinuity detection unit 1002 stores therein a PCR which is a discontinuity point between TS packets, and transmits the PCR to the audio PTS judgment unit 1007 (Step S1203). Also, the discontinuity detection unit 1002 transmits the multiplexed stream to the demultiplexing unit 1003.
Each time a stream is transmitted from the transmission unit 1001, the above PCR storage and discontinuity judgment in Steps S1202 and S1203 are performed.
If a transmitted PES packet is a first PES packet, the demultiplexing unit 1003 detects the transmitted PES packet as a first PES packet (Step S1204). Here, the first PES packet is a PES packet of a transmitted stream that firstly includes a PTS. If a transmitted PES packet is a PES packet following the first PES packet, the transmitted PES packet is not detected in Step S1204. Then, the flow proceeds to Step S1205.
Then, the demultiplexing unit 1003 extracts encoded audio data and an audio PTS, and transmits the extracted encoded audio data to the audio decoding unit 1004, and transmits the extracted audio PTS to the audio PTS acquisition unit 1005 (Step S1205).
The audio PTS acquisition unit 1005 transmits, to the synchronization control unit 1012 and the audio PTS storage unit 1006, the audio PTS transmitted from the demultiplexing unit 1003 (Step S1206).
The synchronization control unit 1012 compares the entry point which has been externally designated with a current audio PTS managed by the audio PTS acquisition unit 1005, and performs audio entry point judgment for judging whether the entry point corresponds to an audio frame having the current audio PTS (Step S1207). If the entry point corresponds to the audio frame having the current audio PTS (Step S1207: YES), the flow proceeds to Step S1208.
If the entry point does not correspond to the audio frame having the current audio PTS (Step S1207: NO), the flow proceeds to Step S1210.
Here, the audio entry point judgment is performed based on whether the following condition expression (Expression 9) is satisfied. In the (Expression 9), EPa3 represents an entry point that has been externally designated, Ta3 represents a time period per frame, and PTSna3 represents the audio PTS extracted in Step S1205. One time period per audio frame is used as Ta3. For example, one time period per frame is 42 ms for AAC24 kHz, and is 22 ms for AAC48 kHz. Note that this condition expression is just an example, and a method for performing the audio entry point judgment is not limited to this condition expression.
EPa3−1/2×Ta3<PTSna3≦EPa3+1/2×1/2×Ta3 (Expression 9)
If the entry point corresponds to the current audio PTS (Step S1207: YES), the synchronization control unit 1012 transmits a decoding permission signal to the audio decoding unit 1004 (S1208). In accordance with the decoding permission signal transmitted from the synchronization control unit 1012, the audio decoding unit 1004 performs decoding processing of the encoded audio data. Decoded video frames are stored in a frame buffer included in the audio decoding unit 1004.
Then, the synchronization control unit 111 changes the audio frame output preparation completion flag to “1” (Step S1209).
If the entry point does not correspond to the current audio PTS (Step S1207: NO), the audio PTS judgment unit 1007 performs gap judgment for judging whether a gap has occurred between a current audio PTS and a previous audio PTS (Step S1210). Here, this gap judgment is performed based on whether the following condition expression (Expression 10) is satisfied. If the (Expression 10) is satisfied, the synchronization control unit 1012 judges that a gap has occurred in the audio frame, and the flow proceeds to Step S1211. If the (Expression 10) is not satisfied, the flow proceeds to Step S1213. Regarding a first audio frame, a previous audio PTS corresponding to an audio PTS of the first audio frame is not stored in the audio storage unit 1006. Accordingly, this gap judgment is not performed on the first audio frame. In the (Expression 10), Xa3 represents a threshold value that is externally set for performing the gap judgment, and is desirably greater than the frame period.
current PTS−previous PTS<0 or current PTS−previous PTS>Xa3 (Expression 10)
If the (Expression 10) is satisfied (Step S1210: YES), the audio PTS judgment unit 1007 compares the audio PTS in which the gap has occurred with a PCR stored in the discontinuity detection unit 1002 in Step S1203 (Step S1211). If the (Expression 8) is satisfied (Step S1211: YES), the flow proceeds to Step S1213. If the (Expression 8) is not satisfied (Step S1211: NO), the flow proceeds to Step S1212.
If the (Expression 8) is not satisfied (Step S1211: NO), the audio PTS judgment unit 1007 judges that the audio frame having the audio PTS in which the gap has occurred is fail-safe data (Step S1212). Then, the flow proceeds to Step S1208. The audio decoding unit 1004 performs skip processing and decoding processing of the encoded audio data in accordance with an instruction issued by the audio PTS judgment unit 1007 (Step S1213).
Then, the audio PTS storage unit 1006 sets the current audio PTS as a previous audio PTS (Step S1214). That is, the audio PTS storage unit 1006 stores the current audio PTS as a previous audio PTS in order to use the current audio PTS for performing gap judgment of a next audio frame. Then, the flow returns to Step S1205.
Note that Operations for starting playback after determination of a playback start video frame and a playback start audio frame are the same as the operations shown in
As have been described above, the synchronous playback device 1000 according to the third embodiment of the present invention detects occurrence of a discontinuity between TS packets using the PCR. Accordingly, it is possible to distinguish a discontinuity caused by data loss of TS packets from a discontinuity caused by PTS bit errors. Since a discontinuity caused by data loss of TS packets may be detected using methods other than the method described in the present specification (for example, using software in the upper layer), a portion corresponding to the loss data is considered not to be designated as an entry point. According to the third embodiment, if a discontinuity caused by data loss of TS packets is detected, it is possible to continue playback start frame search processing, without suspending the search processing and starting playback from fail-safe data.
The following describes a fourth embodiment of the present invention.
A synchronous playback device according to the fourth embodiment has the same structure as that of the synchronous playback device 100 according to the first embodiment. However, the fourth embodiment differs from the first embodiment in the following point. In the fourth embodiment, processing relating to frame loss is performed during playback of video frames and audio frames. Description of the structure of the synchronous playback device according to the fourth embodiment that is the same as that of the synchronous playback device 100 as shown in
The following describes the different structure between the first embodiment and the fourth embodiment.
During playback, the demultiplexing unit 102 separates a multiplexed stream transmitted from the transmission unit 101 into video PES data and audio PES data. The demultiplexing unit 102 extracts a video PTS and encoded video data from the video PES data, and then transmits the extracted video PTS and encoded video data to the video decoding unit 104 and the video PTS acquisition unit 105. Also, the demultiplexing unit 102 extracts an audio PTS and encoded audio data from the audio PES data, and then transmits the extracted audio PTS and encoded audio data to the audio decoding unit 103 and the audio PTS acquisition unit 108.
The audio decoding unit 103 includes therein a unique clock, and operates at the frame period. At a head of the frame period, the audio decoding unit 103 acquires encoded audio data corresponding to one audio frame, and performs operations relating to decoding processing in accordance with a control signal transmitted from the synchronization control unit 111. Specifically, if receiving a decoding permission signal from the synchronization control unit 111, the audio decoding unit 103 performs decoding processing, and outputs a result of the decoding processing. If receiving a decoding suspension signal from the synchronization control unit 111, the audio decoding unit 103 suspends decoding processing and outputting a result of the decoding processing. Also, if receiving a frame discard signal from the synchronization control unit 111, the audio decoding unit 103 discards encoded audio data corresponding to a current audio frame, and immediately acquires encoded audio data corresponding to a next audio frame. Then, the audio decoding unit 103 performs processing in accordance with a next signal transmitted from the synchronization control unit 111.
Like the audio decoding unit 103, the video decoding unit 104 includes therein a unique clock, and operates at the frame period. At a head of the frame period, the video decoding unit 104 acquires encoded video data corresponding to one video frame, and performs operations relating to decoding processing in accordance with a control signal transmitted from the synchronization control unit 111. Specifically, if receiving a decoding permission signal from the synchronization control unit 111, the video decoding unit 104 performs decoding processing, and outputs a result of the decoding processing. At the same time, the video decoding unit 104 acquires encoded video data corresponding to a next video frame. If receiving a decoding suspension signal from the synchronization control unit 111, the video decoding unit 104 suspends decoding processing and outputting a result of the decoding processing. Also, if receiving a frame discard signal from the synchronization control unit 111, the video decoding unit 104 decodes encoded video data corresponding to a current video frame, and then acquires encoded video data corresponding to a next video frame. Then, the video decoding unit 104 performs processing in accordance with a next signal transmitted from the synchronization control unit 111.
Here, the video decoding unit 104 performs decoding processing because of the following reason even if the frame discard signal is received.
It is necessary to perform decoding processing of compression-encoded video data such as MPEG compression-encoded video data by referring to previous frames. Accordingly, there is a possibility that if data corresponding to a frame is discarded without being decoded, a frame following the discarded frame cannot be normally decoded.
The audio PTS acquisition unit 108 acquires an audio PTS transmitted from the demultiplexing unit 102, and transmits the acquired audio PTS to the synchronization control unit 111 at a time when the audio decoding unit 103 acquires encoded audio data.
The video PTS acquisition unit 105 acquires a video PTS transmitted from the demultiplexing unit 102, and transmits the acquired video PTS to the synchronization control unit 111 at a time when the video decoding unit 104 acquires encoded video data.
Upon acquiring the audio PTS from the audio PTS acquisition unit 108, the synchronization control unit 111 detects the reference clock (hereinafter, also referred to as an “STC”) from the reference clock unit 112. The synchronization control unit 111 compares the STC with the audio PTS, and transmit a control signal to the audio decoding unit 103. Also, upon acquiring the video PTS from the video PTS acquisition unit 105, the synchronization control unit 111 detects the STC from the reference clock unit 112. The synchronization control unit 111 compares the STC with the video PTS, and transmit a control signal to the video decoding unit 104.
Also, if judging that a PTS of data corresponding to a current master mode is synchronous with the STC, the synchronization control unit 111 transmits a PTS for correcting the reference clock to the reference clock unit 112.
Furthermore, the synchronization control unit 111 manages whether a current master mode is the audio master mode or the video master mode, and also manages playback statuses respectively indicating whether audio and video are synchronous with the reference clock.
The reference clock unit 112 receives a PTS transmitted from the synchronization control unit 111, and corrects the reference clock based on the received PTS.
The following describes synchronization control processing performed by the synchronization control unit 111 with reference to a flow chart shown in
As shown in
The synchronization control unit 111 acquires the video PTS transmitted from the video PTS acquisition unit 105 (Step S1601).
The synchronization control unit 111 detects an STC that is a current time from the reference clock unit 112 (Step S1602).
Then, the synchronization control unit 111 judges whether the following condition expression (Expression 11) is satisfied (Step S1603).
|PTS−STC|<t1 (Expression 11)
Here, a threshold value t1 is a fixed value that can be determined for each system or data.
If the (Expression 11) is satisfied (Step S1603: YES), the synchronization control unit 111 judges that the video PTS is synchronous with the STC. The flow proceeds to Step S1604. If the (Expression 11) is not satisfied (Step S1603: NO), the synchronization control unit 111 judges that the video PTS is not synchronous with the STC. The flow proceeds to Step S1609 for performing lost synchronization processing.
Here, there is a case where a plurality of audio frames or video frames are included in one PES packet. In this case, a PTS transmitted from the demultiplexing unit 102 corresponds to “i” that is a first frame of the PES packet. Accordingly, “i+1” and “i+2” that are frames following the first frame of the PES packet correspond to no PTS. In such case, the video PTS acquisition unit 105 interpolates a video PTS of a video frame corresponding to no video PTS using a parameter relating to a frame rate of the video frame or a predetermined frame rate. In a case of an audio frame, the audio PTS acquisition unit 108 similarly interpolates an audio PTS in an audio frame corresponding to no audio PTS using a parameter of the audio frame relating to a frame rate or a predetermined frame rate.
The (Expression 11) means the following.
In order to be synchronous with a video frame, it is necessary to start outputting the video frame at a time when the STC matches a PTS. However, there is a possibility that reading processing of the STC using software might cause variation in processing time period. Accordingly, it is difficult to accurately read a time when the STC matches a PTS. Therefore, if the (Expression 11) is satisfied, the synchronization control unit 111 instructs the video decoding unit 104 to perform decoding processing by making redundancy in the judgment processing.
If the (Expression 11) is satisfied (Step S1603: YES), the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1604). In accordance with the decoding permission signal, the video decoding unit 104 performs decoding processing, and outputs audio playback data that is a result of the decoding processing.
The synchronization control unit 111 detects the current master mode (Step S1605).
If the current master mode is the video master mode (Step S1606: YES), the synchronization control unit 111 transmits, to the reference clock unit 112, a PTS judged to be synchronous (Step S1607). The reference clock unit 112 corrects the reference clock based on the transmitted PTS.
If the current master mode is not the video master mode (Step S1606: NO), the flow proceeds to Step S1608.
Since the PTS is judged to be synchronous with the reference clock, the synchronization control unit 111 clears an asynchronization counter (later described), and changes the video playback status to a synchronous status (Step S1608). Then, the synchronization control processing ends.
If the (Expression 11) is not satisfied (Step S1603: NO), the synchronization control unit 111 judges whether the following condition expression (Expression 12) is satisfied (Step S1609).
0<STC−PTS<t2 (Expression 12)
Here, a threshold value t2 is a fixed value that can be determined for each system or data.
If the (Expression 12) is satisfied (Step S1609: YES), a frame to be decoded is already later than a playback time. Accordingly, the video decoding unit 104 discards the frame, and immediately performs processing of a next frame so as to make up for a lost time period. That is, the synchronization control unit 111 transmits a frame discard signal to the video decoding unit 104 (Step S1610). Then, the processing ends.
In the fourth embodiment, if a frame to be decoded is already later, the synchronization control unit 111 instructs the video decoding unit 104 to discard data corresponding to one frame. However, if a frame to be decoded is greatly later than the STC, the synchronization control unit 111 may instruct the video decoding unit 104 to discard data corresponding to a plurality of frames.
If the (Expression 12) is not satisfied (Step S1609: NO) the flow proceeds to Step S1611.
The synchronization control unit 111 judges whether the following condition expression (Expression 13) is satisfied (Step S1611).
0<(PTS−STC)<t3 (Expression 13)
Here, a threshold value t3 is a fixed value that can be determined for each system or data.
If the (Expression 13) is satisfied (Step S1611: YES), a frame to be decoded is faster than a playback time. Accordingly, the video decoding unit 104 suspends decoding processing of the frame so as to clear up an advanced time period. That is, the synchronization control unit 111 transmits a decoding suspension signal to the video decoding unit 104 (Step S1612).
After a lapse of a difference time period between the PTS and the STC, the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1613).
If the (Expression 13) is not satisfied (Step S1611: NO), there is a possibility that a PTS error or a discontinuity might occur. The flow proceeds to Step S1614 for asynchronization processing.
The following is performed in the asynchronization processing.
The synchronization control unit 111 judges whether the following condition expression (Expression 14) relating to the asynchronization counter is satisfied (Step S1614).
value of asynchronization counter<M (Expression 14)
Here, the asynchronization counter counts the number of frames on which asynchronization processing has been performed in a case where any of the judgments performed in Steps S1603, S1609, and S1611 is not satisfied. When starting counting, or if the (Expression 11) in Step S1603 is satisfied, that is if the synchronization control unit 111 judges that the STC is synchronous with the PTS, the synchronization control unit 111 clears the asynchronization counter to “0”.
In the (Expression 14), a threshold value M is a fixed value that can be determined for each system or data.
If the (Expression 14) is satisfied (Step S1614: YES), the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1615). In accordance with the decoding permission signal, the video decoding unit 104 performs decoding processing and output processing of the video frame.
Then, the synchronization control unit 111 increments the asynchronization counter by 1 (Step S1616).
The synchronization control unit 111 changes the video playback status to an asynchronous status (Step S1629). Then, the synchronization control processing ends.
If the (Expression 14) is not satisfied (Step S1614: NO), the flow proceeds to Step S1617 for forced synchronization processing.
The asynchronization counter is used for the following reason.
If there is a great difference time period between the STC and the PTS, it is impossible to judge whether a temporal discontinuity has occurred in one of an audio frame and a video frame or a PTS error has occurred. If the number of continuous frames whose difference time period therebetween is less than a predetermined number of frames (M frames in the above (Expression 14)), the synchronization control unit 111 judges that a PTS error has occurred, and performs asynchronization processing. If the number of continuous frames whose difference time period therebetween is no less than the predetermined number of frames, the synchronization control unit 111 judges that a temporal discontinuity has occurred, and performs forced synchronization processing.
The following is performed in the forced synchronization processing.
The synchronization control unit 111 changes the video playback status to a forced synchronous status (Step S1617).
The synchronization control unit 111 detects a current master mode (Step S1618):
The synchronization control unit 111 judges whether the current master mode is the video master mode (Step S1619).
If the current master mode is the video master mode (Step S1619: YES), the synchronization control unit 111 transmits the PTS to the reference clock unit 112 (Step S1620). The reference clock unit 112 corrects the reference clock based on the transmitted PTS. That is, the forced synchronization processing means correction of the reference clock.
The synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1621). Then, the processing ends.
If the current master mode is not the video master mode (Step S1619: NO), the flow proceeds to Step S1622.
The synchronization control unit 111 detects the playback status of a stream corresponding to the current master (Step S1622).
The synchronization control unit 111 judges whether the playback status of the current master is not a synchronous status (Step S1623).
If the playback status of a stream corresponding to the current master is not a synchronous status (Step S1623: YES), the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1624). Then, the processing ends.
Here, the judgment of the playback status of a stream corresponding to the current master is performed for the following reason.
If the playback status of a stream corresponding to the current master is not a synchronous status, but an asynchronous status or a forced synchronous status, the current master mode is not in a synchronous status. In such a case, the reference clock is not corrected based on a PTS relating to the current master mode. Accordingly, it is necessary to wait until the playback status of a stream corresponding to the current master has changed to a synchronous status.
If the playback status of a stream corresponding to the current master is a synchronous status (Step S1623: NO), the synchronization control unit 111 judges whether the following condition expression (Expression 15) is satisfied (Step S1625).
PTS<STC (Expression 15)
If the (Expression 15) is satisfied, that is, if the STC is greater than the PTS (Step S1625: YES), the PTS is later than the PTS. The synchronization control unit 111 transmits a frame discard signal to the video decoding unit 104 (Step S1626). Then, the processing ends.
If the (Expression 15) is not satisfied, that is, if the PTS is greater than the STC (Step S1625: NO), the synchronization control unit 111 transmits a decoding suspension signal to the video decoding unit 104 (Step S1627).
In Step S1628, after a lapse of a difference time period between the PTS and the STC, the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S162). Then, the processing ends.
The operations shown in the flow chart of
In
During playback in which video and audio are in a synchronous status, a discontinuity occurs in the video frame 1802. At this time, a difference between the video frame 1802 and the reference clock is 78, and is greater than 10 which is the threshold value t3 in the (Expression 13). Also, the value of the asynchronization counter is “0”. Therefore, video asynchronization processing is performed.
Then, a discontinuity occurs in the audio frame 1804. Although the current master mode is the audio master mode, synchronization control processing is performed, like the case of video.
A difference between the audio frame 1804 and the reference clock is 66, and is greater than 10 which is the threshold value t3 in the (Expression 13). Also, the value of the asynchronization counter is “0”. Therefore, audio asynchronization processing is performed.
While the audio asynchronization processing is performed, the reference clock is not corrected. Accordingly, the reference clock increases in an asynchronous status.
Then, since the value of the asynchronization counter at an audio frame 1805 is “3”, the (Expression 14) is not satisfied. Accordingly, forced synchronization processing is performed. Since the current master mode is the audio mater mode, the reference clock is corrected. As a result, the reference clock is 126.
Then, synchronization processing of a video frame 1806 is performed. As a result, the audio playback status is changed to the synchronous status, and the reference clock is corrected. However, a difference between the video frame 1806 and the reference clock is still 12, and is greater than 10 which is the threshold value t3 in the (Expression 13). Accordingly, the asynchronization processing continues to be performed.
The value of the asynchronization counter at a video frame 1807 is “5”, and the (Expression 14) is not satisfied. Accordingly, forced synchronization processing is performed. A difference between the PTS of the video frame 1807 and the STC is 12. Accordingly, decoding processing of the video frame 1807 is suspended, and a lapse of a time period corresponding to 12 time periods per frame is waited for. While waiting for the lapse of the time period, the video frame 1806 is displayed. After the lapse of the time period corresponding to 12 time periods per frame, the video frame 1807 is decoded, and the decoded video frame 1807 is displayed. Then, the video is played back in synchronization with the reference clock.
As have been described above, with the structure of the synchronous playback device according to the fourth embodiment of the present invention, even if a temporal discontinuity occurs in either one or both of an audio stream and a video stream included in a multiplexed stream recorded under a bad radio wave condition, it is possible to continue playback of the multiplexed stream. Furthermore, it is possible to perform playback of audio frames and video frames in synchronization with each other after passing through a discontinuity point in which the discontinuity has occurred.
Moreover, by providing the asynchronization counter, whether to perform asynchronization processing or forced synchronization processing can be judged based on the threshold value. Accordingly, even if a PTS error due to transmission error for example makes it impossible to achieve synchronization, the synchronous playback device does not judge that a discontinuity has occurred, and performs asynchronization processing. Accordingly, it is possible to continuously play back the audio and the video without interruption.
The following describes a fifth embodiment of the present invention.
A synchronous playback device according to the fifth embodiment has the same structure and the same basic operations as those of the synchronous playback device according to the fourth embodiment. However, the fifth embodiment differs from the fourth embodiment in the following point. In the fifth embodiment, synchronization control processing performed during playback includes modified forced synchronization processing. The following mainly describes the different operations between the fourth embodiment and the fifth embodiment.
The operations of the synchronous playback device according to the fifth embodiment are described with reference to
Although synchronization control of decoding processing of encoded video data is described here, it is of course possible to apply this synchronization control to decoding processing of encoded audio data. In such a case, portions represented as “video” and portions represented as “audio” in
The descriptions of the operations in Steps S1701 to S1716 are omitted because of being the same as those of Steps S1601 to S1616 shown in
The following describes forced synchronization processing starting from Step S1717 in detail with reference to
Since video is judged to be asynchronous, the synchronization control unit 111 changes the video playback status to an asynchronous status (Step S1717).
The synchronization control unit 111 detects a current master mode (Step S1718).
The synchronization control unit 111 judges whether the current master mode is the video master mode (Step S1719).
If the current master mode is the video master mode (Step S1719: YES), the synchronization control unit 111 transmits a PTS to the reference clock unit 112 (Step S1720). The reference clock unit 112 corrects the reference clock based on the transmitted PTS.
Then, the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1721). Then, the processing ends.
If the current master mode is not the video master mode (Step S1719: NO), the flow proceeds to Step S1722.
The synchronization control unit 111 detects the playback status of a stream corresponding to the current master (Step S1722).
The synchronization control unit 111 judges whether the playback status of a stream corresponding to the current master is not the synchronous status (Step S1723).
If the playback status of a stream corresponding to the current master is not the synchronous status (Step S1723: YES), the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1724). Then, the processing ends.
If the playback status of the stream corresponding to the current master is the synchronous status (Step S1723: NO), the flow proceeds to Step S1725.
The synchronization control unit 111 judges whether the following condition expression (Expression 16) is satisfied (Step S1725).
PTS−STC<t4 (Expression 16)
Here, a threshold value t4 is a fixed value that can be determined for each system or data.
If the (Expression 16) is satisfied (Step S1725: YES), the flow proceeds to Step S1726.
The synchronization control unit 111 judges whether the following condition expression (Expression 17) is satisfied (Step S1726).
PTS<STC (Expression 17)
If the (Expression 17) is satisfied, that is, if the STC is greater than the PTS (Step S1726: YES), the PTS is later than the STC. Accordingly, the synchronization control unit 111 transmits a frame discard signal to the video decoding unit 104 (Step S1727). Then, the processing ends.
If the (Expression 17) is not satisfied, that is, if the PTS is greater than the STC (Step S1726: NO), the synchronization control unit 111 transmits a decoding suspension signal to the video decoding unit 104 (Step S1728).
After a lapse of a time period corresponding to a difference between the PTS and the STC, the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1729). Then, the processing ends.
If the (Expression 16) is not satisfied (Step S1725: NO), the synchronization control unit 111 clears the asynchronization counter to “0” (Step S1730).
The synchronization control unit 111 transmits a decoding suspension signal to the video decoding unit 104 (Step S1731).
After a lapse of a time period corresponding to t5, the synchronization control unit 111 transmits a decoding permission signal to the video decoding unit 104 (Step S1732).
In the fifth embodiment, if the STC is greater than the PTS in the (Expression 17), that is, the PTS is later than the STC, the synchronization control unit 111 instructs the video decoding unit 104 to discard data corresponding to one frame. However, the synchronization control unit 111 may instruct the video decoding unit 104 to discard data corresponding to a plurality of frames. Furthermore, instead of discarding data corresponding to a difference between the PTS and the STC, data corresponding to t5 may be discarded.
Moreover, in this forced synchronization processing, even if a difference between the PTS and the STC is no less than t4, a range in which synchronization control to be performed is data corresponding to t5. This is because of the following reason.
If the (Expression 14) “value of asynchronization counter<M” is not satisfied (Step S1714: NO), there is a case where a PTS error has occurred due to transmission error or a temporal discontinuity in an audio stream or a video stream repeatedly occurs for a short time period. In such a case, if the processing in Step S1728 is performed, a difference between a PTS and the STC might be an enormous value. This might cause waiting for a long time period, or discard of data corresponding to a long time period. In order to avoid such a situation, a range in which synchronization control to be performed is determined to be data corresponding to t5 even if even if a difference between the PTS and the STC is no less than t4.
The operations shown in the flow chart of
In
The threshold value t3 in the (Expression 13) is 10 for video and audio. The threshold value M of the asynchronization counter in the (Expression 14) is 5 frames for video, and is 3 frames for audio. Also, the threshold value t4 in the (Expression 16) is 20 for audio and video. The fixed difference time period t5 in the (Expression 16) between the STC and the PTS is 10.
During playback in which video and audio are in a synchronous status, a discontinuity occurs in the audio frame 1902. At this time, a difference between the video frame 1902 and the reference clock is 78, and is greater than 10 which is the threshold value t3 in the (Expression 13). Also, the value of the asynchronization counter is “0”. Therefore, video asynchronization processing is performed.
Then, a discontinuity occurs in the audio frame 1904. Although the current master mode is the audio master mode, synchronization control processing is performed.
A difference between the audio frame 1904 and the reference clock is 66, and is greater than 10 which is the threshold value t3 in the (Expression 13). Also, the value of the asynchronization counter is “0”. Therefore, audio asynchronization processing is performed.
While the audio asynchronization processing is performed, the reference clock is not corrected. Accordingly, the reference clock is in an asynchronous status.
Then, since the value of the asynchronization counter at an audio frame 1905 is “3”, the (Expression 14) is not satisfied. Accordingly, forced synchronization processing is performed. Since the current master mode is the audio mater mode, the reference clock is corrected. As a result, the reference clock is 126.
Then, a discontinuity occurs again in the video frame 1906. The value of the asynchronization counter is “3”, and therefore asynchronization processing is continued.
The value of the asynchronization counter at the video frame 1907 is “5”, and the (Expression 14) is not satisfied. Accordingly, forced synchronization processing is performed. At this time, a PTS of the video frame 1907 is 425, and the STC is 133. A difference between the PTS and the STC is 292, and is greater than 20 which is the threshold value t4 in the (Expression 16). Accordingly, the waiting time period t5 is set to be 10, and decoding processing of the video frame 1907 is suspended, and a lapse of a time period corresponding to 10 time periods per frame is waited for. After the lapse of the time period corresponding to 10 time periods per frame, the video frame 1907 is decoded, and the decoded video frame 1907 is displayed. However, a difference between a PTS of a next video frame 1911 and the STC is 11, and is still greater than 10 which is the threshold value t3 in the (Expression 13). The value of the asynchronization counter is “0”. Asynchronization processing is performed, and the asynchronization counter restarts counting up.
Also, a discontinuity occurs in an audio frame 1909 again. A difference between a PTS of the audio frame 1909 and the STC is 371, and is greater than 10 which is the threshold value t3 in the (Expression 13). Accordingly, asynchronization processing is performed.
The value of the audio asynchronization counter at an audio frame 1910 is “3”, and the (Expression 14) is not satisfied. Accordingly, asynchronization control processing is performed. Since the current master mode is the audio master mode, the reference clock is corrected. As a result, the reference clock is 409.
The video asynchronization counter at a video frame 1912 is “5”, and the (Expression 14) is not satisfied. Accordingly, forced synchronization processing is performed. At this time, a PTS of a video frame 1912 is 450, and the STC is 439. A difference between the PTS and the STC is 11, and is greater than 10 which is the threshold value t3 in the (Expression 13). Accordingly, asynchronization processing is performed. That is, decoding processing of the video frame 1912 is suspended, and a lapse of a time period corresponding to 11 time periods per frame is waited for. After the lapse of the time period corresponding to 11 time periods per frame, the video frame 1912 is decoded, and the decoded video frame 1912 is displayed. Then, the video is played back in synchronization with the reference clock.
As have been described above, with the structure of the synchronous playback device according to the fifth embodiment of the present invention, even if an error occurs in a stream due to transmission error, or a temporal discontinuity repeatedly occurs in either one or both of an audio stream and a video stream, or such an error and discontinuously occurs simultaneously, it is possible to perform playback without suspending playback at a point in which an error or discontinuity has occurred, and restore to playback of audio frames and video frames in synchronization with each other in a short time period.
Furthermore, forced synchronization processing is performed in a plurality of times. Accordingly, if it is necessary to perform forced synchronization processing for a long time period, it is possible to shorten a time period for which the same frame continues to be played out for a long time period, and reduce a user's uncomfortable feeling at viewing video while forced synchronization processing is performed.
The following describes a sixth embodiment of the present invention.
A synchronous playback device according to the sixth embodiment has the same structure and the same basic operations as those of the synchronous playback device according to the fifth embodiment. However, the sixth embodiment differs from the fifth embodiment in the following point. In the sixth embodiment, synchronization control processing performed during playback includes modified asynchronization processing. The following mainly describes the different operations between the fifth embodiment and the sixth embodiment.
The operations of the synchronous playback device according to the sixth embodiment are described with reference to
Although synchronization control of decoding processing of encoded video data is described here, it is of course possible to apply this synchronization control to decoding processing of encoded audio data. In such a case, portions represented as “video” and portions represented as “audio” in
The descriptions of the operations in Steps S2001 to S2033 are omitted because of being the same as those of Steps S1701 to S1733 shown in
The following describes asynchronization processing from Step S2034 to S2039 in detail with reference to
The synchronization control unit 111 judges whether the following condition expression (Expression 18) is satisfied (Step S2034).
value of asynchronization counter>N (Expression 18)
Here, a threshold value N is a fixed value that can be determined for each system or data, and is less than the M threshold value in the (Expression 14).
If the (Expression 18) is not satisfied (Step S2034: NO), the flow proceeds to Step S2014. Then, the synchronization control unit 111 performs processing similar to those in the fifth embodiment.
If the (Expression 18) is satisfied (Step S2034: YES), the flow proceeds to Step 2035.
The synchronization control unit 111 detects the current master mode (Step S2035).
If the current master mode is the video master mode (Step S2036: YES), the flow proceeds to Step S2037. If the current master mode is not the video master mode (Step S2036: NO), the flow proceeds to Step S2015.
The synchronization control unit 111 detects the audio playback status (Step S2037).
If the audio playback status is not the synchronous status (Step S2038: NO), the flow proceeds to Step S2015. If the audio playback status is the synchronous status (Step S2038: YES), the flow proceeds to Step S2039.
The synchronization control unit 111 switches the current master mode from the video master mode to the audio master mode (Step S2039).
If the value of asynchronization counter is no less than the threshold value N, the master mode switching control is performed for the following reason.
Suppose that a discontinuity has occurred in a master stream (in an audio stream, for example) included in a recorded stream, and a discontinuity has not occurred in the other stream included in the recorded stream. In such case, if forced synchronization processing is performed after the occurrence of the discontinuity in the master stream, the reference clock needs to be corrected. In other words, this processing causes occurrence of a discontinuity in the reference clock. As a result, the playback device operates such that a stream that is not a master stream (a video stream in a case where the master stream is the audio stream) is synchronous with the reference clock. Accordingly, although no discontinuity has occurred in the stream, continuous playback is not performed due to frame discard or waiting caused by the synchronization control processing. In this way, if a discontinuity has occurred in one stream, especially if a discontinuity has occurred in a stream designated as one master stream, the master mode switching control is performed so as to smoothly play back the other stream in which no discontinuity has occurred.
The operations shown in the flow chart of
In
The threshold value t3 in the (Expression 13) is 10 for video and audio. The threshold value M of the asynchronization counter in the (Expression 14) is 4 frames for audio. Also, the threshold value N of the master switching in the (Expression 18) is 2 frames for audio.
During playback in which video and audio are in a synchronous status and the current master mode is the audio master mode, a discontinuity occurs in the audio frame 2102. At this time, a difference between the audio frame 2102 and the reference clock is 18. Also, the value of the asynchronization counter is “0”. Accordingly, the difference is greater than 10 which is the threshold value t3 in the (Expression 13), and asynchronization processing is performed.
Then, the value of the asynchronization counter in an audio frame 2103 is 3, and the (Expression 18) is satisfied. Accordingly, master switching control is performed to switch the current master from the audio master to the video master. Since a video frame 2106 is synchronous with the reference clock, the video frame 2106 is output in the synchronous status, and the reference clock is corrected.
On the other hand, a difference between an audio frame 2104 and the reference clock is 18, and accordingly the asynchronization processing continues to be performed. Then, the (Expression 14) is not satisfied in an audio frame 2105. Accordingly, forced synchronization processing is performed.
A difference between the audio frame 2105 and the reference clock is 18. Accordingly, decoding processing of the audio frame 2105 is suspended, and a lapse of a time period corresponding to 18 time periods per frame is waited for. While waiting for the lapse of time period corresponding to 18 time periods per frame, no audio is output. After the lapse of the time period corresponding to 18 time periods per frame, the audio frame 2105 is decoded, and the decoded audio frame 2105 is output. The audio is in synchronous status in audio frame 2107 again. Accordingly, master switching control is performed so as to switch the master from the video master to the audio master to restore the normal status. At this time, video frames are played back continuously with no lost frame.
As have been described above, with the structure of the synchronous playback device according to the sixth embodiment, if a discontinuity has occurred in either one of an audio stream and a video stream included in a multiplexed stream recorded under a bad radio wave condition, especially if a discontinuity has occurred in one stream that is a master stream, it is possible to perform continuous playback of the other stream that is a not master stream and in which no discontinuity has occurred.
A seventh embodiment describes a portable telephone according to the present invention. A portable telephone 2000 according to the seventh embodiment includes the synchronous playback device 100 according to the first embodiment.
The communication wireless unit 2401 and the baseband unit 2402 perform signal processing for wireless telephone communications.
The television wireless unit 2403 receives digital television broadcast waves.
The television tuner 2405 performs signal processing on digital television broadcast waves received by the television wireless unit 2403.
The application processing unit 2406 includes a communication unit 2413, a synchronous playback device 2414, a main control unit 2415, and a storage unit 2416.
The communication unit 2413 is an interface for performing communications between the application processing unit 2406 and external devices.
The synchronous playback device 2414 has the same functions of the synchronous playback device 100 according to the first embodiment.
The main control unit 2415 is a CPU (Central Processing Unit) for controlling each of the units included in the portable telephone 2000.
The storage unit 2416 is a memory for storing data, and specifically storing digital television broadcast data that has been received by the television wireless unit 2403 and on which signal processing has been performed by the television tuner 2405.
The input/output unit 2407 includes a camera 2408, an LCD (Liquid Crystal Display) 2409, a microphone 2410, a speaker 2411, and a key input unit 2412.
The camera 2408 inputs photographed data.
The LCD 2409 outputs images and videos, and specifically outputs videos to be played back by the synchronous playback device 2414.
The microphone 2410 inputs sounds such as a user's voice for wireless telephone communications.
The speaker 2411 outputs sounds such as sounds to be played back by the synchronous playback device 2414 and a communication partner's voice for wireless telephone communication.
The key input unit 2412 receives a user's input operation.
The following briefly describes the operations of the portable telephone 2000.
The operations for viewing, recording, and playing back digital television broadcasts performed by the synchronous playback device according to the present invention are described with reference to
For viewing digital television broadcasts, the television wireless unit 2403 receives a channel selected by the television tuner 2405, and transmits a multiplexed stream (MPEG2-TS stream) to the application processing unit 2406. The application processing unit 2406 transmits the received multiplexed stream to the synchronous playback device 2414. The received multiplexed stream is transmitted to the transmission unit 101 included in the synchronous playback device 2414 shown in
Also, for recording digital television broadcasts, the storage unit 2416 stores therein a multiplexed stream transmitted from the television tuner 2405. Here, the storage unit 2416 may be a memory included in the portable telephone 2400 or a removable memory from the portable telephone 2000.
Furthermore, for playing back the stored multiplexed stream, in accordance with a user's key operation input via the key input unit 2412, the storage unit 2416 transmits the multiplexed stream to the synchronous playback device 2414. In the synchronous playback device 2414, the multiplexed stream is separated into audio data and video data. The separated audio data and video data are decoded so as to be played back.
Note that an example in which the portable telephone 2000 includes the synchronous playback device 100 according to the first embodiment is described here. Instead of this, the portable telephone 2000 may of course include any of the synchronous playback devices according to the second to the sixth embodiments.
Furthermore, an example in which the synchronous playback device of the present invention is applied to a portable telephone is described here. Instead of this, the synchronous playback device is applicable to audio devices and various types of video play back devices.
The synchronous playback device of the present invention is broadly applicable to playback devices for playing back encoded videos. The synchronous playback device is useful in that it is possible to start playback from the most appropriate frame included in a stream in which a frame has been lost due to recording of the stream under a bad radio wave condition, and continue the playback without interruption of video signals and audio signals.
Number | Date | Country | Kind |
---|---|---|---|
2006-049672 | Feb 2006 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2007/053512 | 2/26/2007 | WO | 00 | 8/1/2008 |