The present invention relates to the field of digital video playback. More particularly, the invention relates to a method for playing a video stream in reverse, smoothly, without skipping any frames.
Due to the nature of the digital video encoding of the various standards, providing smooth reverse play of a video stream in real time is often difficult.
As shown in
A GOP (Group Of Pictures) is a group of frames that starts with an I frame (in the encoded order). A GOP can typically be accessed independently. The sequence of frames in the encoded stream is such that the encoded reference frames are always placed ahead of the encoded frames that use the reference frames. For example, if the sequence of frames in the encoded stream is IPBB (where the P frame uses the I frame as reference and both B frames use the I frame and the P frame as references), the forward display sequence will be IBBP. This type of data encoding may generally be referred to as temporal compression since this compression exploits the temporal redundancies in addition to spatial redundancies in the data. However, such a compression scheme requires that the data be decoded in the same order it is encoded. Thus, if a user wishes to see the frames displayed in reverse order, so as to back up to a particular section, the process becomes much more difficult.
Most consumer DVD players provide only limited frame display in reverse order. Since an I frame is the only type that contains all data for a complete image, without reference to data from other frames, most consumer DVD players, when set to reverse mode, will play only successive I frames in reverse order. As a result, the consumer sees a stilted “stop action” type of image, rather than a smooth reverse image. Since the number of I frames is only a fraction of the overall number of frames, reverse playback on most machines is usually at a rate of X4 X8 X16 or X32 of normal speed, making it difficult for a user to stop at a particular part of a video program.
A smooth playback of the video stream in reverse order is desirable for a number of reasons. Such a feature would allow the consumer to reverse play the images to a particular frame or section and would also provide a better reverse display, which is less disorienting than the flashing “stop action” types of displays presently used. In addition, the ability to play video in a reverse and forward direction, smoothly, better emulates the actions of tape players and other video equipment, making the use of digitally encoded data more acceptable to professional video users as a data feed source as opposed to decoded data. The ability to smooth reverse images allows a user to better edit video on a frame-by-frame basis.
The term “smooth” refers to the ability to playback all the images of a video stream in reverse display (play speed, high speed, or slow motion). The digitally encoded video does not lend itself naturally to this feature since digital video encoding exploits temporal redundancies in the forward direction, thereby constraining the order in which images can be decoded. Hence, images within a GOP have to be decoded in the order in which the data is encoded in order to produce a stream of video data.
A closed GOP is a GOP that all its frames may have reference frames only from within the GOP. In an open GOP, frames may have reference frames from other GOPs. For example, the open GOP may comprise a B frame that requires a P reference frame from the previous GOP. Open GOPs do not require any additional buffering of the stream data during normal forward-direction play, since the last reference frame of a GOP would have been decoded just before starting to decode the first frame of the subsequent GOP, thereby making the reference frame readily available to decode the subsequent open GOP frames. However, when executing a reverse smooth play operation, the open GOP frames cannot be decoded until the decoder has decoded the reference frames from the subsequent GOP (i.e. the previous GOP in a forward presentation direction).
In the H.264 standard, frames of an open GOP can branch even further by using references from non-adjacent GOPs. Frames within an open GOP in the H.264 standard may use reference frames as far as 16 reference frames away, previous or subsequent. This feature complicates the task of inverse playing even further, as the frames may require references from subsequent GOPs (i.e. previous GOPs in a forward presentation direction) which are relatively distant.
U.S. Pat. No. 7,333,714 discloses a method and system to efficiently process MPEG video in order to perform a reverse play. The disclosed method maximizes the use of memory resources when video frame buffers are implemented. The disclosed system comprises a first subsystem feeding a sequence of frames to a second subsystem. The first subsystem defines a set of parameters that is used to determine the one or more feeding sessions provided to the second subsystem. The second subsystem subsequently decodes the one or more feeding sessions using the set of parameters such that the video may be displayed. Nevertheless, the disclosed method deals with video streams containing closed GOPs only.
US 2006/0008248 discloses a method for smooth reverse play in an MPEG-type stream player, while reducing the buffering requirements. A buffering strategy is disclosed for reducing the required number of passes through the video data unit by optimal scheduling of picture decodes. Nevertheless, the described method does not deal with an H.264-type video stream.
It is an object of the present invention to provide a method for playing a video stream, comprising any number of GOPs, in reverse smoothly without skipping any frames.
It is another object of the present invention to provide a method for playing a video stream, comprising closed or open GOPs, in reverse smoothly without skipping any frames.
It is still another object of the present invention to provide a method for playing a video stream, encoded according to the H.264 standard, in reverse smoothly without skipping any frames.
It is still another object of the present invention to provide a method for error resilience while playing a video stream in reverse.
Other objects and advantages of the invention will become apparent as the description proceeds.
The present invention relates to a method for displaying a video stream in reverse smoothly comprising the steps of: (a) determining at least one GOP for reverse display; (b) selecting all the frames of said GOP, a subsequent I frame and if present, the B frames positioned between said subsequent I frame and the next I or P frame, in the encoded order, into a selected group; (c) if present, discarding from said selected group the B frames that are positioned between the primary I frame of said GOP and the next I or P frame, in the encoded order; (d) decoding and storing the remaining frames of said selected group; and (e) loading for display at least one of said decoded frames in a reverse display order.
In one of the embodiments, all the remaining decoded frames are loaded for display.
In one of the embodiments, the remaining decoded frames are loaded for display excluding the primary I frame.
In one of the embodiments, the remaining decoded frames are loaded for display excluding the subsequent I frame.
Preferably, the storing of the frames is done in a buffer that is capable of storing more than 2 average GOPs of decoded frames.
Preferably, part of the buffer is used for loading one GOP while another part of said buffer is used for storing another GOP.
Preferably, the temporal closest decoded frame that precedes the subsequent I frame is used for error resilience in case said subsequent I frame is corrupt.
In one of the embodiments, the video stream is encoded according to any one of the following standards: MPEG-1, MPEG-2 or MPEG-4.
In one of the embodiments, the video stream is encoded according to the H.264 standard.
Preferably, a reference frame, not part of the selected group, of a frame of said selected group, is substituted by the closest temporal decoded frame of said selected group.
In the drawings:
All referrals hereinafter to “reference” frames are meant to include I type frames, P type frames, or B-reference type frames.
As described, the decoder of the system is oblivious to the reverse display as it is fed in the forward encoded order. In one of the embodiments, the fetching and feeding of frames to the encoder and the organizing of the frames later for reverse display is done in software, where the decoder itself is implemented in hardware, effectively lowering system production costs.
When a request is received for a reverse display of a video data stream longer than a GOP, the system can continue fetching and decoding any number of GOPs by repeating step 1-5 described in relations to
In one of the embodiments, only some of the decoded frames are eventually displayed. For example, if a request is received for a reverse display of a video data stream shorter than a GOP, the system decodes the GOP as described in steps 1-5 in relations to
In some of the cases, some of the frames may be corrupted and some of their data may be lost, for example, during transmission and reception. Many methods can be used for repairing the integrity of the corrupted frames; however, a simple error resilience method calls for the use of data from other frames for repairing the corrupted frames. For example, if one of the frames has a corrupt block, a corresponding block from a previous frame may be copied and inserted instead of the corrupt block in the frame. The repairing block may be copied from a previous frame, a subsequent frame or any other decoded frame. However, if the corruption occurs in an I frame, finding a corresponding block for repairing may not be so easy. In one of the embodiments, the described method of the invention may be used for error resilience of I frames. In the embodiment where the primary I frame of the GOP is decoded but not displayed, and the I frame of the subsequent GOP is decoded and displayed, if the subsequent I frame is corrupt, the closest decoded frame that precedes this subsequent I frame can be used for error resilience.
In the H.264 standard, frames of an open GOP can use reference frames as far as 16 reference frames away, previous or subsequent. Therefore, in one of the embodiments, if a B or P frame require a reference frame that is not part of the selected group, the closest temporal decoded frame to the required reference, within the selected group, shall be used. For example, if a required reference frame is in the previous GOP (in a forward encoded direction) the first frame of the present GOP shall be used as reference instead.
While some embodiments of the invention have been described by way of illustration, it will be apparent that the invention can be carried into practice with many modifications, variations and adaptations, and with the use of numerous equivalents or alternative solutions that are within the scope of persons skilled in the art, without departing from the invention or exceeding the scope of claims.