PREDICTIVE VIDEO CODING EMPLOYING VIRTUAL REFERENCE FRAMES GENERATED BY DIRECT MV PROJECTION (DMVP)

Information

Patent Application
20230300341

References
Source

Publication Number
20230300341
Date Filed
January 20, 2023
2 years ago
Date Published
September 21, 2023
a year ago

Inventors
Original Assignees
- Apple Inc.

CPC
- H04N19/139 - Analysis of motion vectors
- H04N19/176 - the region being a block
International Classifications
- H04N19/139
- H04N19/176

Information

Abstract

Techniques are disclosed for generating virtual reference frames that may be used for prediction of input video frames. The virtual reference frames may be derived from already-coded reference frames and thereby incur reduced signaling overhead. Moreover, signaling of virtual reference frames may be avoided until an encoder selects the virtual reference frame as a prediction reference for a current frame. In this manner, the techniques proposed herein contribute to improved coding efficiencies.

Description

Claims

1. A video coding method, comprising: predictively coding input frames,when a coded input frame is designated as a reference frame, decoding the coded data of the reference frame,storing the decoded reference frame data for use as a prediction reference of subsequently-coded input frame,generating data of a virtual reference frame from a pair of stored reference frames,wherein the predictive coding of an input frame includes a prediction search from among the reference frame data and virtual reference frame data.
2. The method of claim 1, further comprising: when the prediction search selects a virtual reference frame, outputting to a decoder data representing the virtual reference frame and data representing the input frame predictively coded with reference to the virtual reference frame.
3. The method of claim 1, wherein when a virtual reference frame is selected by no prediction search, data representing the virtual reference frame is not output to a decoder.
4. The method of claim 1, wherein a first reference frame of the pair has a temporal position on a first side of the temporal position of the virtual reference frame, and a second reference frame of the pair has a temporal position on a second side of the temporal position of the virtual reference frame.
5. The method of claim 1, wherein content of the virtual reference frame is generated by projection of a motion vector extending from the first reference frame of the pair to the second reference frame of the pair.
6. The method of claim 1, wherein content of the virtual reference frame is generated by projection of motion vectors extending from the first reference frame of the pair across a plurality of reference frames to the second reference frame of the pair.
7. The method of claim 1, wherein, when a current pixel block of an input frame is coded predictively with respect to a virtual reference frame, a motion vector of the pixel block is derived with reference to motion vectors of other pixel blocks neighboring the current pixel block that use a common set of reference frames for prediction as the current pixel block.
8. The method of claim 1, further comprising providing data representing the virtual reference frame to a channel, including temporal interpolated mode identifier indicating a decoder usage of the virtual reference frame.
9. The method of claim 8 wherein the temporal interpolated mode identifier takes one of the following states: a first state indicating that the decoder shall use the virtual reference frame as a reference frame;a second state indicating that the decoder shall output the virtual reference frame for display, anda third state indicating that the decoder shall ill output the virtual reference frame for display enhanced by additional information supplied by an encoder.
10. The method of claim 1, further comprising providing data identifying a mode of the predictive coding of the input frame.
11. The method of claim 10, wherein the predictive coding mode information takes one of the following states: a No_Skip state indicating that the predictive coding generates block level motion information and residual information of coded input frame content,a Full_Skip state indicating that the predictive coding uses direct motion vector interpolation without use of supplementary coding data, anda Semi_Skip state indicating that the predictive coding uses direct motion vector interpolation and includes supplementary coding data.
12. The method of claim 1, further comprising, when a pixel block of the input frame is predictively coded with reference to the virtual reference frame and motion vectors obtained from the predictive coding are smaller than a threshold value, transmitting coded data of the pixel block with a syntax element identifying the motion vectors as having zero values.
13. The method of claim 1, further comprising, wherein the prediction search of a pixel block of the input frame is constrained to a predetermined search window about a collocated location of the virtual reference frame.
14. The method of claim 1, wherein content of the virtual reference frame is generated an optical flow motion vector refinement technique.
15. Computer readable medium having program instruction stored thereon that, when executed by a processing device, causes the processing device to: predictively code input frames,when a coded input frame is designated as a reference frame, decode the coded data of the reference frame,store the decoded reference frame data for use as a prediction reference of subsequently-coded input frame,generate data of a virtual reference frame from a pair of stored reference frames,wherein the predictive coding of an input frame includes a prediction search from among the reference frame data and virtual reference frame data.
16. An encoding terminal, comprising: a video encoder having an input for source video,a video decoder having an input for coded video from the video encoder,a reference picture buffer to store decoded reference frames output from the video decoder,a virtual reference picture generator having an input for reference frames from the reference picture buffer, anda virtual reference picture buffer having an input for virtual reference frames output by the virtual reference picture generator.
17. The terminal of claim 16, further comprising a predictor having inputs for reference frames from the reference picture buffer and for virtual reference frames from the virtual reference picture buffer.
18. The terminal of claim 16, wherein a first reference frame of the pair has a temporal position on a first side of the temporal position of the virtual reference frame, and a second reference frame of the pair has a temporal position on a second side of the temporal position of the virtual reference frame.
19. The terminal of claim 16, wherein virtual reference picture generator generates content of the virtual reference frame by projection of a motion vector extending from the first reference frame of the pair to the second reference frame of the pair.
20. The terminal of claim 16, wherein virtual reference picture generator generates content of the virtual reference frame by projection of motion vectors extending from the first reference frame of the pair across a plurality of reference frames to the second reference frame of the pair.
21. The terminal of claim 16, wherein, when a current pixel block of an input frame is coded predictively with respect to a virtual reference frame, the video encoder derives a motion vector of the pixel block with reference to motion vectors of other pixel blocks neighboring the current pixel block that use a common set of reference frames for prediction as the current pixel block.
22. A video decoding method, comprising: predictively decoding coded frames according to coding parameters provided with the coded frames,when a coded frame is designated as a reference frame, storing the decoded reference frame for use as a prediction reference of subsequently-decoded frames,wherein, when coding parameters identify a coded frame as coded with reference to a virtual reference frame: deriving content of the virtual reference frame from a pair of stored reference frames, anddecoding the coded frame with using the virtual reference frame as a prediction reference.
23. The method of claim 22, wherein a first reference frame of the pair has a temporal position on a first side of the temporal position of the virtual reference frame, and a second reference frame of the pair has a temporal position on a second side of the temporal position of the virtual reference frame.
24. The method of claim 22, wherein content of the virtual reference frame is generated by projection of a motion vector extending from the first reference frame of the pair to the second reference frame of the pair.
25. The method of claim 22, wherein content of the virtual reference frame is generated by projection of motion vectors extending from the first reference frame of the pair across a plurality of reference frames to the second reference frame of the pair.
26. The method of claim 22, wherein, when a current pixel block of an input frame is coded predictively with respect to a virtual reference frame, a motion vector of the pixel block is derived with reference to motion vectors of other pixel blocks neighboring the current pixel block that use a common set of reference frames for prediction as the current pixel block.
27. Computer readable medium having program instruction stored thereon that, when executed by a processing device, causes the processing device to: predictively decode coded frames according to coding parameters provided with the coded frames,when a coded frame is designated as a reference frame, store the decoded reference frame for use as a prediction reference of subsequently-decoded frames,wherein, when coding parameters identify a coded frame as coded with reference to a virtual reference frame: derive content of the virtual reference frame from a pair of stored reference frames, anddecode the coded frame with using the virtual reference frame as a prediction reference.
28. A decoding terminal, comprising: a video decoder having an input for coded video,a reference picture buffer to store decoded reference frames output from the video decoder,a virtual reference picture generator having an input for reference frames from the reference picture buffer, anda virtual reference picture buffer having an input for virtual reference frames output by the virtual reference picture generator.
29. The terminal of claim 28, further comprising a predictor having inputs for reference frames from the reference picture buffer and for virtual reference frames from the virtual reference picture buffer.
30. The terminal of claim 28, wherein a first reference frame of the pair has a temporal position on a first side of the temporal position of the virtual reference frame, and a second reference frame of the pair has a temporal position on a second side of the temporal position of the virtual reference frame.
31. The terminal of claim 28, wherein virtual reference picture generator generates content of the virtual reference frame by projection of a motion vector extending from the first reference frame of the pair to the second reference frame of the pair.
32. The terminal of claim 28, wherein virtual reference picture generator generates content of the virtual reference frame by projection of motion vectors extending from the first reference frame of the pair across a plurality of reference frames to the second reference frame of the pair.

Provisional Applications (2)

	Number	Date	Country
	63331469	Apr 2022	US
	63305111	Jan 2022	US

PREDICTIVE VIDEO CODING EMPLOYING VIRTUAL REFERENCE FRAMES GENERATED BY DIRECT MV PROJECTION (DMVP)

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Provisional Applications (2)