In 360 video, which is also known as 360 degree video, immersive video, or spherical video, video recordings are taken from every direction (i.e., over 360 degrees) simultaneously using an omnidirectional camera or a collection of cameras or the like. In playback, the viewer may select a viewing direction or viewport for viewing among any of the available directions. In compression/decompression (codec) systems, compression efficiency, video quality, and computational efficiency are important performance criteria. Furthermore, the compression/decompression of 360 video will be an important factor in the dissemination of 360 video and the user experience in the viewing of such 360 video.
Therefore, it may be advantageous to increase the compression efficiency, video quality, and computational efficiency of codec systems for processing 360 video. It is with respect to these and other considerations that the present improvements have been needed.
The material described herein is illustrated by way of example and not by way of limitation in the accompanying figures. For simplicity and clarity of illustration, elements illustrated in the figures are not necessarily drawn to scale. For example, the dimensions of some elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference labels have been repeated among the figures to indicate corresponding or analogous elements. In the figures:
One or more embodiments or implementations are now described with reference to the enclosed figures. While specific configurations and arrangements are discussed, it should be understood that this is done for illustrative purposes only. Persons skilled in the relevant art will recognize that other configurations and arrangements may be employed without departing from the spirit and scope of the description. It will be apparent to those skilled in the relevant art that techniques and/or arrangements described herein may also be employed in a variety of other systems and applications other than what is described herein.
While the following description sets forth various implementations that may be manifested in architectures such as system-on-a-chip (SoC) architectures for example, implementation of the techniques and/or arrangements described herein are not restricted to particular architectures and/or computing systems and may be implemented by any architecture and/or computing system for similar purposes. For instance, various architectures employing, for example, multiple integrated circuit (IC) chips and/or packages, and/or various computing devices and/or consumer electronic (CE) devices such as set top boxes, smart phones, etc., may implement the techniques and/or arrangements described herein. Further, while the following description may set forth numerous specific details such as logic implementations, types and interrelationships of system components, logic partitioning/integration choices, etc., claimed subject matter may be practiced without such specific details. In other instances, some material such as, for example, control structures and full software instruction sequences, may not be shown in detail in order not to obscure the material disclosed herein.
The material disclosed herein may be implemented in hardware, firmware, software, or any combination thereof. The material disclosed herein may also be implemented as instructions stored on a machine-readable medium, which may be read and executed by one or more processors. A machine-readable medium may include any medium and/or mechanism for storing or transmitting information in a form readable by a machine (e.g., a computing device). For example, a machine-readable medium may include read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other forms of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.), and others.
References in the specification to “one implementation”, “an implementation”, “an example implementation”, etc., indicate that the implementation described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same implementation. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other implementations whether or not explicitly described herein.
Methods, devices, apparatuses, computing platforms, and articles are described herein related to video coding and, in particular, to deblock filtering 2-dimensional (2D) video frames that are projections from a 360 video space.
The techniques discussed herein provide deblocking for 360 video. For example, deblocking may be enabled along boundaries of 2D video frames and faces of such 2D video frames that are discontiguous in the projected 2D plane when such boundaries are contiguous in the corresponding 360 video (e.g., in the corresponding 360 video sphere). In particular, in some 360 video coding contexts, 2D video frames that are projections from a 360 video space (e.g., projections from 360 video to a 2D plane based on a predetermined format) may be provided to an encoder for encoding into a bitstream such as a standards compliant bitstream. The bitstream may be stored or transmitted or the like and processed by a decoder. The decoder such as a standards compliant decoder may decode the bitstream to reconstruct the 2D video frames (e.g., the projections from the 360 video). The reconstructed 2D video frames may be processed for presentation to a user. For example, a selected viewport may be used to determine a portion or portions of the reconstructed 2D video frames, which may be assembled as needed and provided to a display device for presentation to a user.
In such techniques, the standards compliant codec (encode/decode) techniques may include in-frame deblock filtering for adjacent or neighboring pixels in video frames that cross block (e.g., macroblock, coding unit, etc.) boundaries. However, in projecting from the 360 video space to 2D video frames, some pixels that are neighbors in the 360 video space are presented or formatted as non-neighboring pixels in the 2D video frames. As used herein the term non-neighboring is meant to indicate pixels are not spatially adjacent (e.g., in a 2D video frame) and that sets of pixels have no neighboring pixels between them (e.g., that no pixel of a first set of pixels spatially neighbors any pixel of a second set of pixels in a 2D video frame). For example, such neighboring pixels in the 3D video space may be on opposite boundaries of the corresponding 2D video frame, on non-adjacent boundaries of face projections within the corresponding 2D video frame, or the like, as is discussed further herein.
In some embodiments, a group of pixels for deblock filtering may be identified within a 2D video frame that is a projection from a 360 video space such that the group of pixels includes a first set of pixels and a second set of pixels that are non-neighboring sets of pixels in the 2D video frame and such that they have a first individual pixel of the first set of pixels and a second individual pixel of the second set of pixels that are neighboring pixels in the 360 video space. The identified group of pixels (e.g., a line of pixels with the first and second sets of pixels on opposite sides of a boundary) are deblock filtered using a low pass filter or the like to determine filtered pixel values. Such techniques may be repeated on a line by line basis for any or all sets of pixels that are non-neighboring in the 2D video frame but neighboring pixels in the 360 video space to generate a 360 video space deblock filtered 2D video frame based on the individual 2D video frame.
Such pixel selection or matching and deblock filtering techniques may be implemented in any suitable encode, decode, video preprocessing, or video post-processing context. For example, such techniques may be applied within a local encode loop of a video encoder, as pre-processing prior to providing video frames to an encoder, as post decoder processing, or the like, as is discussed further herein. Furthermore, the discussed techniques may be used in any suitable coding context such as in the implementation of H.264/MPEG-4 advanced video coding (AVC) standards based codecs, high efficiency video coding (H.265/HEVC) standards based codecs, proposed video coding (H.266) codecs, Alliance for Open Media (AOM) standards based codecs such as the AV1 standard, MPEG standards based codecs such as the MPEG-4 standard, VP9 standards based codecs, or any other suitable codec or extension or profile thereof. The discussed techniques reduce blocky artifacts of coded video displayed to users and provide an improved 360 video experience.
As shown, in some contexts, coder 103 may receive 2D video frames 112 (e.g., 2D video frames that are projected from a 360 or spherical space) from 360 to 2D projection module 102 and coder 103 may generate a corresponding output bitstream 113. Although illustrated with respect to coder 103 receiving 2D video frames 112 from 360 to 2D projection module 102, coder 103 may receive 2D video frames 112 from any source such as memory, another device, or the like. In such contexts, coder 103 may provide an encoder capability for system 100 (and, in such contexts input bitstream 114 and 2D video frames 115 may not be employed). 360 camera 101 may be any suitable camera or group of cameras that may attain 360 video or spherical video or the like. Furthermore, 360 to 2D projection module 102 may receive 360 video 111 and 360 to 2D projection module 102 may generate 2D video frames 112 using any suitable technique or techniques. For example, 360 to 2D projection module 102 may project 360 video 111 to 2D video frames 112 in any suitable 2D format that represents the projection from 360 video.
Other modules or components of system 100 may also receive 2D video frames 112 or portions thereof as needed. System 100 may provide, for example, video compression and system 100 may be a video encoder implemented via a computer or computing device or the like. For example, system 100 may generate output bitstream 113 that is compatible with a video compression-decompression (codec) standard such as the H.264/MPEG-4 advanced video coding (AVC) standard, the high efficiency video coding (H.265/HEVC) standard, proposed video coding (H.266) standards, the VP8 standard, the VP9 standard, or the like.
In other embodiments, coder 103 may receive an input bitstream 114 corresponding to or representing 2D frames that are projected from a 360 or spherical space and coder 103 may generate corresponding 2D video frames 115 (e.g., such that 2D frames are projected from a 360 or spherical space). Input bitstream 114 may be received from memory, another device, or the like. In such contexts, coder 103 may provide a decoder capability for system 100 (and, in such contexts 2D video frames 112 and output bitstream 113 may not be employed). In an embodiment, input bitstream may be decoded to 2D video frames 115, which may be displayed to a user via display 108 based on a selected viewport within 2D video frames. Display 108 may be any suitable display such as a virtual reality (VR) display, a head mounted VR display, or the like.
Furthermore, although illustrated with all of 360 camera 101, 360 to 2D projection module 102, coder 103, viewport generator 107, and display 108, system 100 may include only some of 360 camera 101, 360 to 2D projection module 102, coder 103, viewport generator 107, and display 108. In an embodiment, system 100 includes 360 camera 101, and 360 to 2D projection module 102, and coder. In an embodiment, system 100 includes coder 103, viewport generator 107, and display 108. Other combinations of 360 camera 101, 360 to 2D projection module 102, coder 103, viewport generator 107, and display 108 as well as other components may be provided for system 100 depending on the nature of the device in which system 100 is being implemented. System 100 may be implemented via any suitable device such as, for example, a server, a personal computer, a laptop computer, a tablet, a phablet, a smart phone, a digital camera, a gaming console, a wearable device, a display device, an all-in-one device, a two-in-one device, or the like or platform such as a mobile platform or the like. For example, as used herein, a system, device, computer, or computing device may include any such device or platform.
As discussed, coder 103 may receive 2D video frames 112. 2D video frames 112 (as well as 2D video frames 115 and other video frames discussed herein) may include any suitable video data such as pixels or pixel values or data, video sequence, pictures of a video sequence, video frames, video pictures, sequence of video frames, group of pictures, groups of pictures, video data, or the like in any suitable resolution. 2D video frames 112 may be characterized as video, input video data, video data, raw video, or the like. For example, 2D video frames 112 may be video graphics array (VGA), high definition (HD), Full-HD (e.g., 1080p), or 4K resolution video, or the like. Furthermore, 2D video frames 112 may include any number of video frames, sequences of video frames, pictures, groups of pictures, or the like. Techniques discussed herein are discussed with respect to pixels and pixel values of video frames for the sake of clarity of presentation. However, such video frames and/or video data may be characterized as pictures, video pictures, frames, sequences of frames, video sequences, or the like. As used herein, the term pixel or pixel value may include a value representing a pixel of a video frame such as a luminance value for the pixel, a color channel value for the pixel, or the like. In various examples, 2D video frames 112 may be raw video or decoded video. Furthermore, as discussed herein, coder 103 may provide both encode and decode functionality.
As shown, projection face boundary deblocker 104 receives 2D video frames 112 that include projections from a 360 video space. As used herein, the term projected from a 360 video space indicates the format of 2D video frames includes picture or video information corresponding to a 360 space, spherical space, or the like. For example, 360 video may be formatted or projected to a 2D image or video frame plane or the like using known techniques. Analogies to such projections (and their various advantages and disadvantages) may be found in the context of generating 2D maps from the globe. The format of such 2D video frames may include any suitable format such, for example, an equirectangular (ERF) format, a cube map format, a compact cube map format, or the like.
Pixel selection and matching module 105 may determine, for some or all of 2D video frames 112 groups of pixels for deblock filtering. Pixel selection and matching module 105 may determine such groups of pixels for deblock filtering using any suitable technique or techniques. In an embodiment, pixel selection and matching module 105 may receive an indicator or indicators indicative of a format type of 2D video frames 112 (e.g., equirectangular format, cube map format, compact cube map format, or the like) and pixel selection and matching module 105 may determine groups of pixels for deblock filtering responsive to the format type indicator or indicators. Each of such group of pixels for deblock filtering include a first set of pixels and a second set of pixels such that the first and second set of pixels are non-neighboring in the 2D video frame but such that the are neighboring in the 360 video space. Furthermore, such first and second sets of pixels are separated by a boundary across which deblock filtering may be applied. The boundary may be provided by a frame boundary of the 2D video frame, a face boundary of a projection portion of the 2D video frame, or the like. For example, the two sets of pixels of a group of pixels may be selected and oriented/aligned for deblock filtering. As shown in
Also as shown, set of pixels 303 includes a pixel 305 (marked with a gray box) along boundary 205 and set of pixels 304 includes a pixel 306 (also marked with a gray box) that are neighbors in the 3D video space but not in the 2D video frame. In deblock filtering of group of pixels 302, set of pixels 303 and set of pixels 304 may be aligned (e.g., put in a row or column) such that 3D video space neighboring pixels 305, 306 are set next to one another for deblock filtering. In the following illustrations such 3D video space neighboring pixels are marked with gray boxes for the sake of clarity of presentation.
As discussed with respect to system 100, group of pixels 302 may be selected by pixel selection and matching module 105, aligned for deblock filtering by pixel selection and matching module 105, and deblock filtered by deblock filter 106 to generate deblock filtered pixels or pixel values or the like. Any groups of pixels selection and/or deblock filtering discussed herein may be performed by pixel selection and matching module 105 and deblock filtered by deblock filter 106, respectively. Furthermore, any deblock filtering discussed herein may be performed for a group of pixels across a boundary (e.g., for group of pixels 302 across boundary 205 or the like) using any suitable technique or techniques. For example, group of pixels 302 may include N pixels such that N/2 are from set of pixels 303 and N/2 are from set of pixels 304. In various embodiments, N may be in the range of about 4 to 16 or the like. In other examples, set of pixels 303 and set of pixels 304 may have different numbers of pixels. The deblock filtering may include any filtering such as low pass filtering, weighted filtering with any suitable weights, or the like. In an embodiment, the deblock filtering may be provided in accordance with HEVC deblock filtering for neighboring pixels across block boundaries. In an embodiment, group of pixels 302 includes a single line of pixels that may be deblock filtered.
The same deblock filter may be applied to each selected group of pixels or the deblock filters (e.g., their length, weightings, or the like) may be different. In some embodiment, the selected deblock filter may be based on block sizes, prediction types, transform types, or the like corresponding to the blocks from which the sets of pixels are selected. In an embodiment, when a group of pixels has a set of pixels or both sets of pixels are from a smaller block or blocks (e.g., a block of 8×8 or less), a shorter filter is used (e.g., about 4 pixels) and when group of pixels has a set of pixels or both sets of pixels are from a larger block or blocks (e.g., a block larger than 8×8), a longer filter is used (e.g., about 8 or more pixels). In an embodiment, when a group of pixels has a set of pixels or both sets of pixels are from an intra predicted block or blocks, a shorter filter is used (e.g., about 4 pixels) and when group of pixels has a set of pixels or both sets of pixels are from an inter predicted block or blocks, a longer filter is used (e.g., about 8 or more pixels).
With reference to
As shown, set of pixels 401 and set of pixels 402 are non-neighboring in 2D video frame 400 (i.e., no pixel of set of pixels 401 is contiguous with or adjacent to any pixel of set of pixels 402 in 2D video frame 400). However, in the 360 video space, a pixel of set of pixels 401 at frame boundary 406 (marked with a gray box within set of pixels 401) is a neighbor of a pixel of set of pixels 402 at frame boundary 407 (marked with a gray box within set of pixels 402). For example, set of pixels 401 may begin at the marked pixel of set of pixels 401 at frame boundary 406 and extend toward an interior of 2D video frame 400 and set of pixels 402 may begin at the marked pixel of set of pixels 402 at frame boundary 407 and extend toward an interior of 2D video frame 400. Furthermore, first and second sets of pixels 401, 402 may be the same distance (d2) from a bottom frame boundary 409 (and a top frame boundary 408). With reference to
Furthermore,
With reference to
Similarly, for any vertical set of pixels adjacent to bottom frame boundary 409 (except for pixels exactly at centerline 405, if any), a corresponding vertical set of pixels also adjacent to bottom frame boundary 409 may be found such that the sets of pixels are non-neighboring in 2D video frame 400 but neighboring in the 360 video space. For example, for a vertical set of pixels adjacent to bottom frame boundary 409 and left of centerline 405 by a distance, x, from centerline 405, the corresponding vertical set of pixels for deblock filtering may be found also adjacent to bottom frame boundary 409 and right of centerline 405 by the distance, x. For such vertical sets of pixels adjacent to bottom frame boundary 409, the pixels of each set adjacent to bottom frame boundary 409 are neighboring in the 360 video space and such pixels may be placed adjacent to one another for deblock filtering in analogy to the x marked pixels of sets of pixels 403, 404.
As discussed, horizontal and vertical groups of any number of pixels may be determined and deblock filtered in 2D video frame 400. In an embodiment, all such available horizontal (i.e., all linear single pixel depth horizontal groups having a horizontal pixel set adjacent to left frame boundary 406 and a horizontal pixel set adjacent to right frame boundary 407) and vertical (i.e., all linear single pixel depth vertical groups having a vertical pixel set adjacent to top frame boundary 408 and a vertical pixel set adjacent to top frame boundary 408 and equidistant to the corresponding vertical pixel set and all linear single pixel depth vertical groups having a vertical pixel set adjacent to bottom frame boundary 409 and a vertical pixel set adjacent to bottom frame boundary 409 and equidistant to the corresponding vertical pixel set) groups of pixels are deblock filtered. In another embodiment, a subset of each of such available horizontal and vertical groups of pixels are deblock filtered. In any event, such deblock filtering may generate, from a 2D video frame, a 360 video space deblock filtered 2D video frame based on the 2D video frame. As used herein the term 360 video space deblock filtered 2D video frame is meant to indicate a 2D video frame that has been deblock filtered for neighbors in the 360 video space. Such a 360 video space deblock filtered 2D video frame may also be deblock filtered within the 2D video frame or not.
The described pixel selection and deblock filtering techniques for 2D video frame that are projections from a 360 video space may be performed for any format of projection. For example, the 2D video frame may be an equirectangular frame projected from the 360 video space (as discussed with respect to
As shown in
As shown with respect to
As shown, set of pixels 501 and set of pixels 502 are non-neighboring in 2D video frame 500 (i.e., no pixel of set of pixels 501 is contiguous with or adjacent to any pixel of set of pixels 502 in 2D video frame 500). However, in the 360 video space, a pixel of set of pixels 501 at frame boundary 506 (at left boundary of face F, marked with a gray box within set of pixels 501) is a neighbor of a pixel of set of pixels 502 at frame boundary 507 (at right boundary of face A, marked with a gray box within set of pixels 502). For example, set of pixels 501 may begin at the marked pixel of set of pixels 501 at frame boundary 506 and extend toward an interior of 2D video frame 500 (and face F) and set of pixels 502 may begin at the marked pixel of set of pixels 502 at frame boundary 507 and extend toward an interior of 2D video frame 500 (and face A). Furthermore, first and second sets of pixels 501, 502 may be the same distance (d2) from a bottom frame boundary 509 (and a top frame boundary 508).
Furthermore,
With reference to
With reference to
As discussed above and with reference to
As shown with respect to
As shown, set of pixels 801 and set of pixels 802 are non-neighboring in 2D video frame 800 (i.e., no pixel of set of pixels 801 is contiguous with or adjacent to any pixel of set of pixels 802 in 2D video frame 800). However, in the 360 video space, a pixel of set of pixels 801 at frame boundary 806 (at left boundary of face D′, marked with a gray box within set of pixels 801) is a neighbor of a pixel of set of pixels 802 at the bottom boundary of face B (marked with a gray box within set of pixels 802). For example, set of pixels 801 may begin at the marked pixel of set of pixels 801 at frame boundary 806 and extend toward an interior of 2D video frame 800 (and face D′) and set of pixels 802 may begin at the marked pixel of set of pixels 802 at the bottom boundary of face B and extend toward an interior of face B.
Furthermore,
With reference to
As discussed, the pixel selection and deblock filtering techniques discussed herein may be used in any suitable 3D video encode, decode, pre-, or post-processing context.
As shown, encoder 1000 may receive 2D video frames 112 (e.g., 2D video frames that are projected from a 360 or spherical space) and encoder 1000 may generate output bitstream 113 as discussed herein. For example, encoder 1000 may divide an individual 2D frame of 2D video frames 112 into blocks of different sizes, which may be predicted either temporally (inter) via motion estimation module 1001 and motion compensation module 1002 or spatially (intra) via intra prediction module 1004. Such a coding decision may be implemented via selection switch 1008. Furthermore, based on the use of intra or inter coding, a difference between source pixels and predicted pixels may be made via differencer 1007 (e.g., between pixels of previously decoded 360 video space deblock filtered reconstructed frames and pixels of source or original frames). The difference may converted to the frequency domain (e.g., based on a discrete cosine transform) via transform module 1010 and converted to quantized coefficients via quantization module 1011. Such quantized coefficients along with various control signals may be entropy encoded via entropy encoder module 1014 to generate encoded bitstream 1021, which may be transmitted or transferred or the like to a decoder. Furthermore, as part of a local decode loop, the quantized coefficients may be inverse quantized via inverse quantization module 1012 and inverse transformed via inverse transform module 1013 to generate reconstructed differences or residuals. The reconstructed differences or residuals may be combined with reference blocks via adder 1009 to generate reconstructed blocks, which, as shown, may be provided to intra prediction module 1004 for use in intra prediction. Furthermore, the reconstructed blocks may be in-frame deblocked via deblock filtering module 1006 and reconstructed to generate reconstructed frames 1021. Reconstructed frames 1021 may be provided to projection face boundary deblocker 104 for deblocking of pixel groups that are non-neighbors in reconstructed frames 1021 and are neighbors in the 360 video space of which reconstructed frames 1021 include projections. As shown, projection face boundary deblocker 104 may generate 360 video space deblock filtered reconstructed frames (filtered frames) 1022 based on reconstructed frames 1021, which may be stored in frame buffer 1005 and provided to motion estimation module 1001 and motion compensation module 1002 for use in inter prediction.
By implementing encoder 1000 and, in particular, projection face boundary deblocker 104 within encoder 1000, improved encoding quality and compression and improved video quality of video represented by output bitstream 113 may be attained. For example, as discussed, projection face boundary deblocker 104 may receive an individual 2D reconstructed video frame of reconstructed video frames 1021 such that the individual 2D reconstructed video frame includes a projection from a 360 video space. Projection face boundary deblocker 104 determines groups of pixels from the individual 2D reconstructed video frame are non-neighboring in the individual 2D reconstructed video frame and that include pixels that are neighboring in the 360 video space. Such groups of pixels are deblock filtered to generate a filtered frame of 360 video space deblock filtered reconstructed frames 1022. Also as discussed, a portion (e.g., a block or coding unit or the like) of the 360 video space deblock filtered reconstructed frame may be differenced, by differencer 1007, with a corresponding portion of 2D video frames 112 (e.g., a portion of an original 2D video frame) to generate a residual portion. The residual portion may be transformed, by transform module 1010, and quantized, by quantization module 1011, to determine quantized transform coefficients for the residual portion. The quantized transform coefficients may be encoded into output bitstream 113 by entropy encoder module 1014.
As shown, decoder 1100 may receive input bitstream 114 (e.g., an input bitstream corresponding to or representing 2D video frames that are projected from a 360 or spherical space) and decoder 1100 may generate 2D video frames 115 (e.g., such that 2D frames are projected from a 360 or spherical space) as discussed herein. For example, entropy decoder module 1114 may entropy decode input bitstream 114 to determine quantized coefficients along with various control signals. The quantized coefficients may be inverse quantized via inverse quantization module 1112 and inverse transformed via inverse transform module 1113 to generate reconstructed differences or residuals. The reconstructed differences or residuals may be combined with reference blocks (from previously decoded frames) via adder 1109 to generate reconstructed blocks, which, as shown, may be provided to intra prediction module 1104 for use in intra prediction. Furthermore, the reconstructed blocks may be in-frame deblocked via deblock filtering module 1006 and reconstructed to generate reconstructed frames 1121. Reconstructed frames 1121 may be provided to projection face boundary deblocker 104 for deblocking of pixel groups that are non-neighbors in reconstructed frames 1121 and are neighbors in the 360 video space of which reconstructed frames 1121 include projections. As shown, projection face boundary deblocker 104 may generate 2D video frames 115 (e.g., 360 video space deblock filtered reconstructed frames) based on reconstructed frames 1121, which may be stored in frame buffer 1105 and provided to motion compensation module 1102 for use in inter prediction. Furthermore, 2D video frames 115 may be provided for output to a display device or the like for viewing by a user.
By implementing decoder 1100 and, in particular, projection face boundary deblocker 104 within decoder 1100, improved video quality of video represented by 2D video frames 115 may be attained. For example, as discussed, projection face boundary deblocker 104 may receive an individual 2D reconstructed video frame of reconstructed video frames 1121 such that the individual 2D reconstructed video frame includes a projection from a 360 video space. Projection face boundary deblocker 104 determines groups of pixels from the individual 2D reconstructed video frame are non-neighboring in the individual 2D reconstructed video frame and that include pixels that are neighboring in the 360 video space. Such groups of pixels are deblock filtered to generate a filtered frame of 2D video frames 115 (e.g., a 360 video space deblock filtered reconstructed frame). Also as discussed, input bitstream 114 is decoded by entropy decoder module 1114 to determine quantized transform coefficients for a residual portion (e.g., a block or coding unit or the like) of the reconstructed 2D video frame. The quantized transform coefficients may be inverse quantized, by inverse quantization module 1112, and inverse transformed, by inverse transform module 1113, to generate a residual portion. The residual portion may be added to a corresponding prediction portion at by adder 1109 to generate a reconstructed portion. The reconstructed portion may then be in-frame deblock filtered by in-frame deblock filtering (DF) module 1106 and the prediction portion and other prediction portions may be assembled to generate a reconstructed frame of reconstructed frames 1121.
Projection face boundary deblocker 104 may receive the reconstructed frame of reconstructed frames 1121 such that the individual 2D reconstructed video frame includes a projection from a 360 video space. Projection face boundary deblocker 104 determines groups of pixels from the individual 2D reconstructed video frame are non-neighboring in the individual 2D reconstructed video frame and that include pixels that are neighboring in the 360 video space. Such groups of pixels are deblock filtered to generate a filtered frame of 2D video frames 115. With reference to
As shown, encoder 1200 may receive 2D video frames 112 (e.g., 2D video frames that are projected from a 360 or spherical space) and encoder 1200 may generate output bitstream 113 as discussed herein. For example, encoder 1200 may include projection face boundary deblocker 104 as a pre-processor or pre-filter for improved encoder efficiency and video quality. In some examples, projection face boundary deblocker 104 may be characterized as a part of encoder 1200 and, in other examples, projection face boundary deblocker 104 may be characterized as a pre-processor or pre-filter prior to encode processing.
For example, projection face boundary deblocker 104 may receive 2D video frames 112 (e.g., 2D video frames that are projected from a 360 or spherical space) and projection face boundary deblocker 104 may generate 360 video space deblock filtered frames (filtered frames) 1221 as discussed herein. Filtered frames 1221 may then be encode processed as discussed with respect to
By implementing encoder 1200 and, in particular, projection face boundary deblocker 104, improved encoding quality and compression and improved video quality of video represented by output bitstream 113 may be attained. For example, as discussed, projection face boundary deblocker 104 may receive an individual 2D video frame of 2D video frames 112 such that the individual 2D video frame includes a projection from a 360 video space. Projection face boundary deblocker 104 determines groups of pixels from the individual 2D video frame are non-neighboring in the individual 2D reconstructed video frame and that include pixels that are neighboring in the 360 video space. Such groups of pixels are deblock filtered to generate a filtered frame of 360 video space deblock filtered frames 1221. Also as discussed, a portion (e.g., a block or coding unit or the like) of the 360 video space deblock filtered frame may be differenced, by differencer 1007, with a corresponding portion of a reconstructed video frame (e.g., as reconstructed by the local decode loop and as selected by intra- or inter-prediction) to generate a residual portion. The residual portion may be transformed, by transform module 1010, and quantized, by quantization module 1011, to determine quantized transform coefficients for the residual portion. The quantized transform coefficients may be encoded into output bitstream 113 by entropy encoder module 1014.
As shown, decoder 1300 may receive input bitstream 114 (e.g., an input bitstream corresponding to or representing 2D video frames that are projected from a 360 or spherical space) and decoder 1300 may generate 2D video frames 115 (e.g., such that 2D frames are projected from a 360 or spherical space). For example, decoder 1300 may include projection face boundary deblocker 104 as a post-processor or post-filter for improved video quality. In some examples, projection face boundary deblocker 104 may be characterized as a part of decoder 1300 and, in other examples, projection face boundary deblocker 104 may be characterized as a post-processor or post-filter prior to decode processing.
For example, decoder 1300 may receive input bitstream 114, which may be decode processed to generate reconstructed frames 1121 (e.g., reconstructed 2D video frames that are projected from a 360 or spherical space). Such decode processing will not be repeated for the sake of brevity. As shown, reconstructed frames 1121 may be stored in frame buffer 1105 and used for decode of subsequent video frames. Furthermore, projection face boundary deblocker 104 may receive a reconstructed frame of reconstructed frames 1321 such that the individual 2D reconstructed video frame includes a projection from a 360 video space. Projection face boundary deblocker 104 determines groups of pixels from the individual 2D reconstructed video frame are non-neighboring in the individual 2D reconstructed video frame and that include pixels that are neighboring in the 360 video space. Such groups of pixels are deblock filtered to generate a filtered frame of 2D video frames 115. With reference to
Graphics processor 1501 may include any number and type of graphics processors or processing units that may provide the operations as discussed herein. Such operations may be implemented via software or hardware or a combination thereof. In an embodiment, the illustrated modules of graphics processor 1501 may be implemented via circuitry or the like. For example, graphics processor 1501 may include circuitry dedicated to manipulate video data to generate a compressed bitstream and/or circuitry dedicated to manipulate a compressed bitstream to generate video data and to provide the operations discussed herein. For example, graphics processor 1501 may include an electronic circuit to manipulate and alter memory to accelerate the creation of video frames in a frame buffer and/or to manipulate and alter memory to accelerate the creation of a bitstream based on images or frames of video.
Central processor 1502 may include any number and type of processing units or modules that may provide control and other high level functions for system 1500 and/or provide the operations discussed herein. For example, central processor 1502 may include an electronic circuit to perform the instructions of a computer program by performing basic arithmetic, logical, control, input/output operations, and the like specified by the instructions.
Memory 1503 may be any type of memory such as volatile memory (e.g., Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), etc.) or non-volatile memory (e.g., flash memory, etc.), and so forth. In an embodiment, memory 1503 may be configured to store video data such as pixel values, control parameters, bitstream data, or any other video data discussed herein. In a non-limiting example, memory 1503 may be implemented by cache memory. In an embodiment, coder 103 may be implemented via execution units (EU) of graphics processor 1501. The execution units may include, for example, programmable logic or circuitry such as a logic core or cores that may provide a wide array of programmable logic functions. In an embodiment, coder 103 may be implemented via dedicated hardware such as fixed function circuitry or the like. Fixed function circuitry may include dedicated logic or circuitry and may provide a set of fixed function entry points that may map to the dedicated logic for a fixed purpose or function.
Returning to discussion of
Furthermore, the individual 2D video frame may be received for processing in any suitable context. For example, graphics processor 1501 and/or central processor 1502 may implement any encoder, decoder, pre-processor, post-processor, or the like discussed herein. In an embodiment, the individual 2D video frame is a reconstructed 2D video frame and process 1400 further includes differencing a portion of the face boundary deblock filtered 2D video frame with a portion of an original 2D video frame to generate a residual portion, transforming and quantizing the residual portion to determine quantized transform coefficients for the residual portion, encoding the quantized transform coefficients into a bitstream. For example, such processing may provide an in loop deblock filtering for encode.
In another embodiment, the individual 2D video frame is a filtered reconstructed 2D video frame and process 1400 further includes decoding a bitstream to determine quantized transform coefficients for a residual portion of the reconstructed 2D video frame, inverse quantizing and inverse transforming the quantized transform coefficients to determine the residual portion, adding the residual portion to a prediction portion to generate a reconstructed portion of a reconstructed 2D video frame, in-frame deblock filtering the reconstructed 2D video frame to generate the filtered reconstructed 2D video frame, determining a portion of the face boundary deblock filtered 2D video frame for display based on a viewport (e.g., a 100 degree field of view), and displaying the portion of the reconstructed 2D video frame to a user. For example, such processing may provide deblock filtering for decode and display to a user.
In an embodiment, process 1400 further includes differencing a portion of the face boundary deblock filtered 2D video frame with a portion of a reconstructed video frame to generate a residual portion, transforming and quantizing the residual portion to determine quantized transform coefficients for the residual portion, and encoding the quantized transform coefficients into a bitstream. For example, such processing may provide a pre-processing deblock filtering for encode.
Processing may continue at operation 1402, where, from the individual 2D video frame, a group of pixels for deblock filtering may be determined such that the group of pixels includes a first set of pixels and a second set of pixels of the individual 2D video frame such that the first set of pixels and the second set of pixels are non-neighboring sets of pixels in the individual 2D video frame and such that a first individual pixel of the first set of pixels and a second individual pixel of the second set of pixels are neighboring pixels in the 360 video space. For example, pixel selection and matching module 105 of coder 103 as implemented via graphics processor 1501 may determine the group of pixels for deblock filtering. The group of pixels for deblock filtering may be determined using any suitable technique or techniques such as any techniques discussed herein.
In an embodiment, the first set of pixels begins with the first individual pixel at a left boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame and the second set of pixels begins with the second individual pixel at a right boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame. In an embodiment, the individual 2D video frame is an equirectangular frame projected from the 360 video space and the first set of pixels begins with the first individual pixel at a first position of a top boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame, the second set of pixels begins with the second individual pixel at a second position of the top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, and the first position and the second position of the top boundary are equidistant from a center of the top boundary of the individual 2D video frame.
In an embodiment, the individual 2D video frame is a cube map format frame projected from the 360 video space and the first set of pixels begins with the first individual pixel at a first position of a first face projection and a first blank pixel region boundary of the individual 2D video frame and extends toward an interior of the first face projection, the second set of pixels begins with the second individual pixel at a second position of a second face projection and a second blank pixel region boundary and extends toward an interior of the second face projection, and the first position and the second position are equidistant from an intersection of the first and second blank pixel region boundaries. In an embodiment, the individual 2D video frame is a compact cube map format frame projected from the 360 video space and the first set of pixels begins with the first individual pixel at a first face projection and video frame edge boundary of the individual 2D video frame and extends toward an interior of the first face projection and the second set of pixels begins with the second individual pixel at a second face projection and a third face projection boundary and extends toward an interior of the second face projection.
Any number of groups of pixels may be identified for deblock filtering. In an embodiment, the group of pixels is a single line of pixels and process 1400 further includes determining, from the individual 2D video frame, a second group of pixels for deblock filtering such that the second group of pixels includes a third set of pixels and a fourth set of pixels of the individual 2D video frame, such that the third set of pixels and the fourth set of pixels are non-neighboring pixels in the individual 2D video frame and at least a third individual pixel of the third set of pixels and a fourth individual pixel of the fourth set of pixels are neighboring pixels in the 360 video space. For example, the individual 2D video frame may be an equirectangular frame projected from the 360 video space, the first set of pixels may begin with the first individual pixel at a left boundary of the individual 2D video frame and extend toward an interior of the individual 2D video frame, the second set of pixels may begin with the second individual pixel at a right boundary of the individual 2D video frame and extend toward the interior of the individual 2D video frame, the third set of pixels may begin with the third individual pixel at a first position of a top boundary of the individual 2D video frame and extend toward the interior of the individual 2D video frame, the fourth set of pixels may begin with the fourth individual pixel at a second position of the top boundary of the individual 2D video frame and extend toward the interior of the individual 2D video frame, and the first position and the second position of the top boundary may be equidistant from a center of the top boundary of the individual 2D video frame.
Processing may continue at operation 1403, where the group of pixels including the first and second set of pixels may be deblock filtered to generate a 360 video space deblock filtered 2D video frame based on the individual 2D video frame. The group of pixels or groups of pixels may be deblock filtered using any suitable technique or techniques. Each group may be deblock filtered using the same filtering techniques or different filtering techniques. In an embodiment, the group of pixels includes a single line of pixels and said deblock filtering the group of pixels includes applying a low pass filter to the group of pixels.
Various components of the systems described herein may be implemented in software, firmware, and/or hardware and/or any combination thereof. For example, various components of system 100 or system 1500 may be provided, at least in part, by hardware of a computing System-on-a-Chip (SoC) such as may be found in a computing system such as, for example, a smart phone. Those skilled in the art may recognize that systems described herein may include additional components that have not been depicted in the corresponding figures. For example, the systems discussed herein may include additional components such as bit stream multiplexer or de-multiplexer modules and the like that have not been depicted in the interest of clarity.
While implementation of the example processes discussed herein may include the undertaking of all operations shown in the order illustrated, the present disclosure is not limited in this regard and, in various examples, implementation of the example processes herein may include only a subset of the operations shown, operations performed in a different order than illustrated, or additional operations.
In addition, any one or more of the operations discussed herein may be undertaken in response to instructions provided by one or more computer program products. Such program products may include signal bearing media providing instructions that, when executed by, for example, a processor, may provide the functionality described herein. The computer program products may be provided in any form of one or more machine-readable media. Thus, for example, a processor including one or more graphics processing unit(s) or processor core(s) may undertake one or more of the blocks of the example processes herein in response to program code and/or instructions or instruction sets conveyed to the processor by one or more machine-readable media. In general, a machine-readable medium may convey software in the form of program code and/or instructions or instruction sets that may cause any of the devices and/or systems described herein to implement the techniques, modules, components, or the like as discussed herein.
As used in any implementation described herein, the term “module” refers to any combination of software logic, firmware logic, hardware logic, and/or circuitry configured to provide the functionality described herein. The software may be embodied as a software package, code and/or instruction set or instructions, and “hardware”, as used in any implementation described herein, may include, for example, singly or in any combination, hardwired circuitry, programmable circuitry, state machine circuitry, fixed function circuitry, execution unit circuitry, and/or firmware that stores instructions executed by programmable circuitry. The modules may, collectively or individually, be embodied as circuitry that forms part of a larger system, for example, an integrated circuit (IC), system on-chip (SoC), and so forth.
In various implementations, system 1600 includes a platform 1602 coupled to a display 1620. Platform 1602 may receive content from a content device such as content services device(s) 1630 or content delivery device(s) 1640 or other similar content sources. A navigation controller 1650 including one or more navigation features may be used to interact with, for example, platform 1602 and/or display 1620. Each of these components is described in greater detail below.
In various implementations, platform 1602 may include any combination of a chipset 1605, processor 1610, memory 1612, antenna 1613, storage 1614, graphics subsystem 1615, applications 1616 and/or radio 1618. Chipset 1605 may provide intercommunication among processor 1610, memory 1612, storage 1614, graphics subsystem 1615, applications 1616 and/or radio 1618. For example, chipset 1605 may include a storage adapter (not depicted) capable of providing intercommunication with storage 1614.
Processor 1610 may be implemented as a Complex Instruction Set Computer (CISC) or Reduced Instruction Set Computer (RISC) processors, x86 instruction set compatible processors, multi-core, or any other microprocessor or central processing unit (CPU). In various implementations, processor 1610 may be dual-core processor(s), dual-core mobile processor(s), and so forth.
Memory 1612 may be implemented as a volatile memory device such as, but not limited to, a Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), or Static RAM (SRAM).
Storage 1614 may be implemented as a non-volatile storage device such as, but not limited to, a magnetic disk drive, optical disk drive, tape drive, an internal storage device, an attached storage device, flash memory, battery backed-up SDRAM (synchronous DRAM), and/or a network accessible storage device. In various implementations, storage 1614 may include technology to increase the storage performance enhanced protection for valuable digital media when multiple hard drives are included, for example.
Graphics subsystem 1615 may perform processing of images such as still or video for display. Graphics subsystem 1615 may be a graphics processing unit (GPU) or a visual processing unit (VPU), for example. An analog or digital interface may be used to communicatively couple graphics subsystem 1615 and display 1620. For example, the interface may be any of a High-Definition Multimedia Interface, DisplayPort, wireless HDMI, and/or wireless HD compliant techniques. Graphics subsystem 1615 may be integrated into processor 1610 or chipset 1605. In some implementations, graphics subsystem 1615 may be a stand-alone device communicatively coupled to chipset 1605.
The graphics and/or video processing techniques described herein may be implemented in various hardware architectures. For example, graphics and/or video functionality may be integrated within a chipset. Alternatively, a discrete graphics and/or video processor may be used. As still another implementation, the graphics and/or video functions may be provided by a general purpose processor, including a multi-core processor. In further embodiments, the functions may be implemented in a consumer electronics device.
Radio 1618 may include one or more radios capable of transmitting and receiving signals using various suitable wireless communications techniques. Such techniques may involve communications across one or more wireless networks. Example wireless networks include (but are not limited to) wireless local area networks (WLANs), wireless personal area networks (WPANs), wireless metropolitan area network (WMANs), cellular networks, and satellite networks. In communicating across such networks, radio 1618 may operate in accordance with one or more applicable standards in any version.
In various implementations, display 1620 may include any television type monitor or display. Display 1620 may include, for example, a computer display screen, touch screen display, video monitor, television-like device, and/or a television. Display 1620 may be digital and/or analog. In various implementations, display 1620 may be a holographic display. Also, display 1620 may be a transparent surface that may receive a visual projection. Such projections may convey various forms of information, images, and/or objects. For example, such projections may be a visual overlay for a mobile augmented reality (MAR) application. Under the control of one or more software applications 1616, platform 1602 may display user interface 1622 on display 1620.
In various implementations, content services device(s) 1630 may be hosted by any national, international and/or independent service and thus accessible to platform 1602 via the Internet, for example. Content services device(s) 1630 may be coupled to platform 1602 and/or to display 1620. Platform 1602 and/or content services device(s) 1630 may be coupled to a network 1660 to communicate (e.g., send and/or receive) media information to and from network 1660. Content delivery device(s) 1640 also may be coupled to platform 1602 and/or to display 1620.
In various implementations, content services device(s) 1630 may include a cable television box, personal computer, network, telephone, Internet enabled devices or appliance capable of delivering digital information and/or content, and any other similar device capable of uni-directionally or bi-directionally communicating content between content providers and platform 1602 and/display 1620, via network 1660 or directly. It will be appreciated that the content may be communicated uni-directionally and/or bi-directionally to and from any one of the components in system 1600 and a content provider via network 1660. Examples of content may include any media information including, for example, video, music, medical and gaming information, and so forth.
Content services device(s) 1630 may receive content such as cable television programming including media information, digital information, and/or other content. Examples of content providers may include any cable or satellite television or radio or Internet content providers. The provided examples are not meant to limit implementations in accordance with the present disclosure in any way.
In various implementations, platform 1602 may receive control signals from navigation controller 1650 having one or more navigation features. The navigation features of may be used to interact with user interface 1622, for example. In various embodiments, navigation may be a pointing device that may be a computer hardware component (specifically, a human interface device) that allows a user to input spatial (e.g., continuous and multi-dimensional) data into a computer. Many systems such as graphical user interfaces (GUI), and televisions and monitors allow the user to control and provide data to the computer or television using physical gestures.
Movements of the navigation features of may be replicated on a display (e.g., display 1620) by movements of a pointer, cursor, focus ring, or other visual indicators displayed on the display. For example, under the control of software applications 1616, the navigation features located on navigation may be mapped to virtual navigation features displayed on user interface 1622, for example. In various embodiments, may not be a separate component but may be integrated into platform 1602 and/or display 1620. The present disclosure, however, is not limited to the elements or in the context shown or described herein.
In various implementations, drivers (not shown) may include technology to enable users to instantly turn on and off platform 1602 like a television with the touch of a button after initial boot-up, when enabled, for example. Program logic may allow platform 1602 to stream content to media adaptors or other content services device(s) 1630 or content delivery device(s) 1640 even when the platform is turned “off.” In addition, chipset 1605 may include hardware and/or software support for 5.1 surround sound audio and/or high definition 7.1 surround sound audio, for example. Drivers may include a graphics driver for integrated graphics platforms. In various embodiments, the graphics driver may include a peripheral component interconnect (PCI) Express graphics card.
In various implementations, any one or more of the components shown in system 1600 may be integrated. For example, platform 1602 and content services device(s) 1630 may be integrated, or platform 1602 and content delivery device(s) 1640 may be integrated, or platform 1602, content services device(s) 1630, and content delivery device(s) 1640 may be integrated, for example. In various embodiments, platform 1602 and display 1620 may be an integrated unit. Display 1620 and content service device(s) 1630 may be integrated, or display 1620 and content delivery device(s) 1640 may be integrated, for example. These examples are not meant to limit the present disclosure.
In various embodiments, system 1600 may be implemented as a wireless system, a wired system, or a combination of both. When implemented as a wireless system, system 1600 may include components and interfaces suitable for communicating over a wireless shared media, such as one or more antennas, transmitters, receivers, transceivers, amplifiers, filters, control logic, and so forth. An example of wireless shared media may include portions of a wireless spectrum, such as the RF spectrum and so forth. When implemented as a wired system, system 1600 may include components and interfaces suitable for communicating over wired communications media, such as input/output (I/O) adapters, physical connectors to connect the I/O adapter with a corresponding wired communications medium, a network interface card (NIC), disc controller, video controller, audio controller, and the like. Examples of wired communications media may include a wire, cable, metal leads, printed circuit board (PCB), backplane, switch fabric, semiconductor material, twisted-pair wire, co-axial cable, fiber optics, and so forth.
Platform 1602 may establish one or more logical or physical channels to communicate information. The information may include media information and control information. Media information may refer to any data representing content meant for a user. Examples of content may include, for example, data from a voice conversation, videoconference, streaming video, electronic mail (“email”) message, voice mail message, alphanumeric symbols, graphics, image, video, text and so forth. Data from a voice conversation may be, for example, speech information, silence periods, background noise, comfort noise, tones and so forth. Control information may refer to any data representing commands, instructions or control words meant for an automated system. For example, control information may be used to route media information through a system, or instruct a node to process the media information in a predetermined manner. The embodiments, however, are not limited to the elements or in the context shown or described in
As described above, system 1600 may be embodied in varying physical styles or form factors.
Examples of a mobile computing device may include a personal computer (PC), laptop computer, ultra-laptop computer, tablet, touch pad, portable computer, handheld computer, palmtop computer, personal digital assistant (PDA), cellular telephone, combination cellular telephone/PDA, smart device (e.g., smart phone, smart tablet or smart mobile television), mobile internet device (MID), messaging device, data communication device, cameras, and so forth.
Examples of a mobile computing device also may include computers that are arranged to be worn by a person, such as a wrist computers, finger computers, ring computers, eyeglass computers, belt-clip computers, arm-band computers, shoe computers, clothing computers, and other wearable computers. In various embodiments, for example, a mobile computing device may be implemented as a smart phone capable of executing computer applications, as well as voice communications and/or data communications. Although some embodiments may be described with a mobile computing device implemented as a smart phone by way of example, it may be appreciated that other embodiments may be implemented using other wireless mobile computing devices as well. The embodiments are not limited in this context.
As shown in
Various embodiments may be implemented using hardware elements, software elements, or a combination of both. Examples of hardware elements may include processors, microprocessors, circuits, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth), integrated circuits, application specific integrated circuits (ASIC), programmable logic devices (PLD), digital signal processors (DSP), field programmable gate array (FPGA), logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth. Examples of software may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API), instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. Determining whether an embodiment is implemented using hardware elements and/or software elements may vary in accordance with any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds and other design or performance constraints.
One or more aspects of at least one embodiment may be implemented by representative instructions stored on a machine-readable medium which represents various logic within the processor, which when read by a machine causes the machine to fabricate logic to perform the techniques described herein. Such representations, known as IP cores may be stored on a tangible, machine readable medium and supplied to various customers or manufacturing facilities to load into the fabrication machines that actually make the logic or processor.
While certain features set forth herein have been described with reference to various implementations, this description is not intended to be construed in a limiting sense. Hence, various modifications of the implementations described herein, as well as other implementations, which are apparent to persons skilled in the art to which the present disclosure pertains are deemed to lie within the spirit and scope of the present disclosure.
In one or more first embodiments, a computer-implemented method for video coding comprises receiving an individual 2-dimensional (2D) video frame from a video sequence of 2D video frames, such that the individual 2D video frame comprises a projection from a 360 video space, determining, from the individual 2D video frame, a group of pixels for deblock filtering, such that the group of pixels comprises a first set of pixels and a second set of pixels of the individual 2D video frame, the first set of pixels and the second set of pixels are non-neighboring sets of pixels in the individual 2D video frame and at least a first individual pixel of the first set of pixels and a second individual pixel of the second set of pixels comprise neighboring pixels in the 360 video space, and deblock filtering the group of pixels comprising the first and second set of pixels to generate a 360 video space deblock filtered 2D video frame based on the individual 2D video frame.
Further to the first embodiments, the individual 2D video frame comprises one of an equirectangular frame projected from the 360 video space, a cube map format frame projected from the 360 video space, or a compact cube map format frame projected from the 360 video space.
Further to the first embodiments, the first set of pixels begins with the first individual pixel at a left boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame and the second set of pixels begins with the second individual pixel at a right boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame.
Further to the first embodiments, the individual 2D video frame comprises an equirectangular frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first position of a top boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame, the second set of pixels begins with the second individual pixel at a second position of the top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, and the first position and the second position of the top boundary are equidistant from a center of the top boundary of the individual 2D video frame.
Further to the first embodiments, the individual 2D video frame comprises a cube map format frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first position of a first face projection and a first blank pixel region boundary of the individual 2D video frame and extends toward an interior of the first face projection, the second set of pixels begins with the second individual pixel at a second position of a second face projection and a second blank pixel region boundary and extends toward an interior of the second face projection, and the first position and the second position are equidistant from an intersection of the first and second blank pixel region boundaries.
Further to the first embodiments, the individual 2D video frame comprises a compact cube map format frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first face projection and video frame edge boundary of the individual 2D video frame and extends toward an interior of the first face projection and the second set of pixels begins with the second individual pixel at a second face projection and a third face projection boundary and extends toward an interior of the second face projection.
Further to the first embodiments, the group of pixels comprises a single line of pixels and said deblock filtering the group of pixels comprises applying a low pass filter to the group of pixels.
Further to the first embodiments, the group of pixels comprises a single line of pixels and the method further comprises determining, from the individual 2D video frame, a second group of pixels for deblock filtering, such that the second group of pixels comprises a third set of pixels and a fourth set of pixels of the individual 2D video frame, the third set of pixels and the fourth set of pixels are non-neighboring pixels in the individual 2D video frame and at least a third individual pixel of the third set of pixels and a fourth individual pixel of the fourth set of pixels comprise neighboring pixels in the 360 video space, such that the individual 2D video frame comprises an equirectangular frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a left boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame, the second set of pixels begins with the second individual pixel at a right boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, the third set of pixels begins with the third individual pixel at a first position of a top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, the fourth set of pixels begins with the fourth individual pixel at a second position of the top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, and the first position and the second position of the top boundary are equidistant from a center of the top boundary of the individual 2D video frame.
Further to the first embodiments, the individual 2D video frame comprises a reconstructed 2D video frame and the method further comprises differencing a portion of the 360 video space deblock filtered 2D video frame with a portion of an original 2D video frame to generate a residual portion, such that the 360 video space deblock filtered 2D video frame is a reference frame with respect to the original 2D video frame, transforming and quantizing the residual portion to determine quantized transform coefficients for the residual portion, and encoding the quantized transform coefficients into a bitstream.
Further to the first embodiments, the individual 2D video frame comprises a filtered reconstructed 2D video frame and the method further comprises decoding a bitstream to determine quantized transform coefficients for a residual portion of a reconstructed 2D video frame, inverse quantizing and inverse transforming the quantized transform coefficients to determine the residual portion, adding the residual portion to a prediction portion to generate a reconstructed portion of the reconstructed 2D video frame, in-frame deblock filtering the reconstructed 2D video frame to generate the filtered reconstructed 2D video frame, determining a portion of the 360 video space deblock filtered 2D video frame for display based on a viewport, and displaying the portion of the reconstructed 2D video frame to a user.
Further to the first embodiments, the method further comprises differencing a portion of the 360 video space deblock filtered 2D video frame with a portion of a reconstructed video frame to generate a residual portion, transforming and quantizing the residual portion to determine quantized transform coefficients for the residual portion, and encoding the quantized transform coefficients into a bitstream.
In one or more second embodiments, a system for video coding comprises a memory to store an individual 2-dimensional (2D) video frame from a video sequence of 2D video frames, such that the individual 2D video frame comprises a projection from a 360 video space and a processor coupled to the memory, the processor to receive the individual 2-dimensional (2D) video frame, to determine, from the individual 2D video frame, a group of pixels for deblock filtering, such that the group of pixels comprises a first set of pixels and a second set of pixels of the individual 2D video frame, the first set of pixels and the second set of pixels are non-neighboring sets of pixels in the individual 2D video frame and at least a first individual pixel of the first set of pixels and a second individual pixel of the second set of pixels comprise neighboring pixels in the 360 video space, and to deblock filter the group of pixels comprising the first and second set of pixels to generate a 360 video space deblock filtered 2D video frame based on the individual 2D video frame.
Further to the second embodiments, the individual 2D video frame comprises one of an equirectangular frame projected from the 360 video space, a cube map format frame projected from the 360 video space, or a compact cube map format frame projected from the 360 video space.
Further to the second embodiments, the first set of pixels begins with the first individual pixel at a left boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame and the second set of pixels begins with the second individual pixel at a right boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame.
Further to the second embodiments, the individual 2D video frame comprises an equirectangular frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first position of a top boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame, the second set of pixels begins with the second individual pixel at a second position of the top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, and the first position and the second position of the top boundary are equidistant from a center of the top boundary of the individual 2D video frame.
Further to the second embodiments, the individual 2D video frame comprises a cube map format frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first position of a first face projection and a first blank pixel region boundary of the individual 2D video frame and extends toward an interior of the first face projection, the second set of pixels begins with the second individual pixel at a second position of a second face projection and a second blank pixel region boundary and extends toward an interior of the second face projection, and the first position and the second position are equidistant from an intersection of the first and second blank pixel region boundaries.
Further to the second embodiments, the individual 2D video frame comprises a compact cube map format frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first face projection and video frame edge boundary of the individual 2D video frame and extends toward an interior of the first face projection and the second set of pixels begins with the second individual pixel at a second face projection and a third face projection boundary and extends toward an interior of the second face projection.
Further to the second embodiments, the group of pixels comprises a single line of pixels and the processor to deblock filter the group of pixels comprises the processor to apply a low pass filter to the group of pixels.
Further to the second embodiments, the group of pixels comprises a single line of pixels and the processor is further to determine, from the individual 2D video frame, a second group of pixels for deblock filtering, such that the second group of pixels comprises a third set of pixels and a fourth set of pixels of the individual 2D video frame, the third set of pixels and the fourth set of pixels are non-neighboring pixels in the individual 2D video frame and at least a third individual pixel of the third set of pixels and a fourth individual pixel of the fourth set of pixels comprise neighboring pixels in the 360 video space, such that the individual 2D video frame comprises an equirectangular frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a left boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame, the second set of pixels begins with the second individual pixel at a right boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, the third set of pixels begins with the third individual pixel at a first position of a top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, the fourth set of pixels begins with the fourth individual pixel at a second position of the top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, and the first position and the second position of the top boundary are equidistant from a center of the top boundary of the individual 2D video frame.
Further to the second embodiments, the individual 2D video frame comprises a reconstructed 2D video frame and the processor is further to difference portion of the 360 video space deblock filtered 2D video frame with a portion of an original 2D video frame to generate a residual portion, such that the 360 video space deblock filtered 2D video frame is a reference frame with respect to the original 2D video frame, to transform and quantize the residual portion to determine quantized transform coefficients for the residual portion, and to encode the quantized transform coefficients into a bitstream.
Further to the second embodiments, the individual 2D video frame comprises a filtered reconstructed 2D video frame and the processor is further to decode a bitstream to determine quantized transform coefficients for a residual portion of a reconstructed 2D video frame, to inverse quantize and inverse transform the quantized transform coefficients to determine the residual portion, to add the residual portion to a prediction portion to generate a reconstructed portion of the reconstructed 2D video frame, to in-frame deblock filter the reconstructed 2D video frame to generate the filtered reconstructed 2D video frame, to determine a portion of the 360 video space deblock filtered 2D video frame for display based on a viewport, and to display the portion of the reconstructed 2D video frame to a user.
Further to the second embodiments, the processor is further to difference a portion of the 360 video space deblock filtered 2D video frame with a portion of a reconstructed video frame to generate a residual portion, to transform and quantize the residual portion to determine quantized transform coefficients for the residual portion, and to encode the quantized transform coefficients into a bitstream.
In one or more third embodiments, a system comprises means for receiving an individual 2-dimensional (2D) video frame from a video sequence of 2D video frames, such that the individual 2D video frame comprises a projection from a 360 video space, means for determining, from the individual 2D video frame, a group of pixels for deblock filtering, such that the group of pixels comprises a first set of pixels and a second set of pixels of the individual 2D video frame, the first set of pixels and the second set of pixels are non-neighboring sets of pixels in the individual 2D video frame and at least a first individual pixel of the first set of pixels and a second individual pixel of the second set of pixels comprise neighboring pixels in the 360 video space, and means for deblock filtering the group of pixels comprising the first and second set of pixels to generate a 360 video space deblock filtered 2D video frame based on the individual 2D video frame.
Further to the third embodiments, the first set of pixels begins with the first individual pixel at a left boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame and the second set of pixels begins with the second individual pixel at a right boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame.
Further to the third embodiments, the individual 2D video frame comprises an equirectangular frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first position of a top boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame, the second set of pixels begins with the second individual pixel at a second position of the top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, and the first position and the second position of the top boundary are equidistant from a center of the top boundary of the individual 2D video frame.
Further to the third embodiments, the individual 2D video frame comprises a cube map format frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first position of a first face projection and a first blank pixel region boundary of the individual 2D video frame and extends toward an interior of the first face projection, the second set of pixels begins with the second individual pixel at a second position of a second face projection and a second blank pixel region boundary and extends toward an interior of the second face projection, and the first position and the second position are equidistant from an intersection of the first and second blank pixel region boundaries.
Further to the third embodiments, the individual 2D video frame comprises a compact cube map format frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first face projection and video frame edge boundary of the individual 2D video frame and extends toward an interior of the first face projection and the second set of pixels begins with the second individual pixel at a second face projection and a third face projection boundary and extends toward an interior of the second face projection.
Further to the third embodiments, the individual 2D video frame comprises a reconstructed 2D video frame and the system further comprises differencing a portion of the 360 video space deblock filtered 2D video frame with a portion of an original 2D video frame to generate a residual portion, such that the 360 video space deblock filtered 2D video frame is a reference frame with respect to the original 2D video frame, means for transforming and quantizing the residual portion to determine quantized transform coefficients for the residual portion, and means for encoding the quantized transform coefficients into a bitstream.
Further to the third embodiments, the individual 2D video frame comprises a filtered reconstructed 2D video frame and the system further comprises means for decoding a bitstream to determine quantized transform coefficients for a residual portion of a reconstructed 2D video frame, means for inverse quantizing and inverse transforming the quantized transform coefficients to determine the residual portion, means for adding the residual portion to a prediction portion to generate a reconstructed portion of the reconstructed 2D video frame, means for in-frame deblock filtering the reconstructed 2D video frame to generate the filtered reconstructed 2D video frame, means for determining a portion of the 360 video space deblock filtered 2D video frame for display based on a viewport, and means for displaying the portion of the reconstructed 2D video frame to a user.
In one or more fourth embodiments, at least one machine readable medium comprises a plurality of instructions that, in response to being executed on a computing device, cause the computing device to perform video coding by receiving an individual 2-dimensional (2D) video frame from a video sequence of 2D video frames, such that the individual 2D video frame comprises a projection from a 360 video space, determining, from the individual 2D video frame, a group of pixels for deblock filtering, such that the group of pixels comprises a first set of pixels and a second set of pixels of the individual 2D video frame, the first set of pixels and the second set of pixels are non-neighboring sets of pixels in the individual 2D video frame and at least a first individual pixel of the first set of pixels and a second individual pixel of the second set of pixels comprise neighboring pixels in the 360 video space, and deblock filtering the group of pixels comprising the first and second set of pixels to generate a 360 video space deblock filtered 2D video frame based on the individual 2D video frame.
Further to the fourth embodiments, the first set of pixels begins with the first individual pixel at a left boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame and the second set of pixels begins with the second individual pixel at a right boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame.
Further to the fourth embodiments, the individual 2D video frame comprises an equirectangular frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first position of a top boundary of the individual 2D video frame and extends toward an interior of the individual 2D video frame, the second set of pixels begins with the second individual pixel at a second position of the top boundary of the individual 2D video frame and extends toward the interior of the individual 2D video frame, and the first position and the second position of the top boundary are equidistant from a center of the top boundary of the individual 2D video frame.
Further to the fourth embodiments, the individual 2D video frame comprises a cube map format frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first position of a first face projection and a first blank pixel region boundary of the individual 2D video frame and extends toward an interior of the first face projection, the second set of pixels begins with the second individual pixel at a second position of a second face projection and a second blank pixel region boundary and extends toward an interior of the second face projection, and the first position and the second position are equidistant from an intersection of the first and second blank pixel region boundaries.
Further to the fourth embodiments, the individual 2D video frame comprises a compact cube map format frame projected from the 360 video space, the first set of pixels begins with the first individual pixel at a first face projection and video frame edge boundary of the individual 2D video frame and extends toward an interior of the first face projection and the second set of pixels begins with the second individual pixel at a second face projection and a third face projection boundary and extends toward an interior of the second face projection.
Further to the fourth embodiments, the individual 2D video frame comprises a reconstructed 2D video frame and the machine readable medium further comprises a plurality of instructions that, in response to being executed on the computing device, cause the computing device to perform video coding by differencing a portion of the 360 video space deblock filtered 2D video frame with a portion of an original 2D video frame to generate a residual portion, such that the 360 video space deblock filtered 2D video frame is a reference frame with respect to the original 2D video frame, transforming and quantizing the residual portion to determine quantized transform coefficients for the residual portion, and encoding the quantized transform coefficients into a bitstream.
Further to the fourth embodiments, the individual 2D video frame comprises a filtered reconstructed 2D video frame and the machine readable medium further comprises a plurality of instructions that, in response to being executed on the computing device, cause the computing device to perform video coding by decoding a bitstream to determine quantized transform coefficients for a residual portion of a reconstructed 2D video frame, inverse quantizing and inverse transforming the quantized transform coefficients to determine the residual portion, adding the residual portion to a prediction portion to generate a reconstructed portion of the reconstructed 2D video frame, in-frame deblock filtering the reconstructed 2D video frame to generate the filtered reconstructed 2D video frame, determining a portion of the 360 video space deblock filtered 2D video frame for display based on a viewport, and displaying the portion of the reconstructed 2D video frame to a user.
In one or more fifth embodiments, at least one machine readable medium may include a plurality of instructions that in response to being executed on a computing device, causes the computing device to perform a method according to any one of the above embodiments.
In one or more sixth embodiments, an apparatus or system may include means for performing a method according to any one of the above embodiments.
It will be recognized that the embodiments are not limited to the embodiments so described, but can be practiced with modification and alteration without departing from the scope of the appended claims. For example, the above embodiments may include specific combination of features. However, the above embodiments are not limited in this regard and, in various implementations, the above embodiments may include the undertaking only a subset of such features, undertaking a different order of such features, undertaking a different combination of such features, and/or undertaking additional features than those features explicitly listed. The scope of the embodiments should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.