Processing of equirectangular object data to compensate for distortion by spherical projections

Description

BACKGROUND

The present disclosure relates to coding techniques for omnidirectional and multi-directional images and videos.

Some modern imaging applications capture image data from multiple directions about a camera. Some cameras pivot during image capture, which allows a camera to capture image data across an angular sweep that expands the camera's effective field of view. Some other cameras have multiple imaging systems that capture image data in several different fields of view. In either case, an aggregate image may be created that represents a merger or “stitching” of image data captured from these multiple views.

Many modern coding applications are not designed to process such omnidirectional or multi-directional image content. Such coding applications are designed based on an assumption that image data within an image is “flat” or captured from a single field of view. Thus, the coding applications do not account for image distortions that can arise when processing these omnidirectional or multi-directional images with the distortions contained within them. These distortions can cause ordinary video coders to fail to recognize redundancies in image content, which leads to inefficient coding.

Accordingly, the inventors perceive a need in the art for coding techniques that can process omnidirectional and multi-directional image content and limit distortion.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a system in which embodiments of the present disclosure may be employed.

FIG. 2 is a functional block diagram of a coding system according to an embodiment of the present disclosure.

FIG. 3 illustrates image sources that find use with embodiments of the present disclosure.

FIG. 4 illustrates an exemplary equirectangular projection image captured by multi-directional imaging.

FIG. 5 models distortion effects that may arise in spherical images.

FIG. 6 is a graph illustrating distortion an exemplary object in an exemplary equirectangular frame.

FIG. 7 illustrates a coding method according to an embodiment of the present disclosure.

FIG. 8 illustrates a coding method according to an embodiment of the present disclosure.

FIG. 9 illustrates transforms that may be applied to reference frame data according to the method of FIG. 8.

FIG. 10 is a functional block diagram of a coding system according to an embodiment of the present disclosure.

FIG. 11 is a functional block diagram of a decoding system according to an embodiment of the present disclosure.

FIG. 12 illustrates an computer system suitable for use with embodiments of the present disclosure.

DETAILED DESCRIPTION

Embodiments of the present disclosure provide techniques for coding spherical image and video. For each pixel block in a frame to be coded, an encoder may transform reference picture data within a search window about a location of the input pixel block based on displacement respectively between the location of the input pixel block and portions of the reference picture within the search window. The encoder may perform a prediction search among the transformed reference picture data to identify a match between the input pixel block and a portion of the transformed reference picture and, when a match is identified, the encoder may code the input pixel block differentially with respect to the matching portion of the transformed reference picture. The transform may counter-act distortions imposed on image content of the reference picture data by the spherical projection format, which aligns the content with image content of the input picture.

FIG. 1 illustrates a system 100 in which embodiments of the present disclosure may be employed. The system 100 may include at least two terminals 110-120 interconnected via a network 130. The first terminal 110 may have an image source that generates multi-directional and omnidirectional video. The terminal 110 also may include coding systems and transmission systems (not shown) to transmit coded representations of the multi-directional video to the second terminal 120, where it may be consumed. For example, the second terminal 120 may display the spherical video on a local display, it may execute a video editing program to modify the spherical video, or may integrate the spherical video into an application (for example, a virtual reality program), may present in head mounted display (for example, virtual reality applications) or it may store the spherical video for later use.

FIG. 1 illustrates components that are appropriate for unidirectional transmission of spherical video, from the first terminal 110 to the second terminal 120. In some applications, it may be appropriate to provide for bidirectional exchange of video data, in which case the second terminal 120 may include its own image source, video coder and transmitters (not shown), and the first terminal 110 may include its own receiver and display (also not shown). If it is desired to exchange spherical video bidirectionally, then the techniques discussed hereinbelow may be replicated to generate a pair of independent unidirectional exchanges of spherical video. In other applications, it would be permissible to transmit spherical video in one direction (e.g., from the first terminal 110 to the second terminal 120) and transmit “flat” video (e.g., video from a limited field of view) in a reverse direction.

In FIG. 1, the second terminal 120 is illustrated as a computer display but the principles of the present disclosure are not so limited. Embodiments of the present disclosure find application with laptop computers, tablet computers, smart phones, servers, media players, virtual reality head mounted displays, augmented reality display, hologram displays, and/or dedicated video conferencing equipment. The network 130 represents any number of networks that convey coded video data among the terminals 110-120, including, for example, wireline and/or wireless communication networks. The communication network 130 may exchange data in circuit-switched and/or packet-switched channels. Representative networks include telecommunications networks, local area networks, wide area networks and/or the Internet. For the purposes of the present discussion, the architecture and topology of the network 130 is immaterial to the operation of the present disclosure unless explained hereinbelow.

FIG. 2 is a functional block diagram of a coding system 200 according to an embodiment of the present disclosure. The system 200 may include an image source 210, an image processing system 220, a video coder 230, a video decoder 240, a reference picture store 250, a predictor 260 and, optionally, a pair of spherical transform units 270, 280. The image source 210 may generate image data as a multi-directional image, containing image data of a field of view that extends around a reference point in multiple directions. The image processing system 220 may convert the image data from the image source 210 as needed to fit requirements of the video coder 230. The video coder 230 may generate a coded representation of its input image data, typically by exploiting spatial and/or temporal redundancies in the image data. The video coder 230 may output a coded representation of the input data that consumes less bandwidth than the input data when transmitted and/or stored.

The video decoder 240 may invert coding operations performed by the video encoder 230 to obtain a reconstructed picture from the coded video data. Typically, the coding processes applied by the video coder 230 are lossy processes, which cause the reconstructed picture to possess various errors when compared to the original picture. The video decoder 240 may reconstruct picture of select coded pictures, which are designated as “reference pictures,” and store the decoded reference pictures in the reference picture store 250. In the absence of transmission errors, the decoded reference pictures will replicate decoded reference pictures obtained by a decoder (not shown in FIG. 2).

The predictor 260 may select prediction references for new input pictures as they are coded. For each portion of the input picture being coded (called a “pixel block” for convenience), the predictor 260 may select a coding mode and identify a portion of a reference picture that may serve as a prediction reference search for the pixel block being coded. The coding mode may be an intra-coding mode, in which case the prediction reference may be drawn from a previously-coded (and decoded) portion of the picture being coded. Alternatively, the coding mode may be an inter-coding mode, in which case the prediction reference may be drawn from another previously-coded and decoded picture.

In an embodiment, the predictor 260 may search for prediction references of pictures being coded operating on input picture and reference picture that has been transformed to a spherical projection representation. The spherical transform units 270, 280 may transform the input picture and the reference picture to the spherical projection representations.

When an appropriate prediction reference is identified, the predictor 260 may furnish the prediction data to the video coder 230. The video coder 230 may code input video data differentially with respect to prediction data furnished by the predictor 260. Typically, prediction operations and the differential coding operate on a pixel block-by-pixel block basis. Prediction residuals, which represent pixel-wise differences between the input pixel blocks and the prediction pixel blocks, may be subject to further coding operations to reduce bandwidth further.

As indicated, the coded video data output by the video coder 230 should consume less bandwidth than the input data when transmitted and/or stored. The coding system 200 may output the coded video data to an output device 290, such as a transmitter (not shown) that may transmit the coded video data across a communication network 130 (FIG. 1) or a storage device (also not shown) such as an electronic-, magnetic- and/or optical storage medium.

FIG. 3 illustrates image sources 310, 340 that find use with embodiments of the present disclosure. A first image source may be a camera 310, shown in FIG. 3(a), that has a single image sensor (not shown) that pivots along an axis. During operation, the camera 310 may capture image content as it pivots along a predetermined angular distance (preferably, a full 360 degrees) and merge the captured image content into a 360° image. The capture operation may yield an equirectangular image 320 having predetermined dimension M×N pixels. The equirectangular picture 320 may represent a multi-directional field of view 320 having been partitioned along a slice 322 that divides a cylindrical field of view into a two dimensional array of data. In the equirectangular picture 320, pixels on either edge 322, 324 of the image 320 represent adjacent image content even though they appear on different edges of the equirectangular picture 320.

Optionally, the equirectangular image 320 may be transformed to a spherical projection. The spherical transform unit 270 may transform pixel data at locations (x,y) within the equirectangular picture 320 to locations (θ, φ) along a spherical projection 320 according to a transform such as:

θ=x+θ₀, and (Eq. 1.)
φ=y+φ₀, where (Eq. 2.)

θ and φ respectively represents the longitude and latitude of a location in the spherical projection 330, θ₀, φ₀represent an origin of the spherical projection 330, and x and y represent the horizontal and vertical coordinates of the source data in the equirectangular picture 320.

When applying the transform, the spherical transform unit 270 may transform each pixel location along a predetermined row of the equirectangular picture 320 to have a unique location at an equatorial latitude in the spherical projection 330. In such regions, each location in the spherical projection 330 may be assigned pixel values from corresponding locations of the equirectangular picture 320. At other locations, particularly toward poles of the spherical projection 330, the spherical projection unit 270 may map several source locations from the equirectangular picture 320 to a common location in the spherical projection 330. In such a case, the spherical projection unit 270 may derive pixel values for the locations in the spherical projection 330 from a blending of corresponding pixel values in the equirectangular picture 320 (for example, by averaging pixel values at corresponding locations of the equirectangular picture 320).

FIG. 3(b) illustrates image capture operations of another type of image source, an omnidirectional camera 340. In this embodiment, a camera system 340 may perform a multi-directional capture operation and output a cube map picture 360 having dimensions M×N pixels in which image content is arranged according to a cube map capture 350. The image capture may capture image data in each of a predetermined number of directions (typically, six) which are stitched together according to the cube map layout. In the example illustrated in FIG. 3, six sub-images corresponding to a left view 361, a front view 362, a right view 363, a back view 364, a top view 365 and a bottom view 366 may be captured, stitched and arranged within the multi-directional picture 360 according to “seams” of image content between the respective views. Thus, as illustrated in FIG. 3, pixels from the front image that are adjacent to the pixels from each of the top, the left, the right and the bottom images represent image content that is adjacent respectively to content of the adjoining sub-images. Similarly, pixels from the right and back images that are adjacent to each other represent adjacent image content. Further, content from a terminal edge 368 of the back image is adjacent to content from an opposing terminal edge 369 of the left image. The cube map picture 360 also may have regions 367.1-367.4 that do not belong to any image.

Optionally, the cube map image 360 may be transformed to a spherical projection 330. The spherical transform unit 270 may transform pixel data at locations (x,y) within the cube map picture 360 to locations (θ, φ) along a spherical projection 330 according to transforms derived from each sub-image in the cube map. FIG. 3 illustrates six faces 361-366 of the image capture 360 superimposed over the spherical projection 330 that is to be generated. Each sub-image of the image capture corresponds to a predetermined angular region of a surface of the spherical projection 330. Thus, image data of the front face 362 may be projected to a predetermined portion on the surface of the spherical projection, and image data of the left, right, back, top and bottom sub-images may be projected on corresponding portions of the surface of the spherical projection 330.

In a cube map having square sub-images, that is, height and width of the sub-images 361-366 are equal, each sub-image projects to a 90°×90° region of the projection surface. Thus, each position x,y with a sub-image maps to a θ, φ location on the spherical projection 330 based on a sinusoidal projection function of the form φ=f^k(x, y) and θ=g^k(x, y), where x,y represent displacements from a center of the cube face k for top, bottom, front, right, left, right and θ, φ represent angular deviations in the sphere.

When applying the transform, some pixel locations in the cube map picture 360 may map to a unique location in the spherical projection 330. In such regions, each location in the spherical projection 330 may be assigned pixel values from corresponding locations of the cube map picture 360. At other locations, particularly toward edges of the respective sub-images, the spherical projection unit 270 may map image data from several source locations in the cube map picture 360 to a common location in the spherical projection 430. In such a case, the spherical projection unit 270 may derive pixel values for the locations in the spherical projection 430 from a blending of corresponding pixel values in the cube map picture 360 (for example, by a weighted averaging pixel values at corresponding locations of cube map picture 360).

FIG. 3(c) illustrates image capture operations of another type of image source, a camera 370 having a pair of fish-eye lenses. In this embodiment, each lens system captures data in a different 180° field of view, representing opposed “half shells.” The camera 370 may generate an image 380 from a stitching of images generated from each lens system. Fish eye lenses typically induce distortion based on object location within each half shell field of view. In an embodiment, the multi-directional image 380 may be transformed to a spherical projection 330.

The techniques of the present disclosure find application with other types of image capture techniques. For example, truncated pyramid-, tetrahedral-, octahedral-, dodecahedral- and icosahedral-based image capture techniques may be employed. Images obtained therefrom may be mapped to a spherical projection through analogous techniques.

Image sources need not include cameras. In other embodiments, an image source 210 (FIG. 2) may be a computer application that generates 360° image data. For example, a gaming application may model a virtual world in three dimensions and generate a spherical image based on synthetic content. And, of course, a spherical image may contain both natural content (content generated from a camera) and synthetic content (computer graphics content) that has been merged together by a computer application.

Multi-directional imaging systems typically generate image data that contains spatial distortions of image content. FIG. 4 illustrates an exemplary equirectangular image captured by a multi-directional imaging system. The image illustrates, among other things, two objects Obj1 and Obj2, each of the same size. When captured by a multi-directional imaging system, the objects appear to have different sizes based on their location in the equirectangular image. For example, object Obj1 is located fairly close to central axes 410, 420 and, as a result, exhibits a lower level of distortion than the object Obj2. Even so, edges of the object Obj1 exhibit distortion (curvature of straight lines) to a larger degree than portions of the object that are closer to the horizontal axis 410. Object Obj2 is displaced from the horizontal axis 410 much farther than any portion of the object Obj1 and, as a consequence, distortions both of the object's height, which is approximately 32% of the height of object Obj1 in the illustration of FIG. 4, and curvature of horizontal image components of the object Obj1.

FIG. 5 models distortion effects that may arise in spherical image projections. In two dimensional, “flat” video, lateral motion of an object is captured by a flat image sensor, which causes the size of a moving object to remain consistent. When such image data is projected onto a spherical surface, object motion can cause distortion of image data. Consider the example shown in FIG. 5, where an object 510 having a length l moves from a position at the center of object's motion plane 520 to another position away from the center by a distance y. For discussion purposes, it may be assumed that the object 510 is located at a common distance d from a center of the spherical projection.

Mathematically, the distortion can be modeled as follows:

$\begin{matrix} \tan (a) = \frac{l}{d} & (1) \\ \tan (Φ) = \frac{y}{d} & (2) \\ \tan (Φ + b) = \frac{y + l}{d} & (3) \\ b = \tan^{- 1} (\frac{y + l}{d}) - Φ = \tan^{- 1} (\frac{y + l}{d}) - \tan^{- 1} (\frac{y}{d}) & (4) \end{matrix}$

Thus, when an object moves from the center y₀of a projection field of view by a distance y, the ratio of the object's length l in the spherical projection may be given as:

$\begin{matrix} \frac{b}{a} = \frac{\tan^{- 1} (\frac{y + l}{d}) - \tan^{- 1} (\frac{y}{d})}{a} = \frac{\tan^{- 1} (\frac{y + l}{d}) - \tan^{- 1} (\frac{y}{d})}{\tan^{- 1} (\frac{y 0 + l}{d}) - \tan^{- 1} (\frac{y 0}{d})} & (5) \end{matrix}$

Stated in simpler terms, the object's apparent length varies based on its displacement from the center of the projection.

FIG. 6 is a graph illustrating distortion of an exemplary object in an exemplary equirectangular frame. Here, the equirectangular image is of size 3,820 pixels by 1,920 pixels. In the spherical projection, each angular unit of the sphere, therefore, may be taken as

$\frac{π}{1920} \frac{1}{\tan (\frac{π}{1920})} .$

radians and the length l is the height of a single pixel, equal to 1. The distance d may be taken as FIG. 6 illustrates distortion of the length l as y changes from 0 to 960.

As illustrated in FIG. 4, the distortions described in FIG. 6 and in Equations (1)-(5) can occur in multiple dimensions simultaneously. Thus, distortions may arise in a vertical direction when an object 410 moves in a vertical direction with respect to the equirectangular source image. Additional distortions may arise in a horizontal direction when an object moves in a horizontal direction with respect to the equirectangular source image. Thus, the equations (1)-(5) above can be applied to lateral movement in a horizontal direction X as:

$\begin{matrix} \tan (a) = \frac{l}{d} & (6) \\ \tan (Φ) = \frac{x}{d} & (7) \\ \tan (Φ + b) = \frac{x + w}{d} & (8) \\ b = \tan^{- 1} (\frac{x + w}{d}) - Φ = \tan^{- 1} (\frac{x + w}{d}) - \tan^{- 1} (\frac{x}{d}) & (9) \end{matrix}$

Thus, when an object moves from the center x₀of a projection field of view by a distance x, the ratio of the object's width w in the spherical projection max may be given as:

$\begin{matrix} \frac{b}{a} = \frac{\tan^{- 1} (\frac{x + w}{d}) - \tan^{- 1} (\frac{x}{d})}{a} = \frac{\tan^{- 1} (\frac{x + w}{d}) - \tan^{- 1} (\frac{x}{d})}{\tan^{- 1} (\frac{x_{0} + w}{d}) - \tan^{- 1} (\frac{x_{0}}{d})} & (10) \end{matrix}$

According to an embodiment of the present disclosure, a terminal may model distortions that are likely to occur in image data when objects are projected to spherical domain representation or equirectangular representation, then use the model to correct data in the spherical-domain or equirectangular representation to counteract the distortions.

At a high level, embodiments of the present disclosure perform transforms on candidate reference frame data to invert distortions that occur in multi-images. For example, returning to FIG. 4, if image data of object Obj2 were present in a reference frame, the image data of object Obj2 could serve as an adequate prediction reference of object Obj1 that appears in an input frame to be coded. The two objects have the same image content and, absent distortions that arise from the imaging process, the same size. Embodiments of the present disclosure transform reference picture data according to the relationships identified in Equations (5) and (10) to generate transformed reference picture data that may provide a better fit to image data being coded.

FIG. 7 illustrates a coding method 700 according to an embodiment of the present disclosure. The method 700 may operate on a pixel-block by pixel-block basis to code a new input picture that is to be coded. The method 700 may perform a prediction search (box 710) from a comparison between an input pixel block data and reference picture data that is transformed to counter-act imaging distortion. When an appropriate prediction reference is found, the method 700 may code the input pixel block differentially using the transformed reference picture data (the “reference block,” for convenience) as a basis for prediction (box 720). Typically, this differential coding includes a calculation of pixel residuals from a pixel-wise subtraction of prediction block data from the input pixel block data (box 822) and a transformation, quantization and entropy coding of the pixel residuals obtained therefrom (box 724). In this regard, the method 700 may adhere to coding protocols defined by a prevailing coding specification, such as ITU H.265 (also known as “HEVC”), H.264 (also, “AVC”) or a predecessor coding specification. These specifications define protocols for defining pixel blocks, defining search windows for prediction references, and for performing differential coding of pixel blocks with reference to reference blocks. The method 700 also may transform spherical-domain representation of the motion vector to a coder-domain representation, the representation used by the video coding specification (box 726). The method 700 may output the coded pixel residuals, motion vectors and other metadata associated with prediction (typically, coding mode indicators and reference picture IDs) (box 728).

The prediction search (box 710) may include a transform of reference picture data to invert imaging-induced distortion. For each candidate motion vector available in a search window of the prediction search, the method 700 may transform the reference frame based on spatial displacement represented by the motion vector from the input pixel block (box 712). The method 700 may estimate prediction residuals that would be obtained if the candidate motion vector were used (box 714). These computations may be performed by a pixel-wise comparison of the input pixel block and the transformed reference frame that corresponds to the motion vector. Typically, when the comparisons generate pixel residuals of high magnitude and high variance, it indicates lower coding efficiencies than comparisons of other reference blocks that generate pixel residuals having lower magnitude and lower variance. The method 700 also may estimate coding distortions that would arise if the transformed reference block were used (box 716). These computations may be performed by estimating loss of pixel residuals based on quantization parameter levels that are predicted to be applied to the input pixel block. Once estimates have been obtained for all candidate motion vectors under consideration, the method 700 may select the motion vector that minimizes overall coding cost (box 718).

For example, the coding cost J of an input pixel block with reference to a candidate “reference block” BLK_mvthat is generated according to a motion vector my may be given as:

J=Bits(BLK_mv)+k*DIST(BLK_mv), where (11)

Bits(BLK_mv) represents a number of bits estimated to be required to code the input pixel block with reference to the reference block BLK_mv, DIST(BLK_mv) represents the distortion that would be obtained from coding the input pixel block with reference to the reference block BLK_mv, and k may be an operator-selected scalar to balance contribution of these factors. As explained, the method 700 may be performed to select a motion vector that minimizes the value J.

In an embodiment, the transforms may be performed to invert the distortions represented by equations (5) and (10).

The embodiment of FIG. 7 involves one transform of reference frame data for each candidate motion vector under consideration. In other embodiments, reference frame preprocessing may be performed, which may conserve processing resources.

FIG. 8 illustrates a coding method 800 according to an embodiment of the present disclosure. The method 800 may operate on a pixel-block by pixel-block basis to code a new input picture that is to be coded. The method 800 may perform a prediction search (box 810) from a comparison between an input pixel block data and reference picture data that is transformed to counter-act imaging distortion. When an appropriate prediction reference is found, the method 800 may code the input pixel block differentially using the transformed reference picture data (again, the “reference block,” for convenience) as a basis for prediction (box 820). Typically, this differential coding includes a calculation of pixel residuals from a pixel-wise subtraction of prediction block data from the input pixel block data (box 822) and a transformation, quantization and entropy coding of the pixel residuals obtained therefrom (box 824). In this regard, the method 800 may adhere to coding protocols defined by a prevailing coding specification, such as ITU H.265 (also known as “HEVC”), H.264 (also, “AVC”) or a predecessor coding specification. These specifications define protocols for defining pixel blocks, defining search windows for prediction references, and for performing differential coding of pixel blocks with reference to reference blocks. The method 800 also may transform spherical-domain representation of the motion vector to a coder-domain representation, the representation used by the video coding specification (box 826). The method 800 may output the coded pixel residuals, motion vectors and other metadata associated with prediction (typically, coding mode indicators and reference picture IDs) (box 828).

In an embodiment, the prediction search (box 810) may be performed to balance bandwidth conservation and information losses with processing resource costs. For each candidate motion vector my, the method 800 first may transform the reference picture in relation to the input pixel block along a vertical direction y (box 811). This transform essentially transforms reference picture data within a search window of the prediction search based on its vertical displacement from the input pixel block being coded. Thereafter, the method 800, for each candidate x value of the search window, may estimate prediction residuals that would arise if the motion vector were used (box 812) and further may estimate the resulting distortion (box 813). Thereafter, the method 800 may transform the reference picture in relation to the input pixel block along a horizontal direction x (box 814). This transform essentially transforms reference picture data within a search window of the prediction search based on its horizontal displacement from the input pixel block being coded. The method 800, for each candidate y value of the search window, may estimate prediction residuals that would arise if the motion vector were used (box 815) and further may estimate the resulting distortion (box 816). Once estimates have been obtained for all candidate motion vectors under consideration, the method 800 may select the motion vector that minimizes overall coding cost (box 818).

As indicated, the transforms performed in boxes 811 and 814 essentially cause a transform that aligns reference image data with the input pixel blocks on a row-basis (box 811) and a column-basis (box 814). Results of these transforms may be re-used for coding of other input pixel blocks that also are aligned with the input pixel blocks on a row-basis or column-basis respectively. In other words, a system employing the method 800 of FIG. 8 may perform a single transform under box 811 to estimate coding cost and distortion for all input pixel blocks in a common row. Further, the system a system employing the method 800 of FIG. 8 may perform a single transform under box 814 to estimate coding cost and distortion for all input pixel blocks in a common column. Thus, the operation of method 800 is expected to conserve processing resources over operation of the method 700 of FIG. 7.

FIG. 9 illustrates transforms that may be applied to reference frame data according to the method 800 of FIG. 8. FIG. 9(a) illustrates relationships between an exemplary input pixel block PB_i,jto be coded and reference frame data 900. The input pixel block PB_i,jhas a location i,j that defines a search window SW from which a coder may select reference frame data 900 to be used as a basis for prediction of the pixel block PB_i,j. During coding, the method 800 may test candidate motion vectors mv1, mv2, etc. within the search window SW to determine whether an adequate reference block may be found in the reference picture.

FIG. 9(b) illustrates exemplary transforms of the reference frame data that may be performed according to box 811. As illustrated, reference frame data may be transformed based on a vertical displacement between the pixel block PB_i,jbeing coded and reference frame data. In the example illustrated in FIG. 9, the transformation essentially stretches reference frame content based on the vertical displacement. The degree of stretching increases as displacement from the input pixel block increases. The method may test candidate motion vectors within the stretched reference frame data 910 rather than the source reference frame data 900. As illustrated in FIG. 4, the stretched data of object Obj2 may provide a better source of prediction for object Obj1 than the source data of object Obj2.

In other use cases, image data need not be stretched. For example, during coding of image content of object Obj2 in FIG. 4, a reference frame may contain content of the object at a location corresponding to object Obj1. In this case, image data from the reference frame may be spatially condensed to provide an appropriate prediction match to the object Obj2. Thus, the type of stretching, whether expansion or contraction, may be determined based on the displacement between the pixel block PB_i,jbeing coded and the reference frame data and also the location of the pixel block PB_i,jbeing coded.

As illustrated in FIG. 9(b), the method 800 may perform a single transformation of reference frame data 910 that serves for prediction searches of all pixel blocks PB_0,j-PB_max,jin a common row. Thus, method 800 of FIG. 8 is expected to conserve processing resources as compared to the method 700 of FIG. 7.

FIG. 9(c) illustrates exemplary transforms of the reference frame data that may be performed according to box 814. As illustrated, reference frame data 900 may be transformed based on a horizontal displacement between the pixel block PB_i,jbeing coded and reference frame data. In the example illustrated in FIG. 9(c), the transformation essentially stretches reference frame content based on the horizontal displacement. The degree of stretching increases as displacement from the input pixel block increases. The method may test candidate motion vectors within the stretched reference frame data 920 rather than the source reference frame data 900.

Image data need not be stretched in all cases. As with the example of FIG. 9(b), the type of stretching, whether expansion or contraction, may be determined based on the displacement between the pixel block PB_i,jbeing coded and the reference frame data and also the location of the pixel block PB_i,jbeing coded.

As illustrated in FIG. 9(c), the method 800 may perform a single transformation of reference frame data 910 that serves for prediction searches of all pixel blocks PB_i,0-PB_i,maxin a common row. Again, method 800 of FIG. 8 is expected to conserve processing resources as compared to the method 700 of FIG. 7.

Further resource conservation may be employed for the methods 700 and/or 800 by predicting whether motion vector-based coding will be performed. For example, based on ambient operating circumstances, it may be estimated that inter prediction will not be used, either for a given frame or for a portion of frame content. In such circumstances, the prediction searches 710 and/or 810 may be omitted. In another embodiment, ambient operating circumstances may indicate that there is a higher likelihood of motion along a row or along a column of input data. Such indications may be derived from motion sensor data provided by a device that provides image data or from frame-to-frame analyses of motion among image content. In such cases, the method 800 may be performed to omit operation of boxes 814-816 for row-based motion or to omit operation of boxes 811-813 for columnar motion. Alternatively, the method 800 may perform transforms along an estimated direction of motion, which need not be aligned to a row or column of image data (for example, a diagonal vector).

In other embodiments, a coder may select a sub-set of frame regions on which to perform transforms. For example, a coder may identify regions of content for which transforms are to be applied prior to each and other regions for which transforms need not be applied. Such regions may be selected, for example, based on analysis of frame content to identify objects in frame content that are likely to be regions of interest to viewers (for example, faces, bodies or other predetermined content). Such regions may be selected based on analysis of frame content that identifies foreground content within image data, which may be designated regions of interest. Further, such regions may be selected based on display activity reported by a display device 120 (FIG. 1); for example, if an encoder receives communication from a display 120 that indicates only a portion of the equirectangular image is being rendered on the display 120, the encoder may determine to apply such transforms on the portion being rendered and forego transform-based search on other regions that are not being rendered. In another embodiment, regions of particularly high motion may be designated for coding without such transforms; typically, coding losses in areas of high motion are not as perceptible to human viewers as coding losses in areas of low motion.

In a further embodiment, transforms may be performed to account for global camera motion. An encoder may receive data from a motion sensor 290 (FIG. 2) or perform image analysis that indicates a camera is moving during image capture. The image processor 220 may perform image transform operations on reference frames to align reference frame data spatially with the frames output by the camera system 210 (FIG. 2) during motion.

The principles of the present disclosure apply to prediction reference data that is utilized for intra-coding techniques, as well as inter-coding techniques. Where inter-coding exploits temporal redundancy in image data between frames, intra-coding exploits spatial redundancy within a single frame. Thus, an input pixel block may be coded with reference to previously-coded data of the same frame in which the input pixel block resides. Typically, video coders code an input frame on a pixel block-by-pixel block basis in a predetermined order, for example, a raster scan order. Thus, when coding an input pixel block at an intermediate point within a frame, an encoder will have coded image data of other pixel blocks that precede the input pixel block in coding order. Decoded data of the preceeding pixel blocks may be available to both the encoder and the decoder at the time the data of the intermediate pixel block is decoded and, thus, the preceding pixel blocks may be used as a prediction reference.

In such embodiments, prediction search operations for intra-coding may be performed between an input pixel block and prediction reference data (the previously coded pixel blocks of the same frame) that has been transformed according to Eqs. (5) and (10) according to the displacement between the input pixel block and candidate prediction blocks within the prediction reference data. Thus, the techniques of the present disclosure also find application for use in intra-coding.

FIG. 10 is a functional block diagram of a coding system 1000 according to an embodiment of the present disclosure. The system 1000 may include a pixel block coder 1010, a pixel block decoder 1020, an in-loop filter system 1030, a reference picture store 1040, a transform unit 1050, a predictor 1060, a controller 1070, and a syntax unit 1080. The pixel block coder and decoder 1010, 1020 and the predictor 1060 may operate iteratively on individual pixel blocks of a picture. The predictor 1060 may predict data for use during coding of a newly-presented input pixel block. The pixel block coder 1010 may code the new pixel block by predictive coding techniques and present coded pixel block data to the syntax unit 1080. The pixel block decoder 1020 may decode the coded pixel block data, generating decoded pixel block data therefrom. The in-loop filter 1030 may perform various filtering operations on a decoded picture that is assembled from the decoded pixel blocks obtained by the pixel block decoder 1020. The filtered picture may be stored in the reference picture store 1040 where it may be used as a source of prediction of a later-received pixel block. The syntax unit 1080 may assemble a data stream from the coded pixel block data which conforms to a governing coding protocol.

The pixel block coder 1010 may include a subtractor 1012, a transform unit 1014, a quantizer 1016, and an entropy coder 1018. The pixel block coder 1010 may accept pixel blocks of input data at the subtractor 1012. The subtractor 1012 may receive predicted pixel blocks from the predictor 1060 and generate an array of pixel residuals therefrom representing a difference between the input pixel block and the predicted pixel block. The transform unit 1014 may apply a transform to the sample data output from the subtractor 1012, to convert data from the pixel domain to a domain of transform coefficients. The quantizer 1016 may perform quantization of transform coefficients output by the transform unit 1014. The quantizer 1016 may be a uniform or a non-uniform quantizer. The entropy coder 1018 may reduce bandwidth of the output of the coefficient quantizer by coding the output, for example, by variable length code words.

The transform unit 1014 may operate in a variety of transform modes as determined by the controller 1070. For example, the transform unit 1014 may apply a discrete cosine transform (DCT), a discrete sine transform (DST), a Walsh-Hadamard transform, a Haar transform, a Daubechies wavelet transform, or the like. In an embodiment, the controller 1070 may select a coding mode M to be applied by the transform unit 1015, may configure the transform unit 1015 accordingly and may signal the coding mode M in the coded video data, either expressly or impliedly.

The quantizer 1016 may operate according to a quantization parameter Q_Pthat is supplied by the controller 1070. In an embodiment, the quantization parameter Q_Pmay be applied to the transform coefficients as a multi-value quantization parameter, which may vary, for example, across different coefficient locations within a transform-domain pixel block. Thus, the quantization parameter Q_Pmay be provided as a quantization parameters array.

The pixel block decoder 1020 may invert coding operations of the pixel block coder 1010. For example, the pixel block decoder 1020 may include a dequantizer 1022, an inverse transform unit 1024, and an adder 1026. The pixel block decoder 1020 may take its input data from an output of the quantizer 1016. Although permissible, the pixel block decoder 1020 need not perform entropy decoding of entropy-coded data since entropy coding is a lossless event. The dequantizer 1022 may invert operations of the quantizer 1016 of the pixel block coder 1010. The dequantizer 1022 may perform uniform or non-uniform de-quantization as specified by the decoded signal Q_P. Similarly, the inverse transform unit 1024 may invert operations of the transform unit 1014. The dequantizer 1022 and the inverse transform unit 1024 may use the same quantization parameters Q_Pand transform mode M as their counterparts in the pixel block coder 1010. Quantization operations likely will truncate data in various respects and, therefore, data recovered by the dequantizer 1022 likely will possess coding errors when compared to the data presented to the quantizer 1016 in the pixel block coder 1010.

The adder 1026 may invert operations performed by the subtractor 1012. It may receive the same prediction pixel block from the predictor 1060 that the subtractor 1012 used in generating residual signals. The adder 1026 may add the prediction pixel block to reconstructed residual values output by the inverse transform unit 1024 and may output reconstructed pixel block data.

The in-loop filter 1030 may perform various filtering operations on recovered pixel block data. For example, the in-loop filter 1030 may include a deblocking filter 1032 and a sample adaptive offset (“SAO”) filter 1033. The deblocking filter 1032 may filter data at seams between reconstructed pixel blocks to reduce discontinuities between the pixel blocks that arise due to coding. SAO filters may add offsets to pixel values according to an SAO “type,” for example, based on edge direction/shape and/or pixel/color component level. The in-loop filter 1030 may operate according to parameters that are selected by the controller 1070.

The reference picture store 1040 may store filtered pixel data for use in later prediction of other pixel blocks. Different types of prediction data are made available to the predictor 1060 for different prediction modes. For example, for an input pixel block, intra prediction takes a prediction reference from decoded data of the same picture in which the input pixel block is located. Thus, the reference picture store 1040 may store decoded pixel block data of each picture as it is coded. For the same input pixel block, inter prediction may take a prediction reference from previously coded and decoded picture(s) that are designated as reference pictures. Thus, the reference picture store 1040 may store these decoded reference pictures.

The transform unit 1050 may perform transforms of reference picture data as discussed in the foregoing embodiments. Thus, based on displacement between an input pixel block and reference picture data in a search window about the input pixel block, the transform unit 1050 may generate transformed reference picture data. The transform unit 1050 may output the transformed reference picture data to the predictor 1060.

As discussed, the predictor 1060 may supply prediction data to the pixel block coder 1010 for use in generating residuals. The predictor 1060 may include an inter predictor 1062, an intra predictor 1063 and a mode decision unit 1064. The inter predictor 1062 may receive spherically-projected pixel block data representing a new pixel block to be coded and may search spherical projections of reference picture data from store 1040 for pixel block data from reference picture(s) for use in coding the input pixel block. The inter predictor 1062 may support a plurality of prediction modes, such as P mode coding and B mode coding. The inter predictor 1062 may select an inter prediction mode and an identification of candidate prediction reference data that provides a closest match to the input pixel block being coded. The inter predictor 1062 may generate prediction reference metadata, such as motion vectors, to identify which portion(s) of which reference pictures were selected as source(s) of prediction for the input pixel block.

The intra predictor 1063 may support Intra (I) mode coding. The intra predictor 1063 may search from among spherically-projected pixel block data from the same picture as the pixel block being coded that provides a closest match to the spherically-projected input pixel block. The intra predictor 1063 also may generate prediction reference indicators to identify which portion of the picture was selected as a source of prediction for the input pixel block.

The mode decision unit 1064 may select a final coding mode to be applied to the input pixel block. Typically, as described above, the mode decision unit 1064 selects the prediction mode that will achieve the lowest distortion when video is decoded given a target bitrate. Exceptions may arise when coding modes are selected to satisfy other policies to which the coding system 1000 adheres, such as satisfying a particular channel behavior, or supporting random access or data refresh policies. When the mode decision selects the final coding mode, the mode decision unit 1064 may output a non-spherically-projected reference block from the store 1040 to the pixel block coder and decoder 1010, 1020 and may supply to the controller 1070 an identification of the selected prediction mode along with the prediction reference indicators corresponding to the selected mode.

The controller 1070 may control overall operation of the coding system 1000. The controller 1070 may select operational parameters for the pixel block coder 1010 and the predictor 1060 based on analyses of input pixel blocks and also external constraints, such as coding bitrate targets and other operational parameters. As is relevant to the present discussion, when it selects quantization parameters Q_P, the use of uniform or non-uniform quantizers, and/or the transform mode M, it may provide those parameters to the syntax unit 1080, which may include data representing those parameters in the data stream of coded video data output by the system 1000.

During operation, the controller 1070 may revise operational parameters of the quantizer 1016 and the transform unit 1015 at different granularities of image data, either on a per pixel block basis or on a larger granularity (for example, per picture, per slice, per largest coding unit (“LCU”) or another region). In an embodiment, the quantization parameters may be revised on a per-pixel basis within a coded picture.

Additionally, as discussed, the controller 1070 may control operation of the in-loop filter 1030 and the prediction unit 1060. Such control may include, for the prediction unit 1060, mode selection (lambda, modes to be tested, search windows, distortion strategies, etc.), and, for the in-loop filter 1030, selection of filter parameters, reordering parameters, weighted prediction, etc.

In an embodiment, the predictor 1060 may perform prediction searches using input pixel block data and reference pixel block data in a spherical projection. Operation of such prediction techniques are described in U.S. patent application Ser. No. 15/390,202, filed Dec. 23, 2016 and assigned to the assignee of the present application. In such an embodiment, the coder 1000 may include a spherical transform unit 1090 that transforms input pixel block data to a spherical domain prior to being input to the predictor 1060. The transform unit 1050 may transform reference picture data to the spherical domain (in addition to performing the transforms described hereinabove) prior to being input to the predictor 1060.

FIG. 11 is a functional block diagram of a decoding system 1100 according to an embodiment of the present disclosure. The decoding system 1100 may include a syntax unit 1110, a pixel block decoder 1120, an in-loop filter 1130, a reference picture store 1140, a transform unit 1150, a predictor 1160, and a controller 1170. The syntax unit 1110 may receive a coded video data stream and may parse the coded data into its constituent parts. Data representing coding parameters may be furnished to the controller 1170 while data representing coded residuals (the data output by the pixel block coder 1010 of FIG. 10) may be furnished to the pixel block decoder 1120. The pixel block decoder 1120 may invert coding operations provided by the pixel block coder 1010 (FIG. 10). The in-loop filter 1130 may filter reconstructed pixel block data. The reconstructed pixel block data may be assembled into pictures for display and output from the decoding system 1100 as output video. The pictures also may be stored in the prediction buffer 1140 for use in prediction operations. The transform unit 1150 may perform transforms of reference picture data identified by motion vectors contained in the coded pixel block data as described in the foregoing discussion. The predictor 1160 may supply prediction data to the pixel block decoder 1120 as determined by coding data received in the coded video data stream.

The pixel block decoder 1120 may include an entropy decoder 1122, a dequantizer 1124, an inverse transform unit 1126, and an adder 1128. The entropy decoder 1122 may perform entropy decoding to invert processes performed by the entropy coder 1018 (FIG. 10). The dequantizer 1124 may invert operations of the quantizer 1016 of the pixel block coder 1010 (FIG. 10). Similarly, the inverse transform unit 1126 may invert operations of the transform unit 1014 (FIG. 10). They may use the quantization parameters Q_Pand transform modes M that are provided in the coded video data stream. Because quantization is likely to truncate data, the data recovered by the dequantizer 1124, likely will possess coding errors when compared to the input data presented to its counterpart quantizer 1016 in the pixel block coder 1010 (FIG. 10).

The adder 1128 may invert operations performed by the subtractor 1011 (FIG. 10). It may receive a prediction pixel block from the predictor 1160 as determined by prediction references in the coded video data stream. The adder 1128 may add the prediction pixel block to reconstructed residual values output by the inverse transform unit 1126 and may output reconstructed pixel block data.

The in-loop filter 1130 may perform various filtering operations on reconstructed pixel block data. As illustrated, the in-loop filter 1130 may include a deblocking filter 1132 and an SAO filter 1134. The deblocking filter 1132 may filter data at seams between reconstructed pixel blocks to reduce discontinuities between the pixel blocks that arise due to coding. SAO filters 1134 may add offset to pixel values according to an SAO type, for example, based on edge direction/shape and/or pixel level. Other types of in-loop filters may also be used in a similar manner. Operation of the deblocking filter 1132 and the SAO filter 1134 ideally would mimic operation of their counterparts in the coding system 1000 (FIG. 10). Thus, in the absence of transmission errors or other abnormalities, the decoded picture obtained from the in-loop filter 1130 of the decoding system 1100 would be the same as the decoded picture obtained from the in-loop filter 1030 of the coding system 1000 (FIG. 10); in this manner, the coding system 1000 and the decoding system 1100 should store a common set of reference pictures in their respective reference picture stores 1040, 1140.

The reference picture stores 1140 may store filtered pixel data for use in later prediction of other pixel blocks. The reference picture stores 1140 may store decoded pixel block data of each picture as it is coded for use in intra prediction. The reference picture stores 1140 also may store decoded reference pictures.

The transform unit 1150 may perform transforms of reference picture data as discussed in the foregoing embodiments. In a decoder 1100, it is sufficient for the transform unit 1150 to perform transforms of reference picture identified by motion vectors contained in the coded video data. The motion vector may identify to the decoder 1100 the location within the reference picture from which the encoder 1000 (FIG. 10) derived a reference block. The decoder's transform unit 1150 may perform the same transformation of reference picture data, using the motion vector and based on a the pixel block being decoded and the reference block, to generate transformed reference block data.

As discussed, the predictor 1160 may supply the transformed reference block data to the pixel block decoder 1120. The predictor 1160 may supply predicted pixel block data as determined by the prediction reference indicators supplied in the coded video data stream.

The controller 1170 may control overall operation of the coding system 1100. The controller 1170 may set operational parameters for the pixel block decoder 1120 and the predictor 1160 based on parameters received in the coded video data stream. As is relevant to the present discussion, these operational parameters may include quantization parameters Q_Pfor the dequantizer 1124 and transform modes M for the inverse transform unit 1115. As discussed, the received parameters may be set at various granularities of image data, for example, on a per pixel block basis, a per picture basis, a per slice basis, a per LCU basis, or based on other types of regions defined for the input image.

In practice, encoders and decoders may exchange signaling to identify parameters of the coding operations that are performed. The signaling typically is performed with reference to a coding protocol, such as HEVC, AVC and related protocols, that define syntax elements for communication of such parameter. In an embodiment, the techniques of the foregoing embodiments may be integrated with the HEVC coding protocol that adds a new parameter, called “reference_correction_id” to a sequence parameter dataset, such as by:

Descriptor

seq_parameter_set_rbsp( ) ;

sps_video_parameter_set_id
u(4)

sps_max_sub_layers_minus1
u(3)

sps_temporal_id_nesting_flag
u(1)

profile_tier_level(1, sps_max_sub_layers_minus1)

sps_seq_paramter_set_id
ue(v)

reference_correction_id
u(3)

chroma_format_idc
ue(v)

In an embodiment, the reference_correction_id may take values such as:

reference_correction_id
format

0
Nothing done

1
Horizontal

2
Vertical

3
Horizontal and vertical

4
Vertical and horizontal

5
Transform

6
Reserved

7
Reserved

where:

reference_correction_id=0 indicates no special handling is performed,

reference_correction_id=1 indicates only horizontal distortion correction is performed;

reference_correction_id=2 indicates only vertical distortion correction is performed;

reference_correction_id=3 indicates that horizontal distortion correction is performed first, followed by vertical correction for each block in a different row.

reference_correction_id=4 indicates that vertical distortion correction is performed first, followed by horizontal correction for each block in a different column.

reference_correction_id=5 indicates that block by block transforms are applied for each reference candidate during prediction searches.

Of course, the coding parameters may be signaled according to a different syntax as may be desired.

The foregoing discussion has described operation of the embodiments of the present disclosure in the context of video coders and decoders. Commonly, these components are provided as electronic devices. Video decoders and/or controllers can be embodied in integrated circuits, such as application specific integrated circuits, field programmable gate arrays and/or digital signal processors. Alternatively, they can be embodied in computer programs that execute on camera devices, personal computers, notebook computers, tablet computers, smartphones or computer servers. Such computer programs typically are stored in physical storage media such as electronic-, magnetic- and/or optically-based storage devices, where they are read to a processor and executed. Decoders commonly are packaged in consumer electronics devices, such as smartphones, tablet computers, gaming systems, DVD players, portable media players and the like; and they also can be packaged in consumer software applications such as video games, media players, media editors, and the like. And, of course, these components may be provided as hybrid systems that distribute functionality across dedicated hardware components and programmed general-purpose processors, as desired.

For example, the techniques described herein may be performed by a central processor of a computer system. FIG. 12 illustrates an exemplary computer system 1200 that may perform such techniques. The computer system 1200 may include a central processor 1210, one or more cameras 1220, a memory 1230, and a transceiver 1240 provided in communication with one another. The camera 1220 may perform image capture and may store captured image data in the memory 1230. Optionally, the device also may include sink components, such as a coder 1250 and a display 1260, as desired.

The central processor 1210 may read and execute various program instructions stored in the memory 1230 that define an operating system 1212 of the system 1200 and various applications 1214.1-1214.N. The program instructions may perform coding mode control according to the techniques described herein. As it executes those program instructions, the central processor 1210 may read, from the memory 1230, image data created either by the camera 1220 or the applications 1214.1-1214.N, which may be coded for transmission. The central processor 1210 may execute a program that operates according to the principles of FIG. 6. Alternatively, the system 1200 may have a dedicated coder 1250 provided as a standalone processing system and/or integrated circuit.

As indicated, the memory 1230 may store program instructions that, when executed, cause the processor to perform the techniques described hereinabove. The memory 1230 may store the program instructions on electrical-, magnetic- and/or optically-based storage media.

The transceiver 1240 may represent a communication system to transmit transmission units and receive acknowledgement messages from a network (not shown). In an embodiment where the central processor 1210 operates a software-based video coder, the transceiver 1240 may place data representing state of acknowledgment message in memory 1230 to retrieval by the processor 1210. In an embodiment where the system 1200 has a dedicated coder, the transceiver 1240 may exchange state information with the coder 1250.

The foregoing description has been presented for purposes of illustration and description. It is not exhaustive and does not limit embodiments of the disclosure to the precise forms disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from the practicing embodiments consistent with the disclosure. Unless described otherwise herein, any of the methods may be practiced in any combination.

Claims

1. A method for coding an input pixel block containing multi-directional image content, comprising: for a spherical domain transform based on a first direction within a reference picture having an equiangular format, transforming content of the reference picture within a search window by a spherical-domain transform of content elements within the search window, wherein each of the content elements is transformed according to a displacement along the first direction between a location of the input pixel block and a location of the content elements within the search window,performing a prediction search along a perpendicular to the first direction upon which the spherical domain transform is based and within the spherical-domain transformed content of the reference picture to identify a match between the input pixel block and a matching portion of the transformed content of the reference picture; andwhen a match is identified, coding the input pixel block differentially with respect to the matching portion.
2. The method of claim 1, wherein the transforming comprises, for each candidate motion vector in the search window, transforming a reference block that is identified from the reference picture by the candidate motion vector.
3. The method of claim 1, wherein for a row of pixel blocks along the first direction and including the input pixel block, the transforming comprises, transforming, vertically along columns perpendicular to the first direction, content of the reference picture within a search window about the row of pixel blocks, and wherein the transformed content of the reference picture is used for prediction searches of the other pixel blocks in the row.
4. The method of claim 1, wherein for a column of pixel blocks including the input pixel block, the transforming comprises, transforming, horizontally along rows, content of the reference picture within a search window about the column of pixel blocks, and wherein the transformed content of the reference picture is used for prediction searches of the other pixel blocks in the column.
5. The method of claim 1, wherein the transforming comprises, transforming content of the reference picture within a search window along a direction of motion identified for a frame that includes the input pixel block.
6. The method of claim 1, wherein the coding is intra-coding and the reference picture includes decoded data of previously-coded data of a same frame in which the input pixel block is located.
7. The method of claim 1, wherein the coding is inter-coding and the reference picture includes decoded data of another frame that was coded prior to coding of a frame in which the input pixel block is located.
8. The method of claim 1, wherein the multi-directional image content is generated by a multi-view camera having fish eye lenses.
9. The method of claim 1, wherein the multi-directional image content is generated by an omnidirectional camera.
10. The method of claim 1, wherein the multi-directional image content is generated by a computer application.
11. The method of claim 1, wherein the coding comprises: calculating prediction residuals representing differences between pixels of the input pixel block and the matching portion of the transformed reference picture,transforming the prediction residuals to transform coefficients,quantizing the transform coefficients, andentropy coding the quantized coefficients.
12. The method of claim 1, further comprising transmitting with coded data of the input pixel block, a parameter identifying a type of transform performed on the reference picture.
13. The method of claim 1, further comprising: coding a plurality of input pixel blocks by, respectively:estimating a prediction mode to be applied to each respective pixel block, andwhen the estimated prediction mode is an inter-coding mode, performing the transforming, prediction search and coding for the respective pixel block, andwhen the estimated prediction mode is an intra-coding mode, omitting the transforming, prediction search and coding for the respective pixel block.
14. The method of claim 1, further comprising: estimating global motion of a frame to which the input pixel block belongs,wherein the transforming comprises aligning the reference picture spatially with respect to the input pixel block's frame.
15. The method of claim 1 wherein content elements at different displacements from the input pixel block are subject to different transforms.
16. The method of claim 1 wherein: the search window is larger than the input pixel block along the first direction, andthe prediction search in the search window includes searching along both the first direction and along the direction perpendicular to the first direction.
17. A non-transitory computer readable storage medium having stored thereon program instructions that, when executed by a processing device, cause the device to: for a spherical domain transform based on a first direction within a reference picture having an equiangular format, transforming content of the reference picture within a search window by a spherical-domain transform of content elements within the search window, wherein each of the content elements is transformed according to a displacement along the first direction between a location of the input pixel block and a location of the content elements within the search window,performing a prediction search along a perpendicular to the first direction upon which the spherical domain transform is based and within the spherical-domain transformed content of the reference picture to identify a match between the input pixel block and a matching portion of the transformed content of the reference picture; andwhen a match is identified, coding the input pixel block differentially with respect to the matching portion.
18. The medium of claim 17, wherein the transform comprises, for each candidate motion vector in the search window, transforming a reference block that is identified from the reference picture by the candidate motion vector.
19. The medium of claim 17, wherein for a row of pixel blocks including the input pixel block, the transform comprises, transforming, vertically along columns, content of the reference picture within a search window about the row of pixel blocks, and wherein the transformed content of the reference picture is used for prediction searches of the other pixel blocks in the row.
20. The medium of claim 17, wherein for a column of pixel blocks including the input pixel block, the transform comprises, transforming, horizontally along rows, content of the reference picture within a search window about the column of pixel blocks, and wherein the transformed content of the reference picture is used for prediction searches of the other pixel blocks in the column.
21. The medium of claim 17, wherein the multi-directional image content is generated by a multi-directional camera having fish eye lenses.
22. The medium of claim 17, wherein the multi-directional image content is generated by an omnidirectional camera.
23. The medium of claim 17, wherein the multi-directional image content is generated by a computer application.
24. The medium of claim 17, wherein the coding comprises: calculating prediction residuals representing differences between pixels of the input pixel block and the matching portion of the transformed reference picture,transforming the prediction residuals to transform coefficients,quantizing the transform coefficients, andentropy coding the quantized coefficients.
25. The medium of claim 17, wherein the program instructions cause the device to transmit with coded data of the input pixel block, a parameter identifying a type of transform performed on the reference picture.
26. A video coder, comprising: a pixel block coder,a pixel block decoder having an input coupled to an output of the pixel block coder,a reference picture store to store reference pictures in an equiangular format based on a first direction from pixel blocks output from the pixel block decoder,a transform unit transforming reference picture content from the reference picture store within a search window about a location of an input pixel block, wherein the transforming is by a spherical-domain transform based on the first direction of the equiangular format and which transforms content elements within the search window, wherein each of the content elements is transformed according to a displacement along the first direction of the equiangular format between the location of the input pixel block and a location of the content element within the search window, anda motion predictor for predicting the input pixel block by searching along a perpendicular to the first direction upon which the spherical-domain transform is based within the spherical-domain transformed content of the reference picture.
27. The coder of claim 26, wherein, for each candidate motion vector in the search window, the transform unit transforms a reference block that is identified from the reference picture by the candidate motion vector.
28. The coder of claim 26, wherein for a row of pixel blocks including the input pixel block, the transform unit transforms, vertically along columns, content of the reference picture within a search window about the row of pixel blocks, and wherein the motion predictor uses the transformed content of the reference picture used for prediction searches of the other pixel blocks in the row.
29. The coder of claim 26, wherein for a column of pixel blocks including the input pixel block, the transform unit transforms, horizontally along rows, content of the reference picture within a search window about the column of pixel blocks, and wherein the motion predictor uses the transformed content of the reference picture used for prediction searches of the other pixel blocks in the column.
30. A method of decoding a coded pixel block, comprising: from a reference picture having an equiangular format based on a first direction, transforming a reference block, identified by a motion vector provided in data of the coded pixel block, the transforming is by a spherical-domain transform based on the first direction of the equirectangular format, wherein the reference block is transformed based on a displacement along the first direction between a location of the coded pixel block and a location of the reference block and wherein the motion vector indicates the result of a search along a perpendicular to the first direction upon which the spherical-domain transform is based,decoding the input pixel block differentially with respect to the transformed reference block using other data of the coded pixel block.
31. The medium of claim 30, the transforming is performed according to a type of transform identified in the other data of the coded pixel block.
32. A method for coding an input pixel block containing multi-directional image content, comprising: transforming content of the input pixel block from an equirectangular source domain to a spherical domain by a spherical transform based on a first axis of the equirectangular source domain;for a plurality of stored reference pictures, transforming content of the reference picture(s) within a search window located about a location of the input pixel block from the source domain to the spherical domain, wherein the transforming is based on a location of the input pixel block and based on a distance along the first axis of content elements within the search window from the location of the input pixel block,performing a prediction search along a second axis perpendicular to the first axis of the equirectangular source domain to identify a match between the transformed spherical domain input pixel block and a matching portion of one of the transformed spherical domain reference pictures, andcoding the source domain input pixel block differentially with respect to matching content from a source domain reference picture corresponding to the one transformed spherical domain reference picture, the matching content identified by the prediction search conducted in the spherical domain.

US Referenced Citations (547)

Number	Name	Date	Kind
4890257	Anthias et al.	Dec 1989	A
5185667	Zimmerman	Feb 1993	A
5262777	Low et al.	Nov 1993	A
5313306	Kuban et al.	May 1994	A
5359363	Kuban et al.	Oct 1994	A
5448687	Hoogerhyde et al.	Sep 1995	A
5537155	O'Connell et al.	Jul 1996	A
5600346	Kamata et al.	Feb 1997	A
5684937	Oxaal	Nov 1997	A
5689800	Downs	Nov 1997	A
5715016	Kobayashi et al.	Feb 1998	A
5787207	Golin	Jul 1998	A
5872604	Ogura	Feb 1999	A
5903270	Gentry et al.	May 1999	A
5936630	Oxaal	Aug 1999	A
6011897	Koyama et al.	Jan 2000	A
6031540	Golin et al.	Feb 2000	A
6043837	Driscoll, Jr. et al.	Mar 2000	A
6058212	Yokoyama	May 2000	A
6122317	Hanami et al.	Sep 2000	A
6144890	Rothkop	Nov 2000	A
6204854	Signes et al.	Mar 2001	B1
6219089	Driscoll, Jr. et al.	Apr 2001	B1
6222883	Murdock et al.	Apr 2001	B1
6317159	Aoyama	Nov 2001	B1
6331869	Furlan et al.	Dec 2001	B1
6426774	Driscoll, Jr. et al.	Jul 2002	B1
6535643	Hong	Mar 2003	B1
6539060	Lee et al.	Mar 2003	B1
6559853	Hashimoto et al.	May 2003	B1
6577335	Kobayashi et al.	Jun 2003	B2
6751347	Pettigrew et al.	Jun 2004	B2
6762789	Sogabe et al.	Jul 2004	B1
6769131	Tanaka et al.	Jul 2004	B1
6795113	Jackson et al.	Sep 2004	B1
6907310	Gardner et al.	Jun 2005	B2
6973130	Wee et al.	Dec 2005	B1
6993201	Haskell et al.	Jan 2006	B1
7006707	Peterson	Feb 2006	B2
7015954	Foote et al.	Mar 2006	B1
7039113	Soundararajan	May 2006	B2
7050085	Park et al.	May 2006	B1
7095905	Peterson	Aug 2006	B1
7123777	Rondinelli et al.	Oct 2006	B2
7139440	Rondinelli et al.	Nov 2006	B2
7149549	Ortiz et al.	Dec 2006	B1
7259760	Hashimoto et al.	Aug 2007	B1
7327787	Chen et al.	Feb 2008	B1
7382399	McCall et al.	Jun 2008	B1
7385995	Stiscia et al.	Jun 2008	B2
7415356	Gowda et al.	Aug 2008	B1
7433535	Mukherjee et al.	Oct 2008	B2
7450749	Rouet et al.	Nov 2008	B2
7593041	Novak et al.	Sep 2009	B2
7620261	Chiang et al.	Nov 2009	B2
7660245	Luby	Feb 2010	B1
7742073	Cohen-Solal et al.	Jun 2010	B1
7755667	Rabbani et al.	Jul 2010	B2
7782357	Cutler	Aug 2010	B2
8027473	Stiscia et al.	Sep 2011	B2
8045615	Liang et al.	Oct 2011	B2
8217956	Jin	Jul 2012	B1
8255552	Witt et al.	Aug 2012	B2
8270496	Yin et al.	Sep 2012	B2
8295360	Lewis et al.	Oct 2012	B1
8339394	Lininger	Dec 2012	B1
8442109	Wang et al.	May 2013	B2
8442311	Hobbs et al.	May 2013	B1
8462109	Nasiri et al.	Jun 2013	B2
8462853	Jeon et al.	Jun 2013	B2
8482595	Kweon	Jul 2013	B2
8682091	Amit et al.	Mar 2014	B2
8693537	Wang et al.	Apr 2014	B2
8711941	Letunovskiy et al.	Apr 2014	B2
9013536	Zhu et al.	Apr 2015	B2
9071484	Traux	Jun 2015	B1
9094681	Wilkins et al.	Jul 2015	B1
9098870	Meadow et al.	Aug 2015	B2
9219919	Deshpande	Dec 2015	B2
9224247	Wada et al.	Dec 2015	B2
9258520	Lee	Feb 2016	B2
9277122	Imura et al.	Mar 2016	B1
9404764	Lynch	Aug 2016	B2
9430873	Nakamura et al.	Aug 2016	B2
9510007	Chan et al.	Nov 2016	B2
9516225	Banta et al.	Dec 2016	B2
9596899	Stahl et al.	Mar 2017	B2
9639935	Douady-Pleven et al.	May 2017	B1
9723223	Banta et al.	Aug 2017	B1
9743060	Matias et al.	Aug 2017	B1
9754413	Gray	Sep 2017	B1
9781356	Banta et al.	Oct 2017	B1
9823835	Wang et al.	Nov 2017	B2
9838687	Banta et al.	Dec 2017	B1
9866815	Vrcelj et al.	Jan 2018	B2
9936204	Sim et al.	Apr 2018	B1
9967563	Hsu et al.	May 2018	B2
9967577	Wu et al.	May 2018	B2
9992502	Abbas et al.	Jun 2018	B2
9996945	Holzer et al.	Jun 2018	B1
10102611	Murtha et al.	Oct 2018	B1
10204658	Krishnan	Feb 2019	B2
10212456	Guo et al.	Feb 2019	B2
10264282	Huang et al.	Apr 2019	B2
10277897	Mukherjee et al.	Apr 2019	B1
10282814	Lin et al.	May 2019	B2
10306186	Chuang et al.	May 2019	B2
10321109	Tanumihardja et al.	Jun 2019	B1
10334222	Kokare et al.	Jun 2019	B2
10339627	Abbas et al.	Jul 2019	B2
10339688	Su et al.	Jul 2019	B2
10349068	Banta et al.	Jul 2019	B1
10375371	Xu et al.	Aug 2019	B2
10455238	Mody et al.	Oct 2019	B2
10523913	Kim et al.	Dec 2019	B2
10559121	Moudgil et al.	Feb 2020	B1
10573060	Ascolese et al.	Feb 2020	B1
10574997	Chung et al.	Feb 2020	B2
10593012	Lee et al.	Mar 2020	B2
10614609	Shih et al.	Apr 2020	B2
10642041	Han et al.	May 2020	B2
10643370	Lin et al.	May 2020	B2
10652284	Liu et al.	May 2020	B2
10728546	Leontaris et al.	Jul 2020	B2
20010006376	Numa	Jul 2001	A1
20010028735	Pettigrew et al.	Oct 2001	A1
20010036303	Maurincomme et al.	Nov 2001	A1
20020080878	Li	Jun 2002	A1
20020093670	Luo et al.	Jul 2002	A1
20020126129	Snyder et al.	Sep 2002	A1
20020140702	Koller et al.	Oct 2002	A1
20020141498	Martins	Oct 2002	A1
20020190980	Gerritsen et al.	Dec 2002	A1
20020196330	Park et al.	Dec 2002	A1
20030098868	Fujiwara et al.	May 2003	A1
20030099294	Wang et al.	May 2003	A1
20030152146	Lin et al.	Aug 2003	A1
20040022322	Dye	Feb 2004	A1
20040028133	Subramaniyan et al.	Feb 2004	A1
20040028134	Subramaniyan et al.	Feb 2004	A1
20040032906	Lillig et al.	Feb 2004	A1
20040056900	Blume	Mar 2004	A1
20040189675	Pretlove et al.	Sep 2004	A1
20040201608	Ma et al.	Oct 2004	A1
20040218099	Washington	Nov 2004	A1
20040227766	Chou et al.	Nov 2004	A1
20040247173	Nielsen et al.	Dec 2004	A1
20050013498	Srinivasan et al.	Jan 2005	A1
20050041023	Green	Feb 2005	A1
20050069682	Tseng	Mar 2005	A1
20050129124	Ha	Jun 2005	A1
20050204113	Harper et al.	Sep 2005	A1
20050232356	Gomi	Oct 2005	A1
20050243915	Kwon et al.	Nov 2005	A1
20050244063	Kwon et al.	Nov 2005	A1
20050286777	Kumar	Dec 2005	A1
20060034527	Gritsevich	Feb 2006	A1
20060055699	Perlman et al.	Mar 2006	A1
20060055706	Perlman et al.	Mar 2006	A1
20060110062	Chiang et al.	May 2006	A1
20060119599	Woodbury	Jun 2006	A1
20060126719	Wilensky	Jun 2006	A1
20060132482	Oh	Jun 2006	A1
20060165164	Kwan et al.	Jul 2006	A1
20060165181	Kwan et al.	Jul 2006	A1
20060204043	Takei	Sep 2006	A1
20060238445	Wang et al.	Oct 2006	A1
20060282855	Margulis	Dec 2006	A1
20070024705	Richter et al.	Feb 2007	A1
20070057943	Beda et al.	Mar 2007	A1
20070064120	Didow et al.	Mar 2007	A1
20070071100	Shi et al.	Mar 2007	A1
20070097268	Relan et al.	May 2007	A1
20070115841	Taubman et al.	May 2007	A1
20070223582	Borer	Sep 2007	A1
20070263722	Fukuzawa	Nov 2007	A1
20070291143	Barbieri et al.	Dec 2007	A1
20080036875	Jones et al.	Feb 2008	A1
20080044104	Gering	Feb 2008	A1
20080049991	Gering	Feb 2008	A1
20080077953	Fernandez et al.	Mar 2008	A1
20080118180	Kamiya et al.	May 2008	A1
20080184128	Swenson et al.	Jul 2008	A1
20080252717	Moon et al.	Oct 2008	A1
20080310513	Ma et al.	Dec 2008	A1
20090040224	Igarashi et al.	Feb 2009	A1
20090123088	Kallay et al.	May 2009	A1
20090153577	Ghyme et al.	Jun 2009	A1
20090190858	Moody et al.	Jul 2009	A1
20090219280	Maillot	Sep 2009	A1
20090219281	Maillot	Sep 2009	A1
20090251530	Cilia et al.	Oct 2009	A1
20090262838	Gholmieh et al.	Oct 2009	A1
20100029339	Kim et al.	Feb 2010	A1
20100061451	Fuchigami	Mar 2010	A1
20100079605	Wang et al.	Apr 2010	A1
20100080287	Ali	Apr 2010	A1
20100110481	Do et al.	May 2010	A1
20100124274	Cheok et al.	May 2010	A1
20100135389	Tanizawa et al.	Jun 2010	A1
20100215226	Kaufman et al.	Aug 2010	A1
20100305909	Wolper et al.	Dec 2010	A1
20100316129	Zhao et al.	Dec 2010	A1
20100329361	Choi et al.	Dec 2010	A1
20100329362	Choi et al.	Dec 2010	A1
20110058055	Lindahl et al.	Mar 2011	A1
20110090967	Chen et al.	Apr 2011	A1
20110128350	Oliver et al.	Jun 2011	A1
20110142306	Nair	Jun 2011	A1
20110194617	Kumar et al.	Aug 2011	A1
20110200100	Kim et al.	Aug 2011	A1
20110235706	Demircin et al.	Sep 2011	A1
20110274158	Fu et al.	Nov 2011	A1
20110305274	Fu et al.	Dec 2011	A1
20110310089	Petersen	Dec 2011	A1
20120082232	Rojals et al.	Apr 2012	A1
20120098926	Kweon	Apr 2012	A1
20120192115	Falchuk et al.	Jul 2012	A1
20120219055	He et al.	Aug 2012	A1
20120230392	Zheng et al.	Sep 2012	A1
20120260217	Celebisoy	Oct 2012	A1
20120263231	Zhou	Oct 2012	A1
20120307746	Hammerschmidt et al.	Dec 2012	A1
20120320169	Bathiche	Dec 2012	A1
20120320984	Zhou	Dec 2012	A1
20120327172	Ei-Saban et al.	Dec 2012	A1
20130003858	Sze	Jan 2013	A1
20130016783	Kim et al.	Jan 2013	A1
20130044108	Tanaka et al.	Feb 2013	A1
20130051452	Li et al.	Feb 2013	A1
20130051467	Zhou et al.	Feb 2013	A1
20130088491	Hobbs et al.	Apr 2013	A1
20130094568	Hsu et al.	Apr 2013	A1
20130101025	Van fer Auwera et al.	Apr 2013	A1
20130101042	Sugio et al.	Apr 2013	A1
20130111399	Rose	May 2013	A1
20130124156	Wolper et al.	May 2013	A1
20130127844	Koeppel et al.	May 2013	A1
20130128986	Tsai et al.	May 2013	A1
20130136174	Xu et al.	May 2013	A1
20130170726	Kaufman et al.	Jul 2013	A1
20130182775	Wang et al.	Jul 2013	A1
20130195183	Zhai et al.	Aug 2013	A1
20130208787	Zheng et al.	Aug 2013	A1
20130219012	Suresh et al.	Aug 2013	A1
20130251028	Au et al.	Sep 2013	A1
20130272415	Zhou	Oct 2013	A1
20130301706	Qiu et al.	Nov 2013	A1
20140002439	Lynch	Jan 2014	A1
20140003450	Bentley et al.	Jan 2014	A1
20140010293	Srinivasan et al.	Jan 2014	A1
20140078263	Kim	Mar 2014	A1
20140082054	Denoual et al.	Mar 2014	A1
20140089326	Lin et al.	Mar 2014	A1
20140140401	Lee et al.	May 2014	A1
20140153636	Esenlik et al.	Jun 2014	A1
20140169469	Bernal et al.	Jun 2014	A1
20140176542	Shohara et al.	Jun 2014	A1
20140218356	Distler et al.	Aug 2014	A1
20140254949	Chou	Sep 2014	A1
20140267235	DeJohn et al.	Sep 2014	A1
20140269899	Park et al.	Sep 2014	A1
20140286410	Zenkich	Sep 2014	A1
20140355667	Lei et al.	Dec 2014	A1
20140368669	Talvala et al.	Dec 2014	A1
20140376634	Guo et al.	Dec 2014	A1
20150003525	Sasai et al.	Jan 2015	A1
20150003725	Wan	Jan 2015	A1
20150016522	Sato	Jan 2015	A1
20150029294	Lin et al.	Jan 2015	A1
20150062292	Kweon	Mar 2015	A1
20150089348	Jose	Mar 2015	A1
20150103884	Ramasubramonian et al.	Apr 2015	A1
20150145966	Krieger et al.	May 2015	A1
20150195491	Shaburov et al.	Jul 2015	A1
20150195559	Chen et al.	Jul 2015	A1
20150195573	Aflaki et al.	Jul 2015	A1
20150215631	Zhou et al.	Jul 2015	A1
20150237370	Zhou et al.	Aug 2015	A1
20150256839	Ueki et al.	Sep 2015	A1
20150264259	Raghoebardajal et al.	Sep 2015	A1
20150264386	Pang et al.	Sep 2015	A1
20150264404	Hannuksela	Sep 2015	A1
20150271517	Pang et al.	Sep 2015	A1
20150279087	Myers et al.	Oct 2015	A1
20150279121	Myers et al.	Oct 2015	A1
20150304665	Hannuksela et al.	Oct 2015	A1
20150321103	Barnett et al.	Nov 2015	A1
20150326865	Yin et al.	Nov 2015	A1
20150339853	Wolper et al.	Nov 2015	A1
20150341552	Chen et al.	Nov 2015	A1
20150346812	Cole	Dec 2015	A1
20150346832	Cole et al.	Dec 2015	A1
20150350673	Hu et al.	Dec 2015	A1
20150351477	Stahl et al.	Dec 2015	A1
20150358612	Sandrew et al.	Dec 2015	A1
20150358613	Sandrew et al.	Dec 2015	A1
20150358633	Choi et al.	Dec 2015	A1
20150373334	Rapaka et al.	Dec 2015	A1
20150373372	He et al.	Dec 2015	A1
20160012855	Krishnan	Jan 2016	A1
20160014422	Su et al.	Jan 2016	A1
20160027187	Wang et al.	Jan 2016	A1
20160050369	Takenaka et al.	Feb 2016	A1
20160080753	Oh	Mar 2016	A1
20160112489	Adams et al.	Apr 2016	A1
20160112704	Grange et al.	Apr 2016	A1
20160112713	Russell	Apr 2016	A1
20160142697	Budagavi	May 2016	A1
20160150231	Schulze	May 2016	A1
20160165257	Chen et al.	Jun 2016	A1
20160227214	Rapaka et al.	Aug 2016	A1
20160234438	Satoh	Aug 2016	A1
20160241836	Cole et al.	Aug 2016	A1
20160269632	Morioka	Sep 2016	A1
20160277746	Fu et al.	Sep 2016	A1
20160286119	Rondinelli	Sep 2016	A1
20160350585	Lin et al.	Dec 2016	A1
20160350592	Ma et al.	Dec 2016	A1
20160352791	Adams et al.	Dec 2016	A1
20160352971	Adams et al.	Dec 2016	A1
20160353089	Gallup et al.	Dec 2016	A1
20160353146	Weaver et al.	Dec 2016	A1
20160360104	Zhang et al.	Dec 2016	A1
20160360180	Cole et al.	Dec 2016	A1
20170013279	Puri	Jan 2017	A1
20170026659	Lin et al.	Jan 2017	A1
20170038942	Rosenfeld et al.	Feb 2017	A1
20170054907	Nishihara et al.	Feb 2017	A1
20170064199	Lee et al.	Mar 2017	A1
20170078447	Hancock et al.	Mar 2017	A1
20170085892	Liu et al.	Mar 2017	A1
20170094184	Gao et al.	Mar 2017	A1
20170104927	Mugavero et al.	Apr 2017	A1
20170109930	Holzer et al.	Apr 2017	A1
20170127008	Kankaanpaa et al.	May 2017	A1
20170142371	Barzuza et al.	May 2017	A1
20170155912	Thomas et al.	Jun 2017	A1
20170180635	Hayashi et al.	Jun 2017	A1
20170200255	Lin et al.	Jul 2017	A1
20170200315	Lockhart	Jul 2017	A1
20170208346	Narroschke et al.	Jul 2017	A1
20170214937	Lin	Jul 2017	A1
20170223268	Shimmoto	Aug 2017	A1
20170223368	Abbas et al.	Aug 2017	A1
20170228867	Baruch	Aug 2017	A1
20170230668	Lin	Aug 2017	A1
20170236323	Lim et al.	Aug 2017	A1
20170244775	Ha et al.	Aug 2017	A1
20170251208	Adsumilli et al.	Aug 2017	A1
20170257644	Andersson et al.	Sep 2017	A1
20170272698	Liu et al.	Sep 2017	A1
20170272758	Lin et al.	Sep 2017	A1
20170278262	Kawamoto et al.	Sep 2017	A1
20170280126	Van der Auwera	Sep 2017	A1
20170287200	Forutanpour et al.	Oct 2017	A1
20170287220	Khalid et al.	Oct 2017	A1
20170295356	Abbas et al.	Oct 2017	A1
20170301065	Adsumilli et al.	Oct 2017	A1
20170301132	Dalton et al.	Oct 2017	A1
20170302714	Ramsay et al.	Oct 2017	A1
20170302951	Joshi et al.	Oct 2017	A1
20170309143	Trani et al.	Oct 2017	A1
20170322635	Yoon et al.	Nov 2017	A1
20170323422	Kim et al.	Nov 2017	A1
20170323423	Lin et al.	Nov 2017	A1
20170332107	Abbas et al.	Nov 2017	A1
20170336705	Zhou et al.	Nov 2017	A1
20170339324	Tocher et al.	Nov 2017	A1
20170339341	Zhou et al.	Nov 2017	A1
20170339391	Zhou et al.	Nov 2017	A1
20170339392	Forutanpour et al.	Nov 2017	A1
20170339415	Wang et al.	Nov 2017	A1
20170344843	Wang et al.	Nov 2017	A1
20170353737	Lin et al.	Dec 2017	A1
20170359590	Zhang et al.	Dec 2017	A1
20170366808	Lin et al.	Dec 2017	A1
20170374332	Yamaguchi et al.	Dec 2017	A1
20170374375	Makar et al.	Dec 2017	A1
20180005447	Wallner et al.	Jan 2018	A1
20180005449	Wallner et al.	Jan 2018	A1
20180007387	Izumi	Jan 2018	A1
20180007389	Izumi	Jan 2018	A1
20180018807	Lu et al.	Jan 2018	A1
20180020202	Xu et al.	Jan 2018	A1
20180020238	Liu et al.	Jan 2018	A1
20180027178	Macmillan et al.	Jan 2018	A1
20180027226	Abbas et al.	Jan 2018	A1
20180027257	Izumi et al.	Jan 2018	A1
20180047208	Marin et al.	Feb 2018	A1
20180048890	Kim et al.	Feb 2018	A1
20180053280	Kim et al.	Feb 2018	A1
20180054613	Lin et al.	Feb 2018	A1
20180061002	Lee et al.	Mar 2018	A1
20180063505	Lee et al.	Mar 2018	A1
20180063544	Tourapis et al.	Mar 2018	A1
20180075576	Liu et al.	Mar 2018	A1
20180075604	Kim et al.	Mar 2018	A1
20180075635	Choi et al.	Mar 2018	A1
20180077451	Yip et al.	Mar 2018	A1
20180084257	Abbas	Mar 2018	A1
20180091812	Guo et al.	Mar 2018	A1
20180098090	Lin et al.	Apr 2018	A1
20180101931	Abbas et al.	Apr 2018	A1
20180109810	Xu et al.	Apr 2018	A1
20180124312	Chang	May 2018	A1
20180130243	Kim et al.	May 2018	A1
20180130264	Ebacher	May 2018	A1
20180146136	Yamamoto	May 2018	A1
20180146138	Jeon et al.	May 2018	A1
20180152636	Yim et al.	May 2018	A1
20180152663	Wozniak et al.	May 2018	A1
20180160113	Jeong et al.	Jun 2018	A1
20180160138	Park	Jun 2018	A1
20180160156	Hannuksela et al.	Jun 2018	A1
20180164593	Van der Auwera et al.	Jun 2018	A1
20180167613	Hannuksela et al.	Jun 2018	A1
20180167634	Salmimaa et al.	Jun 2018	A1
20180174619	Roy et al.	Jun 2018	A1
20180176468	Wang et al.	Jun 2018	A1
20180176536	Jo et al.	Jun 2018	A1
20180176596	Jeong et al.	Jun 2018	A1
20180176603	Fujimoto	Jun 2018	A1
20180184101	Ho	Jun 2018	A1
20180184121	Kim et al.	Jun 2018	A1
20180191787	Morita et al.	Jul 2018	A1
20180192074	Shih et al.	Jul 2018	A1
20180199029	Van der Auwera et al.	Jul 2018	A1
20180199034	Nam et al.	Jul 2018	A1
20180199070	Wang	Jul 2018	A1
20180218512	Chan et al.	Aug 2018	A1
20180220138	He et al.	Aug 2018	A1
20180227484	Hung et al.	Aug 2018	A1
20180234700	Kim et al.	Aug 2018	A1
20180240223	Yi et al.	Aug 2018	A1
20180240276	He et al.	Aug 2018	A1
20180242016	Lee et al.	Aug 2018	A1
20180242017	Van Leuven et al.	Aug 2018	A1
20180249076	Sheng et al.	Aug 2018	A1
20180249163	Curcio et al.	Aug 2018	A1
20180249164	Kim et al.	Aug 2018	A1
20180253879	Li et al.	Sep 2018	A1
20180262775	Lee et al.	Sep 2018	A1
20180268517	Coban et al.	Sep 2018	A1
20180270417	Suitoh et al.	Sep 2018	A1
20180276788	Lee et al.	Sep 2018	A1
20180276789	Van der Auwera et al.	Sep 2018	A1
20180276826	Van der Auwera et al.	Sep 2018	A1
20180276890	Wang	Sep 2018	A1
20180286109	Woo et al.	Oct 2018	A1
20180288435	Boyce et al.	Oct 2018	A1
20180295282	Boyce et al.	Oct 2018	A1
20180302621	Fu et al.	Oct 2018	A1
20180307398	Kim et al.	Oct 2018	A1
20180315245	Patel	Nov 2018	A1
20180322611	Bang et al.	Nov 2018	A1
20180329482	Woo et al.	Nov 2018	A1
20180332265	Hwang et al.	Nov 2018	A1
20180332279	Kang et al.	Nov 2018	A1
20180338142	Kim et al.	Nov 2018	A1
20180343388	Matsushita	Nov 2018	A1
20180349705	Kim et al.	Dec 2018	A1
20180350407	Decoodt et al.	Dec 2018	A1
20180352225	Guo et al.	Dec 2018	A1
20180352259	Guo et al.	Dec 2018	A1
20180352264	Guo et al.	Dec 2018	A1
20180359487	Bang	Dec 2018	A1
20180374192	Kunkel et al.	Dec 2018	A1
20180376126	Hannuksela	Dec 2018	A1
20180376152	Wang et al.	Dec 2018	A1
20190004414	Kim et al.	Jan 2019	A1
20190007669	Kim et al.	Jan 2019	A1
20190007679	Coban et al.	Jan 2019	A1
20190007684	Van der Auwera et al.	Jan 2019	A1
20190012766	Yoshimi	Jan 2019	A1
20190014304	Curcio et al.	Jan 2019	A1
20190026858	Lin et al.	Jan 2019	A1
20190026934	Shih et al.	Jan 2019	A1
20190026956	Gausebeck et al.	Jan 2019	A1
20190028642	Fujita et al.	Jan 2019	A1
20190045212	Rose et al.	Feb 2019	A1
20190057487	Cheng	Feb 2019	A1
20190057496	Ogawa et al.	Feb 2019	A1
20190082184	Hannuksela	Mar 2019	A1
20190104315	Guo et al.	Apr 2019	A1
20190108611	Izumi	Apr 2019	A1
20190132521	Fujita et al.	May 2019	A1
20190132594	Chung et al.	May 2019	A1
20190141318	Li et al.	May 2019	A1
20190158800	Kokare et al.	May 2019	A1
20190191170	Zhao et al.	Jun 2019	A1
20190200016	Jang et al.	Jun 2019	A1
20190200023	Hanhart et al.	Jun 2019	A1
20190215512	Lee et al.	Jul 2019	A1
20190215532	He et al.	Jul 2019	A1
20190230285	Kim	Jul 2019	A1
20190230337	Kim	Jul 2019	A1
20190230368	Zhao et al.	Jul 2019	A1
20190230377	Ma	Jul 2019	A1
20190236990	Song et al.	Aug 2019	A1
20190238888	Kim	Aug 2019	A1
20190246141	Kim et al.	Aug 2019	A1
20190253622	Van der Auwera et al.	Aug 2019	A1
20190253624	Kim	Aug 2019	A1
20190253703	Coban et al.	Aug 2019	A1
20190260990	Lim et al.	Aug 2019	A1
20190268594	Lim et al.	Aug 2019	A1
20190272616	Lee et al.	Sep 2019	A1
20190273929	Ma et al.	Sep 2019	A1
20190273949	Kim et al.	Sep 2019	A1
20190281217	Kim	Sep 2019	A1
20190281273	Lin et al.	Sep 2019	A1
20190281290	Lee et al.	Sep 2019	A1
20190289324	Budagavi	Sep 2019	A1
20190289331	Byun	Sep 2019	A1
20190297341	Zhou	Sep 2019	A1
20190297350	Lin et al.	Sep 2019	A1
20190306515	Shima	Oct 2019	A1
20190355126	Sun et al.	Nov 2019	A1
20190379893	Krishnan	Dec 2019	A1
20190387251	Lin et al.	Dec 2019	A1
20200029077	Lee et al.	Jan 2020	A1
20200036976	Kanoh et al.	Jan 2020	A1
20200045323	Hannuksela	Feb 2020	A1
20200058165	Choi et al.	Feb 2020	A1
20200074587	Lee et al.	Mar 2020	A1
20200074687	Lin et al.	Mar 2020	A1
20200077092	Lin et al.	Mar 2020	A1
20200084441	Lee et al.	Mar 2020	A1
20200120340	Park et al.	Apr 2020	A1
20200120359	Hanhart et al.	Apr 2020	A1
20200137401	Kim et al.	Apr 2020	A1
20200137418	Onno et al.	Apr 2020	A1
20200153885	Lee et al.	May 2020	A1
20200162731	Kim et al.	May 2020	A1
20200213570	Shih et al.	Jul 2020	A1
20200213571	Kim et al.	Jul 2020	A1
20200213587	Galpin et al.	Jul 2020	A1
20200234399	Lin et al.	Jul 2020	A1
20200244957	Sasai et al.	Jul 2020	A1
20200252650	Shih et al.	Aug 2020	A1
20200260082	Lim et al.	Aug 2020	A1
20200260120	Hanhart et al.	Aug 2020	A1
20200267385	Lim et al.	Aug 2020	A1
20200396461	Zhao et al.	Dec 2020	A1
20210006714	Kim	Jan 2021	A1
20210006838	Lee et al.	Jan 2021	A1

Foreign Referenced Citations (14)

Number	Date	Country
2077525	Jul 2009	EP
2005-159824	Jun 2005	JP
2008-193458	Aug 2008	JP
2012-160886	Aug 2012	JP
2014-176034	Sep 2014	JP
2017-0015938	Feb 2017	KR
WO 2012044709	Apr 2012	WO
WO 2015138979	Sep 2015	WO
WO 2015184416	Dec 2015	WO
WO 2016076680	May 2016	WO
WO 2016140060	Sep 2016	WO
WO 2017125030	Jul 2017	WO
WO 2017127816	Jul 2017	WO
WO 2018118159	Jun 2018	WO

Non-Patent Literature Citations (20)

Entry
Tosic et al.; “Multiresolution Motion Estimation for Omnidirectional Images”; IEEE 13th European Signal Processing Conference; Sep. 2005; 4 pages.
He et al.; “AHG8: Geometry padding for 360 video coding”; Joint Video Exploration Team (JVET); Document: JVET-D0075; Oct. 2016; 10 pages.
Vishwanath et al.; “Rotational Motion Model for Temporal Prediction in 360 Video Coding”; IEEE 19th Int'l Workshop on Multimedia Signal Processing; Oct. 2017; 6 pages.
Sauer et al.; “Improved Motion Compensation for 360 Video Projected to Polytopes” Proceedings of the IEEE Int'l Conf. on Multimedia and Expo; Jul. 2017; p. 61-66.
International Patent Application No. PCT/US2018/017124; Int'l Search Report and the Written Opinion; dated Apr. 30, 2018; 19 pages.
International Patent Application No. PCT/US2018/017124; Int'l Preliminary Report on Patentability; dated Aug. 29, 2019; 12 pages.
He et al.; “AHG8: InterDigital's projection format conversion tool”; Joint Video Exploration Team (JVET) of ITU-T SG 16 WP3 and ISO/IEC JTC 1/SC 29/WG 11 4th meeting; Oct. 2016; 18 pages.
Kammachi et al.; “AHG8: Test results for viewport-dependent pyramid, cube map, and equirectangular panorama schemes”; JVET-D00078; Oct. 2016; 7 pages.
Yip et al.; “Technologies under Considerations for ISO/IEC 23000-20 Omnidirectional Media Application Format”; ISO/IEC JTC1/SC29/WG11 MPEG2017/W16637; Jan. 2017; 50 pages.
International Patent Application No. PCT/US2018/018246; Int'l Search Report and the Written Opinion; dated Apr. 20, 2018; 15 pages.
Boyce et al.; “Common Test Conditions and Evaluation Procedures for 360 degree Video Coding”; Joint Video Exploration Team; ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11; Doc. JVET-D1030; Oct. 2016; 6 pages.
Li et al.; “Projection Based Advanced Motion Model for Cubic Mapping for 360-Degree Video”; Cornell University Library; 2017; 5 pages.
Zheng et al.; “Adaptive Selection of Motion Models for Panoramic Video Coding”; IEEE Int'l Conf. Multimedia and Expo; Jul. 2007; p. 1319-1322.
He et al.; “AHG8: Algorithm description of InterDigital's projection format conversion tool (PCT360)”; Joint Video Exploration Team; ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11; Doc. JVET-D0090; Oct. 2016; 6 pages.
International Patent Application No. PCT/US2017/051542; Int'l Search Report and the Written Opinion; dated Dec. 7, 2017; 17 pages.
International Patent Application No. PCT/US2017/051542; Int'l Preliminary Report on Patentability; dated Jul. 4, 2019; 10 pages.
Choi et al.; “Text of ISO/IEC 23000-20 CD Omnidirectional Media Application Format”; Coding of Moving Pictures and Audio; ISO/IEC JTC1/SC29/WG11 N16636; Jan. 2017; 48 pages.
International Patent Application No. PCT/US2018/018246; Int'l Preliminary Report on Patentability; dated Sep. 6, 2019; 8 pages.
Sauer et al.; “Geometry correction for motion compensation of planar-projected 360VR video”; Joint Video Exploration Team; Document: JVET-D0067; Oct. 2016; 13 pages.
Minhua Zou; “AHG8: Unrestricted Motion Compensation for 360 Video in ERP Format”; Joint Video Exploration; JVET-E0065; 2017; Broadcom; 4 pages.

Related Publications (1)

	Number	Date	Country
	20180234700 A1	Aug 2018	US

Processing of equirectangular object data to compensate for distortion by spherical projections

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

CPC