The present disclosure relates to predictive coding of depth maps for three-dimensional imaging systems.
In three-dimensional imaging systems, a popular representation of a scene is MultiView-plus-Depth (MVD), in which information about three-dimensional structure of a scene is stored in form of depth maps. Basing on depth maps and corresponding videos (so called texture views), a virtual view can be synthesized, typically in between of the source views. The use of MVD leads to a simple and straightforward approach to view synthesis, which is to employ Depth-Image-Based Rendering (DIBR).
View synthesis is expected to be realized by user-side devices, like 3D displays, mobile phones, etc. Therefore there is a need for fast, computationally efficient view synthesis algorithms, which desirably could be seamlessly implemented in hardware.
A publication “Platelet-based coding of depth maps for the transmission of multiview images” (by Y. Morvan, P. H. N. de With, and D. Farin, in Proceedings of SPIE: Stereoscopic Displays and Applications, 2006, vol. 6055) discloses an algorithm that models depth maps using piecewise-linear functions (platelets). To adapt to varying scene detail, a quadtree decomposition is employed that divides the image into blocks of variable size, each block being approximated by one platelet. In order to preserve sharp object boundaries, the support area of each platelet is adapted to the object boundary. The subdivision of the quadtree and the selection of the platelet type are optimized such that a global rate-distortion trade-off is realized.
A publication “Fast View Synthesis using platelet-based depth representation” (by K. Wegner, O. Stankiewicz. M. Domanski, in 21th International Conference on Systems, Signals and Image Processing, IWSSIP 2014. Dubrovnik, Croatia, May 2014) discloses an approach to speeding up the view synthesis process that exploits depth modeling—the depth data is divided into blocks of various sizes and modelled as planes. In such case only 4 corners of each block need to be transformed during view synthesis, instead of transforming every pixel in the block. This way depth data is adaptively simplified.
There is a need to further improve the existing methods for predictive coding of depth maps, in particular in order to reduce the number of transmitted data and improve the compression ratio.
There is presented herein a method for predictive encoding of a depth map, comprising the steps of: receiving a depth map; dividing the depth map to blocks, performing quad-tree decomposition of the depth map to sub-blocks; approximating each of the sub-blocks by a plane, the plane being associated with three plane points (p0, p1, p2). The method further comprises: determining an order of coding of the sub-blocks; while encoding consecutive sub-blocks: calculating predictors ({circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2) for the plane points (p0, p1, p2) of the currently-encoded sub-block (C); calculating differences (Δp0, Δp1, Δp2) between the actual values of the plane points (p0, p1, p2) and the values of the corresponding predictors ({circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2); providing the parameters of the plane for the currently-encoded sub-block (C) in form of the differences (Δp0, Δp1, Δp2); and providing an encoded depth map in a form of a set of planes defined by parameters.
The method may comprise calculating predictors (p0, p1, p2) for the plane points (p0, p1, p2) of the currently-encoded sub-block (C) on the basis of the values of the plane points (pR0, pR1, pR2) for planes corresponding to earlier-encoded sub-blocks.
The earlier-encoded sub-blocks can be adjacent to the currently encoded sub-block.
R may ε{LT, L, LB, T, RT}.
The method may comprise specifying predictors ({circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2) for the plane points (p0, p1, p2) of the currently-encoded sub-block (C) on the basis of a predefined value.
The order of coding of the sub-blocks can be selected from the group comprising: z-order, raster scan order, diagonal scan order, zigzag scan order.
The sub-blocks may have a rectangular shape with a size of N1×N2, wherein at least one of N1, N2 is higher than 1.
The sub-blocks may have a square shape.
There is also disclosed a computing device program product for depth-image-based rendering, the computing device program product comprising: a non-transitory computer readable medium; first programmatic instructions for receiving a depth map; second programmatic instructions for dividing the depth map to blocks; third programmatic instructions for performing quad-tree decomposition of the depth map to sub-blocks; fourth programmatic instructions for approximating each of the sub-blocks by a plane, the plane being associated with three plane points (p0, p1, p2); fifth programmatic instructions for determining an order of coding of the sub-blocks; sixth programmatic instructions for, while encoding consecutive sub-blocks: calculating predictors ({circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2) for the plane points (p0, p1, p2) of the currently-encoded sub-block (C); calculating differences (Δp0, Δp1, Δp2) between the actual values of the plane points (p0, p1, p2) and the values of the corresponding predictors ({circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2); providing the parameters of the plane for the currently-encoded sub-block (C) in form of the differences (Δp0, Δp1, Δp2); and seventh programmatic instructions for providing an encoded depth map in a form of a set of planes defined by parameters.
There is also disclosed a system for predictive encoding of a depth map, the system comprising: a data bus communicatively coupling components of the system; a memory for storing data; a controller configured to perform the steps of: receiving a depth map; dividing the depth map to blocks; performing quad-tree decomposition of the depth map to sub-blocks; approximating each of the sub-blocks by a plane, the plane being associated with three plane points (p0, p1, p2); determining an order of coding of the sub-blocks; while encoding consecutive sub-blocks: calculating predictors ({circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2) for the plane points (p0, p1, p2) of the currently-encoded sub-block (C); calculating differences (Δp0, Δp1, Δp2) between the actual values of the plane points (p0, p1, p2) and the values of the corresponding predictors ({circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2); providing the parameters of the plane for the currently-encoded sub-block (C) in form of the differences (Δp0, Δp1, Δp2); and providing an encoded depth map in a form of a set of planes defined by parameters.
The present method is shown by means of exemplary embodiment on a drawing, in which:
Some portions of the detailed description which follows are presented in terms of data processing procedures, steps or other symbolic representations of operations on data bits that can be performed on computer memory. Therefore, a computer executes such logical steps thus requiring physical manipulations of physical quantities.
Usually these quantities lake the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system. For reasons of common usage, these signals are referred to as bits, packets, messages, values, elements, symbols, characters, terms, numbers, or the like.
Additionally, all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Terms such as “processing” or “creating” or “transferring” or “executing” or “determining” or “detecting” or “obtaining” or “selecting” or “calculating” or “generating” or the like, refer to the action and processes of a computer system that manipulates and transforms data represented as physical (electronic) quantities within the computer's registers and memories into other data similarly represented as physical quantities within the memories or registers or other such information storage.
A computer-readable (storage) medium, such as referred to herein, typically may be non-transitory and/or comprise a non-transitory device. In this context, a non-transitory storage medium may include a device that may be tangible, meaning that the device has a concrete physical form, although the device may change its physical state. Thus, for example, non-transitory refers to a device remaining tangible despite a change in state.
As utilized herein, the term “example” means serving as a non-limiting example, instance, or illustration. As utilized herein, the terms “for example” and “e.g.” introduce a list of one or more non-limiting examples, instances, or illustrations.
The present disclosure is related to representation of depth data in a simplified manner, wherein instead of dense, regularly sampled pixels of depth map, rectangular blocks of depth pixels are modelled with flat planes, as described in the “Fast View Synthesis using platelet-based depth representation” publication mentioned in the background section. The flat planes are described by four corners, therefore the complexity of view synthesis process is significantly reduced—instead of pixel by pixel transformation, plane-model-based transformation can be used.
The method is performed according to the flow diagram of
The depth map received in step 101 is divided, in step 102, into non-overlapping blocks having a size of M×M. Each M×M block of the depth map is adaptively divided in a quadtree decomposition process, in step 103, to sub-blocks having a size N×N in the range from M×M to 2×2. In alternative embodiments, the depth map can be divided into non-square blocks, for example rectangular blocks having a size of M1×M2. Moreover, in alternative embodiments the blocks can be divided into non-square sub-blocks, for example rectangular sub-blocks having a size of N1×N2, wherein at least one of N1, N2 is higher than 1. For the sake of simplicity and clarity only, the presented embodiment is related to square blocks and sub-blocks.
The depth map in each sub-block is approximated in step 104 by a plane, as shown in
Each plane, such as shown in
d(x,y)=a·x+β·y+γ
wherein d(x,y) is the value of the depth map at point x, y of the particular sub-block having a size of N×N. The values x and y are from 0 to N−1.
A coder that transmits the depth map in a form of a set of planes, must transmit for each block a tree defining the decomposition of the depth map block to sub-blocks. For each sub-block, parameters α, β, γ defining the plane must be transmitted.
As shown in
This allows to transmit the value of the depth map at the points p0, p1, p2 instead of parameters α, β, γ.
In one preferred embodiment, the sub-blocks of the coded depth map block can be coded according to the z-order determined in step 105, as shown in
The values p0, p1, p2 for the currently-encoded sub-block can be predicted in step 106 (as predictors {circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2) on the basis of extension of planes defined by values pR0, pR1, pR2, in the earlier-encoded sub-blocks, wherein Rε{LT, L, LB, T, RT}. The prediction can be also made on the basis of a predefined value stored at the coder. Of course, the prediction is not done for the first sub-blocks in the coding order, which have no corresponding earlier-encoded sub-blocks.
For example, as shown in
The position of point p0 of the currently-encoded sub-block C with respect to the earlier-encoded sub-block LT is referenced as xLTp
The predictor p0 of the value of p0 can be determined as the value of the depth map corresponding to the extension of plane (pLT0, pLT1, pLT2) defined for the earlier-encoded sub-block LT according the formula;
The predictor {circumflex over (p)}2 of the value of p2 can be determined as the value of the depth map corresponding to the extension of plane (pLT0, pLT1, pLT2) defined for the earlier-encoded sub-block LT according to the formula:
The predictor {circumflex over (p)}1 of the value of p1 can be determined as the value of the depth map corresponding to the extension of plane (pT0, pT1, pT2) defined for the earlier-encoded sub-block T according to the formula:
In the depth-map coding method according to the disclosure, the encoder transmits in step 108 the parameters of the plane for the currently-encoded sub-block C in form of the difference Δp0, Δp1, Δp2 between the actual value of p0, p1, p2 and the value of its corresponding predictor {circumflex over (p)}0, {circumflex over (p)}1, {circumflex over (p)}2:
Δp0=p0−{circumflex over (p)}0
Δp1=p1−{circumflex over (p)}1
Δp2=p2−{circumflex over (p)}2
For the first sub-blocks in the coding order, which have no corresponding earlier-encoded sub-blocks, the difference may correspond to the actual value of the plane points p0, p1, p2, as the predictors may be assumed to have a zero value.
At the decoder, the actual parameters of the plane p0, p1, p2 are calculated according to the above formula by adding to the received difference values the predictors calculated for the earlier-decoded sub-blocks for the currently decoded sub-block.
The present disclosure provides improved encoding efficiency of the depth map. Therefore, the method provides a useful, concrete and tangible result.
According to the present method, certain computer data are processed in a processing device according to
It can be easily recognized, by one skilled in the art, that the aforementioned method for predictive encoding of a depth map may be performed and/or controlled by one or more computer programs. Such computer programs are typically executed by utilizing the computing resources in a computing device. Applications are stored on a non-transitory medium. An example of a non-transitory medium is a non-volatile memory, for example a flash memory while an example of a volatile memory is RAM. The computer instructions are executed by a processor. These memories are exemplary recording media for storing computer programs comprising computer-executable instructions performing all the steps of the computer-implemented method according the technical concept presented herein.
While the method presented herein has been depicted, described, and has been defined with reference to particular preferred embodiments, such references and examples of implementation in the foregoing specification do not imply any limitation on the method. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader scope of the technical concept. The presented preferred embodiments are exemplary only, and are not exhaustive of the scope of the technical concept presented herein.
Accordingly, the scope of protection is not limited to the preferred embodiments described in the specification, but is only limited by the claims that follow.
Number | Date | Country | Kind |
---|---|---|---|
PL412833 | Jun 2015 | PL | national |