METHOD, ENCODER, AND DISPLAY DEVICE FOR REPRESENTING A THREE-DIMENSIONAL SCENE AND DEPTH-PLANE DATA THEREOF

Description

BACKGROUND

Some volumetric, augmented reality, and virtual reality applications represent a three-dimensional scene as a series of images at different distances (depth planes) to a viewer of the scene. To render such a scene from a desired viewpoint, each depth plane can be processed in turn and composited with the others to simulate a two-dimensional projection of the three-dimensional scene at the desired viewer position. This two-dimensional projection can then be displayed on a head-mounted device, mobile phone, or other flat screen. By dynamically adjusting the two-dimensional projection based on the position of the viewer, the experience of being in a three-dimensional scene can be simulated.

SUMMARY OF THE EMBODIMENTS

Decreasing a number of depth planes required to accurately represent a three-dimensional scene is valuable because such a reduction decreases the amount of data that must be processed. In embodiments disclosed herein, reducing the number of depth planes is accomplished while ensuring that an accurate simulation can be rendered that meets or just exceeds the ability of the human visual system to perceive depth. Embodiments disclosed herein include a “Depth Perceptual Quantization” function or DPQ that relates physical distances in depth (depth planes) to the capabilities of the human visual system, such as visual acuity. Each depth plane calculated by the DPQ is a constant “just noticeable difference” from an adjacent plane.

In a first aspect, a method for representing a three-dimensional scene stored as a three-dimensional data set is disclosed. The method includes determining a quantity P depth-plane depths along a first viewing direction relative to a first vantage point. A separation ΔD between each proximal depth D and an adjacent distal depth (D+ΔD) of the P depth-plane depths is a just-noticeable difference determined by (i) the proximal depth D, (ii) a lateral offset Δx, perpendicular to the first viewing direction, and between the first vantage point and a second vantage point, and (iii) a visual angle Δϕ subtended by separation ΔD when viewed from the second vantage point. The method also includes generating, from the three-dimensional data set, a proxy three-dimensional data set that includes P proxy images I_k. Generating the proxy three-dimensional data set is accomplished by, for each depth-plane depth of the P depth-plane depths: generating a proxy image of the P proxy images from at least one cross-sectional image of a plurality of transverse cross-sectional images that (i) constitute the three-dimensional data set and (ii) each represent a respective transverse cross-section of the three-dimensional scene at a respective scene-depth of a plurality of scene-depths.

In a second aspect, an encoder includes a processor and a memory. The memory stores machine readable instructions that when executed by the processor, control the processor to execute the method of the first aspect.

In a third aspect, a display device includes an electronic visual display, a processor, and a memory. The memory stores machine readable instructions that when executed by the processor, control the processor to, for each proxy image I_kof P proxy images, k=0, 1, . . . , (P−1): (i) determine a respective scene-depth D_kof proxy image I_kas a linear function of

${(\frac{{(k / P_{d})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (k / P_{d})}^{1 / m}})}^{1 / n},$

where m, n, c₁, c₂, and c₃are predetermined values and P_d=(P−1), and (ii) display proxy image I_kat scene depth D_kon the electronic visual display.

In a fourth aspect, a method for representing depth-plane data includes, for each of a plurality of two-dimensional images each corresponding to a respective one of a plurality of depths D within a three-dimensional scene: (i) determining a normalized depth D′ from the depth D; (ii) computing a normalized perceptual depth D_PQthat equals

${(\frac{c_{2} + c_{1} D^{' n}}{1 + c_{3} D^{' n}})}^{m};$

(iii) representing the normalized perceptual depth D_PQas a binary code value D_B, where m, n, c₁, c₂, and c₃are predetermined values.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a schematic of a viewer viewing a three-dimensional scene as rendered by a display of a device.

FIG. 2 is a schematic illustrating a geometrical derivation of an expression for just-noticeable-difference in depth as a function of viewing distance and a transverse displacement.

FIG. 3 is a schematic illustrating a relationship between the transverse displacement and viewing distance of FIG. 2, horizontal screen resolution, and angular visual acuity.

FIG. 4 is a plot of just-noticeable-difference in depth as a function of viewing distance for a specific viewing configuration.

FIG. 5 is a plot showing a plurality of depth-plane depths recursively-determined using the expression for just-noticeable-difference in depth derived via FIG. 2, in an embodiment.

FIG. 6 is a graphical illustration of normalized depth as a function of depth-plane depths of FIG. 5, in an embodiment.

FIG. 7 is a flowchart illustrating a method for representing a three-dimensional scene stored as a three-dimensional data set, in an embodiment.

FIG. 8 is a flowchart illustrating a method for representing depth-plane data, in an embodiment.

DETAILED DESCRIPTION OF THE EMBODIMENTS

Devices and methods disclosed herein determine depth-plane locations based on the limits of spatial acuity (the ability to perceive fine detail). This approach differs from methods that rely on binocular acuity (the ability to perceive a different image in two eyes). By leveraging spatial acuity, embodiments disclosed herein ensure accurate representation of high-frequency occlusions that exist when one object is obscured by another from one viewing position, but is visible from another

The depth-plane location methods disclosed herein consider motion parallax, which is when an observer moves when observing a scene to observe it from a different perspective. The change in the image from two different vantage points results in a strong depth cue. Other methods consider only the difference in vantage point between two eyes, typically 6.5 cm. Embodiments herein accommodate, and are designed for, a much longer baseline, such as of 28 cm of movement, which results in many more perceptual depth planes.

FIG. 1 is a schematic of a viewer 191 viewing a three-dimensional scene 112 as rendered by a display 110 of a device 100. Example of device 100 include head-mounted displays, mobile devices, computer monitors, and television receivers. Device 100 also includes a processor 102 and a memory 104 communicatively coupled thereto. Memory 104 stores a proxy three-dimensional data set 170 and software 130. Software 130 includes a decoder 132 in the form of machine-readable instructions, implements one or more functions of device 100. As used herein, the term “proxy image data set” denotes a memory efficient representation, or proxy, for an original image data set.

FIG. 1 also includes an encoding device 160, which includes a processor 162 and a memory 164 communicatively coupled thereto. Memory 164 stores a three-dimensional data set 150, software 166, and proxy three-dimensional data set 170. Software 166 includes an encoder 168 in the form of machine-readable instructions, implements one or more functions of encoding device 160. In embodiments, encoder 168 generates proxy three-dimensional data set 170 and a quantity P depth-plane depths 174 from three-dimensional data set 150. Device 100 and encoding device 160 are communicatively coupled via a communications network 101.

Each of memory 104 and 164 may be transitory and/or non-transitory and may include one or both of volatile memory (e.g., SRAM, DRAM, computational RAM, other volatile memory, or any combination thereof) and non-volatile memory (e.g., FLASH, ROM, magnetic media, optical media, other non-volatile memory, or any combination thereof). Part or all of memory 104 and 164 may be integrated into processor 102 and 162, respectively.

Three-dimensional data set 150 includes a quantity S transverse cross-sectional images 152, each of which represent a respective transverse cross-section of the three-dimensional scene at a respective scene-depth 154 (0, 1, . . . . S−1). Quantity S is greater than quantity P. Proxy three-dimensional data set 170 includes P proxy images 172 (0, 1, . . . , P−1). For each depth-plane depth 174(k), encoder 168 generates a proxy image 172(k) from at least one transverse cross-sectional image 152. Index k is one P integers, e.g., an integer between and including zero and (P−1). One of the respective scene-depths 154 of the at least one transverse cross-sectional image 152 is most proximate to depth-plane depth 174(k).

Decoder 132 decodes proxy three-dimensional data set 170 and transmits decoded data to display 110, which display 110 displays as three-dimensional scene 112. Three-dimensional scene 112 includes P proxy images 172 (0, 1, . . . , P−1), each of which are at respective depth-plane depth 174 (0, 1, . . . , P−1) in a direction z and parallel to the x-y plane of a three-dimensional Cartesian coordinate system 118. On coordinate system 118, depth-plane depths 174 are denoted as z₀, z₁, . . . z_P-1along the z axis. FIG. 1 also depicts three-dimensional Cartesian coordinate system 198 that defines directions x′ y′, and z′. As viewed by viewer 191, directions x, y, and z of coordinate system 118 are parallel to respective directions x′ y′, and z′ of coordinate system 198.

Calculating Perceptual Depths

FIG. 2 is a schematic illustrating a derivation of an expression for just-noticeable-difference in depth as a function of viewing distance. In FIG. 2, an object 221 is located at distance D from viewer 191 observer and object 222 behind it by distance ΔD. From viewing position 211, object 222 is occluded by object 221. As viewer 191 moves a distance Δx to a new position 212, viewer 191 is able to observe object 222. The geometry can be written in terms of a difference Δϕ of angles 231 and 232 illustrated in FIG. 2, as shown in equation (1), where Δϕ is the angular visual acuity of the observer. For television and motion picture production, Recommendation ITU-R BT.1845 by the International Telecommunication Union specifies an observer with “normal” 20/20 visual acuity, or an angular resolution Δϕ= 1/60 degree.

$\begin{matrix} ∠232 - ∠ 2 3 1 = Δϕ & (1) \end{matrix}$

Equation (1) can be written in terms of trigonometric functions as:

$\begin{matrix} \tan^{- 1} (\frac{Δ x}{D}) - \tan^{- 1} (\frac{Δ x}{D + Δ D}) = Δ ϕ & (2) \end{matrix}$

Solving Equation (2) for ΔD yields equation (3), which is an example depth quantization function.

$\begin{matrix} Δ D = \frac{Δ x - D \tan (\tan^{- 1} (\frac{Δ x}{D}) - Δ ϕ)}{\tan (\tan^{- l} (\frac{Δ x}{D}) - Δ ϕ)} & (3) \end{matrix}$

To use equation 3, a range of depth planes must be specified. Recommendation ITU-R BT.1845 specifies the closest distance where the human eye can comfortably focus as D_min=0.25 m. For D_maxwe choose a value where the denominator reaches zero, and ΔD is infinity, which occurs at

$D_{ma x} = \frac{Δ x}{\tan (Δ ϕ)},$

which is dependent both on the choice of baseline distance Δx and the visual acuity Δϕ.

The value of Δx must also be specified. This is the minimum movement that an observer must make to perceive a change in depth between object 221 and object 222. For images that are intended to be viewed on a display, this can be computed from the “ideal viewing distance” specified in ITU-R BT.1845 as the point where a width Δw of each pixel matches the visual acuity 4, as shown in FIG. 3. For a horizontal screen resolution of N_x=3840 pixels, viewing at the minimum viewing distance D_min, the distance from one edge of the screen to the other is given by equation 4:

$\begin{matrix} Δ x = N_{x} \cdot D \cdot \tan (Δϕ) & (4) \end{matrix}$

Computing Δx for the closest viewing distance D=D_min, we calculate Δx=0.28 meters, which results in D_max=960 m. Larger movements may exceed the just-noticeable difference (JND), but since it is impossible for a single observer to view from both positions simultaneously, they must rely on their working memory to compare the views from both perspectives.

FIG. 4 shows a plot of ΔD of equation (3) in meters and ΔD/D as a function of viewing distance D for Δϕ= 1/60 degree and Δx=0.28 meters Very small changes of depth are visible at near distances (0.15 mm at D=25 cm). The depth JND increases at larger distances until the depth approaches D_max.

Using Equation 3 starting at D_minand incrementing by ΔD until reaching D_maxallows us to build up a table of P depth-planes depths 174, where each depth-plane depth 174 differs by a perceptual amount from the last. The final depth plane is set to D=D_max. Hence, proxy three-dimensional data set 170 is a memory efficient representation, or proxy, for three-dimensional data set 150. Computational resources required for device 100 to display and refresh views three-dimensional scene 100 as viewer 191 moves along the x′ axis is less for data set 170 than for data set 150.

The number of unique depth planes under the above conditions is P=2890. To show a smooth continuous gradient spanning half of the screen (for example a railway disappearing into the distance from a bottom edge of the screen to a top edge as shown in three-dimensional scene 112) while allowing observer movement Δx=0.28 meters, nearly three thousand unique depth planes may be required.

FIG. 5 is a plot of showing a mapping 510 of depth plane indices k=0 to k=2889 for each of the aforementioned 2890 depth planes to a respective depth-plane depth D_k, where D_kis the depth of the k^thdepth plane.

Fitting a Functional Form

It is possible to achieve a functional fit (that is invertible) to mapping 510 that maps a plurality of actual depths D to a respective depth-plane depth D_PQ. The functional form of equation (5) is one such mapping, where depth-plane depths D_PQbest fit mapping 510 for properly chosen values of exponent n and coefficients c₁, c₂, and c₃. The right hand-side of equation (5) may have other forms without departing from the scope hereof.

$\begin{matrix} D_{P Q} = (c_{2} + c_{1} D^{' n}) / (1 + c_{3} D^{' n}) & (5) \end{matrix}$

In equation (5), D′ is normalized depth D/D_maxand D_PQis a normalized depth of a corresponding perceptual depth plane. D_PQranges from 0 to 1. Coefficients c₁, c₂, and c₃satisfy c₃=c₁+c₂−1 and c²=−c₁(D_min/D_max)ⁿ. In embodiments, values of c₂and c₃are determined such that D_PQ(D_min)=0 and D_PQ(D_max)=1. In an embodiment, D_maxequals 960 meters, c₁=2,620,000, and exponent n equals ¾.

A more accurate functional fit may be obtained using the functional form specified in equation (6), which adds an exponent m to the right side of equation (5). That is, equation (5) is a particular instance of equation (6), in which m equals one. In embodiments, exponent n=1.

$\begin{matrix} D_{P Q} = {((c_{2} + c_{1} D^{' n}) / (1 + c_{3} D^{' n}))}^{m} & (6) \end{matrix}$

As in equation (5), values of c₂and c₃may be determined such that D_PQ(D_min)=0 and D_PQ(D_max)=1. The relationships among coefficients c₁, c₂, and c₃are the same as stated above for equation (5). In an embodiment, D_maxequals 960 meters, c₁=2,620,000, and exponent n equals 3872/4096, and m=5/4.

Depth-plane depths D_PQof equation (6) are an example of depth-plane depths 174. If the unit of D_PQis not explicitly mentioned, each depth-plane depth D_PQis a normalized depth ranging from zero to one. In other embodiments, each depth-plane depth D_PQhas units of length, and ranges from D_minto D_max.

Equation (7) is an inverted form of equation (6), and hence is an explicit expression for normalized depth D′=D/D_maxis a function of depth-plane depth D_PQ, coefficients c₁, c₂, and c₃, and exponents m and n.

$\begin{matrix} D^{'} = {(({(D_{P Q})}^{1 / m} - c_{2}) / (c_{1} - {c_{3} (D_{P Q})}^{1 / m}))}^{1 / n} & (7) \end{matrix}$

FIG. 6 is a plot 600 of equation (7). Plot 600 includes data 610 generated by an iterative and recursive application of equation (3): D_k+1=D_k+ΔD_k, where ΔD_kis the left side of equation (3). Plot 600 also includes fit 620 generated by equation (7). In embodiments, exponent n=1, which results in an approximation of equation (7) when n≠1.

Equation (8) is an indexed version equation (7), where k/P_dreplaces D_PQ, D_kreplaces D′, and index k ranges from 0 to P_d, where P_d=(P−1). Equation (8) also includes a coefficient μ and an offset β.

$\begin{matrix} D_{k}^{'} = {μ (({(k / P_{d})}^{1 / m} - c_{2}) / (c_{1} - {c_{3} (k / P_{d})}^{1 / m}))}^{1 / n} + β & (8) \end{matrix}$

If the unit of D′_kis not explicitly mentioned, β equals zero and μ equals one such that D′_krepresents a normalized depth D_k/D_max. In other embodiments, β and μ have units of lengths, and are chosen such that D′_k(k=0) equals D_minand D′_k(k=P−1) equals D_maxand D′_kis no longer normalized.

In embodiments, software 130 of device 100 includes machine readable instructions that when executed by the processor: (i) control the processor to, for each proxy image 172 (0−P_d), determine a respective normalized scene depth D′_kaccording to equation (8), and (ii) display each proxy image 172 (0−P_d) at a scene depth determined from the normalized scene-depth D_kon the display 110.

FIG. 7 is a flowchart illustrating a method 700 for representing a three-dimensional scene stored as a three-dimensional data set. In embodiments, method 700 is implemented within one or more aspects of encoding device 160 and/or device 100. For example, method 700 may be implemented by at least one of (i) processor 162 executing computer-readable instructions of software 166 and (ii) processor 102 executing computer-readable instructions of software 130. Method 700 includes steps 720 and 730. In embodiments, method 700 also includes at least one of steps 710, 740, and 750.

Step 720 includes determining a quantity P depth-plane depths along a first viewing direction relative to a first vantage point. A separation ΔD between each proximal depth D and an adjacent distal depth (D+ΔD) of the P depth-plane depths is a just-noticeable difference determined by (i) the proximal depth D, (ii) a lateral offset Δx, perpendicular to the first viewing direction, and between the first vantage point and a second vantage point, and (iii) a visual angle Δϕ subtended by separation ΔD when viewed from the second vantage point. In an example of step 720, encoder 168 determines depth-plane depths 174.

In embodiments, the visual angle Δϕ is one arcminute. In embodiments, each of the P depth-plane depths exceeds a minimum depth D₀and being denoted by D_k, k=0, 1, 2, . . . , (P−1), determining the P depth-plane depths comprising iteratively determining depth D_k+1=D_k+ΔD_k. In such embodiments, separation ΔD_kmay be equal to

$\frac{Δ x - D_{k} \tan (\tan^{- 1} (\frac{Δ x}{D_{k}}) - Δ ϕ)}{\tan (\tan^{- 1} (\frac{Δ x}{D_{k}}) - Δ ϕ)},$

which is an example of equation (3).

In embodiments, method 700 includes step 710, which includes determining lateral offset Δx from the visual angle Δϕ and a predetermined minimum depth-plane depth of the P depth-plane depths. In an example of step 710, software 166 determines lateral offset Δx using equation (4) where D equals depth-plane depth 174(0).

Step 730 includes generating, from the three-dimensional data set, a proxy three-dimensional data set that includes P proxy images I_k. Generating the proxy three-dimensional data set is accomplished by, for each depth-plane depth of the P depth-plane depths: generating a proxy image of the P proxy images from at least one cross-sectional image of a plurality of transverse cross-sectional images that (i) constitute the three-dimensional data set and (ii) each represent a respective transverse cross-section of the three-dimensional scene at a respective scene-depth of a plurality of scene-depths. In embodiments, one of the respective scene-depths of the at least one cross-sectional image is most proximate to the depth-plane depth. In an example of step 730, encoder 168 generates proxy three-dimensional data set 170 from three-dimensional data set 150. Data sets 150 and 170 transverse cross-sectional images 152 and proxy images 172, respectively, as illustrated in FIG. 1.

When the at least one cross-sectional image of step 730 includes multiple cross-sectional images, step 730 may include step 732. Step 732 includes generating the proxy image comprising averaging the multiple cross-sectional images. The final depth plane may be constructed by averaging the values of all depths beyond D_max. The first depth plan may be constructed by averaging the values of all depths below D_min. In an example of step 732, encoder 168 generates each proxy image 172 as an average of two or more transverse cross-sectional images 152.

Step 740 includes, for each proxy image I_kof the P proxy images, k=0, 1, 2, . . . , (P−1), determining a respective scene-depth D_kof proxy image I_kas a linear function of

${(\frac{{(k / P_{d})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (k / P_{d})}^{1 / m}})}^{1 / n},$

where m, n, c₁, c₂, and c₃are predetermined values and P_d=(P−1). In embodiments, each scene-depth D_kequals

${(\frac{{(k / P_{d})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (k / P_{d})}^{1 / m}})}^{1 / n} .$

In an example of step 740, either encoder 168 or decoder 132 determines, for each proxy image 172(k), a respective depth-plane depth 174(k) according to equation (7) where D_PQequals k/P_dand depth-plane depth 174(k) equals scene-depth D_k.

In embodiments, step 740 includes reading the quantities D_min, D_max, and P from metadata of the three-dimensional data set. For example, quantities D_min, D_max, and P may be stored as metadata of three-dimensional data set 150, which is read by software 166. In embodiments, each of D_minand D_maxis a 10-bit fixed-point value, with respective values of 0.25 meters and 960 meters if the fixed-point value is zero. In embodiments, P is a 12-bit fixed-point value.

Step 750 includes each displaying proxy image/k at its respective depth-plane depth. In an example of step 750, device 100 displays at least one proxy image 172(k) at depth-plane depth 174(k), shown within three-dimensional scene 112 as z_k. When method 700 includes step 740, each respective depth-plane depth of step 750 equals a respective scene depth D′_kof step 740, for example, depth-plane depth 174(k) equals scene-depth D′_k.

In embodiments, steps 720 and 730 are executed by a first device, such as encoding device 160, FIG. 1, and method 700 includes step 740. In such embodiments, step 750 may include step 752, which includes transmitting the proxy three-dimensional data from the first device to a second device, which executes said determining the respective scene-depths D_kand displays the proxy image. In an example of step 752, encoding device 160 transmits proxy three-dimensional data set 170 to device 100 and neither generates nor stores depth-plane depths 174. In this example, device 100 executes step 740 to determine depth-plane depths 174.

FIG. 8 is a flowchart illustrating a method 800 for representing depth-plane data. In embodiments, method 700 is implemented within one or more aspects of device 100. For example, method 800 may be implemented by processor 102 executing computer-readable instructions of software 130.

Method 800 includes steps 810, 820, and 830, each of which are executed for each of a plurality of two-dimensional images each corresponding to a respective one of a plurality of depths D within a three-dimensional scene. In embodiments, transverse cross-sectional images 152 constitute the plurality of two-dimensional images and scene-depths 154 constitute the plurality of scene depths D.

Step 810 includes determining a normalized depth D′ from the depth D. In an example of step 810, software 130 determines a respective normalized depth from each scene-depth 154.

Step 820 includes computing a normalized perceptual depth D_PQaccording to equation (6). In example of step 820, software 130 determines a respective depth-plane depth 174 from each scene-depth 154 divided by D_max. In this example, depth-plane depths are normalized depths.

Step 830 includes representing the normalized perceptual depth D_PQas a binary code value D_B. In an example of step 830, software 130 represents each depth plane-depth 174 as a respective binary code value. In embodiments, the bit depth of the binary code value D_Bis one of eight, ten, or twelve. Step 830 may also include storing each binary code value on a non-transitory storage media, which may be part of memory 104.

Combinations of Features

Features described above as well as those claimed below may be combined in various ways without departing from the scope hereof. The following enumerated examples illustrate some possible, non-limiting combinations.

- (A1) A method for representing a three-dimensional scene stored as a three-dimensional data set is disclosed. The method includes determining a quantity P depth-plane depths along a first viewing direction relative to a first vantage point. A separation ΔD between each proximal depth D and an adjacent distal depth (D+ΔD) of the P depth-plane depths is a just-noticeable difference determined by (i) the proximal depth D, (ii) a lateral offset Δx, perpendicular to the first viewing direction, and between the first vantage point and a second vantage point, and (iii) a visual angle Δϕ subtended by separation ΔD when viewed from the second vantage point. The method also includes generating, from the three-dimensional data set, a proxy three-dimensional data set that includes P proxy images I_k. Generating the proxy three-dimensional data set is accomplished by, for each depth-plane depth of the P depth-plane depths: generating a proxy image of the P proxy images from at least one cross-sectional image of a plurality of transverse cross-sectional images that (i) constitute the three-dimensional data set and (ii) each represent a respective transverse cross-section of the three-dimensional scene at a respective scene-depth of a plurality of scene-depths.
- (A2) In embodiments of method (A1), the visual angle Δϕ is one arcminute.
- (A3) Embodiments of either of methods (A1) and (A2) further include determining lateral offset Δx from the visual angle Δϕ and a predetermined minimum depth-plane depth of the P depth-plane depths.
- (A4) In embodiments of any one of methods (A1)-(A3), each of the P depth-plane depths exceeding a minimum depth Do and is denoted by D_k, k=0, 1, 2, . . . , (P−1), and determining the P depth-plane depths includes iteratively determining depth D_k+1=D_k+ΔD_k.
- (A5) In embodiments of method (A4), separation ΔD_kis equal to

$\frac{Δ x - D_{k} \tan (\tan^{- 1} (\frac{Δ x}{D_{k}}) - Δ ϕ)}{\tan (\tan^{- 1} (\frac{Δ x}{D_{k}}) - Δ ϕ)} .$

- (A6) In embodiments of any one of claims (A1)-(A5), when generating the proxy image, the at least one cross-sectional image includes multiple cross-sectional images of the plurality of transverse cross-sectional images, and generating the proxy image includes averaging the multiple cross-sectional images.
- (A7) Embodiments of any one of claims (A1)-(A6) further include for each proxy image I_kof the P proxy images, k=0, 1, 2, . . . , (P−1): determining a respective scene-depth D_kof proxy image I_kas a linear function of

${(\frac{{(k / P_{d})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (k / P_{d})}^{1 / m}})}^{1 / n},$

- where m, n, c₁, c₂, and c₃are predetermined values and P_d=(P−1); displaying proxy image I_kat scene-depth D_k.
- (A8) When said determining the P depth-plane depths and generating the proxy three-dimensional data set is executed by a first device, embodiments of (A7) further include transmitting the proxy three-dimensional data from the first device to a second device, which executes said determining the respective scene-depths D_kand displays the proxy image.
- (A9) In embodiments of either one of methods (A7) and (A8), each scene-depth D′_kbeing equal to

${(\frac{{(k / P_{d})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (k / P_{d})}^{1 / m}})}^{1 / n},$

- and the P uniformly-spaced depth-plane depths range from zero to one, where c₃=c₁+c₂−1 and c₂=−c₁(D_min/D_max)ⁿwhere D_minand D_maxare a minimum scene-depth and a maximum scene-depth of the a three-dimensional scene, respectively.
- (A10) Embodiments of (A9) further include reading the quantities D_min, D_max, and P from metadata of the three-dimensional data set.
- (A11) In embodiments of either of methods (A9) and (A10), D_minand D_maxequal to 0.25 meters and 960 meters, respectively.
- (A12) In embodiments of any one of methods (A7)-(A11), c₁, m, and n equal to 2620000, 5/4, and 3845/4096, respectively.
- (A13) In embodiments of any one of methods (A1)-(A12), in said step of generating a proxy image, one of the respective scene-depths of the at least one cross-sectional image is most proximate to the depth-plane depth.
- (B1) An encoder includes a processor and a memory. The memory stores machine readable instructions that when executed by the processor, control the processor to execute any one of methods (A1)-(A13).
- (C1) A display device includes an electronic visual display, a processor, and a memory. The memory stores machine readable instructions that when executed by the processor, control the processor to, for each proxy image I_kof P proxy images, k=0, 1, . . . , (P−1): (i) determine a respective scene-depth D_kof proxy image I_kas a linear function of

${(\frac{{(k / P_{d})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (k / P_{d})}^{1 / m}})}^{1 / n},$

- where m, n, c₁, c₂, and c₃are predetermined values and P_d=(P−1), and (ii) display proxy image I_kat scene depth D_kon the electronic visual display.
- (D1) A method for representing depth-plane data includes, for each of a plurality of two-dimensional images each corresponding to a respective one of a plurality of depths D within a three-dimensional scene: (i) determining a normalized depth D′ from the depth D; (ii) computing a normalized perceptual depth D_PQthat equals

${(\frac{c_{2} + c_{1} D^{'^{n}}}{1 + c_{3} D^{'^{n}}})}^{m};$

- (iii) representing the normalized perceptual depth D_PQas a binary code value D_B, where m, n, c₁, c₂, and c₃are predetermined values.
- (D2) In embodiments of method (D1), the plurality of depths D range from a minimum D_minat which D_PQequals zero to a maximum D_maxat which D_PQequals one, c₂being equal to −c₁(D_min/D_max)ⁿ, c₃being equal to (c₁+c₂−1).
- (D3) In embodiments of either one of methods (D1) and (D2), c₁equals 2,620,000, n equals 3872/4096, and m equals 5/4.
- (D4) In embodiments of any one of methods (D1)-(D3), a bit depth of the binary code value D_Bis one of eight, ten, or twelve.
- (D5) Embodiments of any one of methods (D1)-(D4) further include storing the binary code value D_Bon a non-transitory storage media.
- (E1) An apparatus includes a non-transitory storage media and a bitstream stored on the non-transitory storage media. The bitstream includes depth-distance data, wherein the depth-distance data is encoded with binary code values D_Bthat represent a normalized depth-distance values D′ based at least in part on a functional model of

$D^{'} = {(\frac{{(D_{PQ})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (D_{PQ})}^{1 / m}})}^{1 / n} .$

- Parameters n, m, c₁, c₂, and c₃are predetermined values, and D_PQis a normalized value of binary code value D_B, and satisfies 0≤ D_PQ≤1.
- (F1) A decoding method includes, for each proxy image I_kof a quantity P proxy images, k=0, 1, 2, . . . , (P−1): (i) determining a respective scene-depth D_kof proxy image I_kas a linear function of

${(\frac{{(k / P_{d})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (k / P_{d})}^{1 / m}})}^{1 / n},$

- where m, n, c₁, c₂, and c₃are predetermined values and P_d=(P−1); and (ii) displaying proxy image I_kat scene-depth D_k.
- (F2) In embodiments of method (F1), each scene-depth D_kbeing equal to

${(\frac{{(k / P_{d})}^{1 / m} - c_{2}}{c_{1} - {c_{3} (k / P_{d})}^{1 / m}})}^{1 / n},$

- and the P uniformly-spaced depth-plane depths range from zero to one, where c₃=c₁+c₂−1 and c₂=−c₁(D_min/D_max)ⁿwhere D_minand D_maxare a minimum scene-depth and a maximum scene-depth of the a three-dimensional scene, respectively.
- (F3) Embodiments of either one of methods (F1) and (F2) further includes reading the quantities D_min, D_max, and P from metadata of the three-dimensional data set.
- (F4) In embodiments of any one of methods (F1)-(F3), D_minand D_maxequal to 0.25 meters and 960 meters, respectively.
- (F5) In embodiments of any one of methods (F1)-(F4), c₁, m, and n equal to 2620000, 5/4, and 3845/4096, respectively.
- (G1) An encoder includes a processor and a memory. The memory stores machine readable instructions that when executed by the processor, control the processor to execute any one of methods (F1)-(F5).

Changes may be made in the above methods and systems without departing from the scope of the present embodiments. It should thus be noted that the matter contained in the above description or shown in the accompanying drawings should be interpreted as illustrative and not in a limiting sense. Herein, and unless otherwise indicated the phrase “in embodiments” is equivalent to the phrase “in certain embodiments,” and does not refer to all embodiments. The following claims are intended to cover all generic and specific features described herein, as well as all statements of the scope of the present method and system, which, as a matter of language, might be said to fall therebetween.

Claims

1. A method for reducing the number of depth planes of a three-dimensional scene stored as a three-dimensional data set, comprising: receiving a lateral offset Δx, perpendicular to a first viewing direction, and between a first vantage point and a second vantage point, wherein the lateral offset Δx is the minimum distance an observer must make to perceive a change in depth between a first object at a proximal depth D along the first viewing direction and a second object at an adjacent distal depth (D+ΔD) along the first viewing direction;receiving a visual angle Δϕ representing an angular visual acuity of the observer;receiving the three-dimensional data set that includes a quantity S transverse cross-sectional images, each of which corresponding to a depth-plane depth and representing a respective transverse cross-section of the three-dimensional scene at a respective scene-depth along the first viewing direction relative to the first vantage point;determining a quantity P depth-plane depths along the first viewing direction relative to the first vantage point, a separation ΔD between each proximal depth D and the adjacent distal depth (D+ΔD) of the P depth-plane depths being a just-noticeable difference determined by (i) the proximal depth D, (ii) the lateral offset Δx, and (iii) the visual angle Δϕ subtended by separation ΔD when viewed from the second vantage point, wherein the quantity P is smaller than the quantity S, wherein
2. The method of claim 1, wherein receiving the lateral offset Δx comprises determining lateral offset Δx by calculating Δx=Nx·Dmin·tan(Δϕ), where Nx is a horizontal screen resolution and Dmin is a predetermined minimum depth-plane depth of the P depth-plane depths.
3. The method of claim 1, wherein generating a proxy image from at least one cross-sectional image of the quantity S transverse cross-sectional images comprises generating the proxy image from multiple cross-sectional images of the quantity S transverse cross-sectional images, and wherein generating the proxy image comprises averaging the multiple cross-sectional images.
4. The method of claim 1, wherein generating a proxy image from at least one cross-sectional image of the quantity S transverse cross-sectional images comprises generating the proxy image from the at least one cross-sectional image being most proximate to the respective depth-plane depth.
5. The method of claim 1, each of the P depth-plane depths being greater or equal a predetermined minimum depth-plane depth Dmin and being denoted by Dk, k=0, 1, 2, . . . , (P−1), determining the P depth-plane depths comprising iteratively determining depth Dk+1=Dk+ΔDk.
6. The method of claim 5, separation ΔDk being equal to
7. The method of claim 1, further comprising, for each proxy image Ik of the P proxy images, k=0, 1, 2, . . . , (P−1): determining a respective approximated normalized depth-plane depth D′k of proxy image Ik as a linear function of
8. The method of claim 7, said determining the P depth-plane depths and generating the proxy three-dimensional data set being executed by a first device, further comprising: transmitting the proxy three-dimensional data from the first device to a second device, which executes said determining the respective approximated normalized depth-plane depths D′k and displays the proxy image.
9. The method of claim 7, the P uniformly-spaced normalized depth-plane depths range from zero to one, where c3=c1+c2−1 and c2=−c1 (Dmin/Dmax)n where Dmin and Dmax are a minimum scene-depth and a maximum scene-depth of the a three-dimensional scene, respectively.
10. An apparatus comprising: a processor; anda memory storing machine readable instructions that when executed by the processor, control the processor to perform the method of claim 1.
11. A display device comprising, an electronic visual display;a processor; anda memory storing machine readable instructions that when executed by the processor, control the processor to perform the method of claim 1 and to display the generated proxy images on the electronic visual display.
12. The method of claim 9, further comprising: receiving a minimum scene-depth Dmin;receiving a maximum scene-depth Dmax;for each of a plurality of two-dimensional images each corresponding to a respective one of a plurality of scene-depths D within the three-dimensional scene: determining a normalized depth D′ from the scene-depth D by calculating D/Dmax;computing a normalized perceptual depth DPQ that equals
13. The method of claim 12, further comprising storing the binary code value DB on a non-transitory storage media.
14. The method of claim 9, further comprising: for each of a plurality of two-dimensional images each corresponding to a respective one of a plurality of normalized perceptual depths DPQ within the three-dimensional scene: computing a normalized depth-distance value D′ as a linear function of
15. An apparatus comprising: a non-transitory storage media; and a bitstream stored on the non-transitory storage media, the bitstream comprising depth-distance data, wherein the depth-distance data is encoded with binary code values DB that represent normalized depth-distance values D′ determined according to the method of claim 14.

Priority Claims (1)

Number	Date	Country	Kind
21177381.7	Jun 2022	EP	regional

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application No. 63/195,898 and European Patent Application No. 21177381.7, both filed on Jun. 2, 2021, each of which is incorporated by reference in its entirety.

PCT Information

Filing Document	Filing Date	Country	Kind
PCT/US2022/031915	6/2/2022	WO

Provisional Applications (1)

	Number	Date	Country
	63195898	Jun 2021	US

METHOD, ENCODER, AND DISPLAY DEVICE FOR REPRESENTING A THREE-DIMENSIONAL SCENE AND DEPTH-PLANE DATA THEREOF

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

CROSS-REFERENCE TO RELATED APPLICATIONS

PCT Information

Provisional Applications (1)