The present invention relates generally to 3D displays.
In exemplary implementations of this invention, a display device produces a 3D image. The device comprises a stack of spatially-addressable light attenuating layers.
A controller performs calculations in order to control the device. In these calculations, tensors provide sparse, memory-efficient representations of a light field. The calculations include using weighted nonnegative tensor factorization (NTF) to solve an optimization problem. The tensor calculations may employ multiplicative update rules, and may be sufficiently efficient to achieve interactive refresh rates.
In exemplary implementations of this invention, a tensor display has N spatially addressable light attenuating layers and displays a temporal sequence of M frames. If the tensor display has a uniform backlight, the light field that it emits may be represented by Nth-order, rank-M tensor. If the tensor display has a directional backlight, the light field that it emits may be represented by a rank-M tensor with an order equal to N+1 (where N does not count any spatially addressable light attenuating layers in the directional backlight itself). The controller performs calculations that involve such a tensor.
In exemplary implementations, the display device is automultiscopic: a viewer can see the 3D effect without wearing glasses. The 3D display appears different from different viewing angles, and thus exhibits both binocular disparity and motion parallax, which are cues for depth perception. Thus, the display device creates the illusion of looking at an actual 3D scene.
In exemplary implementations, the display device includes (1) at least one spatially addressable light attenuating layer, and (2) a backlight. For example, each layer may comprise a liquid crystal display (LCD) layer. Light from the backlight is transmitted through the layer(s). The layer(s) are time-multiplexed: i.e., configured to produce a time-multiplexed image, in which a sequence of frames is shown at a rate equal to or faster than the flicker fusion frequency. This causes a human viewer to perceive a time average of the frames.
In exemplary implementations of this invention, the controller comprises one or more processors. Using weighted NTF, the controller calculates an optimal set of attenuations induced in light at respective pixels in the layers in the stack. If the backlight is directional, the optimal set may also include angles of light rays emitted by respective pixels of the backlight. The controller outputs control signals that cause the display device to produce this optimal set of attenuations (and light ray angles, if applicable) on a per pixel basis. Thus, the controller causes the device to output a light field that optimally approximates the target light field. If a human looks at the outputted light field without special glasses, the human perceives a 3D image.
In exemplary implementations of this invention, the directional backlight is effectively a low resolution light field display. This directional backlight can control the angle of light that it emits, and can be used with an arbitrary number of light attenuator layers (e.g., one or more such layers).
Here are two examples of a tensor display with a directional backlight.
In the first example, a purely angular backlight illuminates a single spatially addressable, light attenuating layer. The light attenuator layer comprises an LCD layer. The purely angular backlight can vary the angle of light that it emits, but does not have spatially addressable light attenuating pixels. Both the light attenuator layer and the purely angular backlight are time-multiplexed.
In the second example, the tensor display includes two spatially addressable light attenuator layers: (1) a high spatial resolution LCD in front, and (2) a secondary LCD layer in back. The secondary LCD layer is part of the directional backlight. Again, both the front LCD and the backlight are time-multiplexed.
In this second example, the directional backlight may comprise, from back to front: (1) an array of cold-cathode fluorescent lamps (CCFLs); (2) one or more lenses; and (3) the secondary LCD layer, positioned at a distance from the lens(es) equal to the focal length of the lens(es). By controlling which of the CCFLs behind the lens(es) are lit, the angle of light transmitted through respective pixels in the secondary LCD can be controlled.
Alternately, other types of light emitters may be used in the directional backlight. For example, the CCFLs may be replaced with other light-emitting devices, including light-emitting diodes (LEDs), including organic light-emitting diodes (OLEDs). Alternately, the stack of light attenuating layers may comprise two or more LCD layers, in addition to any LCD layer that is part of the directional backlight.
In some implementations, the display device is illuminated by a uniform backlight. For example, the stack may comprise two, three or more layers, where each layer is a spatially addressable light attenuator and none of the layers are part of the uniform backlight.
As used herein, a “tensor display” is an automultiscopic display device wherein: (1) the device includes one or more spatially addressable, light attenuating layers; (2) the device includes a controller, which is configured to perform calculations to control the device; and (3) the calculations involve using weighted NTF.
In exemplary implementations of this invention, time-multiplexed tensor displays have many practical advantages. For example, such tensor displays can be brighter than existing automultiscopic displays. Further, such tensor displays may have wider fields of views, greater depths of field and thinner form factors than existing automultiscopic displays. For these reasons, such tensor displays are well suited for producing 3D displays in mobile devices, such as tablets, smartphones and cell phones. More generally, tensor displays may be used in any flat screen display device, including in a monitor for (i) a personal computer (PC), (ii) a laptop computer, or (iii) home theater.
Tensor-based calculations can be used in a wide variety of architectures. Among other things, these calculations can be used to control an arbitrary number of light attenuating layers (e.g., two, three, four or more layers) and, if the backlight is directional, to control angle of light emitted by the backlight.
The description of the present invention in the Summary and Abstract sections hereof is just a summary. It is intended only to give a general introduction to some illustrative implementations of this invention. It does not describe all of the details of this invention. This invention may be implemented in many other ways.
The above Figures illustrate some illustrative implementations of this invention, or provide information that relates to those implementations. However, this invention may be implemented in many other ways. The above Figures do not show all of the details of this invention.
In exemplary implementations, this invention comprises a tensor display, i.e., an automultiscopic display device wherein: (1) the device includes one or more spatially addressable, light attenuating layers; (2) the device includes a controller, which is configured to perform calculations to control the device; and (3) the calculations involve using weighted NTF. Tensor displays can be illuminated by either uniform or directional backlighting (e.g. a low-resolution light field emitter).
In exemplary implementations of this invention, an N-layer, M-frame tensor display is illuminated by a uniform backlight, and the light field that it emits can be represented by an Nth-order, rank-M tensor. The light field tensor is decomposed as a sum of M rank-1 tensors, each corresponding to the outer product of N masks representing the transmittance of each layer for each frame. (If the backlight is directional, then the tensor display emits a light field that can be represented by N+1-order, rank-M tensor, where N does not count any spatially addressable light attenuating layers in the directional backlight itself). Using this representation, a unified optimization framework based on nonnegative tensor factorization (NTF) may be applied to a wide variety of tensor display architectures. For example, this NTF optimization framework may be employed for tensor display architectures comprising: (1) multiple, time-multiplexed layers with directional backlighting; (2) multiple, time-multiplexed layers with uniform backlighting, (3) static, multilayer displays with three or more layers and uniform backlighting, (4) two-layer, temporally-multiplexed displays, or (5) a full resolution 2D mode (in which all but one of the layers is rendered transparent).
Advantageously, this NTF optimization framework allows joint multilayer, multiframe light field decompositions. Such decompositions significantly reduce artifacts observed with prior multilayer-only and multiframe-only decompositions.
In a prototype of this invention, a tensor display includes modified LCD panels and a custom integral imaging backlight. In this prototype, an efficient, GPU-based NTF implementation enables interactive applications.
In exemplary implementations, the one or more layers act as a compressive display. Imagery sent to the display is compressed. But it is compressed in a way that allows the viewer to at least partially decompress it. In a stack with N spatially addressable, light attenuating layers, any light field sent to the display is reduced to N images—one for each of the layers. As these images are presented at a high frame rate, the viewer's eyes integrates the images into a high-rank approximation of the desired light field, thereby at least partially decompressing the imagery The higher the rank approximation that is achieved, the less lossy the compression.
In exemplary implementations of this invention, directional backlighting (used with multiple, time-multiplexed light attenuating layers) achieves a wide field of view, while reducing the need for additional layers and frames, yielding a thin, power-efficient, high-resolution light field display well suited for mobile and home theater applications. For example, in a prototype, a low-resolution lenslet-based directional backlight is used with a high-resolution LCD. In this prototype, a target light field is decomposed into a low-rank tensor approximation, increasing brightness and allowing more views to be generated than available frames. The NTF optimization framework allows arbitrary combinations of directional backlights and multiple light-attenuating display layers.
In exemplary implementations of this invention, multiplicative light attenuation to allow synthesized 3D objects to extend outside the enclosure. Furthermore, tensor displays support specularities, occlusions, and global illumination effects, without requiring moving parts.
In exemplary implementations, tensor displays enable trade-offs between image fidelity, resolution, brightness, and display complexity. These tensor displays employ compressive display modes, wherein low-rank tensor approximation efficiently exploits correlations between neighboring views to synthesize an emitted light field with an apparent number of views exceeding the number of frames. In contrast, prior direct display modes assign a single view to each frame, limiting resolution and brightness.
Tensor displays can provide greater depths of field, wider fields of view, and thinner form factors, compared to prior automultiscopic displays.
LCD layers 201, 211, 303, 305, 307 are configured to be able to temporally vary the light attenuation on a per pixel basis. For example, for each frame in a temporal sequence of frames, the controller (225 or 325) can control, for each respective each pixel (e.g., 209, 213, 309, 311, 313) in the LCD layers, whether light is more or less attenuated as it is transmitted through the pixel.
In the examples shown in
First, consider 2D light fields and 1D layers (an extension to 4D light fields and 2D layers is covered later).
Static Multilayer Displays: Consider a fixed stack of N light-attenuating layers (i.e., one that does not support temporal variation of the mask patterns). When illuminated by a uniform backlight with unit radiance, the emitted light field {tilde over (l)}(x, v) is given by the following expression:
where ƒ(n)(ξn)ε[0,1] is the transmittance at the point ξn of layer n, separated a distance dn from the x-axis.
Consider a three-layer configuration, with the transmittances for the rear, middle, and front layers given by ƒ(ξ1), g(ξ2), and h(ξ3), respectively. Equation 1 gives the following expression for the emitted light field.
{tilde over (l)}(x,v)=ƒ(ξ1)g(ξ2)h(ξ3), for ξn=x+(dn/dr)v (Eq. 2)
The emitted light field {tilde over (l)}(x,v) can be represented as the restriction of the function
{tilde over (t)}(ξ1,ξ2,ξ3)=ƒ(ξ1)g(ξ2)h(ξ3) (Eq. 3)
defined in the three-dimensional Euclidean space 3 spanned by {ξ1, ξ2, ξ3}, to the two-dimensional subspace defined by the equation αξ1+βξ2+γξ3=0, with
α=d3−d2, β=d1−d3, γ=d2−d1 (Eq. 4)
Thus, elements of the emitted light field {tilde over (l)}(x,v) are restricted to the plane corresponding to Equation 4.
For the general case with N>3 layers, the emitted light field {tilde over (l)}(x,v) can also be represented as the restriction of the function
defined on N, to a plane.
In practice, each layer has discrete pixels with constant transmittances rather than continuously-varying opacities. Thus, it is desirable to tabulate the transmittance ƒi
=f∘g∘h, such that {tilde over (t)}ijk=ƒigjhk, (Eq. 5)
where ∘ is the vector outer product.
Only a subset of tensor elements {tilde over (t)}ijk correspond to valid light field rays; most tensor elements correspond to “non-physical” rays (i.e., ones that spontaneously change position or direction after passing through a layer). To address this limitation of tensor representation, define a sparse, binary-valued weight tensor such that the emitted light field tensor is given by the following expression:
where is the Hadamard (elementwise) product.
Non-zero elements of are close to the plane defined by Equation 4. Tensors provide sparse, memory-efficient representations for static N-layer displays; only the non-zero elements of are stored.
Time-Multiplexed Multilayer Displays: static multilayer displays have finite degrees of freedom. Artifacts, resulting from limited depths of field and fields of view, persist in the emitted light field. These artifacts are typically observed as blur. These artifacts may be mitigated by increasing the degrees of freedom.
Increased degrees of freedom may be achieved by rapid temporal modulation, such that the observer perceives the average of an M-frame sequence.
Generalizing Equation 1, the emitted light field {tilde over (l)}(x,v) is given by
where ƒm(n)(ξn) is the transmittance at the point ξn of layer n during frame m.
Let columns of the matrix F(n)=[f1(n)f2(n) . . . fM(n)] define the sequence of M masks displayed on layer n. For a three-layer display, Equation 7 can be represented in discrete coordinates as a 3rd-order, rank-M tensor given by
where matrices enclosed by double square brackets correspond to the CP decomposition (canonical polyadic decomposition) of a tensor into a sum of rank-1 tensors.
The CP decomposition is equivalent to CANDECOMP (canonical decomposition) and PARAFAC (parallel factors), with elements of the tensor given by
. For the general case with N light-attenuating layers and M time-multiplexed frames, the emitted light field can be represented as an Nth-order, rank-M tensor =[[F(1), F(2), . . . , F(N)]].
Light field synthesis with time-multiplexed, multilayer displays requires decomposing a target light field l(x, v) into an M-frame sequence of N transmittance functions ƒm(n)(ξn). This can be formulated as the following constrained nonlinear least squares problem:
where {tilde over (l)}(x,v) is the emitted light field, given by Equation 7, and X and V denote the intervals [xmin, xmax] and [vmin, vmax].
The tensor representation discussed above provides an efficient means for solving Equation 9. Using this representation for a three-layer configuration with discrete coordinates, the objective function is expressed as
where is the target light field tensor, obtained by assigning the target light field l(x, v) to the plane defined by Equation 4, and
is the squared tensor norm of χ.
This expression can be solved by applying weighted nonnegative tensor factorization (NTF) and multiplicative update rules. For a three-layer display, these update rules have the following forms:
F←F((W(1)L(1))(H⊙G))⋄((W(1))(F(H⊙G)T))(H⊙G)) (Eq. 11)
G←G((W(2)L(2))(H⊙F))⋄((W(2)(G(H⊙F)T))(H⊙F)) (Eq. 12)
H←H((W(3)L(3))(G⊙F))⋄((W(3)(H(G⊙F)T))(G⊙F)) (Eq. 13)
In these expressions, ⋄ is Hadamard (elementwise) division. Also, in these expressions ⊙ is the Khatri-Rao product, defined for a pair of matrices AεI×K and BεJ×K, such that
A⊙B=[a
1{circle around (x)}b1 a2{circle around (x)}b2 . . . aK{circle around (x)}bK], (Eq. 14)
where {circumflex over (x)} is the Kronecker product and ai and bj denote the ith and jth columns of A and B, respectively.
These update equations also make use of the tensor matricization (unfolding) operation, defined such that X(n) arranges the mode-n fibers of X to be columns of the resulting matrix.
For the general case with N light-attenuating layers and M frames, Equation 10 has the following form:
where =[[F(1), F(2), . . . , F(N)]].
Similarly, the update rules are generalized such that
F
(n)
←F
(n)((W(n)L(n)F⊙n)⋄(W(n)(F(n)(F⊙(n))T))F⊙n) (Eq. 16)
where F⊙n is defined by the following expression:
F
⊙
n
≡F
(N)
⊙ . . . ⊙F
(n+1)
⊙F
(n−1)
⊙ . . . ⊙F
(1) (Eq. 17)
4D light fields and 2D layers require vectorizing the 2D layer transmittances, giving a similar set of transmittance vectors fm(n). Values are clamped to the feasible range after each iteration of Equation 16.
According to principles of this invention, tensor representation allows for the decomposition of a target light field into a set of time-multiplexed, light-attenuating layers. The multiplicative update rules allow an efficient, GPU-based implementation that achieves interactive refresh rates with multilayer LCDs.
As shown in the fourth column (517) of
In exemplary implementations of this invention, an alternate approach for achieving wider fields of view is used: replacing conventional uniform backlighting with time-multiplexed directional backlighting.
A directional backlight is equivalent to a low-resolution light field display. Consider a directional backlight that has significantly lower spatial resolution, but equivalent angular resolution and field of view, as compared to the target light field l(x, v). In that case, it is desirable to enhance the spatial resolution by covering a low-resolution light field display with an N-layer stack of light-attenuating layers. Generalizing Equation 7, the light field emitted by such a display architecture is given by the following expression:
where bm(x, v) denotes the light field emitted by the backlight during frame m.
Let B denote the discrete backlight light field, such that bas corresponds to pixel s of view a. The backlight light field can be equivalently represented as a vector b, defined as follows.
b=[b
1
T
b
2
T
. . . b
S
T]T, for bs=[b1sb2s . . . bAs]T (Eq. 19)
Using this parameterization, Equation 18 can be represented in discrete coordinates as an N+1-order, rank-M tensor , given by
where tensor element
Since Equations 8 and 20 are similar, NTF can also be applied to optimize multilayer displays with directional backlighting.
As shown in
Tensor displays can exploit the additional degrees of freedom arising from multiple layers and frames to achieve high-fidelity light field reconstructions.
The upper portion 605 of
Objects close to the display appear sectioned across layers. For example, an object close to the display may map primarily to the front layer, with residual details assigned to other layers. Similar sectioning behaviors have been observed in the past with multilayer-only decompositions. Unlike these works, however, joint multilayer, multiframe decompositions produce additional time-varying, high-frequency patterns that appear across all layers and resemble content-adaptive parallax barriers.
The bottom portion 607 of
In
Tensor display decompositions exhibit predictable structures, whose arrangement arise from the specific display configuration. Heuristically-defined methods can achieve similar fidelity with reduced computation.
The performance of an automultiscopic display can be quantified by its depth of field: an expression for the maximum spatial frequency ωξ
Taking the 2D Fourier transform of Equation 18 yields the following expression for the emitted light field spectrum {circumflex over (l)}(ωx, ωy):
where ωx and ωv are the spatial and angular frequencies, * denotes convolution, and the repeated convolution operator is defined as
For uniform backlighting, the backlight spectrum
{circumflex over (b)}m(ωx, ωv)=δ(ωx, ωv), the Dirac delta function.
The spectral support of a tensor display is the region of non-zero values in the emitted light field spectrum, for all possible layer masks and backlight illumination patterns. The spectral support for the light field reflected by a diffuse surface is the line ωv=(do/dr)ωx.
Intersecting this line with the spectral support for a given display provides a geometric construction for the upper bound on the depth of field. For example, the emitted light field spectrum for a parallax barrier or integral imaging display is non-zero only for |ωx|≦1/(2Δx) and |ωv|≦1/(2Δv), where Δx and Δv are the spatial and angular sampling rates, respectively. In practice, the spatial sampling rate Δx is the spacing between barrier slits/pinholes or lenslets.
The geometric construction yields the following expression for the depth of field:
where Δv=(2dr/A)tan(α/2) with A views and field of view α.
The geometric construction provides an upper bound on the depth of field for any tensor display architecture. Consider a two-layer display with uniform backlighting, with the layers separated by a distance Δd and ω0=1/(2p) denoting the maximum spatial frequency for each layer with pixel pitch p. Equation 21 defines the light field spectrum, where d1=−Δd/2 and d2=Δd/2. A diamond-shaped region bounds the spectral support for any two-layer display. The spatial cutoff frequency ωξ
Using the previously described geometric construction, the depth of field for a three-layer display with uniform backlighting and equally-spaced layers is given by
where Equation 21 is again applied to find the spectral support, with d1=−Δd, d2=0, and d3=Δd.
The spectral support for a three-layer display exceeds that of a similar parallax barrier or integral imaging display, leading to increased depth of field.
Incorporating directional backlighting can significantly expand the field of view. The depth of field for a single-layer display using directional backlighting is obtained by a similar geometric construction.
Consider a directional backlight which implements a low-resolution light field display, such that {circumflex over (b)}m(ωx, ωv) has non-zero support for |ωx|≦1/(2Δx) and |ωv|≦1/(2Δv). This yields the following depth of field expression:
where ω0 again denotes the spatial cutoff frequency for the layer.
The addition of a single light-attenuating layer significantly increases the spatial resolution for a conventional parallax barrier or integral imaging display, particularly near the display surface. However, far from the display, the depth of field is identical to these conventional automultiscopic displays.
Advantageously, tensor displays can achieve increased depth of field by covering any low-resolution light field display with time-multiplexed, light-attenuating layers. In a prototype of this invention, the optimization program uses continuously-varying layer transmittances. Alternately, the upper bound of the depth of field can be characterized with discrete pixels.
In some implementations of this invention: (a) static and time-multiplexed tensor displays have identical spectral supports (i.e., averaging over an M-frame sequence does not alter the support via Equation 21); yet (b) time multiplexing significantly reduces artifacts. Without being limited by theory, the reduced artifacts may be attributable, at least in part, to additional degrees of freedom allowed with time multiplexing. While the upper bound of the depth of field may be identical, in practice it cannot be achieved with static methods, motivating tensor displays for joint multilayer, multiframe decompositions capable of approaching the upper bound.
An important benefit of tensor displays is to open a design trade space not accessible to conventional automultiscopic displays. Conventional multilayer-only or multiframe-only decompositions require many layers or prohibitively high frame rates, limiting their practicality using current LCD technology. However, with joint multilayer, multiframe decompositions, display designers can explore the interdependence of the number of layers, the number of frames, and the image brightness.
In exemplary implementations, tensor displays use fewer layers and frames achieve higher-fidelity reconstructions than conventional methods, in a manner supported by current LCD technology. Tensor displays can achieve wide fields of view, as required for multiviewer scenarios.
Consider a fixed set of uniformly-spaced viewpoints during optimization. Providing closely-spaced target views sufficiently constrains the decompositions so minimal artifacts are perceived at intermediate viewpoints.
In some implementations, it is desirable to maximize image fidelity (e.g., PSNR) as a function of device complexity (i.e., the number of layers and frames). Increasing the number of frames allows the number of layers to be decreased (for a given PSNR). Image fidelity also depends on the brightness scale βε[0,1] applied to the target light field. Modifying Equation 15 yields the following objective function supporting a trade-off between image brightness and fidelity.
Decreasing brightness generally yields higher-fidelity reconstructions for the same number of layers and frames.
For example, consider the trade space for multilayer displays with brightness β=0.2 m, in a prototype of this invention. In this example: (a) static decompositions (i.e., M=1) cannot exceed 30 dB, even with as many as eight layers; (b) to achieve 40 dB with eight layers, two frames are required; (c) there is a trade-off between layer complexity and refresh rate along the 40 dB curve; and (d) using six frames, only three layers are required, with more frames providing marginal benefits. Thus, with tensor displays, high-speed displays may be used to reduce device complexity, minimizing the number of layers to achieve a certain image fidelity.
Adding a directional backlight alters the design trade space. For example, consider a prototype of this invention, in which a directional backlight has 47×29 lenslets. In this directional backlight example: (a) two frames are still required to reach 40 dB using eight layers; but (b) only a single layer is required using eight frames. In this example, the directional backlight effectively reduces the number of required layers by one. This underscores the practical benefits of the tensor display framework, which can employ multilayer decompositions, time-multiplexing, and directional backlighting together.
In other applications of a prototype of this invention: (i) for three layers and a uniform backlight, four frames are required to achieve 40 dB, and with additional frames, brightness can be significantly increased (up to β≈0.6); and (ii) for single layer and a directional backlight, a minimum of eight frames are required to achieve 40 dB. (If the directional backlight itself includes an LCD layer, then a single LCD layer and directional backlight actually comprise two LCD layers).
Conventional automultiscopic displays, including parallax barriers and integral imaging, exhibit a set of periodically-repeating viewing zones. In contrast, recent (prior art) computationally-optimized multilayer and multiframe displays generally (i) exhibit a set of non-repeating viewing zones, and (ii) yield extended depths of field, greater resolution, and increased brightness. However, these (prior art) computationally-optimized multilayer and multiframe displays typically have, for any single viewer, a limited field of view per of α≦20°.
Tensor displays can support wider fields of view, while retaining the benefits of computational optimization. In a prototype of this invention, a field of view of α=50°×20° is achieved, for a light field with 9×3 views, using either five layers and uniform backlighting or a single layer and directional backlighting. Prior art multilayer-only and multiframe-only decompositions lack sufficient degrees of freedom to achieve high-PSNR reconstructions for these scenarios.
A prototype of this invention comprises a reconfigurable tensor display capable of implementing two-layer and three-layer architectures with uniform or directional backlighting. The layers are constructed using three modified Viewsonic® VX2268wm 120 Hz LCD panels. The front and rear polarizing films are removed from the front two LCDs, and the stack is interleaved with alternating crossed linear polarizers. Aluminum brackets added to the rear panel allow lenslet arrays to be affixed for operation as a directional backlight. A rectangular lenslet array is approximated using two crossed lenticular sheets, purchased from Micro Lens Technology, Inc. The corrugated surfaces of the sheets are held in direct contact, minimizing astigmatic aberrations. The directional backlight supports varying spatio-angular resolution trade-offs using 10, 15, and 20 lenses per inch (LPI) lenticular sheets. In directional backlighting modes, an additional polarizing film is placed after the lenslet arrays, restoring the linear polarization state before rays impinge on the next LCD in the stack.
This prototype employs offline and online solvers based on Equation 16. Computation is divided between CPUs (central processing units) for the offline solver, and GPUs (graphical processing units) for the online solver. The offline solver is run on an Intel Core® i5 workstation with 10 GB of RAM. The online solver is run on an Intel Core i7 workstation with 6 GB of RAM and an external Nvidia® QuadroPlex 7000 graphics unit containing two Quadro® GPUs and a G-Sync card. This provides four frame-synchronous DVI (Digital Visual Interface) outputs capable of driving the LCDs at 120 Hz.
In this prototype, target light fields are rendered using POV-Ray (Persistence of Vision Raytracer program) or, for interactive applications, using OpenGL (Open Graphics Language). Rendered light fields have a spatial resolution of 840×525 pixels (i.e., half the resolution of LCDs used in this prototype) and an angular resolution of 5×5 views.
This prototype employs nonnegative tensor factorization (NTF) using the multiplicative update rules described above. An offline, Matlab®-based solver is used for simulations. Decomposing a target light field into a six-frame sequence for three layers takes approximately 30 minutes using 50 updates. Color channels are processed independently. An online, GPU-accelerated solver is implemented in OpenGL and Cg (C for Graphics). In some applications, the update rules are cast as additive combinations of the logarithms of the layer transmittances. Using this representation, the update rules are mapped to standard operations of the graphics pipeline, including projective texture mapping, accumulation buffers, floating point framebuffers, and perspective rendering. These operations are not only computationally efficient, but also memory-efficient, as only the non-zero tensor elements need to be stored and processed. For interactive applications, temporal coherence between decompositions may be exploited, seeding each frame with the prior result. Portions of the pseudocode used for the GPU-accelerated solver is set forth in the Pseudocode section below.
In this prototype, separate threads are used to decouple the decomposition from the display routines. Decompositions are evaluated in an asynchronous thread, updating layer patterns as they become available. This ensures that all display layers can be continuously refreshed at 120 Hz, without waiting for updated decompositions. This prototype can achieve up to 10 multiplicative updates per second for as many as 12 frames. Light fields with reduced spatial or angular resolution can be decomposed and displayed at interactive refresh rates, as shown in the supplementary video.
In this prototype, the multiplicative updates constitute an alternating least squares solution to the nonlinear tensor factorization problem, employing steepest descent with a fixed step length. While this approach typically exhibits slow convergence using a CPU-based implementation, each update is efficiently computed using the GPU-accelerated implementation. To support interactive applications, temporal coherence between decompositions can be exploited, seeding each frame with the prior decomposition. For static scenes, seeding results in one update per frame. For interactive applications, seeding introduces motion blur. Given sufficient computational resources, blur can be eliminated by using multiple updates per frame.
This prototype is reconfigurable, as noted above. In one configuration, this prototype comprises a three-layer LCD with uniform backlighting.
In this three-layer, uniform backlighting configuration: Acrylic spacers separated each panel by Δd=4.0 cm. The target light field was rendered with a field of view of α=20°×20° and brightness β=0.2 (see Section 4.2
Experiments with the prototype provide insights into practical engineering issues. Accurate mechanical alignment is desirable. Decomposed layers exhibit high-frequency patterns; it is desirable that these patterns be properly aligned. Accurate alignment is ensured by displaying perspective images of a crosshair array on each layer. A camera is placed at the desired viewer position (e.g., directly in front of the display at a distance of 2 m) and the patterns were shifted until alignment was obtained. Radiometric calibration is desirable, including measuring the black levels and gamma values. The former are incorporated as constraints in the update rules, while the latter are addressed by applying gamma correction at runtime. Without being limited by theory, remaining variations in color and intensity may be attributable to differences in the LCD color gamut, color filter cross-talk, moire due to stacking multiple layers, and angular color variation common to high-speed LCDs.
In a second configuration, this prototype comprises a single LCD with a directional backlight. The backlight uses crossed 10 LPI lenticular sheets, yielding a field of view of α=48′×48′ and backlight resolution of 187×117 lenslets. The front LCD is separated by Δd=8.5 mm from the middle of the lenticular sheets. Remaining system parameters are identical to the three-layer prototype. The crossed lenticular sheets produce strong absorption along lens boundaries. In a commercial implementation, lenslet arrays can be manufactured with minimal absorption. Alternatively, edge-lit directional backlighting can eliminate this artifact. As shown in
Tensor displays open a large design trade space that was inaccessible using prior automultiscopic displays. With a tensor display framework, designers can maximize image fidelity, brightness, and field of view, depending on the number of layers and maximum refresh rate allowed by the design constraints and display technology, respectively.
In some cases, this invention is implemented with a single LCD with a directional backlight. This design has many advantages. Such displays support a wide field of view with relatively few frames (i.e., as few as three). Thus, provided with 180 Hz LCDs, this design can achieve a thin form factor, wide field of view, bright automultiscopic display with an effective refresh rate of 60 Hz.
In some cases, this invention employs joint multilayer, multiframe decompositions with uniform backlighting. These decompositions can be an effective tool for optimizing multilayer displays with uniform backlighting. Such displays with uniform backlighting can have the added benefit of a tunable field of view. This allows viewing zones to adapt to the location of viewers. (In contrast, in one of the prototypes of this invention, directional backlighting has a fixed field of view).
A prototype of this invention exhibits several limitations inherent to layered architectures, including moire, color-channel crosstalk, interreflections, misalignment, and dimming due to layered color filter arrays. Many of these issues can be resolved with additional optical engineering. Moire, interreflections, and misalignment can be mitigated using holographic diffusers, antireflective coatings, and rigid enclosures, respectively. A direct solution to crosstalk is to alter the transmission profiles of the color filters; however, this approach will further decrease brightness. Instead, field sequential color can be applied (i.e., using a backlight that sequentially strobes each color), albeit by placing additional demands on the refresh rate.
In some implementations of this invention, color filters are not used for each layer. In that case, decompositions are performed assuming monochromatic panels interspersed with a few color filters.
The weight tensor applied in Equation 16 allows decompositions to be tuned to the positions of viewers. Head or eye tracking may be employed, and the weight matrix can be altered to only project automultiscopic imagery aligned to each viewer. Between viewers, the emitted light field can be unconstrained, allowing for higher-fidelity, brighter imagery.
The image formation model, given by Equation 18, can be generalized. For example, the model can be generalized to apply to time-multiplexed, light-attenuating layers over a uniform light source, with one lenslet array between the first and second layers.
In some implementations of this invention, at least some of the layers comprise both light-attenuating and light-emitting materials. Optionally, refractive elements can be placed at any point (e.g., a Fresnel lens in front of the display to extend the depth of field).
In some implementations of this invention, additional views are used to create accommodation and convergence cues. To support more views, higher-speed displays can be employed. For example, tensor displays can be implemented with digital microshutters (DMS), capable of achieving 1,440 Hz refresh rates, allowing 24 frames with an effective refresh rate of 60 Hz.
In a prototype of this invention, least-squares optimization is employed. Alternately, perceptual error metrics may be used in tensor displays. Advantageously, perceptual error metrics allow further reductions in complexity (i.e., fewer layers and frames). In some cases, perceptual error metrics involve nonlinear objectives and use modified optimization schemes.
Tensor displays can bring together the advantages of multilayer panels, high refresh rates, and directional backlighting. In exemplary implementations of this invention, tensor displays comprise computational displays, wherein the display architecture and encoding algorithm are jointly optimized to maximize optical and computational efficiency.
This invention can be implemented as a single LCD with a directional backlight. This design achieves a wide field of view and large depth of field with a thin form factor using efficient multiplicative updates.
Tensor displays may be implemented in many different ways. Preferably, in tensor displays, multiple light-attenuating optical elements are combined in a way such that each ray in a target light field intersects each optical element at most once. Light-attenuating elements can be arranged in layers comprising any of the following: angularly-invariant spatial light modulators, purely directional modulators, and spatio-angular modulators. A low-resolution light field backlight, for instance, implemented by a lenslet array on top of an LCD, is one type of spatio-angular modulator.
In exemplary implementations of this invention, the tensor space spanned by a tensor display with N optical elements, such as layers, is of dimension N. The light field only occupies a low-dimensional manifold within the tensor space. The shape of the manifold depends on a particular tensor display configuration (e.g., a three-layer display or a dual-layer configuration with an additional directional backlight). A weighted nonnegative tensor decomposition has non-zero values only on the low-dimensional manifold created by the light field in tensor space.
In some implementations of this invention, a low-resolution directional backlight combined with a high-resolution layer, such as an LCD, can achieve high image quality by temporally multiplexing only a few frames.
As shown in
As shown in
In a prototype of this invention (with three LCD layers and a uniform backlight), a minimum of 50 iterations is usually needed to ensure high image fidelity, but about 6 to 12 time-multiplexed frames achieve a high image quality even for a challenging scene exhibiting a large depth of field. In this prototype, light fields with uncorrelated views, such as Arabic numerals, can be successfully synthesized using the proposed low-rank tensor factorization.
In a test of this prototype (with three LCD layers and a uniform backlight): (i) low image quality can be achieved due to a large depth of field; (ii) low-rank approximations using 6 and 12, respectively, time-multiplexed frames create a visually appealing approximation of the light field; and (iii) higher-rank factorizations do not improve image quality significantly, demonstrating that light field tensors are inherently of low rank.
Specifically,
Here are a few definitions and clarifications. As used herein:
The terms “a” and “an”, when modifying a noun, do not imply that only one of the noun exists. For example, if “a” ball exists, this does not imply that only one ball exists.
A display is “automultiscopic” if it produces a 3D image that can be perceived by a human not wearing glasses or other optical apparatus. The 3D image produced by an automultiscopic display, when viewed by a human not wearing glasses or other optical devices: (i) includes multiple views, the view seen depending on the angle at which the image is viewed, (ii) exhibits binocular disparity; and (iii) exhibits motion parallax in both horizontal and vertical directions.
A “backlight” provides illumination for one or more transmissive components of a display device. The transmissive components are optically closer to a viewer, compared to the backlight, which is optically further from the viewer.
The term “comprise” (and grammatical variations thereof) shall be construed broadly, as if followed by “without limitation”. If A comprises B, then A includes B and may include other things.
A backlight is “directional” if it is configured to output light at an angle, relative to a display surface of the backlight, that can vary at different times. Here are some examples of “directional” backlights: (i) In a simple implementation, a directional backlight may comprise an array of light-emitting devices behind a lens, where the different light-emitting devices are configured to be turned on one at a time, causing light to exit the front of the lens at an angle that varies over time. (ii) In some cases, a directional backlight may include a layer comprising more than one pixel, the angle of light outputted by each pixel in that layer being separately controllable. (iii) In some cases, a directional backlight may output light at multiple angles at the same time. (iv) In some cases, a directional backlight may include a layer comprising a spatial attenuator, the attenuation caused by each pixel in that layer being separately controllable.
The term “e.g.” means including without limitation.
To minimize the “error” between two things is to minimize a measure of (or based on) the difference between the two things. For example, a solution to a least squares problem may minimize the error between an actual and a target light field, in a least squared sense.
The fact that an “example” or multiple examples of something are given does not imply that they are the only instances of that thing. An example (or a group of examples) is merely a non-exhaustive and non-limiting illustration.
Unless the context clearly indicates otherwise: (1) a phrase that includes “a first” thing and “a second” thing does not imply an order of the two things (or that there are only two of the things); and (2) such a phrase is simply a way of identifying the two things, respectively, so that they each can be referred to later with specificity (e.g., by referring to “the first” thing and “the second” thing later). For example, unless the context clearly indicates otherwise, if an equation has a first term and a second term, then the equation may (or may not) have more than two terms, and the first term may occur before or after the second term in the equation. A phrase that includes “a third” thing, a “fourth” thing and so on shall be construed in like manner.
The “flicker fusion frequency”, as used herein, means 30 Hz.
In the context of a display device (and components of the device), “front” is optically closer to a viewer, and “rear” is optically further from the viewer, when the viewer is viewing a display produced by the device during normal operation of the device. The “front” and “rear” of a display device continue to be the front and rear, even when no viewer is present. Similar terms, such as “behind”, shall be construed in like manner.
The terms “horizontal” and “vertical” shall be construed broadly. For example, “horizontal” and “vertical” may refer to two arbitrarily chosen coordinate axes in a Euclidian two dimensional space.
The term “include” (and grammatical variations thereof) shall be construed broadly, as if followed by “without limitation”.
“Intensity” shall be construed broadly to include any measure of or related to intensity, energy or power. For example, the “intensity” of light includes any of the following measures: irradiance, spectral irradiance, radiant energy, radiant flux, spectral power, radiant intensity, spectral intensity, radiance, spectral radiance, radiant exitance, radiant emittance, spectral radiant exitance, spectral radiant emittance, radiosity, radiant exposure and radiant energy density.
As used herein, the “number” of spatially addressable, light attenuating layers in a display device does not count any such layers that are part of a directional backlight that illuminates the display. For example, if the only spatially addressable, light attenuating layers in a display device are (1) a front LCD layer and (2) an LCD layer that is part of a directional backlight, then the “number” of such spatially addressable, light attenuating layers is treated as one.
The term “or” is an inclusive disjunctive. For example “A or B” is true if A is true, or B is true, or both A or B are true.
“NTF” means nonnegative tensor factorization. The “order” (or “dimension”) of a tensor is the minimum number of indicia needed to uniquely identify a component of the tensor.
A parenthesis is simply to make text easier to read, by indicating a grouping of words. A parenthesis does not mean that the parenthetical material is optional or can be ignored.
Persistence of vision is not a “perception error metric”, as used herein.
To vary something “per pixel” means to vary it at respective pixels. Something may vary “per pixel” even if it varies only at some, but not all, of the pixels in a set of pixels. Similar terms (e.g. “on a per pixel basis”) shall be construed in like manner.
A “pixel” includes the smallest addressable element in a display device. For example, a light-transmitting or light-emitting display device may have pixels.
The “rank” of a tensor is the minimum number of simple tensors with which it is possible to express as a sum.
A “simple tensor” can be completely factorized into vectors.
A “single-layer” display using directional backlighting can be implemented in multiple ways. For example, a “single-layer” display using directional backlighting may comprise a single LCD layer and a purely angular backlight. Or, for example, a “single-layer” display using directional backlighting may comprise both (1) a front LCD layer and (2) a directional backlight, where the directional backlight itself includes another LCD layer. The phrase “single-layer” is used to indicate that there is only one spatially addressable light attenuating layer, not counting any such layer in the directional backlight. Similar phrases (however worded) regarding a single layer (or single LCD) and a directional backlight shall be construed in like manner.
The term “sparse” shall be construed broadly. For example, in a conventional context, a tensor is “sparse” if a majority of the tensor components are zero. This allows significant data compression, zeros being insignificant data that is not stored. Of course, in an unconventional binary system, a conventional zero may be replaced with another value. More generally, a tensor is “sparse” if it functions in a manner equivalent to a conventional sparse tensor. For example, a tensor is “sparse” if a majority of its components are a value that is treated as insignificant data for compression purposes.
A display layer is “time-multiplexed” if it is configured to display a sequence of frames at a rate equal to or faster than the flicker fusion frequency. For example, a display layer that is configured to display a sequence of frames at a rate of 60 Hz (60 frames per second) is time-multiplexed.
A “tensor display” is an automultiscopic display device wherein: (1) the device includes one or more spatially addressable, light attenuating layers; (2) the device includes a controller, which is configured to perform calculations to control the device; and (3) the calculations involve using weighted NTF.
A backlight is “uniform” if it is not directional.
Here are examples of mathematical notation (e.g., font, capitalization, and math symbols) used herein:
α is a scalar;
a is a vector;
A is a matrix;
χ is a tensor;
χ(i) is matricization (unfolding) of tensor χ along mode i;
χxiA=AX(i) is a tensor-matrix product along mode i;
a∘b is a vector outer product;
AB is a Hadamard matrix product (elementwise product);
A⋄B is Hadamard matrix division (elementwise division);
A{circumflex over (x)}B is a Kronecker product of two matrices A, B;
A{circumflex over (x)}=A(N){circumflex over (x)} . . . {circumflex over (x)}A(I) is a Kronecker product of N matrices A(N), . . . , A(I);
A{circle around (x)}n=A(N){circumflex over (x)} . . . {circumflex over (x)}A(n+1){circumflex over (x)}A(n−1){circumflex over (x)} . . . {circumflex over (x)}A(I) is a Kronecker product of N−1 matrices A(N), . . . , A(I), skipping A(n);
A⊙B is a Khatri-Rao product of two matrices;
A⊙=A(N)⊙ . . . ⊙A(1) is a Khatri-Rao product of
N matrices A(N), . . . , A(1); and
A⊙n=A(N)⊙ . . . ⊙A(n+1) ⊙A(n−1) ⊙ . . . ⊙A(1) is a Khatri-Rao product of N−1 matrices A(N), . . . , A(1), skipping A(n).
Variations:
This invention may be implemented in many different ways. Here are some non-limiting examples.
This invention may be implemented as a method comprising, in combination: (a) using a backlight to provide light to a display device, which display device includes one or more spatially addressable, light attenuating layers; (b) using the layers to display a temporal sequence of frames; and (c) using one or more processors (i) to perform an optimization calculation to compute, for each respective frame in the sequence and each respective layer in the one or more layers, attenuation of the light at respective pixels of the respective layer; and (ii) to output control signals to control the attenuation; wherein (I) the optimization calculation includes at least one mathematical operation an Nth-order, rank-M tensor, where M is equal to the number of frames in the sequence, and N is equal to the number of the layers, if the backlight is uniform, and N is equal to the number of the layers plus one, if the backlight is directional, (II) the optimization calculation includes applying a weighted nonnegative tensor factorization, and (III) either (A) the backlight is uniform and the number of the layers is at least three or (B) the backlight is directional and the number of the layers is at least one. Furthermore: (1) the backlight may be directional; (2) the backlight may be directional and the number of the layers may be at least two; (3) the backlight may be uniform and the number of the layers may be at least three; (4) the tensor may be sparse. The display device may be configured to produce an automultiscopic display. The automultiscopic display may have one or more fields of view; the display device may have a front display surface; each respective field of view, out of the one or more fields of view, may be centered about a viewing axis; and the method may further comprise dynamically varying the viewing axis of each field of view, respectively, including to orientations that are not normal to the front display surface, and tracking gaze or head position of a human user of the display device.
This invention may be implemented as apparatus comprising, in combination: (a) a display device, which display device includes one or more spatially addressable, light attenuating layers, which layers are configured to display a temporal sequence of frames; (b) a backlight, the backlight being configured to provide light to the display device; and (c) one or more processors, the one or more processors being configured (i) to perform an optimization calculation to compute, for each respective frame in the sequence and each respective layer in the one or more layers, attenuation of the light at respective pixels of the respective layer; and (ii) to output control signals to control the attenuation; wherein (I) the backlight is directional and the number of the layers is at least one, (II) the optimization calculation includes at least one mathematical operation on an rank-M tensor, M being equal to the number of frames in the sequence, and (III) the optimization calculation includes applying a weighted nonnegative tensor factorization. The backlight may be directional and the tensor may have an order equal to N+1, where N is the number of the layers. The backlight may comprise a lens and a spatially addressable light modulating layer; the lens may have a focal length; and the light modulating layer may be positioned at a distance from the lens, which distance is equal to the focal length. The backlight may be directional and the number of the layers may be equal to at least two. The display device may be configured to produce an automultiscopic display. The tensor may be sparse. The optimization calculation may optimize based at least in part on perception error metrics. The optimization calculation may calculate a set of per pixel attenuations, which set minimizes error between a light field transmitted from the display device and a light field that would be created by a target 3D scene. At least one of the layers may comprise both (i) optical elements configured to transmit light and (ii) optical elements configured to emit light.
This invention may be implemented as apparatus comprising, in combination: (a) a display device, which display device includes one or more spatially addressable, light attenuating layers, the layers being configured to display a temporal sequence of frames; (b) a backlight, the backlight being configured to provide light to the display device; and (c) one or more processors, the one or more processors being configured (i) to perform an optimization calculation to compute, for each respective frame in the sequence and each respective layer in the one or more layers, attenuation of light at respective pixels of the respective layer; and (ii) to output control signals to control the attenuation; wherein (I) the backlight is uniform, (II) the optimization calculation includes at least one mathematical operation on an Nth-order, rank-M tensor, M being equal to the number of frames in the sequence and N being equal to the number of the layers, (III) the optimization calculation includes applying a weighted nonnegative tensor factorization, and (IV) the number of the layers is at least three. The display device may be configured to produce an automultiscopic display, which display concurrently has one or more fields of view; the display device may have a front display surface; each respective field of view, out of the one or more fields of view, may be centered about a viewing axis; and (d) the display device may be configured to dynamically vary the viewing axis of each field of view, respectively, including to orientations that are not normal to the front display surface. The apparatus may be configured to track gaze or head position of a human user of the display device. The tensor may be sparse.
It is to be understood that the methods and apparatus that are described above and below are merely illustrative applications of the principles of the invention. Numerous modifications may be made by those skilled in the art without departing from the scope of the invention.
The following pseudocode documents a GPU-based implementation of nonnegative tensor factorization for tensor displays. all underlying operators for this particular application map well to functions of the fixed graphics pipeline. NTF is implemented using OpenGL and a set of CG shaders.
The pseudocode is designed for a tensor display consists of L light-attenuating layers, each displaying F frames in rapid succession. An optional, low-resolution directional backlight is also supported. Both the original light field and the backlight are assumed to consist of V different views. As the display of the decomposed layers is a time-critical operation, requiring a frame rate that matches the monitor refresh rate, it is implemented in a different thread than the decomposition, which can be run at a lower frame rate. The separate decomposition thread updates the light field at interactive frame rates and decomposes it into a set of L layers, each with F different time frames, and an additional directional backlight with V views and F time frames.
The pseudocode documents the main display loop (i.e., the algorithm titled NTF—Main Display Routines) for synchronized rendering of temporally-multiplexed layers and the backlight with monitor refresh rates. This implementation assumes that calibrated interlacing masks are available for each view of the light field. These masks are multiplied by the corresponding rendered view and added together to generate an interlaced image to be displayed behind a lenslet array.
Decomposition routines (e.g., the algorithm titled NTF—Content-Updating Thread) are also documented, implementing weighted nonnegative tensor factorization.
This application is a non-provisional of, and claims the benefit of the filing date of, U.S. Provisional Application Ser. No. 61/590,507, filed Jan. 25, 2012, the entire disclosure of which is herein incorporated by reference.
This invention was made with U.S. government support under: (i) grant IIS-1116452, awarded by the National Science Foundation; and (ii) awards 10-DARPA-1102, N66001-10-1-4041, and P.O. 10320917, awarded by the Defense Advanced Research Projects Agency. The government has certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
61590507 | Jan 2012 | US |