The present disclosure relates generally to the field of digital image processing and display, and, more specifically, to the field of superresolution display.
The development of higher-resolution displays is of central importance to the display industry. Leading mobile displays recently transitioned from pixel densities of less than 50 pixels per cm (ppcm) and now approach 150 ppcm. Similarly, the consumer electronics industry begins to offer “4K ultra-high definition (UHD)” displays, having a horizontal resolution approaching 4,000 pixels, as the successor to high-definition television (HDTV). Furthermore, 8K UHD standards already exist for enhanced digital cinema. Achieving such high-resolution displays currently hinges on advances that enable spatial light modulators with increased pixel counts.
Beyond these larger market trends, several emerging display technologies necessitate even greater resolutions than 4K/8K UHD standards will provide. For example, wide-field-of-view head-mounted displays (HMDs), such as the Oculus Rift, incorporate high-pixel-density mobile displays. Such displays approach or exceed the resolution of the human eye when viewed at the distance of a phone or tablet computer. However, they appear pixelated when viewed through magnifying HMD optics, which dramatically expand the field of view. Similarly, glasses-free 3D displays, including parallax barrier and integral imaging, require an order of magnitude higher resolution than today's displays. At present, HMDs and glasses-free 3D displays remain niche technologies and are less likely to drive the development of higher-resolution displays than the existing applications, hindering their advancement and commercial adoption.
The following briefly reviews the state-of-art related to high resolution display technologies.
Superresolution imaging algorithms have been used to recover a high-resolution image (or video) from low-resolution images (or videos) with varying perspectives. Superresolution imaging requires solving an ill-posed inverse problem: the high-resolution source is unknown. Methods differ based on the prior assumptions made regarding the imaging process. For example, in one approach, camera motion uncertainty is eliminated by using piezoelectric actuators to control sensor displacement.
In one of the superresolution display systems that have been developed, a “wobulation” method is used to double the addressed resolution for front-projection displays incorporating a single high-speed digital micro-mirror device (DMD). A piezoelectrically-actuated mirror displaces the projected image by half a pixel, both horizontally and vertically. Since DMDs can be addressed faster than the critical flicker fusion threshold, two shifted images can be rapidly projected, so that the viewer perceives their additive superposition. As with a jittered camera, the superresolution factor increases as the pixel aperture ratio decreases. The performance is further limited by motion blur introduced during the optical scanning process. More recently, wobulation has been extended to flat panel displays, using an eccentric rotating mass (ERM) vibration motor applied to an LCD.
Similar superresolution display concepts have been developed for digital projectors. Rather than presenting a time-multiplexed sequence of shifted, low-resolution images, projector arrays can be used to display the displaced image set simultaneously. Such “superimposed projection” systems have been demonstrated by multiple research groups. As with all projected arrays, superimposed projections required precise radiometric and geometric calibration, as well as temporal synchronization. These issues can be mitigated using a single-projector superresolution method where multiple offset images are created by an array of lenses within the projector optics. Unlike superimposed projectors, these images must be identical, resulting in limited image quality.
Wobulation and other temporally-multiplexed methods introduce artifacts when used to superresolve videos due to unknown gaze motion. Eye movement alters the desired alignment between subsequent frames, as projected on the retina. If the gaze can be estimated, then superresolution can be achieved along the eye motion trajectory, as reportedly demonstrated.
All of the superresolution displays discussed thus far implement the same core concept: additive (temporal) superposition of shifted low-resolution images. As with image superresolution, such designs benefit from low pixel aperture ratio—diverging from industry trends to increase aperture ratios.
The so-called “optical pixel sharing (OPS)” approach is the first reported approach to exploit dual modulation projectors for superresolution by depicting an edge-enhanced image using a two-frame decomposition: the first frame presents a high-resolution, sparse edge image, whereas the second frame presents a low-resolution non-edge image. OPS requires an element be placed between the display layers (e.g., an array of lenses or a randomized refractive surface); correspondingly, existing OPS implementations do not allow thin form factors. OPS reproduces imagery with decreased brightness and decreased peak signal-to-noise ratio (PSNR).
Dual-modulation displays are routinely applied to achieve high dynamic range (HDR) display. HDR projectors are implemented by modulating the output of a digital projector using large flat panel liquid crystal displays (LCDs). A high dynamic range and high resolution projector system has been reportedly developed, where a three-chip liquid crystal on silicon (LCoS) projector emits a low-resolution chrominance image, which is subsequently projected onto another higher-resolution LCoS chip to achieve luminance modulation.
Displays with two or more Spatial Light Modulators (SLMs) have also been incorporated in glasses-free3D displays for multi-view imagery. It was reportedly demonstrate that content-adaptive parallax barriers can be used with dual-layer LCDs to create brighter, higher-resolution 3D displays.
Therefore, it would be advantageous to provide a display mechanism offering a high spatial and/or temporal display resolution beyond the native resolution and/or frame refresh rate of current-generation display panels.
Provided herein are methods and systems for image and video displays with increased spatial resolution using current-generation light-attenuating spatial light modulators (SLM), including liquid crystal displays (LCDs), digital micro-mirror devices (DMDs), and liquid crystal on silicon (LCoS) displays. Without increasing the addressable pixel count, cascaded displays in conjunction with pertinent data processing processes are employed to serve this end.
More specifically, in some embodiments, two or more SLMs are disposed on top of one another (or in a cascaded manner), subject to a lateral offset of half a pixel or less along each axis. The lateral offsets makes each pixel on one layer modulates multiple pixels on another. In this manner, the intensity of each subpixel fragment—defined by the geometric intersection of a pixel on one display layer with one on another layer—can be controlled, thereby increasing the effective display resolution. High resolution target images are factorized into multi-layer attenuation patterns, demonstrating that cascaded displays may operate as “compressive displays:” utilizing fewer independently-addressable pixels than apparent in the displayed image.
The similar methods may be adopted to increase the temporal resolution of stacks of two or more SLMs, refreshed in staggered intervals. However, in some other embodiments, temporal multiplexing of factorized imagery may not involved. As a result, videos can be presented without the appearance of artifacts characteristic of prior methods or the requirement for high-refresh-rate displays.
In contrast with the additive approaches adopted in the prior art, cascaded displays according to the present disclosure create a multiplicative superposition by synthesizing higher spatial frequencies by the (simultaneous) interference of shifted light-attenuating displays with large aperture ratios.
Cascaded displays offer several distinct advantages relative to prior superresolution displays: achieving thin form factors, requiring no moving parts, and using computationally-efficient factorization processes to enable interactive content.
According to one embodiment of the present disclosure, a method of displaying images comprises: (1) accessing original image data representing an image; factorizing the original image data into first image data and second image data; and displaying a representation of the image on a display device at an effective display resolution. The display device comprises a first display layer having a first native resolution and a second display layer having a second native resolution. The first display layer overlays the second display layer. The first image data is rendered for display on the first display layer, and the second image data is rendered for display on the second display layer. The effective display resolution is greater than the first and second native resolutions.
In one embodiment, the display devices include L display layers, where a respective display layer is laterally offset relative to an immediately adjacent display layer by 1/L pixel in two orthogonal directions. A pixel in the respective display layer is modulated using multiple pixels of an underlying display layer in the L display layers. The first and second image data may each correspond to a respective single frame of the image.
The original image data may represent a single frame of pixels of the image, wherein the first image data represents a first plurality of frames the image, and the second image data represent a second plurality of frames of the image. The first plurality of frames are sequentially rendered on the first display layer, and the second plurality of frames are sequentially rendered on the second display layer. The first plurality of frames and the second plurality of frames can be rendered in synchronization or out of synchoronization.
According to another embodiment of the present disclosure, a method of displaying images comprises: (1) accessing first frames representing one frame of an image in a first spatial resolution; (2) accessing second frames representing the one frame of the image in a second spatial resolution; (3) sequentially rendering the first frames for display on a first display layer of a display device; and (4) sequentially rendering the second frames for display on a second display layer of the display device. The first display layer overlays the second display layer with a lateral shift in two perpendicular directions by a fraction of a pixel of the first display layer. An effective display resolution resulted from the sequentially renderings is greater than the first spatial resolution and the second spatial resolution.
According to another embodiment of the present disclosure, a display system comprises: a processor; memory; and a plurality of display layers coupled to the processor and the memory and disposed in a cascaded manner and comprising a first and a second display layers. The first display layer offsets by a fraction of a pixel with reference to the second display layer in two orthogonal lateral directions. The memory stores instructions that implement a method comprising: (1) accessing first image data representing the image and second image data representing the image; (2) rendering the first image data for display on the first display layer at a first spatial resolution; and (3) rendering the second image data for display on the second display layer at a second spatial resolution. An effective display resolution of the representation of the image is greater than the first native spatial resolution and the second native spatial resolution.
The foregoing is a summary and thus contains, by necessity, simplifications, generalization and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting. Other aspects, inventive features, and advantages of the present invention, as defined solely by the claims, will become apparent in the non-limiting detailed description set forth below.
Embodiments of the present invention will be better understood from a reading of the following detailed description, taken in conjunction with the accompanying drawing figures in which like reference characters designate like elements and in which:
Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the preferred embodiments, it will be understood that they are not intended to limit the invention to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of embodiments of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be recognized by one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the embodiments of the present invention. Although a method may be depicted as a sequence of numbered steps for clarity, the numbering does not necessarily dictate the order of the steps. It should be understood that some of the steps may be skipped, performed in parallel, or performed without the requirement of maintaining a strict order of sequence. The drawings showing embodiments of the invention are semi-diagrammatic and not to scale and, particularly, some of the dimensions are for the clarity of presentation and are shown exaggerated in the drawing Figures. Similarly, although the views in the drawings for the ease of description generally show similar orientations, this depiction in the Figures is arbitrary for the most part. Generally, the invention can be operated in any orientation.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the present invention, discussions utilizing terms such as “processing” or “accessing” or “executing” or “storing” or “rendering” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories and other computer readable media into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices. When a component appears in several embodiments, the use of the same reference numeral signifies that the component is the same component as illustrated in the original embodiment.
As used herein, the term “superresolution” (SR) refers to signal-processing techniques designed to enhance the effective spatial resolution of an image or an imaging system to better than that corresponding to the size of the pixel of the original image or image sensor.
Overall, embodiments of the present disclosure create a multiplicative superposition by synthesizing higher spatial and/or temporal frequencies by the simultaneous interference of shifted light-attenuating displays with large aperture ratios. A stack of two or more multiplicative display layers (or spatial light modulator (SLM) layers) are integrated in a display device to synthesize a spatially-superresolved image. Based on an original image or a set of video frames with a target spatial/temporal resolution, a factorization process is performed to derive respective image data for presentation on each display layer.
In one aspect, the display layers in a stack are laterally shifted with each other, resulting in an effective spatial resolution exceeding the native display resolutions of the display layers. High fidelity to a high resolution original image can be advantageously achieved with or without time-multiplexing attenuation patterns, although the later offer better performance in terms of reducing the appearance of artifacts. A real-time, graphics processing unit (GPU)-accelerated cascaded display algorithm is presented and eliminates the need for temporal multiplexing, while preserving superresolution image fidelity.
In another aspect, two or more display layers (or SLMs) are refreshed in staggered intervals to synthesize a video with an effective refresh rate exceeding that of each individual display layer, e.g., by a factor equal to the number of layers. Further optically averaging neighboring pixels can minimize artifacts.
Also provided herein is a comprehensive optimization framework based on non-negative matrix and tensor factorization. Particularly, the weighted rank-1 residue iteration approach can outperform the prior multiplicative update rules.
Modeling Cascaded Dual-Layer Displays
In general, the construction of the cascaded display device may exploit spatial or temporal multiplexing to increase the effective number of addressable pixels. As a result, a decomposition problem needs to be solved to determine the optimal control of the display components to maximize the perceived resolution, subject to physical constraints (e.g., limited dynamic range, restricted color gamut, and prohibition of negative emittances).
In one embodiment, a dual-layer display includes a pair of spatial light modulators (SLMs) placed in direct contact in front of a uniform backlight and contains a uniform array of pixels with individually-addressable transmissivity at a fixed refresh rate. The layers are disposed with a lateral offset of each other. For example, the layers can be offset from each other by a fraction of a pixel in two orthogonal directions. However, the present disclosure is not limited by the amount, dimension or directions of lateral offset.
As a result, this configuration creates a uniform array of subpixel fragments defined by the overlap of pixels on the bottom layer with those on the top. For example, the subpixel fragment S2.1 is defined by the pixel a2 of the bottom layer 110 and pixel b1 of the top layer. Therefore, there exist four times as many subpixel fragments as pixels on an individual, establishing the capacity to quadruple the spatial resolution.
Assuming the bottom layer 110 has N pixels and the top layer 110 has M pixels in total. During operation of the display device, K time-multiplexed frames are presented to the viewer at a rate above the critical flicker fusion threshold, such that their temporal average is perceived. Using temporal multiplexing can advantageously increase the degrees of freedom available to reduce image artifacts.
Hereinunder, the emissivity of pixel i in the bottom layer 110, for frame k, is denoted as ai(k), such that 0≦ai(k)≦1. Similarly, bj(k), denotes the transmissivity of the pixel j of the top layer, for frame k, such that 0≦b(k)≦1. The emissivity of each subpixel fragment is represented by si,j, which can be expressed as
where wi,j is a factor for denoting the overlap of pixel i and pixel j.
This expression (1) implies that dual-layer image formation can be concisely expressed using matrix multiplication:
S=W°(ABT). (2)
where ° denotes the Hadamard (element-wise) matrix product; A is an N×K matrix, whose columns contain bottom layer pixel emissivities during frame k; B is an M×K matrix, whose columns contain the top-layer pixel transmissivities during frame k; W is an N×M sparse weight matrix, containing the pair-wise overlaps; and S is a sparse N×M matrix containing the subpixel fragment emissivities. S can be non-zero only where pixel i and pixel j overlap.
The image formation model given by Equations (1) and (2) can be applied to various types of spatial light modulators, including panels with differing pixel pitches. Furthermore, relative lateral translations and in-plane rotations of the two layers can be encoded in an appropriate choice of the weight matrix W.
This model can be practically applied to existing flat panel displays (e.g., LCD panels containing color filter arrays and limited pixel aperture ratios) and digital projectors (e.g., those containing LCD, LCoS, or DMD spatial light modulators), and so on.
Cascaded displays according to the present disclosure can provide enhanced spatial resolution by layering spatially-offset, temporally-averaged display panels.
In some embodiments, assuming all layers have identical square pixels, each layer is offset by 1/L pixel with respect to the previous layer. The resultant cascaded display then has L2 times as many subpixel fragments as any individual layer therein.
At 202, the original image frame is decomposed into multiple frame sets through a factorization process, each frame set for a respective display layer. The factorization process can be performed in various suitable manners, including the exemplary computational processes described in greater detail below. Each respective frame set may contain one or more frames (also referred to as “patterns” herein) in a spatial resolution compatible with the corresponding display layer.
At 203, the frame sets derived from 202 are rendered on respective display layers for display. More specifically, with regards to each display layer, the corresponding frame set is rendered sequentially for display. As a collective result, a user can perceive an effective spatial resolution of the display device that exceeds the native resolution of each individual layer. A spatial superresolution is therefore advantageously achieved.
To factorize a target high-resolution image, in some embodiments, the image can be sampled and rearranged as a sparse matrix W°T containing subpixel fragment values analogously to S. Thus, the image is represented by a series of time-multiplexed attenuation pattern pairs (e.g., columns of A and B to be displayed across the two layers).
For example, to display or reconstruct an image on a cascaded dual-layer display in a superresolution, the original image data can be factorized into two single patterns, one for each layer. In some other embodiments, temporal multiplexing can be incorporated in the factorization process to derive multiple frames for display during the integration period of the user eyes. Thus, the multiple frames in each frame set are consecutively rendered for display on a corresponding layer.
In one embodiment, a simple heuristic factorization is utilized and capable of losslessly reconstructing a spatially-superresolved target image using four time-multiplexed attenuation layer pairs (K=4), assuming that both layers have the same pixel structure and the lateral shift is half a pixel along both axes.
As shown, a time-multiplexed sequence of shifted pinhole grids are displayed on the bottom layer (first row representing frames for Layer 1), together with aliased patterns on the top layer (second row representing frames for Layer 2). Each bottom-layer pixel illuminates the corners of four top-layer pixels, as shown in row 3. When the four frames are presented at a rate exceeding the flicker fusion threshold, the viewer perceives an image with four times the number of pixels in any layer. Note that, the cascaded display may appear dimmer than a conventional display if the backlight brightness remains the same.
As shown in
Although no artifacts are present in the reconstructed images, heuristic factorizations appear with one quarter the brightness as a conventional single-layer display, since each subpixel fragment is only visible during one of four frames.
In another embodiment, an optimized compressive factorization process is employed for deriving the frame data for respective layers. By application of Equation (2), optimal dual-layer factorizations are provided by solving the following constrained least-squares problem:
where ≦ is the element-wise matrix inequality operator. Note that for the brightness scaling factor, 0<β≦1 is required to allow solutions that reduce the luminance of the perceived image, relative to the target image (e.g., as observed with the heuristic four-frame factorization). If the upper bounds on A and B are ignored, then Equation (3) corresponds to weighted non-negative matrix factorization (WNMF). As a result, any weighted NMF algorithm can be applied to achieve spatial superresolution, with the pixel values clamped to the feasible range after each iteration. For example, the following multiplicative update rules can be used:
The double line operator denotes Hadamard (element-wise) matrix division.
Similar multiplicative update rules can be applied to multi-layer 3D displays. In terms of computation performance, weighted rank-1 residue iterations (WRRI) may be preferred for being robust and efficient. Table 1 presents a pseudo code showing an exemplary factorization process of deriving the matrix A and B which represent the frame data sets for two display layers, respectively. A and B are calculated iteratively according to a weighted Rank-I Residue (WRRI) iteration process. WRRI is specified in Table 1, with xj denoting column j of a matrix X and [xj]+ denoting projection onto the positive orthant, such that element i of [xj]+ is given by max(0, xi,j).
As described above, Equations (2) and (3) cast image formation by dual-layer cascaded displays as a matrix factorization problem, such that the factorization rank equals the number of time-multiplexed frames. Hence, WNMF-based factorization allows configurations of reconstruction accuracy, the number of time-multiplexed frames, and the brightness of the reconstructed image.
The partial reconstructions are presented in frames of 531, 532, and 533 and the cascaded image 540 is presented as the end result, which is compared with a reconstructed image 550 using a conventional approach and the target image 510. When the three frames for an individual layer (e.g., 511-513 of Layer 1) are presented at a rate greater than the critical flicker fusion threshold, the viewer perceives a superresolved image 540 with four times the number of pixels. If backlight brightness remains the same, the cascaded display may appear dimmer than a conventional display using a single display layer. Increasing the brightness scaling factor β can compensate for absorption losses.
As discussed with reference to
In some embodiments, given a cascaded display with L (L>1) layers that are refreshed in a staggered manner, a frame refresh time of a particular layer may lag behind the frame refresh time of a previous layer by a fraction (=1/L for example) of frame refresh cycle.
In general, cascaded displays advantageously can achieve high quality results in terms of spatial and temporal resolutions, even without temporal multiplexing. As discussed above, eliminating temporal multiplexing is equivalent to displaying a rank-1 factorization. WRRI is a preferred efficient method for solving this rank-1 factorization, achieving real-time frame rates for high-definition (HD) target frames (a variant of alternating least squares for solving NMF as discussed in detail below). This observation is significant to enable real-time applications. For instance, a GPU-based implementation of fast rank-1 factorization can be used for interactive operation of the cascaded head-mounted display).
Cascaded displays according to the present disclosure can also enhance temporal resolution by layering multiple temporally-offset, spatially-averaged displays. Temporally offsetting multiple display panels of a cascaded display synthesizes a temporal superresolution display. More specifically, the frame refresh time for each layer is offset from that of a previous layer by a fraction of a fraction of frame refresh cycle. As a consequence, a viewer of the cascaded display perceives a video content being displayed in a high refresh rate than the native refresh rate(s) of individual layers.
In some embodiments, the multiple layers in the cascaded display are mechanically aligned with respect to pixels and are refreshed in a staggered fashion.
According to the present disclosure, for spatial superresolution, optional temporal multiplexing generally enhances the reconstruction fidelity. Similarly, for temporal superresolution, spatial averaging reduces reconstruction artifacts by increasing the degrees of freedom afforded by dual-layer displays with staggered refreshes. In some embodiments, spatial averaging is achieved by introducing a diffusing optical element on top of a flat panel cascaded display (e.g., a dual-layer LCD) or by defocusing a projector employing cascaded displays.
Equation (5) is an exemplary objective function to determine optimal factorizations for temporal superresolution:
Here, A is a length-FN column vector, containing the bottom-layer pixel emissivities, concatenated over F video frames; similarly, B is a length-FM column vector, containing the top-layer pixel transmissivities, concatenated over F video frames. The permutation matrices {P1, P2} reorder the reconstructed subpixel fragments S=ABT such that the first F columns of the product P1ABTP2 contain the length-NM subpixel fragments, corresponding to the superresolved image displayed during the corresponding frame. Spatial averaging is represented as the FN×FN convolution matrix C, which low-pass filters the columns of P1ABTP2.
Once again, W is a sparse weight matrix, containing the pair-wise overlaps across space and time. Finally, W°T denotes the subpixel fragments for the target temporally-superresolved video. In some embodiments, if the goal is to increase frame rate, not spatial fidelity, time-multiplex needs not be performed on each target frame over K factorization frames.
Joint spatial and temporal superresolution is directly supported by the objective function presented in Equation (5). The weight matrix W subsumes temporal as well as spatial overlaps. Hence, it is sufficient to set the weight matrix elements accordingly. To solve Equation (5), in some embodiments, the following update rules (6) and (7) are used for implementing temporal superresolution using cascaded dual-layer displays, as described in greater detail in a later section below.
For simplicity, these multiplicative update rules are specified for spatiotemporal superresolution. However, the WRRI algorithm can be similarly adapted. More specifically, given an implementation for the update rules of Equation (4), instead of constructing the matrices {C, P1, P2}, a spatial blur is applied to the current estimate ABT between the iterations.
In one embodiment, all layers and frames are initialized to uniformly-distributed random values. The entire video is factorized simultaneously. For longer videos, a sliding window of frames can be factorized, constraining the first frames in each window to equal the last frames in the previous window. As demonstrated in
The multiplicative update rules (Equation (4)) and the WWRI method (Algorithm 1 in Table 1) can be implemented in a software program configured for spatial superresolution with dual-layer displays in Matlab or any other suitable programming language. In one embodiment, the program is be configured to support arbitrary numbers of frames (i.e., factorization ranks) The fast rank-1 solver can be implemented using CUDA to leverage GPU acceleration (source code is provided in Table 6). All factorizations were performed on an Intel 3.2 GHz Intel Core i7 workstation with 8 GB of RAM and an NVIDIA Quadro K5000. The fast rank-1 solver maintains the native 60 Hz refresh rate, including overhead for rendering scenes and applying post-processing fragment shaders (e.g., in an HMD demonstration).
Data processing and operations of cascaded displays need the physical configuration of the display layers and their radiometric characteristics, e.g., to compute the pixel overlaps encoded in W in Equation 2. Misalignment among the display layers can be corrected in a calibration process, for example, by warping the image displayed on the second layer to align with the image displayed on the first layer.
For instance, two photographs are used estimate this warp. In each photograph, a checkerboard is displayed on one layer, while the remaining layer is set to be fully transparent or fully reflective. Scattered data interpolation estimates the warping function that projects photographed first-layer checker-board corners into the coordinate system of the image displayed on the second layer. The second-layer checkerboard (or any other image) is warped to align with the first-layer check-board. In addition, radiometric characteristics are measured by photographing flat field images; these curves are inverted such that each display is operated in a linear radiometric fashion. Thus, the geometric and radiometric calibration is used to rectify the captured images and correct vignetting—allowing direct comparison to predicted results.
A cascaded display device according to the present disclosure can be implemented as a dual-layer LCD screen, supporting direct-view and head-mounted display (HMD) device, a dual-layer LCoS projector, etc. Operating cascaded displays to achieve superresolution advantageously places fewer practical restrictions: no physical gap is required between the layers, enabling thinner form factors, and significantly fewer time-multiplexed frames are necessary to eliminate image artifacts.
The memory 930 stores a cascaded display program 931, which may be an integral part of the driver program for the display assembly 960. The memory 930 also stores the original graphics data 934 and the factorized graphics data 935. The cascaded display program 931 includes a module 932 for temporal factorization computation and a module 933 for spatial factorization computation. Provided with user configurations and original graphics data 934, the cascaded display program 931 derives factorized image data 935 for display on each display layer 961 and 962, as described in greater detail herein. For example, the temporal factorization module 932 is configured to perform a process according to Equations (5)-(7); and the spatial factorization module 933 is configured to perform a process according to Equations (3) and (4).
A cascaded display device according to the present disclosure can be implemented as an LCD used in a direct-view or head-mounted display (HMD) application. The display device may include a stack of LCD panels, interface boards, a lens attachment (for HMD use), and etc. For instance, each panel is operated at the native resolution of 1280×800 pixels and with a 60 Hz refresh rate. However, the present disclosure is not limited by the purposes or application utilizing cascaded display. The present disclosure is not limited by the type of display panels or configuration or arrangement of the multiple layers in cascaded display.
In some embodiments, a cascaded display device includes LCD panel(s) and organic light-emitting diode (OLED) panel(s), electroluminescent display panel(s) or any other suitable type of display layer(s), or a combination therefore.
A cascaded LCD display according to the present disclosure supports direct viewing from a distance, as with a mobile phone or tablet computer, and HMD using appropriate lens attachment.
All spatial superresolution results presented herein were captured using a Canon EOS 7D camera with a 50 mm f/1.8 lens. Temporal superresolution results, included in the supplementary video, use a Point Grey Flea3 camera with a Fujinon 2.8-8 mm varifocal lens. Due to the gap between the LCD modulation layers, the lateral offset will appear to shift depending on viewer location. The calibration procedure described above is used to compensate for the parallax. The display layer patterns are displayed at a lower resolution than the native panel resolution, allowing direct comparison to “ground truth” superresolved images.
In one embodiment, a head-mounted display (HMD) according to the present disclosure additionally includes a lens assembly (e.g., a pair of aspheric magnifying lenses) disposed away from the top LCD by by slightly less than their 5.1 cm focal length in order to synthesize a magnified, erect virtual image appearing near “optical infinity.” Head tracking is supported through the use of an inertial measurement unit (IMU). The GPU-accelerated fast WRRI solver can be used to process data for display in the HMD. This implementation is able to maintain the native 60 Hz refresh, including the time required to render the OpenGL scene, apply a GLSL fragment shader to warp the imagery to compensate for spherical and chromatic aberrations, and to factorize the resulting target image. Unlike direct viewing, an HMD allows a limited range of viewing angles—reducing the influence of viewer parallax and facilitating practical applications of cascaded LCDs.
Superresolution by cascaded displays may also be applied in cascaded liquid (LCoS) projectors, e.g., in compliance with 8K UHD cinematic projection standards. An exemplary LCoS projector includes multiple LCoS microdisplays, interface electronics, a relay lens, PBS, an aperture, projection lens, and an illumination engine, etc. These displays were operated at their native resolution of 1024×600 pixels, at a refresh rate of 60 Hz, an aperture ratio of 95.8% and reflectivity of 70%. The relay lens is used to achieve dual modulation by projecting the image of the first LCoS onto the second with unit magnification. The PBS cube can be positioned between the relay lens and second LCoS, replacing the original PBS plate. The dual-modulated image was projected onto a screen surface using projection optics.
The LCoS panels according the present disclosure can be positioned off-axis to prevent multiple reflections. If the two LCoS panels are perpendicular to, and centered along, the optical axis of the relay lens, then light can be reflected back to the first LCoS from the PBS cube, leading to experimentally-observed aberrations. Laterally shifting the LCoS panels away from the optical axis can reduce or eliminate these artifacts. The aperture is placed in front of the first LCoS to prevent any reflected light—now offset from the optical axis—from continuing to propagate.
Cascaded display techniques disclosed herein can also be applied in cascaded printed films. Printed semi-transparent color films can be reproduced using the patterns provided with the supplementary material. Only single-frame (i.e., rank-1) factorizations need to be presented with static films.
This section presents exemplary embodiments for formulating the WNMF problems for various spatial superresolution applications according to the present disclosure.
Given a non-negative matrix represented as
Tε
+
m×n,
and a target rank r<min(m, n), the following is to be solved:
Exemplary WNMF algorithms used for solving Equation (S.1) are compared in this disclosure, including weighted multiplicative update rules (herein referred to as “Blonde1”), the weighted rank-one residue iteration (WRRI) method, and an alternating least-squares Newton (ALS-Newton) method.
In example presented in
Table 2 lists the performance we achieve when running three iterations with each method for a 1576×1050 frames (timings averaged over 10 frames):
The following presents formulation of an exemplary WNMF process for joint spatiotemporal superresolution optimization.
If every pixel value is stacked at every staggered refresh time in a large vector for each layer, the spatio-temporal layer reconstruction is modeled as a weighted rank-1 NMF problem. Assume a non-negative matrix is given as
Tε
+
m×n,
the problem is then formulated as the following Equation (S.2)
The vectors a, b contain all layer pixels over all timesteps. The matrices P1, P2 are permutation matrices, where P1 will permute the rows of the abT which contains all possible spatial and temporal layer interactions (forward and backward in time). The matrix P2 will permute the columns of this matrix. Together they permute abT, so that the resulting matrix contains the stacked image corresponding to a particular time-step in one column. The weight matrix W assigns 0 to the large parts of this matrix, which correspond to no layer interaction. The matrix C is a potential blur applied to the superresolved image (e.g., a diffuser). A small blur allows an additive spatial coupling of nearby pixels.
After describing the spatiotemporal optimization problem (Equation (S.2)), the next step is to derive matrix factorization update rules. For simplicity, the multiplicative NMF rules (S.3) can be used, including weight-adaption. It will be appreciated that this derivation can be applied to other NMF algorithms straightforwardly. As presented earlier, the NMF rules for Equation (S.1) was
where the double lines denotes element-wise division. The generalization of the NMF problem can utilize the following simpler derivation by substituting
A:=CP
1a
B:=(bTP2)T=P2Tb (S.4)
Thus, Equation (S.3) becomes
Line three follows because permutations matrices have the property of
P
−1
=P
T.
The last line shows that the updated equation can be computed efficiently in parallel. The updates for a follows from symmetry
The derivation using Equation (S.4) can be applied analogously to the WRRI update rules.
The following embodiment employs an exemplary real-time rank-1 factorization process using an ALS-Newton method. According to the present disclosure, the exemplary ALS-Newton method is optimized for specific superresolution problems, especially for rank-1 factorization.
For rank r=1, a general nonnegative matrix factorization problem from Eq. (S.1) is simplified to:
In an alternating least squares scheme, one solves the biconvex problem from above by alternately solving for one of the two variables a, b while fixing the other one and iterating, as represented in Table 3.
For r=1, the non-negativity constraints
bε
+
n and a ε+m
can be removed in steps 3 and 4. After the unconstrained (and hence convex) sub-problem in Table 1, the solution can be projected to a non-negative solution with the same objective function value or by flipping the signs of the negative elements (assuming that the previous solution does not harm the constraint as well). So an algorithm for the unconstrained rank-1 ALS WNMF process can be derived, as presented in Table 4
Thus far, a non-convex problem has been formulated as a sequence of convex optimization problems. The “b-step” in Table 4 can be solved using Newton's method having quadratic convergence. As a result, the gradient and Hessian of f(b) is derived with
where the matrices D(ω) is introduced, which puts the matrix from the subscript on the diagonal. Also introduced is the matrix O(ω), which corresponds to the outer vector product operation with the vector in the subscript and the rhs, followed by vectorization. The second line allows to remove the Frobenius norm and so the gradient and Hessian of f are easily derived. For the gradient, it is represented as
The operator OT is the same as the outer vector product operation plus subsequent summation over the rows of the resulting matrix. So it simply needs to do the point-wise operation W°abT−W°W°T, do the outer product with a, sum over the rows of the corresponding matrix, which yields then the gradient with respect to b.
For the Hessian, a diagonal matrix is obtained with
Since the Hessian is a diagonal matrix
the inverse in Newton's method becomes simply a point-wise division. Table 5 shows an exemplary process for full Newton for rank-1, which can be used to implemented the process shown in Table 4.
Table 6 shows an exemplary real-time CUDA code for rank-1 factorization, which supports three different update rules, Blonde1, WRRI, and ALS-Newton. The code includes two kernels. One computes the nominator (or gradient) and denominator (or Hessian) for an update for a considered layer. Another one performs the update given those components.
The following embodiment employs an exemplary nonnegative tensor factorization process for multi-layer cascaded displays configured for superresolution.
As discussed above, multi-layer cascaded displays may use a weighted nonnegative tensor factorization (WNTF) in conjunction with multiplicative update rules. The generalized two-layer update rules are given by Equation (4).
A three-layer image formation model can be expressed as
where it is assumed that a bottom layer has I1 pixels, a middle layer has I2 pixels, and a top layers with I3 pixels. As discussed above, K time-multiplexed frames are rendered on the display device at a rate exceeding the critical flicker fusion threshold so that a viewer can perceive the presented images in a superresolution. The transmissivity of pixel i3 in the top layer, for frame k, is denoted as ci3(k) and 0≦ci3(k)≦1. w11,12,13 denotes the cumulative overlap of pixels i1, i2, and i3.
A tensor representation can be adopted for the image formation model. The canonical decomposition of an order-3, rank-K tensor can be defined as
where start operator denotes the vector outer product and {xk, yk, zk} represent column k of their respective matrices. Equation (S.11) can be used to concisely express image formation by a three-layer cascaded display:
where is a sparse tensor containing the effective emissivities of the subpixel fragments, W is also a sparse I1×I2×I3 tensor tabulating the cumulative pixel overlaps, and ° denotes the Hadamard (element-wise) product. Observe that {ak, bk, ck} represent the pixel values displayed on their respective layers during frame k (e.g., in lexicographic order). Hence, matrix A equals the concatenation of the frames displayed on the first layer such that A=[a1, a2, . . . , aK](similarly for the other layers).
Given this image formation model, the objective function can be used for optimal three-layer factorizations:
where β is the dimming factor applied to the target subpixel fragment emissivities W ° T. This objective can be minimized by application of the following multiplicative update rules
In the above expressions, ⊙ expresses the Khatri-Rao product:
X⊙Y=[x
1
★y
1
,x
2
★y
2
, . . . ,x
K
★y
K]. (S.18)
X(n) is the unfolding of tensor X, which arranges the node-n fibers of X into sequential matrix columns. Generalization to higher factorization orders can be similarly derived.
In this simulated example, the “drift” image was spatially superresolved by a factor of 16 using a stack of four light-attenuating layers, each shifted by ¼ of a pixel, along each axis. The target image, the depiction with a single (low-resolution) display layer, and the reconstruction using a cascaded four-layer display are shown from left to right. It shows that significant upsampling is achieved by the cascaded four-layer display.
In this example, the lateral offset is generalized to maximize the superresolution capability: by progressively shifting each layer by ¼ of a pixel and consequently creating 16 times as many subpixel fragments as pixels on a single layer. Using two-frame (i.e., order-4, rank-2) factorizations achieve high superresolution factors, as demonstrated by the fidelity of the inset regions in
In summary, a generalized framework is provided for cascaded displays that encompasses arbitrary numbers of offset pixel layers and numbers of time-multiplexed frame. For example, cascaded dual-layer displays provide a means to quadruple spatial resolution with practical display architectures supported by real-time factorization methods (e.g., the cascaded LCD screen and LCoS projector prototypes).
LCD panels primarily achieve color display by the addition of a color filter array (CFA) composed of a periodic array of spectral bandpass filters. Typically, three neighboring columns of individually-addressable subpixels, illuminated by a white backlight, are separately filtered into red, green, and blue wavelength ranges, together representing a single full-color pixel column. At sufficient viewing distances, spatial multiplexing of color channels becomes imperceptible. In some embodiments, it has been observed that cascaded dual-layer LCDs can still double the vertical resolution when vertically-aligned CFAs are present on each layer. Whereas, increasing the horizontal resolution may be problematic without modifying the CFA structure.
Two modifications are presented herein to address the problems: the use of multiple color filters per pixel (on the top-most layer) and the use of cyan-yellow-magenta CFAs. Use of both can result in cascaded dual-layer LCDs that appear as a single LCD with twice the number of color subpixels along each axis.
As each subpixel fragment may depict a different color if it has an independent color filter, cascaded dual-layer LCDs can be constructed using monochromatic panels (e.g., those free of any color filter arrays). Offsetting such displays by half a pixel, both horizontally and vertically, creates four times as many subpixel fragments as pixels on a single layer. To create a spatially-multiplexed color display, a CFA having one color filter per subpixel fragment may be used. This can be achieved by fabricating one panel with a CFA with half the pitch as a conventional panel, such that two vertically-aligned color filters are present at each pixel in the outermost display panel. In this manner, rather than the larger layer pixels, each subpixel is individually filtered by the single custom CFA.
As an alternative, two LCD panels with identical color filter arrays can be used.
Given a fixed CFA, a single filter can act on each column of pixels. Consider a pair of LCDs with periodic columns of cyan, yellow, and magenta filters, beginning with a cyan column on the left-hand side. The second panel can be positioned with an offset of one-and-a-half pixels to the right and half a pixel up or down (see
For example, in the diagram 1510 showing the first layer with a CFA, the pixels (a1-a3) in the first column are cyan; the pixels (a4-a6) in the second column are yellow, the pixels (a7-a9) in the third column are magenta, and the pixels (a10-a12) in the fourth column are cyan. In the diagram 1520 showing a second light-absorbing display placed in direct contact with the rear display layer with an identical CFA, the pixels (b1-b3) in the first column are magenta; the pixels (b4-b6) in the second column are cyan, the pixels (a7-a9) in the third column are yellow, and the pixels (a10-a12) in the fourth column are magenta.
The diagram 1530 shows the geometric overlap of offset pixel layers creates an array of subpixel fragments. The spectral overlap of the color filters creates an effective CFA that appears as a traditional red-greed-blue filter pattern with twice the pitch as the underlying CFAs. More specifically, the subpixels in columns 1531, 1534 and 1537 are blue, the subpixels in columns 1532 and 1535 are red, and the subpixels in columns 1533 and 1536 are green.
This idea can be extended to other sub-pixel layouts and color filters, such as a 2×2-grid of cyan, yellow, magenta, and white. When offset by a quarter pixel in each dimension, the resolution increases by four times, but now have apparent cyan, yellow, magenta, red, green, blue, and white sub-pixels. It will be appreciated that the multi-layer cyan-yellow-magenta CFAs described herein is not all-encompassing, and is offered as an illustrative example.
As with the 2×2-grid, more general CFA patterns and filter band pass spectra can be used with the basic principle: overlapped CFAs can synthesize arbitrary target CFAs that modulate individual subpixel fragments, while utilizing existing display manufacturing processes that create a single color filter per pixel, per display layer.
In some other embodiments, the utilization of high-speed LCDs may eliminate the need for CFAs. Instead field-sequential color (FSC) is used, in which monochromatic panels sequentially display each color channel, while the backlight color is altered.
In still some other embodiments, the effective CFA could also be achieved simply by manufacturing one of the layers using a red-greed-blue CFA with twice the normal pitch, with no CFA placed in the other layer.
With respect to spatial superresolution, solutions of Equation (3) offer a display designer a flexible trade-off between apparent image brightness, spatial resolution, and refresh rate, as captured by the dimming factor β, the resolution of the target image W° T, and the factorization rank K, respectively.
With respect to temporal superresolution, solutions of Equation (5) also offer flexible control between brightness, resolution, and refresh rate. Architectures intended for spatiotemporal superresolution may include an optical blurring element (characterized by the point spread function embedded in the convolution matrix C). In some embodiments, factorizations with 2×2-pixel uniform blur kernels are sufficient to render high-PSNR reconstructions for a variety of target videos, as described in greater detail below. However, in some other embodiments, effective superresolution can be achieved without added blur and therefore other diffuse elements need not be incorporated.
Several superresolution techniques according to the prior art are utilized to generate display results and compared with those generated from cascaded display system according to the present disclosure.
According to an additive superresolution display model in the prior art, a set of superimposed, shifted low-resolution images are presented, through vibrating displays and superimposed projections. It has been assumed that no motion blur is introduced which would further degrade image quality for vibrating displays.
An optical pixel sharing (OPS) approach according to the prior art is also used to generated images for comparison purposes. The OPS implementation requires specifying two tuning parameters: the edge threshold and the smoothing coefficient. Two dimensional grid search was used to optimize these parameters—independently for each target image—to maximize the PSNR or the SSIM index. In practice, ensemble-averaged tuning parameters are be used, increasing reconstruction artifacts. In contrast, cascaded displays according to the present disclosure do not require optimizing any such tuning parameters, further advantageously facilitating real-time applications.
The spatial light modulators used in each of these display alternatives may have variable pixel aperture ratios. As observed, limited aperture ratios translate to improved image quality for additive superresolution displays. However, spatial superresolution from additive superpositions is practically hindered due to the engineering challenges associated with limiting aperture ratios—particularly for superimposed projections. Furthermore, industry trends are pushing ever-higher aperture ratios (e.g., LCoS microdisplays and power-efficient LCDs). As a result, a 100% aperture ratio is assumed in all comparisons presented herein.
Several observations can be made from the visual comparisons and PSNR table. Foremost, for these examples, single-frame cascaded display factorizations closely approach or outperform all other methods utilizing two time-multiplexed frames. These PSNR advantages translate to visible reductions in artifacts.
Notice the enhancement relative to a conventional (low-resolution) display (column 1702). Cascaded displays (columns 1706 and 1707) significantly outperform optical pixel sharing (OPS) (columns 1704 and 1705), which relies on a similar dual-modulation architecture containing relay optics. Simulations of additive superresolution (columns 1703 and 1704) also appear to outperform OPS, under the assumption that no motion blur is used in the additive simulations.
Two-frame cascaded display factorizations (column 1707) outperform all other two-frame factorizations (e.g., column s 1703) by a significant margin and even four-frame additive superresolution. This highlights the benefits of the compressive capabilities enabled by our matrix-factorization-based approach.
The following expands on the PSNR analysis by comparing the modulation transfer functions (MTFs) characterizing each superresolution display alter-native: specifying the contrast of spatially-superresolved images, as a function of spatial frequency. The MTF of a display can be measured using a variety of test patterns, including natural image sets, spatial frequency chirps, and slanted edges. Here a chirped zone plate pattern is adopted and has form of (1+cos(cr2))/2, where r=sqrt(x2+y2), {x, y}ε[−π, π], and c controls the maximum spatial frequency.
MTF analysis confirms the earlier observations made regarding the relative performance of each approach. Furthermore, it reveals that single-frame cascaded displays effectively quadruple the spatial resolution (doubling it along each image dimension)—albeit with artifacts introduced by compression—maintaining greater than 70% contrast for the highest superresolved frequencies.
Three alternatives are compared: additive superresolution displays using either two or four frames, optical pixel sharing (OPS) using two frames, and cascaded displays using one, two, three and four frames. Additive superresolution uses a single display layer, whereas OPS and cascaded displays employ two display layers. Two versions are included for OPS. In one OPS version, its edge-threshold is optimized and used 1/ε=8 for smoothing. In the second OPS version, both the edge-threshold and the smoothing parameter 1/ε are optimized. For the optimization of the optimal parameters for this image set, the average PSNR in the last row of this table is used as the objective function. For the table on the right (in grey) OPS parameters are optimized per image for the best achievable quality.
The data demonstrates that single-frame cascaded displays achieve a better quality than two-frame additive superresolution displays, both in terms of PSNR and SSIM. Cascaded displays achieve roughly the quality of a two-frame OPS display: the average PSNR of single-frame cascaded displays is slightly less than for the jointly optimized OPS (our improvement to the original OPS paper), but our average single-frame SSIM is slightly better than jointly optimized OPS. The cascaded displays with two or more frames outperform all other methods by significant margins.
MTFs are computed using the slanted edge method. In this case, the MTF is estimated from the profile of the slanted edge. Note the slanted edge MTF of the cascaded display matches the MTF of the target image. OPS reproduces the slanted edge very well, since there is enough pixel intensity in the bright regions that it can redistribute to the edge.
Although certain preferred embodiments and methods have been disclosed herein, it will be apparent from the foregoing disclosure to those skilled in the art that variations and modifications of such embodiments and methods may be made without departing from the spirit and scope of the invention. It is intended that the invention shall be limited only to the extent required by the appended claims and the rules and principles of applicable law.
This application claims priority and benefit to U.S. Provisional Patent Application No. 61/955,057, filed on Mar. 18, 2014, titled “CASCADED DISPLAYS: SPATIOTEMPORAL SUPERRESOLUTION USING OFFSET PIXEL LAYERS,” the entire content of which is incorporated by reference herein for all purposes.
Number | Date | Country | |
---|---|---|---|
61955057 | Mar 2014 | US |