This invention relates to hardware acceleration of signal processing systems for displaying an image using holographic techniques.
We have previously described, in UK Patent Application No. GB0329012.9, filed 15 Dec. 2003, now published as WO2005/059881 (hereby incorporated by reference in its entirety), a method of displaying a holographically generated video image comprising plural video frames, the method comprising providing for each frame period a respective sequential plurality of holograms and displaying the holograms of the plural video frames for viewing the replay field thereof, whereby the noise variance of each frame is perceived as attenuated by averaging across the plurality of holograms.
Broadly speaking embodiments of the method aim to display an image by projecting light via a spatial light modulator (SLM) onto a screen. The SLM is modulated with holographic data approximating a hologram of the image to be displayed but this holographic data is chosen in a special way, the displayed image being made up of a plurality of temporal subframes, each generated by modulating the SLM with a respective subframe hologram. These subframes are displayed successively and sufficiently fast that in the eye of a (human) observer the subframes (each of which have the spatial extent of the displayed image) are integrated together to create the desired image for display.
Each of the subframe holograms may itself be relatively noisy, for example as a result of quantising the holographic data into two (binary) or more phases, but temporal averaging amongst the subframes reduces the perceived level of noise. Embodiments of such a system can provide visually high quality displays even though each subframe, were it to be viewed separately, would appear relatively noisy.
A scheme such as this has the advantage of reduced computational requirements compared with schemes which attempt to accurately reproduce a displayed image using a single hologram, and also facilitate the use of a relatively inexpensive SLM.
Here it will be understood that the SLM will, in general, provide phase rather than amplitude modulation, for example a binary device providing relative phase shifts of zero and a (+1 and −1 for a nomialised amplitude of unity). In preferred embodiments, however, more than two phase levels are employed, for example four phase modulation (zero, π/2, π, 3π/2), since with only binary modulation the hologram results in a pair of images one spatially inverted in respect to the other, losing half the available light, whereas with multi-level phase modulation where the number of phase levels is greater than two this second image can be removed. Further details can be found in our earlier application GB0329012.9 (ibid), hereby incorporated by reference in its entirety.
Although embodiments of the method are computationally less intensive than previous holographic display methods it is nonetheless generally desirable to provide a system with reduced cost and/or power consumption and/or increased performance. It is particularly desirable to provide improvements in systems for video use which generally have a requirement for processing data to display each of a succession of image frames within a limited frame period.
According to the present invention there is therefore provided a hardware accelerator for a holographic image display system, the image display system being configured to generate a displayed image using a plurality of holographically generated temporal subframes, said temporal subframes being displayed sequentially in time such that they are perceived as a single reduced-noise image, each said subframe being generated holographically by modulation of a spatial light modulator with holographic data such that replay of a hologram defined by said holographic data defines a said subframe, the hardware accelerator comprising: an input buffer to store image data defining said displayed image; an output buffer to store holographic data for a said subframe; at least one hardware data processing module coupled to said input data buffer and to said output data buffer to process said image data to generate said holographic data for a said subframe; and a controller coupled to said at least one hardware data processing module to control said at least one data processing module to provide holographic data for a plurality of said subframes corresponding to image data for a single said displayed image to said output data buffer.
Preferably a plurality of the hardware data processing modules is included for processing data for a plurality of the subframes in parallel. In preferred embodiments the hardware data processing module comprises a phase modulator coupled to the input data buffer and having a phase modulation data input to modulate phases of pixels of the image in response to an input which preferably comprises at least partially random phase data. This data may be generated on the fly or provided from a non-volatile data store. The phase modulator preferably includes at least one multiplier to multiply pixel data from the input data buffer by input phase modulation data. In a simple embodiment the multiplier simply changes a sign of the input data.
In embodiments an output of the phase modulator is provided to a space-frequency transformation module such as a Fourier transform or inverse Fourier transform module. In the context of the holographic subframe generation procedure described later these two operations are substantially equivalent, effectively differing only by a scale factor. In other embodiments other space-frequency transformations may be employed (generally frequency referring to spatial frequency data derived from spatial position or pixel image data). In some preferred embodiments the space-frequency transformation module comprises a one-dimensional Fourier transformation module with feedback to perform a two-dimensional Fourier transformation of the (spatial distribution of the) phase modulated image data to output holographic subframe data. This simplifies the hardware and enables processing of, for example, first rows then columns (or vice versa).
In preferred embodiments the hardware data also includes a quantiser coupled to the output of the transformation module to quantise the holographic subframe data to provide holographic data for a subframe for the output buffer. The quantiser may quantise into two, four or more (phase) levels. In preferred embodiments the quantiser is configured to quantise real and imaginary components of the holographic subframe data to generate a pair of subframes for the output buffer. Thus in general the output of the space-frequency transformation module comprises a plurality of data points over the complex plane and this may be thresholded (quantised) at a point on the real axis (say zero) to split the complex plane into two halves and hence generate a first set of binary quantised data, and then quantised at a point on the imaginary axis, say 0j, to divide the complex plane into a further two regions (complex component greater than 0, complex component less than 0). Since the greater the number of subframes the less the overall noise this provides further benefits.
Preferably one or both of the input and output buffers comprise dual-ported memory.
In some particularly preferred embodiments the holographic image display system comprises a video image display system and the displayed image comprises a video frame.
The invention further provides a holographic image display system including a hardware accelerator as described above.
These and other aspects of the invention will now further be described, by way of example only, with reference to the accompanying figures in which:
a and 15b show, respectively, a holographic image display system incorporating a hardware accelerator, and a consumer electronic device incorporating the holographic image display system of
In an embodiment, the various stages of the hardware accelerator implement the algorithm listed below. The algorithm is a method of generating, for each video frame I=Ixy, sets of N binary-phase holograms h(1) . . . h(N). Statistical analysis of the algorithm has shown that such sets of holograms form replay fields that exhibit mutually independent additive noise.
Step 1 forms N targets Gxy(n) equal to the amplitude of the supplied intensity target Ixy, but with independent identically-distributed (i.i.t.), uniformly-random phase. Step 2 computes the N corresponding full complex Fourier transform holograms guv(n). Steps 3 and 4 compute the real part and imaginary part of the holograms, respectively. Binarisation of each of the real and imaginary parts of the holograms is then performed in step 5: thresholding around the median of muv(n) ensures equal numbers of −1 and 1 points are present in the holograms, achieving DC balance (by definition) and also minimal reconstruction error. In an embodiment, the median value of muv(n) assumed to be zero. This assumption can be shown to be valid and the effects of making this assumption are minimal with regard to perceived image quality. Further details can be found in the applicant's earlier application (ibid), to which reference may be made.
The purpose of the phase-modulation block shown in the embodiment of
The quantisation hardware that is shown in the embodiment of
There are many different ways in which phase-modulation data, as shown in
In another embodiment, pre-calculated phase modulation data is stored in a look-up table and a sequence of address values for the look-up table is produced, such that the phase-data read out from the look-up table is random. In this embodiment, it can be shown that a sufficient condition to ensure randomness is that the number of entries in the look-up table, N, is greater than the value, m, by which the address value increases each time, that m is not an integer factor of N, and that the address values ‘wrap around’ to the start of their range when N is exceeded. In a preferred embodiment, N is a power of 2, e.g. 256, such that address wrap around is obtained without any additional circuitry, and m is an odd number such that it is not a factor of N.
In some implementations of an OSPR-type algorithm the input image is padded with zeros around the edges to create an enlarged image plane prior to performing a holographic transform, for example, so that the transformed image fits the SLM (for more details see co-pending UK patent application no, 0610784.1 filed 2 Jun. 2006, hereby incorporated by reference in its entirety. In such a case when performing an (I)FFT the zeros (more precisely, the zeroed areas) may be omitted to speed up the processing.
Further details of an example embodiment of the system are described below:
In this example, the holograms (OSPR frames) were displayed on an SXGA (1280×1024) reflective binary phase modulating spatial light modulator (SLM) made by CRL Opto. The SLM was driven by CRL Opto's custom interface board, taking either a DVI or a digitised VGA signal. The native signal was a 1280×1024 60 Hz, 8 bits per colour plane signal, yielding a total of 24 bits. This signal was interpreted as 24 individual binary planes which were displayed sequentially on the SLM at a rate of 1440 frames per second.
This was well suited to an N=24 implementation of OSPR (although N=16 provides a good projected image). A VGA signal, as described above, was provided from an FPGA development board.
In a constructed embodiment the FPGA development board used to implement the algorithm comprised a Virtex-II (xc2v2000-ff896) Multimedia and Microblaze demonstration board of Xilinx Inc. The Xilinx ISE Foundation software was used to synthesise and implement the design from a Verilog entry. The board was programmed with the Xilinx Parallel Cable TV, and Chipscope Integrated Logic Analyser (ILA) cores were inserted for the process of debugging.
Whether or not a Fast Fourier Transform (FFT) is used, a two-dimensional transform may still be achieved by splitting into rows then columns. Given that the FFT supports streaming, a complete two dimensional Fourier transform using 1D 1024-point transforms therefore takes 2π2 clock cycles, plus the latency: i.e. for a clock running at 108 Mhz (for reasons described later), a complete 1024×1024 transform takes 19.5 ms, or it can be run at a maximum frequency (with this example hardware) of just over 50 Hz.
In the present application, a shortcut may be taken when one bears in mind that for any binary hologram, a conjugate image is produced.
or a total 14.6 ms transform time, or about 69 Hz. For an implementation, N=24 this yields a maximum frame-rate of 5.72 fps (frames per second), and for N=16 a frame-rate of 8.57 fps, as a single Fourier transform produces two frames. For full frame rate video (at least 25 fps) either more FFT cores may be provided in parallel on the FPGA, or the core can be clocked at a higher speed, or a lower value of N can be employed.
A median quantisation process for both the real and the imaginary outputs of the Fourier transform. This helps to ensure the overall DC balancing. Median quantisation, however, generally requires all values to be known before the middle value can be found, so that all values can be quantised to (1, −1) based on which side they are of the median value.
To implement this in an FPGA could cause a bottleneck since it would require one pass to find the median of the data before the quantisation stage. Also, all 1024×1024 16-bit real and imaginary values would have to be stored to be compared with the median. This would require an additional 1024×1024×2=2097152 clock cycles, or 19.4 ms if running at 108 MHz. Two work-around possibilities are:
Both of these methods can be very easily pipelined: (1) is easily implemented by simply storing the sign-bit of the output of the FFT; and (2) can be pipelined by storing preferably all the last frames FFT output values, and sorting them while the current frame is being calculated.
For versions of simplicity, and because only a limited amount of memory was available on board, method (1) was chosen for the described example embodiment.
This was implemented using a CORDIC (Coordinate rotation digital computer) core. The selected core also had an in-built scale compensator to compensate for the increase in magnitude caused by the CORDIC algorithm. The 8-bit image greyscale magnitudes were simply fed in to the core, along with random numbers generated from an XORshift register. A 16-bit CORDIC core was used for greater precision (with the magnitudes being fed to bits [15-7] (Bit 16 is the sign bit, and hence for images in this example it will always be 0).
In order to store all the data for multiple OSPR frames in the finite amount of memory available, the output from the quantiser was collated. The NtRAMs, whose data width is 32 bits, have the facility to enable the data to be written by individual bytes. The single bits from the quantiser (both real and imaginary) were put in to a one-byte sized shift register. Every four cycles (hence one complete shift through the shift register), this byte-sized shift register was written to memory using a byte-mask. This procedure is shown in
Two dual-memory frame buffers were implemented in the system. Essentially they were composed of two NtRAMs, one being written to while the other read. A single-bit input to the dual-memory frame buffer configured which NtRAM was being written to, hence giving the ability to be able to switch between the two RAMs.
The output frame buffer was read continuously by the video DAC for the output SVGA signal, while data was written to it by the collated outputs of the FFT.
The input frame buffer had data supplied from the input image FIFO (first-in and first-out) buffer, while data was read in to the phase randomiser.
An Analog Devices ADV7185 (NTSC/PAL video decoder) was used to decode a composite video signal.
In order to configure the device via the I2C bus standard, a simple microprocessor was implemented in the FPGA (KCPSM-II (Constant (K) Coded Programmable State Machine 2, written by Chapman, K. of Xilinx Inc.)). The ADV7185 was configured to give 10-bit luminance data interleaved by the two chrominance channels (YUV data) as a 27 MHz data stream: Cb0Y0Cr0Y1Cb2Y2Cr2Y3Cb4Y4Cr4Y5 . . . (This data stream is a ‘4:2:2’ sampling scheme, where there are only chrominance values for every other luminance value). This data was fed into a line-field decoder in order to find the line timing signals, signals embedded in the data through the use of reserved data words used as timing reference signals (TRS) (see International Telecommunication Union video standards JTU-R BT.656 and ITU-R BT.601).
The data stream was then converted from the 4:2:2 scheme to a 4:4:4 scheme by interpolating between the chrominance values, (this is shown in
The next stage was a de-interlacing stage. The method chosen to de-interlace was ‘Multiple Field Processing’. The two fields (odd and even) were stored together in memory to form a single frame (sometimes referred to as ‘weave’). This was achieved by having an address counter that stored the odd and even frames together. This method of de-interlacing produced the highest resolution output picture, but sometimes had undesirable visual artifacts (double imaging) when the image had significant movement (for example, the image may have changed significantly after the odd field was sent, before the even field is sent). Another alternative is to interpolate between the lines of each frame.
As the timing of the luminance data was not regular, the data was fed in to a FIFO buffer before being stored in the NtRAM. Another FIFO was placed in parallel with this and was fed with the address of the luminance value being stored, in order to de-interlace the signal.
The FPGA supplied the triple video D/A converter (the FMS3815) with three channels of 8-bit data (decoded by the CRL Opto board into 24 sequential binary frames). The CRL Opto display device had a native resolution of 1280×1024. Standard values for the sync timings and borders were chosen for this resolution, and a clock of 108 MHz was used (hence the rest of the system was run at 108 MHz for simplicity). As the data had been collated within the FPGA by the ‘output collater’ module, the data had to be ‘unpacked’ before being sent to the FMS3815.
a shows a holographic image display system incorporating a hardware accelerator 100 as described above. The hardware accelerator 100 has an input 102 to receive image data, for example from a consumer electronic device defining an image to be displayed. The hardware accelerator 100 drives SLM 24 to project a plurality of phase hologram sub-frames which combine to give the impression of displayed image 14 in the replay field (RPF).
In more detail, a laser diode 20 (for example, at 532 nm), provides substantially collimated light 22 to a spatial light modulator 24 such as a pixilated liquid crystal modulator. The SLM 24 phase modulates light 22 with a hologram and the phase modulated light is provided to a demagnifying optical system 26. In the illustrated embodiment, optical system 26 comprises a pair of lenses 28, 30 with respective focal lengths f1, f2, f1<f2, spaced apart at distance f1+f2. Optical system 26 (which is not essential) increases the size of the projected holographic image by diverging the light forming the displayed image, as shown.
Lenses L1 and L2 (with focal lengths f1 and f2 respectively) form the beam-expansion pair. This expands the beam from the light source so that it covers the whole surface of the modulator. The skilled person will understand that depending on the relative size of the beam 22 and SLM 24 this may be omitted. Lens pair L3 and L4 (With focal lengths f3 and f4 respectively) form a demagnification lens pair, in effect a demagnifying telescope. This effectively reduces the pixel size of the modulator, thus increasing the diffraction angle. As a result, the image size increases. The increase in image size is equal to the ratio of f3 to f4, which are the focal lengths of lenses L3 and L4 respectively. The skilled person will understand that other optical arrangements can also be used to achieve this effect. In embodiments a filter may also be included to filter out unwanted parts of the displayed image, for example a bright (zero order) undiffracted spot or a repeated first order or conjugate image, which may appear as an upside down version of the displayed image, depending upon how the hologram for displaying the image is generated. Optionally one or more lenses may be encoded in the hologram, as described in UK patent application GB 0606123.8 filed on 28 Mar. 2006, hereby incorporated by reference in its entirety, allowing the size of the optical system to be reduced.
In a colour system light beams from red, green and blue lasers may be combined and modulated by a common SLM (time multiplexed). Techniques for implementing a colour display are described in more detail in UK patent application GB 0610784.1 filed on 2 Jun. 2006, also incorporated by reference in its entirety.
b shows an example a consumer electronic device 10 incorporating a hardware projection module 12 as described above to project a displayed image 14.
We have described an embodiment of holographic image display hardware which is configured to implement a procedure in which a two-dimensional image is generated using a plurality of holographically generated temporal subframes, the temporal subframes being displayed sequentially in time such that they are perceived as a single reduced-noise image. We have described an example procedure which we broadly refer to as One Step Phase Retrieval (OSPR), We have, however, also described OSPR-type procedures in which, strictly speaking, in some implementations it could be considered that more than one step is employed. The holographic image display hardware we have described is also suitable for implementing these procedures, examples of which are described in GB0518912.1 filed 16 Sep. 2005 and GB0601481.5 filed on 25 Jan. 2006, both hereby incorporated by reference in their entirety.
Broadly speaking, in the first of the above two patent applications “noise” in one sub-frame is compensated in a subsequent sub-frame so that the number of subframes required for a given image quality can be reduced. More particularly feedback is used so that the noise of each subframe compensates for the cumulative noise from previously displayed subframes. In the second, by calculating the holographic subframe data at a higher resolution than is used to display a subframe, phase-induced errors can be compensated by adjusting the target phase data for pixels of the image to compensate for the errors introduced. Preferably this is performed so that the desirable requirement of a substantially flat spatial spectrum is met.
Applications for the above described holographic image display hardware include, but are not limited to, the following: mobile phone; PDA; laptop; digital camera; digital video camera; games console; in-car cinema; personal navigation systems (in-car or wristwatch GPS); head-up/helmet-mounted displays for automobiles or aviation; watch; personal media player (e.g. MP3 player, personal video player); dashboard mounted display; laser light show box; personal video projector (a “video iPod®”); advertising and signage systems; computer (including desktop); and a remote control unit.
No doubt many other effective alternatives will occur to the skilled person. It will be understood that the invention is not limited to the described embodiments and encompasses modifications apparent to those skilled in the art lying within the spirit and scope of the claims appended hereto.
Number | Date | Country | Kind |
---|---|---|---|
0511962.3 | Jun 2005 | GB | national |
0512905.1 | Jun 2005 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2006/050152 | 6/13/2006 | WO | 00 | 11/19/2008 |