The invention relates to a method of rendering image data for a multi-view display. In particular the invention relates to a method of rendering image data for a multi-view display by means of a depth dependent spatial filter. The invention further relates to a multi-view display, to a signal rendering system and to computer readable code for implementing the method.
A multi-view display is a display capable of presenting to a viewer, different images depending upon the view direction, so that an object in an image may be viewed from different angles. An example of a multi-view display is an auto-stereoscopic display capable of presenting a viewer's left eye with a different image than the right eye. Various multi-view display technologies exist, one such technology is lenticular based. A lenticular display is a parallax 3D display capable of showing multiple images for different horizontal viewing directions. This way, the viewer can experience, e.g., motion parallax and stereoscopic cues.
One problem relating to multi-view displays is that images for different view-directions may overlap and thereby giving rise to ghost images, or cross-talk between images. Another problem relates to that the number of view-directions may be relatively small, typically eight or nine which may give rise to aliazing effects in some view-directions.
The published US patent application US 2003/0117489 discloses a three dimensional display and method of reducing crosstalk between left and right eye images of a 3D auto-stereoscopic display. The disclosed method of reducing crosstalk is based on adding a base level of grey to every pixel of both the left and right images so as to raise the background grey level.
The inventor of the present invention has appreciated that an improved method of rendering image data is of benefit, and has in consequence devised the present invention.
The present invention seeks to provide improved means for rendering image data for a multi-view display, and it may be seen as an object of the invention to provide an effective filtering technique that ameliorates the perceived image quality of a viewer, or user, of a multi-view display. Preferably, the invention alleviates, mitigates or eliminates one or more of the above or other disadvantages singly or in any combination.
According to a first aspect of the present invention there is provided, a method of rendering image data for a multi-view display, the method the comprising steps of:
In a multi-view display, the image data is typically rendered for proper presentation. The rendering may be needed since the image may be based on 2D image data projected to the viewer in such a way that the viewer perceives a spatial, or 3D, dimension of the image. For each view-direction of an image, a sub-image of the image as seen from that view-direction is generated, and the sub-images are projected into the associated view-direction.
The rendering process typically comprises several operations or steps, e.g. depending upon the input format of the image data, the display apparatus, the type of image data, etc. Image data of a first image is provided in a first step. This first step need not be a first step of the entire rendering process. The first image is typically in a format including image plus depth data, or an associated depth map may be provided with the image data, so that the 3D image data may be determined.
The inventor had the insight that spatial filtering for improving the perceived image quality, especially in terms of crosstalk and aliazing effect, is performed in the output domain, i.e. it is performed at a rendering stage where an input image has already been sampled, at least to some degree, for multi-view display. By spatially filtering the first image signal to provide a second image and the second image being sampled to a plurality of sub-images for multi-view, artefacts, such as crosstalk and aliazing effects, are dealt with in the input domain on a single image instead of in the output domain on a plurality of images, thereby dealing with artefacts in an efficient way.
While filtering a single image in the input domain rather than the multitude of images in the output domain, may be less perfect than the full-blown filtering of the multitude of images in the output domain, most artefacts may still be avoided or diminished, and a low-cost alternative in terms of processing power and time may thereby be provided.
Further advantages of the invention according to the first aspect include easy implementation in the rendering pipeline of images for multi-view display. The invention may be implemented in a separate pipeline step before the actual multi-view rendering, allowing for a more pipelined parallel implementation.
Furthermore, the method is effectively dealing with reduction of artefacts, such as crosstalk and aliazing artefacts, thereby rendering pre-processing or post-processing to further remove or diminish crosstalk or aliazing artefacts unnecessary.
The optional features as defined in dependent claims 2 and 3 are advantageous since band-pass filtering done by low-pass filtering, high-pass filtering and/or a combination of the two, are well-known band-pass filtering techniques which may be implemented in variety of ways, thereby ensuring robust and versatile implementation. In the low-pass filtering, frequencies higher than the Nyquist frequency may be removed, whereas the high-pass filter amplify high frequencies, e.g. the frequencies below the Nyquist frequency.
The optional features as defined in dependent claims 4 and 5 are advantageous since by determining the strength of the spatial filter as a size of the set of image elements of the second image, such as a radius or extend of a distribution filter of the set of image elements, it is ensured that objects near the reference plane are not greatly affected by the spatial filtering, whereas objects further away from the reference plane are affected by the spatial filtering.
The optional features as defined in dependent claims 6 and 7 are advantageous since by updating an image element of the second image with a visibility factor, problems relating to mixing of foreground and background objects may be countered in an effective way, such problems may arise when a spatial filtered image is rendered for a shifted viewpoint.
The optional features as defined in dependent claim 8 are advantageous since by updating the depth of the image elements of the second image an improved handling of viewpoint changes may be provided. The depth is updated by setting the depth of the image element of the second image element to a value between the depth of the image element of the first image and the depth of the image element of the second image. In this way, when an image element of the second image would substantially be composed of foreground and only a little of background, the depth may be set to a value substantially towards the depth of the foreground, a gradual depth transition soften the depth edge. In an embodiment may the depth value be set to the maximum of the depth of the image element of the first image and the depth of the image element of the second image.
The optional features as defined in dependent claim 10 are advantageous since by applying the spatial filter so that the image element of the first image and the set of image elements of the second image are aligned along a horizontal line of the first image, effects of the coarse sampling in the view direction and crosstalk may effectively be countered for a multi-view display projecting the difference views in a plurality of horizontally orientated directions.
The optional feature as defined in claim 11 is advantageous since the 2.5D video image format is a standard and widely used format.
According to a second aspect of the invention is provided a multi-view display device comprising:
The display device being a multi-view display device enhanced with the rendering method of the first aspect. It is an advantage of the present invention that the multi-view display device may both be a display device born with the functionality according to the first aspect of the invention, or a display device not born with the functionality according to the first aspect of the invention, but which subsequently is enhanced with the functionality of the present invention.
The input module, the rendering module and the output module may be provided as a signal rendering system according to the third aspect of the invention.
According to a fourth aspect of the invention is provided a computer readable code for implementing the method according to the first aspect.
In general the various aspects of the invention may be combined and coupled in any way possible within the scope of the invention. These and other aspects, features and/or advantages of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.
Embodiments of the invention will be described, by way of example only, With reference to the drawings, in which
Each lens covers a number of pixels 4, 5 and projects them out, as illustrated by the number of pixels denoted 7. The viewer sees one subset of pixels 4 with the right eye and another subset of pixels 5 with the left eye. A 3D experience is thereby obtained.
The lenticular lenses are in the illustrated embodiment arranged at a slight angle or slanted with respect to the columns of the pixels, so that their main longitudinal axis is at an angle with respect to the column direction of the display elements. In this configuration the viewer will see the points sampled along a direction 22 of the lens. In a nine-view display nine images, one for each view direction, are concurrently computed and shown on the group of pixels associated with a sub-image. When a pixel is lit, the entire lens above the pixel is illuminated 21—this is shown in FIG. 2B—so that for a specific view direction it is the entire lens above the pixel that is seen emitting the color of that pixel.
The visibility of sub-images from neighboring views from a single viewing direction may cause artefacts such as crosstalk. This is illustrated in
The inventor of the present invention has appreciated that by appropriate spatial filtering problems relating to crosstalk, to ghost imaging and aliazing may be removed or at least diminished. Furthermore, by spatially filtering the input image before the image is rendered for multi-view display, only a single image needs to be filtered (and possible a depth map in accordance with certain embodiments). Thereby providing an efficient way of handling spatial filtering of multi-view image data.
Depth dependent spatial filtering is done to counter crosstalk and/or aliazing effects. However, the depth dependent spatial filtering on an input image which is subsequently rendered for different viewpoints may introduce new artefacts by the rendering. Such as artefacts relating to that foreground and background objects mix for the rendered images with shifted viewpoint, thereby diminishing the perceived image quality of the 3D image at the different viewpoints.
In order to provide a 3D image with high perceived image quality, the depth dependent filtering of the image may be such that a blurring of the image is consistent with a blur introduced by a camera focused at a particular depth, this is illustrated in
The band-pass filter is typically a low-pass or a high-pass filter. The low-pass filter mitigates problems, typically alias problems, related to sampling the intensity function into a low number of sub-images, such as eight or nine, depending upon the number of views of the display. The high-pass filter mitigates problems relating to crosstalk imposing blur in the view direction. A combination of high-pass filtering and low-pass filtering may be performed to optimize the perceived image quality, or the filters may be applied separately.
Firstly, an image signal representing a first image comprising 3D image data is received or provided. The 3D image data may be represented in any suitable coordinate representation. In a typical coordinate representation, the image is described in terms of a spatial coordinate set referring to a position in image plane, and a depth of the image in a direction perpendicular to the image plane. It is, however, to be understood that alternative coordinate representations may be envisioned.
The filtering may be input-driven, and for each input pixel, the input pixel also being referred to as the source element, the difference in depth between the source element and a reference depth is determined. The reference depth being set to the depth layer in the image which is in focus, or which should remain in focus. The depth difference is then used as a measure for the strength of the spatial filter. The strength of the filter may in an embodiment be the number of pixels affected by the intensity of the source element, i.e. as the size of the set of image elements of the second image. The size of the set of image elements may be the radius of a distribution filter, distributing the intensity of the source element to the set of destination elements. In the following may source element and destination element be referred to as source pixel and destination pixel, respectively.
For areas where the depth values are near the reference depth, the radius of the distribution filter is small, so destination pixels only receive contribution from the corresponding source pixel. For areas where the depth is much different from the reference value, source pixel intensities are distributed over large areas and the mix, resulting in a blur.
To generate a blur that is consistent with the blur introduced by cameras focused on a particular depth, a visibility factor, v, is multiplied to the distribution function, so that Ip:=Ip+f(r)*Iq*v. The visibility factor equals zero when the destination pixel is much closer to the viewpoint than the source pixel, thereby ensuring that background does not blur over foreground. The visibility factor equals one when the source pixel is much closer to the viewpoint than the source pixel, and has a gradual transition between the two values. The distances between the source pixel, the destination pixel and the viewpoint may be evaluated from the spatial coordinates of the source pixel, the destination pixel and the viewpoint, for example by comparing the distances between the source pixel and the viewpoint and between the destination pixel and the viewpoint to one or more minimum distances so as to determined when the source pixel is much closer than the destination pixel to the viewpoint, and vice versa. The visibility factor has the effect that destination colors have to be normalized according to the summation of weights, since the sum of weights cannot be held constant beforehand.
In the following embodiments relating to depth-dependent blur are addressed. A depth-dependent spatial filter can nevertheless be used to both depth-depending blurring (low-pass filtering) and to depth dependent sharpening (high-pass filtering) which is discussed in connection with
It is seen in
Both in the destination image (
However, the halo-artefacts remain for silhouettes where foreground blurs over background. The de-occlusion that occurs with the object at the lower left corner 84 is not just a repetition of a background color, but of a color which for a large part is made up of foreground color, i.e. a semi-transparency is introduced.
In a situation where additional viewpoints of an image is rendered from image plus depth information, different solutions exists as how to diminish or even remove the halo-effects.
In an embodiment, the visibility factor is modified so that source pixels only contribute to destination pixels of similar depth. The result of such a filtering is shown in
In another embodiment, halo-effects are countered by filtering the depth map. The halo-effects 84 as seen in the lower left corner of
A solution, which at least reduces the artefacts considerably, is to also filter the depth map itself, thereby ensuring that the artefacts are not enlarged as much by the rendering.
Any destination pixel to which a foreground color is distributed, should also have foreground depth, thereby avoiding that such a pixel will be used in the multi-view rendering to fill in de-occlusion areas. This can be done by applying a depth-dependent morphological filter: when an source pixel is distributed to a destination pixel, the depth of the destination pixel is set to the maximum of the depth of the source pixel and the previous depth of that destination pixel. This naturally follows the visibility criterion: depth information from background objects does not change depth information of foreground objects (which for example will keep the depth transitions of for example the pillar to its background sharp, both in color and in depth). In general, may the updating of the depth map be done by instead of setting the depth of the destination pixel to the maximum value as mention above, to set the depth of the destination pixel to a value between the depth of the source pixel and the depth of the destination pixel.
In a situation where the image filter blurs foreground over background, the depth map is updated with the foreground depth to extend the foreground object. The result is shown in
Using this filtered depth map, along with the filtered image from
The spatial filtering as discussed in connection with
In horizontal filtering vertical halo-effects are avoided for shifted viewpoints. An example of a vertical halo-effect that is avoided in this situation may be seen be comparing the top of the pillar 110 on
A high-pass filtering is typically applied in order to pre-compensate blurring of an image introduced later on, e.g. in connection with the multi-view rendering or sampling of the image.
To counter this, the input image can be high-pass filtered. Usually resulting in some overshoot before and after the edge, making the edge “higher”. This is drawn schematically in
For high-pass filtering areas which have a depth similar to the reference depth no or only little sharpening occurs, as the difference between the reference depth and the depth of the area increases, the radius, or extent, of the area affected by the sharpening increases, matching the distance between edges in neighboring views.
In an embodiment, the signal including the image data to be presented to the viewer is inputted into an input module, as a first image signal. The depth dependent spatial filtering of the first image to provide a second image is conducted at a rendering module, the rendering module typically being a processor unit. The input module, rendering module and output module, need not, but may, be separate entities.
The rendering module may also apply additional rendering functions to the image data, e.g. the image data may be properly scaled to the view resolution, colors may be adjusted, etc. The rendering of the image signal may be done separately for different color components and the view-dependent intensity function may be determined for at least one color component of the image, and the band-pass filtering applied to the at least one color component of the image. For example, since in an RGB-signal the green component is the most luminous component, the spatial filtering may in an embodiment only be applied for the green component.
The invention can be implemented in any suitable form including hardware, software, firmware or any combination of these. The invention or some features of the invention can be implemented as computer software running on one or more data processors and/or digital signal processors. The elements and components of an embodiment of the invention may be physically, functionally and logically implemented in any suitable way. Indeed, the functionality may be implemented in a single unit, in a plurality of units or as part of other functional units. As such, the invention may be implemented in a single unit, or may be physically and functionally distributed between different units and processors.
Although the present invention has been described in connection with preferred embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present invention is limited only by the accompanying claims.
In this section, certain specific details of the disclosed embodiment are set forth for purposes of explanation rather than limitation, so as to provide a clear and thorough understanding of the present invention. However, it should be understood readily by those skilled in this art, that the present invention may be practiced in other embodiments which do not conform exactly to the details set forth herein, without departing significantly from the spirit and scope of this disclosure. Further, in this context, and for the purposes of brevity and clarity, detailed descriptions of well-known apparatus, circuits and methodology have been omitted so as to avoid unnecessary detail and possible confusion.
Reference signs are included in the claims, however the inclusion of the reference signs is only for clarity reasons and should not be construed as limiting the scope of the claims.
Number | Date | Country | Kind |
---|---|---|---|
05111632.5 | Dec 2005 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB06/54456 | 11/27/2006 | WO | 00 | 8/26/2008 |