This application claims priority to Korean Patent Application No. 10-2010-0033266, filed in the Korean Intellectual Property Office on Apr. 12, 2010, and all the benefits accruing therefrom under 35 U.S.C. §119, the content of which in its entirety is herein incorporated by reference.
(a) Field of the Invention
An image converting device and a three-dimensional (“3D”) image display device including the same are provided.
(b) Description of the Related Art
Generally, in the art of 3D image displaying technology, a stereoscopic effect is represented using binocular parallax. Binocular parallax is the most critical factor to allow a person to perceive a stereoscopic effect at close range. That is, different 2D images are respectively seen by a right eye and a left eye, and if the image seen by the left eye (hereinafter referred to as a “left-eye image”) and the image seen by the right eye (hereinafter referred to as a “right-eye image”) are transmitted to the brain, the left-eye image and the right-eye image are combined in the brain such that a 3D image having depth information is perceived by the observer.
The 3D image display devices using binocular parallax in 3D image displays may be categorized into different types, such as those which utilize stereoscopic schemes using glasses such as a shutter glasses scheme and a polarized glasses scheme, and those which utilize autostereoscopic schemes in which a lenticular lens or a parallax barrier is disposed to the display device without the glasses.
Generally, a multi-view 2D image is required to produce the 3D image; that is, two different 2D images taken from different points of view are required in order to produce a 3D image. However these schemes may not utilize a single-view 2D image that has been manufactured in the past in order to generate a 3D image; that is, the above schemes may not generate a 3D image using a 2D image taken from only a single point of view. Thus, movies or images which have been previously filmed in only 2D may not easily be converted to 3D because the second point of view to create binocular parallax is omitted.
Accordingly, research on converting a 2D image into a 3D image to apply content that has been manufactured in the past from a single view point to a next generation display device which may utilize 3D display has been actively undertaken. To convert the 2D image into the 3D image, depth information is generated, parallax is generated, and the left-eye image and the right-eye image are generated, however it is difficult to technically generate the depth information.
An exemplary embodiment of an image converting device according to the present invention includes; a downscaling unit which downscales a two-dimensional (“2D”) image to generate at least one downscaling image, a feature map generating unit which extracts feature information from the downscaling image to generate a feature map, wherein the feature map includes a plurality of objects, an object segmentation unit which divides the plurality of objects, an object order determining unit which determines a depth order of the plurality of objects, and adds a first weight value to an object having a shallowest depth among the plurality of objects, and a visual attention calculating unit which generates a low-level attention map based on visual attention of the feature map. In one exemplary embodiment, the first weight value may be added to the block saliency of the object having the shallowest depth.
In one exemplary embodiment, the object order determining unit may include an edge extraction unit which extracts edges of the plurality of objects, a block comparing unit which determines the depth order of the plurality of objects based on at least one of a block moment and a block saliency of the edge, and a weighting unit adding the first weight value to the object.
In one exemplary embodiment, the object order determining unit may further include an edge counting unit which counts the number of edges.
In one exemplary embodiment, the block comparing unit may determine whether objects among the plurality of objects are overlapped based on whether the number of edges is even or odd.
In one exemplary embodiment, the object having the deepest depth among the plurality of objects may or may not have a second weight value added thereto, and the second weight value may be less than the first weight value. In one exemplary embodiment, the second weight value may be added to the block saliency of the object having the deepest depth.
In one exemplary embodiment, a plurality of low-level attention maps may be generated, an image combination unit which includes the plurality of low-level attention maps may be further included in the image converting device, and a visual attention map may be generated from the combined plurality of low-level attention maps.
In one exemplary embodiment, the image converting device may further include an image filtering unit which filters the plurality of combined low-level attention maps.
In one exemplary embodiment, the feature map may include a center area and a surrounding area, and the visual attention may be determined based on a difference between a histogram of the center area and a histogram of the surrounding area.
In one exemplary embodiment, the feature map may include a center area and a surrounding area, the surrounding area and the center area include at least one unit-block, and the visual attention is determined by a block moment, a block saliency, or both a block moment and a block saliency.
In one exemplary embodiment, the image converting device may further include an image filtering unit which filters the low-level attention map.
In one exemplary embodiment, the image converting device may further include a parallax information generating unit which generates parallax information based on the visual attention map and the 2D image, and a three-dimensional (“3D”) image rendering unit rendering the 3D image based on the parallax information and the 2D image.
An exemplary embodiment of an image converting method according to an the present invention includes; downscaling a 2D image to generate at least one downscaling image; extracting feature information from the downscaling image to generate a feature map, dividing a plurality of objects, wherein the feature map includes the plurality of objects; determining a depth order of the plurality of objects, adding a first weight value to the object having the shallowest depth among the plurality of objects, and generating a low-level attention map based on visual attention of the feature map.
In one exemplary embodiment, the image converting method may further include extracting edges of the plurality of objects, and the depth order of the plurality of objects may be determined based on at least one of a block moment or a block saliency at the edges.
In one exemplary embodiment, the image converting method may further include counting the number of edges.
In one exemplary embodiment, the image converting method may further include determining an overlapped object among the plurality of objects based on whether the number of edges is odd or even.
In one exemplary embodiment, the object having the deepest depth among the plurality of objects may or may not have a second weight value added thereto, and the second weight value may be less than the first weight value.
In one exemplary embodiment, a plurality of low-level attention maps may be generated, combining the plurality of low-level attention maps may be further included in the method, and the visual attention map may be generated from the combined plurality of low-level attention maps.
In one exemplary embodiment, the image converting method may further include filtering the plurality of combined low-level attention maps.
In one exemplary embodiment, the downscaling image may be an image wherein the 2D image is downscaled in a horizontal direction, in a vertical direction, or in both a horizontal and vertical direction.
In one exemplary embodiment, a plurality of downscaling images may be generated, and the plurality of downscaling images may be processed in one frame.
In one exemplary embodiment, the image converting method may further include generating parallax information based the visual attention map and the 2D image, and rendering a 3D image based on the parallax information and the 2D image.
An exemplary embodiment of a 3D image display device according to the present invention includes a display panel including a plurality of pixels, and an image converting device converting a 2D image into a 3D image as described in detail above.
In one exemplary embodiment, the image converting device may include; a downscaling unit which downscales the 2D image to generate at least one downscaling image, a feature map generating unit which extracts feature information from the downscaling image to generate a feature map, wherein the feature map includes a plurality of objects, an object segmentation unit which divides the plurality of objects, an object order determining unit which determines a depth order of the plurality of objects, and adds a first weight value to the object having the shallowest depth among the plurality of objects, and a visual attention calculating unit which generates a low-level attention map based on visual attention of the feature map.
In the exemplary embodiments according to the present invention, the overlapped objects may be divided, the arrangement order between the objects may be clear, the image quality having the depth information may be improved, the data calculating amount may be reduced, and an amount of memory resources utilized may be reduced.
The above and other aspects, advantages and features of this disclosure will become more apparent by describing in further detail exemplary embodiments thereof with reference to the accompanying drawings, in which:
The invention now will be described more fully hereinafter with reference to the accompanying drawings, in which embodiments of the invention are shown. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Like reference numerals refer to like elements throughout.
It will be understood that when an element is referred to as being “on” another element, it can be directly on the other element or intervening elements may be present therebetween. In contrast, when an element is referred to as being “directly on” another element, there are no intervening elements present. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, third etc. may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms are only used to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, a first element, component, region, layer or section discussed below could be termed a second element, component, region, layer or section without departing from the teachings of the present invention.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” or “includes” and/or “including” when used in this specification, specify the presence of stated features, regions, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, regions, integers, steps, operations, elements, components, and/or groups thereof.
Furthermore, relative terms, such as “lower” or “bottom” and “upper” or “top,” may be used herein to describe one element's relationship to another element as illustrated in the Figures. It will be understood that relative terms are intended to encompass different orientations of the device in addition to the orientation depicted in the Figures. For example, if the device in one of the figures is turned over, elements described as being on the “lower” side of other elements would then be oriented on “upper” sides of the other elements. The exemplary term “lower”, can therefore, encompasses both an orientation of “lower” and “upper,” depending on the particular orientation of the figure. Similarly, if the device in one of the figures is turned over, elements described as “below” or “beneath” other elements would then be oriented “above” the other elements. The exemplary terms “below” or “beneath” can, therefore, encompass both an orientation of above and below.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Exemplary embodiments of the present invention are described herein with reference to cross section illustrations that are schematic illustrations of idealized embodiments of the present invention. As such, variations from the shapes of the illustrations as a result, for example, of manufacturing techniques and/or tolerances, are to be expected. Thus, embodiments of the present invention should not be construed as limited to the particular shapes of regions illustrated herein but are to include deviations in shapes that result, for example, from manufacturing. For example, a region illustrated or described as flat may, typically, have rough and/or nonlinear features. Moreover, sharp angles that are illustrated may be rounded. Thus, the regions illustrated in the figures are schematic in nature and their shapes are not intended to illustrate the precise shape of a region and are not intended to limit the scope of the present invention.
All methods described herein can be performed in a suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”), is intended merely to better illustrate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention as used herein.
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.
Now, an exemplary embodiment of a three-dimensional (“3D”) image display device according to the present invention will be described with reference to
Here, the 3D image display device may include a stereoscopic image display device using a shutter glass or a polarization glass, and an autostereoscopic image display device using a lenticular lens or a parallax barrier. The stereoscopic image display device includes a display panel including a plurality of pixels.
Here, the image converting device may be embedded to the 3D image display device. Also, the image converting device may be embedded to various pieces of image receiving and replaying equipment such as a broadcasting tuner, a satellite broadcasting reception terminal, a cable television reception converter, a video cassette recorder (“VCR”), a digital video disk (“DVD”) player, a high definition television (“HDTV”) receiver, a blue-ray disk player, a game console or various other similar devices.
Referring to
The image converting device converts a 2D image into a 3D image. As used herein, the term 2D image means a general 2D image taken from a single view point, and the 3D image means an image including two 2D images, each taken from a different viewpoint, such as a stereo-view. For example, the 3D image may refer to the left eye image, the right eye image, or both, while the left eye image and the right eye image are images that are displayed on a 2D plane. Embodiments include configurations wherein the left eye image and the right eye image may be simultaneously output on the 2D plane (and later separated using some form of filter, e.g., a polarization filter or a color filter), and embodiments wherein the left eye image and the right eye image may be sequentially output on the 2D plane.
The 2D image input to the image converting device is converted into a visual attention map having depth information, and the parallax information generating unit 70 generates the parallax information based on the visual attention map and the input 2D image. Here, the parallax information may be generated for a single pixel of the image or for a pixel group including multiple pixels. The 3D image rendering unit 80 renders the 3D image based on the input 2D image and the generated parallax information. For example, the 3D image rendering unit 80 may render the left eye image and the right eye image based on an original 2D image and the generated parallax information.
The term visual attention means that a person's brain and recognition system generally concentrate a particular region of the image, and this is provided in the various fields. The topic of visual attention has been the subject of much research in the fields of physiology, psychology, neural systems, and a computer vision. In addition, visual attention is of particular interest in the field of computer vision related to object recognition, trace, and discovery.
The visual attention map is an image generated by calculating the visual attention of an observer for the 2D image, and may include information related to the importance of the object in the 2D image. For example, in one exemplary embodiment the visually interesting region may be disposed close to the observer, and the visually non-interesting region may be disposed away from the observer. That is, the visually interesting region may be brightly represented to be disposed close to the observer (i.e., the gray value is large), and the visually non-interesting region may be darkly represented to be disposed away from the observer (i.e., the gray value is small). In an image that includes an object and a background, the object may be bright and the background may be dark, and accordingly, the object may be seen as protruding from the background. In one exemplary embodiment, the sizes of the original 2D image and the visual attention map may be 960×1080, respectively.
Next, a process for generating the visual attention map from the 2D image will be described in detail.
Referring to
Referring to
Referring to
The visual attention calculating unit 30 may execute a low-level attention computation using at least one feature map, and may generate a low-level attention map based on the result of the low-level attention computation. For example, the visual attention calculating unit 30 may use the differences between the histogram of a central area and a histogram of a surrounding area to execute the low-level attention computation.
Referring to
The area setup unit 31 may determine a center area and a surrounding area for at least one feature map, and the surrounding area may enclose the center area. The present exemplary embodiment of an area setup unit 31 may include a unit block setup unit, a center-area setup unit, and a surrounding-area setup unit.
The unit-block setup unit may determine a unit block size and shape, which in the present exemplary embodiment may include a square or rectangular shaped unit-block. For example, in the present exemplary embodiment the unit-block may have a size of 8 (pixels)×8 (pixels). Here, the number of combinations of the center area and the surrounding area may be geometrically increased according to the size of the 2D image such that the unit-block may be used to reduce the number of combinations of the center area and the surrounding area. Accordingly, the data calculating amount may be reduced.
The center-area setup unit may determine the center area to be the size of the unit-block, and the surrounding-area setup unit may determine the surrounding area to be the sum of the plurality of unit-blocks. Referring to
The histogram calculating unit 32 may calculate the difference between the feature information histogram of the center area and the feature information histogram of the surrounding area. In the present exemplary embodiment, the histogram may be one of an intensity histogram or a color histogram. Alternative feature information may be alternatively used as described above.
A method for calculating the difference of the histogram will be described in detail with reference to
To use a center-surround histogram, neighboring areas of two types may be defined with respect to the arbitrary pixel of the feature map 410. That is, the center area 411 and the surrounding area 412 may be defined according to the reference pixel. The surrounding area 412 may include the center area 411, and the area of the surrounding area 412 may be larger than the area of the center area 411.
Accordingly, the histograms of the center area and the surrounding area are extracted, and various histogram difference measurement methods may be used to gain the feature value difference 421 of the center area and the surrounding area. Accordingly, the low-level attention map 420 according to the feature value difference 421 of the center area and the surrounding area may be generated.
Various methods to gain the histogram difference may be used. For example, in one exemplary embodiment a chi square (χ2) method may be used. That is, if the center area is referred to as R and the surrounding area is referred to as Rs, when Ri is referred to as an i-th Bin of the histogram, wherein the histogram may include information regarding the luminance, the color, and the texture of the area, the center-surround histogram is substantially the same as the chi square difference of the center area histogram and the surrounding area histogram and may be represented by Equation 1 below:
The attention map generating unit 33 may use the feature information histogram to generate the low-level attention map.
In one exemplary embodiment the entirety of the center-surround histogram is not used, but instead only a moment of the histogram may be used to execute the low-level attention computation by using at least one feature map. As used herein, the term moment may include at least one of a mean, a variance, a standard deviation, and a skew of the histogram. For example, the mean, the variance, the standard deviation, and the skew may be determined for the luminance values of the plurality of pixels included in one unit-block. Memory resources may be saved by using the moment of the histogram, rather than the entire values thereof.
For example, if the value of the j-th pixel of the i-th block is Pij, the moment of the i-th block may be represented by Equation 2 as follows:
Here, Ei refers to the mean, σi refers to the variance, and si refers to the skew.
Also, in this case, a saliency of the predetermined block may be defined by Equation 3 as follows:
Here, the parameter w is a weight value controlling the relative importance between the moments, and a basic predetermined value may be 1. Also, B0, B1, B2, B3, and B4 may be the blocks shown in
Referring to
The object segmentation unit 90 may divide the several objects in one image, and the overlapped objects in the image in which the background is included. As shown in
A segmentation algorithm may be used as a method for dividing the objects, and a watershed algorithm may be used as the segmentation algorithm.
The object order determining unit 100 may determine the depth order of the objects and the background while scanning in the horizontal direction or the vertical direction. In the present exemplary embodiment, the background may be omitted. That is, one of the objects among the plurality of objects disposed close to the observer (where the depth is shallow) and one of the objects among the plurality of objects disposed away from the observer (where the depth is deep) may be determined. Here, the background may be disposed further away from the observer than the objects. As shown in
The edge extraction unit 110 may extract the outer line, e.g., a boundary or an edge, of the objects included in the image. For example, in one exemplary embodiment in order to extract the outer line of the objects, a high pass filter may be used. If the image of
The edge counting unit 120 may count the number of outer lines (the edges) while scanning in the horizontal direction or in the vertical direction. Referring to
The block comparing unit 130 may determine the depth order of the objects and the background based on a block moment or a block saliency of the blocks near the edges. Also, the block comparing unit 130 may additionally determine whether the objects are overlapped with one another based on whether the number of edges is an even number or an odd number.
After the depth order of the objects and the background is determined, the weighting unit 140 may provide a larger weight value as the objects are disposed closer to the observer. For example, in one exemplary embodiment wherein the feature information is luminance, the larger gray value may be added to the block saliency of the object disposed close to the observer and the smaller value may or may not be added to the block saliency of the object disposed away from the observer. Accordingly, the division between the object disposed close to the observer and the object disposed away from the observer may be clear, and the image quality having the depth information may be improved. Furthermore, the depth order between the objects is clear such that the depth order between the objects may not be exchanged when executing the image filtering. The authorization of the weight value may be determined based on the existence of the overlapped objects. That is, when the objects are overlapped, the object that is closest to the observer may be given the large weight value, and when the objects are not overlapped, the weight value may not be given to any object. Also, regardless of the existence of the overlapping of the objects, the larger weight value may be given to the object disposed closer to the observer. In one exemplary embodiment the weight value may be appropriately determined by experimentation as would be apparent to one of ordinary skill in the art.
As shown in
When the number of edges is odd, as shown in
Also, when the number of edges is even, as shown in
In addition, in the exemplary embodiment that the number of edges is not counted, after comparing the saliency of the left block and the saliency of the right block with respect to the edges, when the saliency of the right block is larger, the larger weight value may be added to the saliency of the right block, and the weight value may not be added to the saliency of the left block, or the smaller weight value may be added. When the saliency of the right block is smaller, the weight value may not be added to the saliency of the left block and the saliency of the right block.
Referring to
Referring to
The low-level attention map generated for at least one downscaling image may be selectively processed by the image filtering unit 60. For example, the filtering method may be a method using a normalization curve, a method using a sigmoid curve, and a method using a bilateral filter, and one or more methods may be sequentially used. In detail, in the bilateral filter, after executing 10×10 decimation, 10×10 interpolation may be executed after using a 5×5×5 low pass filter.
In one exemplary embodiment, the low-level attention map may be up-scaled by the image expansion unit 50. For example, the up-scaling may use bi-cubic interpolation. Here, in the process up-scaling the image, the weight value may be added to the image data for each pixel. The image data for each pixel may correspond to the background image. That is, the weight value may not be given to the image data disposed on a lower side in the low-level attention map, or a gradually decreasing weight value may be added to the image data disposed on the lower side of the low-level attention map.
In detail, in an exemplary embodiment wherein the size of the image is 960×540, the weight value added to the image data may be gradually increased as the line number approaches 515 from 0. Next, as the line number approaches 540 from 515, the weight value added to the image data may be gradually decreased from the weight value at the line number 515. When each of two adjacent upper and lower images is weighted in the above described way, an adjacent area of the two images may have dark gray values. Accordingly, although two adjacent images are filtered, each image may have dark gray values at the upper side, and may have gradually brighter gray values in the downward direction to the bottom side from the upper side of each image. Accordingly, the distortion of the gray values in an adjacent area of two images may be prevented, and the image quality may be improved.
If the weight value added to the lower portion of each image is gradually increased, that is, the weight value added to each image continuously increases through the entire line number, when two adjacent images are filtered, the distortion of the gray value occurs because an adjacent area of two weighted images has dark gray values and bright gray values. For example, when the rectangular image pyramid is weighted, the lower portion of the upper image of two adjacent images has bright gray values, and the upper portion of the lower image of two adjacent images has dark gray values. Here, the upper image and the lower image are adjacent to each other in the up and down directions of the rectangular image pyramid. As a result of filtering the weighted rectangular image pyramid, the upper portion of the lower image may have brighter gray values than the expected dark gray values. This is because two adjacent images influence each other particularly in the adjacent area of two images when filtering. In other words, when filtering, the weighted lower portion having bright gray values in the upper image influences the weighted upper portion having dark gray values in the lower image.
The image combination unit 40 combines at least one of the images that are expanded by the image expansion unit 50 and have the same size. For example, at least one of the images may be overlapped with another, and then added.
Next, the combined images may be filtered by the image filtering unit 60. As described above, the image filtering unit 60 may sequentially execute one or more filtering methods.
Also, the combined images may be expanded by the image expansion unit 50. For example, when the size of the combined image is 960×540, the combined image may be changed into the image having the size of 960×1080 by the image expansion unit 50.
While this invention has been described in connection with what is presently considered to be practical exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2010-0033266 | Apr 2010 | KR | national |