This invention relates to processing an image.
Pictures displayed on computer screens are made up of a very large number of closely packed dots known as pixels. Although a color or black-and-white image may include many different colors or shades of gray, each individual pixel displays only a single color or a single shade of gray. A computer accesses data to determine how to light up each pixel in an image. For example, the data may be a single number corresponding to a shade of gray, or a collection of numbers that instruct the computer to light a given pixel by mixing different amounts of red, green, and blue.
An image the size of a small computer monitor requires data for nearly half-a-million pixels. The large amount of data needed to describe each pixel in an image can consume a lot of space on a computer hard disk or take a long time to download over a network. Thus, it would be advantageous to develop a technique for reducing the amount of data needed to represent an image.
A method of constructing a two-dimensional image including receiving information describing a two-dimensional N-gon mesh of vertices and constructing an image by coloring at least some of the N-gons based on the respective vertices of the N-gons.
Introduction
Typically, the triangle lines shown in
For example, as shown in
Conceptually, an edge collapse, as shown in
Whether or not instructions reduce the mesh, expressing a two-dimensional image as a flat mesh permits the use of a variety of techniques used by 3D graphics applications. For example, pixels bounded by a triangle in the mesh can also be colored using a 3D graphics technique known as Gouraud shading (also called intensity interpolation shading or color interpolation shading).
Gouraud shading uses the colors of triangle vertices to color-in the triangle area between the vertices. Many standard APIs (Application Program Interfaces), such as DirectX and OpenGL, and popular 3D Graphics Accelerator cards (e.g., 3D cards available from Matrox™ and Diamond™) offer Gouraud shading functions in their APIs. The Gouraud shading functions provided by these cards are used by real-time 3D video games to quickly shade triangles in three-dimensional triangular meshes that form spaceships and other 3D objects. By creating a flat mesh from an image, software (e.g., a browser) displaying an image can take advantage of the fast hardware-implemented coloring routines offered by these 3D graphics cards even though the images are only two-dimensional. This technique can speed image special effects such as zooming in or out of an image. For example, as shown in
Zooming in or out is achieved by multiplying the vertex coordinates (or a copy of the vertex coordinates) by a scalar value. This has the effect of increasing the distance between vertex coordinates and increasing the size of the triangles. The triangles now require more pixels to fill the interior and edges. Thus, while vertices 102b and 102c were neighbors in
Many other shading techniques may be used instead of, or in combination with, Gouraud shading. For example, a system may use flat-shading or wire-frame shading. In flat shading, each pixel of a triangle is colored using the average color of the triangle vertices. Wire-frame shading is like Gouraud shading, except the interior pixels are not colored, only the edges. This gives the image an interesting computer-like effect.
Image Processing
As shown in
Prior to mesh reduction, the software 110 may apply filters to the mesh to identify important edges or to apply special effects that are transmitted as part of the image. The information gained in this pre-processing may guide the edge selection during mesh reduction.
After creating the mesh 112, the software 110 then uses mesh reduction techniques to simplify the mesh. A wide variety of mesh reduction techniques can be used such as edge-collapsing, vertex clustering, and/or vertex decimation. After simplification, the software 110 encodes the mesh for later reconstruction into an image.
The image decoding software 120 decodes the-encoded mesh 118 and can use a shading technique 144 to display the image. A user may choose different image special effects 126 such as image scaling (e.g., zooming-in and out), image rotation in any x, y, or z direction, or image shearing. The image may also be perturbed using a modulated sin or cosine function to create a wave-like effect. These special effects transform 126 the coordinates of the mesh vertices, for example, by moving them closer together or farther apart. After determining or receiving the new coordinates of the triangle vertices 128, the software 120 can again use Gouraud shading 124 to “color-in” each triangle.
Edge Collapsing
One type of mesh reduction technique is known as edge-collapsing.
In one embodiment, the process 114 uses the techniques described in a SIGGRAPH 97 article entitled “Surface Simplification Using Quadric Error Metrics” by Michael Garland and Paul S. Heckbert (Michael Garland and Paul S. Heckbert. Surface Simplification Using Quadric Error Metrics, Proceedings of SIGGRAPH 97, Computer Graphics Proceedings, Annual Conference Series, pp. 209–216 (August 1997, Los Angeles, Calif.); Addison Wesley; Edited by Turner Whitted; ISBN 0-89791-896-7). The article describes a technique designed and used for simplifying surface topologies in three-dimensional meshes (e.g., smoothing small jagged three-dimensional spikes or bumps). The technique identifies a set of planes whose intersection meet at a vertex and defines the error of the vertex with respect to this set of planes as the sum of squared distances to the planes. Instead of using this technique to smooth a three-dimensional surface, however, the encoding software can use the error-metric to identify edges that can be removed and cause the least amount of perceptible image color (or greyscale) degradation.
The three-dimensional error-metric technique operates on vertices having x, y, and z coordinates. Since the co-planar coordinates of the two-dimensional mesh can have only x and y values, in greyscale pictures, the z value for each vertex coordinate can be set to the greyscale value of the vertex.
In a color embodiment, the process 114 transforms the red, green, and blue (RGB) values of each pixel into YUV values in the luminance/chrominance color space. Y, u, and v can be directly computed as functions of R, G, and B. In the YUV color space, the Y value dominates the u and v values. Hence, the Y value of each vertex can be fed to the error-metric function as the z value for each vertex.
In yet another embodiment, the process applies the Quadric Error Metric to the Y, u, and v components of each vertex. The resulting error metrics (3 per edge) are weighted and averaged together to form a single error metric which is feed into an edge-collapsing function.
Mesh Encoding
After mesh reduction (if applied), the resulting mesh can be compactly encoded for storage and subsequent decoding.
A wide variety of techniques can be used to encode a mesh.
As shown in
As shown in
To summarize,
The mesh representation of the image could be encoded using other techniques including techniques that support progressive mesh construction and rendering. For example, the mesh could be encoded by compressing the actions needed to reconstruct the mesh progressively. Progressive transmission enables a system to display construction of the mesh as progressive information arrives instead of awaiting complete transmission.
The techniques described here, however, are not limited to any particular hardware or software configuration; they may find applicability in any computing or processing environment. The techniques may be implemented in hardware or software, or a combination of the two. Preferably, the techniques are implemented in computer programs executing on programmable computers that each include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and one or more output devices. Program code is applied to data entered using the input device to perform the functions described and to generate output information. The output information is applied to one or more output devices.
Each program is preferably implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language.
Each such computer program is preferable stored on a storage medium or device (e.g., CD-ROM, hard disk or magnetic diskette) that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform the procedures described in this document. The system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner.
Other embodiments are within the scope of the following claims. For example, while described as operating on a two-dimensional image, the image may be a single frame in a series of video frames. In this case, the encoding of the image may be described as changes to the mesh instead of completely re-encoding the mesh for each frame.
This application is a divisional of application Ser. No. 09/429,881, filed Oct. 29, 1999, now U.S. Pat. No. 6,798,411. The entire teachings of the above application are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
4600919 | Stern | Jul 1986 | A |
4747052 | Hishinuma et al. | May 1988 | A |
4835712 | Drebin et al. | May 1989 | A |
4855934 | Robinson | Aug 1989 | A |
4901064 | Deering | Feb 1990 | A |
5124914 | Grangeat | Jun 1992 | A |
5163126 | Einkauf et al. | Nov 1992 | A |
5299302 | Fiasconaro | Mar 1994 | A |
5363478 | Fiasconaro | Nov 1994 | A |
5371778 | Yanof et al. | Dec 1994 | A |
5377320 | Abi-Ezzi et al. | Dec 1994 | A |
5611030 | Stokes | Mar 1997 | A |
5701404 | Stevens et al. | Dec 1997 | A |
5731819 | Gagne et al. | Mar 1998 | A |
5757321 | Billyard | May 1998 | A |
5786822 | Sakaibara | Jul 1998 | A |
5805782 | Foran | Sep 1998 | A |
5809219 | Pearce et al. | Sep 1998 | A |
5812141 | Kamen et al. | Sep 1998 | A |
5847712 | Salesin et al. | Dec 1998 | A |
5894308 | Isaacs | Apr 1999 | A |
5929860 | Hoppe | Jul 1999 | A |
5933148 | Oka et al. | Aug 1999 | A |
5949969 | Suzuoki et al. | Sep 1999 | A |
5966133 | Hoppe | Oct 1999 | A |
5966134 | Arias | Oct 1999 | A |
5974423 | Margolin | Oct 1999 | A |
6054999 | Strandberg | Apr 2000 | A |
6057859 | Handelman et al. | May 2000 | A |
6078331 | Pulli et al. | Jun 2000 | A |
6115050 | Landau et al. | Sep 2000 | A |
6175655 | George et al. | Jan 2001 | B1 |
6191787 | Lu et al. | Feb 2001 | B1 |
6191796 | Tar | Feb 2001 | B1 |
6198486 | Junkins et al. | Mar 2001 | B1 |
6201549 | Bronskill | Mar 2001 | B1 |
6208347 | Migdal et al. | Mar 2001 | B1 |
6219070 | Baker et al. | Apr 2001 | B1 |
6239808 | Kirk et al. | May 2001 | B1 |
6252608 | Snyder et al. | Jun 2001 | B1 |
6262737 | Li et al. | Jul 2001 | B1 |
6262739 | Migdal et al. | Jul 2001 | B1 |
6292192 | Moreton | Sep 2001 | B1 |
6317125 | Persson | Nov 2001 | B1 |
6337880 | Cornog et al. | Jan 2002 | B1 |
6388670 | Naka et al. | May 2002 | B2 |
6405071 | Analoui | Jun 2002 | B1 |
6437782 | Pieragostini et al. | Aug 2002 | B1 |
6478680 | Yoshioka et al. | Nov 2002 | B1 |
6559848 | O'Rourke | May 2003 | B2 |
6593924 | Lake et al. | Jul 2003 | B1 |
6593927 | Horowitz et al. | Jul 2003 | B2 |
6600485 | Yoshida et al. | Jul 2003 | B1 |
6608627 | Marshall et al. | Aug 2003 | B1 |
6608628 | Ross et al. | Aug 2003 | B1 |
20010026278 | Arai et al. | Oct 2001 | A1 |
20020101421 | Pallister | Aug 2002 | A1 |
Number | Date | Country | |
---|---|---|---|
20050083329 A1 | Apr 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09429881 | Oct 1999 | US |
Child | 10921783 | US |