The invention relates to a method of transforming a computer representation of an N-dimensional first object into a computer model of the first object.
The invention also relates to a compression method of transforming a computer representation of an N-dimensional object into a compression model of the object.
The invention also relates to a method of decompressing a compressed video signal to a computer representation of an N-dimensional object.
The invention also relates to a method of transforming a first cellular space model having a first plurality of cells into a second cellular space model having a second plurality of cells.
The invention also relates to a computer program for performing a method of transforming a computer representation of an N-dimensional first object into a computer model of the first object.
The invention also relates to a computer program for performing a compression method of transforming a computer representation of an N-dimensional object into a compression model of the object.
The invention also relates to a computer program for performing a method of decompressing a compressed video signal to a computer representation of an N-dimensional object.
The invention also relates to an apparatus for transforming a computer representation of an N-dimensional first object into a computer model of the first object, the apparatus comprising
The invention also relates to a video decompression apparatus for decompressing a compressed video signal to a computer representation of an N-dimensional object, the video decompression apparatus comprising:
The invention also relates to a data representation comprising a cellular space for representing a digitized N-dimensional object.
An embodiment of the method is known from the book by M Ghanbari ”Video Coding, an introduction to standard codes” The Institution of Electrical Engineers, 1999, ISBN 0 85296 762 4, pp. 46-48.
In this embodiment, the computer representation is a digitized representation of a set of two-dimensional images representing recordings of three-dimensional objects in a space recorded by projection in the image plane of a camera, at consecutive instants. The images consist of a matrix of pixel positions to which grey values are assigned.
In most applications of the known method, such as digital television transmission or recording on a DVD disc, fixed blocks of pixels are transformed into a computer model in accordance with a little adaptive pattern. An example of an application is a recording on a DVD disc making use of the MPEG2 standard in which the computer model comprises, inter alia, discrete cosine transform (DCT) coefficients computed for a priori fixed blocks of pixels. Another application of the known system is a video application in accordance with the MPEG4 standard which allows more adaptivity. For example, 2-dimensional objects can be coded in an object-based manner in MPEG4. The MPEG4 standard also allows a three-dimensional model of a human face, animated with respect to time, as a compression model of a human face in a video sequence. It is a drawback of the current MPEG4 compression systems that there is no satisfactory method of automatically modeling voxel representations of three-dimensional objects in a video cube. A voxel representation represents a three-dimensional object as a set of cubes of elementary dimensions, referred to as voxels. A voxel may be defined as a three-dimensional geometrical position associated with a number which indicates, for example, a grey value of a pixel in a video image. A video cube is a cube of voxels formed by placing a plurality of video images, which succeed each other in time, one behind the other.
It is, inter alia, a first object of the invention to provide a transformation method for modeling N-dimensional objects by means of a user-friendly computer model.
It is, inter alia, a second object of the invention to provide an efficient method of compressing an N-dimensional object.
It is, inter alia, a third object of the invention to provide a method of decompressing an efficiently compressed video signal.
It is, inter alia, a fourth object of the invention to provide a method of transforming a first cellular space model into a second cellular space model so that transformations of the associated N-dimensional objects can be modeled efficiently.
It is, inter alia, a fifth object of the invention to provide a computer program for performing the transformation method.
It is, inter alia, a sixth object of the invention to provide a computer program for performing the compression method.
It is, inter alia, a seventh object of the invention to provide a computer program for performing the decompression method.
It is, inter alia, an eighth object of the invention to provide an apparatus for performing the transformation method.
It is, inter alia, a ninth object of the invention to provide an apparatus for performing the decompression method.
It is, inter alia, a tenth object of the invention to provide an easily processable data representation for representing an N-dimensional object.
The first object is realized in that the computer model transformation comprises the step of generating a cellular space model having a first cell belonging to a first manifold having a dimension which is equal to N, and a second cell belonging to a second manifold having a lower dimension which is equal to N−1 situated on the border of the first manifold, and an edge between the first cell and the second cell to which an indicator is assigned, which indicates whether the second manifold forms part of the border of the first manifold.
A manifold is the mathematical name for a collection of points having a dimension D. An example of a manifold is a plane. An example of a plane in a video cube is the plane which is built up from the projections in consecutive video images, as picked up by a camera, from the upper side of, for example, a square object. In each individual video image, this projection forms a line and all lines jointly form a plane. The plane formed by the upper sides of an object may of course also be curved. Besides two spatial dimensions associated with a video image and the time dimension, a third spatial dimension may be present in a three-dimensional television application. When a scale dimension is also added, the number of dimensions of N is equal to five. Additional dimensions can be added to represent other parameters, for example, computed on the basis of the texture of an object.
A cellular space is a specific instance of a graph. A graph is a mathematical concept and consists of cells and edges. The cellular space will generally be built up in such a way that a cell corresponds to each manifold of the N-dimensional object, starting from the N-dimensional manifold forming the interior of the object and going across all lower dimensional manifolds on the border of the border up to and including manifolds on the border with a zero dimension, being points. A specific property of a cellular space is that an edge is added between a first cell corresponding to a first manifold having a dimension D and a second cell corresponding to a second manifold having a lower dimension with one dimension less, D−1, if the second manifold is situated on the border of the first manifold. All lower-dimensional manifolds on the border of a mother object are thus explicitly modeled by means of a cell and an edge in the cellular space model. An example of building up a cellular graph is illustrated with reference to
There are many methods of modeling N-dimensional objects in the computer graphics technique. However, these methods have a metrical nature. An example is an octree in which a three-dimensional object is partitioned into cubes of different dimensions until the smallest cubes approach the irregular outer surface with a given precision. Other models model the surface of an N-dimensional object such as, for example, a triangular mesh or a Gaussian bump model. The cellular space is, however, a topologic representation of the N-dimensional object which allows an indication of the components the object consists of, which components can be subsidiarily modeled, if necessary, by means of a metric model.
It is advantageous when, for a computer representation of a second object, a third cell belonging to a third manifold is added to the cellular space model. When all manifolds of both objects are represented in one single cellular space, their topological relation is convenient and can easily be processed. Two adjoining manifolds belonging to a first and a second cell have a common border manifold of a lower dimension belonging to a third cell. The cellular space has a first edge between the third and the first cell and a second edge between the third and the second cell, which model the border relations of the border manifold. Since the border manifold generally forms part of only one manifold, the indicator of one of the edges has the value of “forming part-of” and the indicator of the other edge has the value of “not forming part of”. By means of the information of all objects, comprising the cellular space, it is easy to predict, for example, the temporal evolution of an object or to change it in a computer graphics application. The indicator supplies information about the fact which of the two objects in a three-dimensional space which is captured by the camera into a video sequence, is the most backwardly positioned.
In one embodiment, a three-dimensional video cube consisting of two-dimensional images associated with consecutive instants and being placed one after the other is partitioned into a first object and a second object, and the transformation generates a first cell and a third cell, the dimension of the first manifold and the third manifold being at most three.
This embodiment occurs, for example, in a two-dimensional television application. The advantage of the method according to the invention is that geometrical transformations of objects can be more easily modeled in time by means of the cellular space model. All voxels in the video cube are assigned to an object, for example, a first three-dimensional space-time object represents a person who is walking and the second object is the person's environment comprising all other voxels. When a video cube comprises P pictures chosen from a video sequence, the person can only occur, for example, in a number of P-K pictures, or alternatively, he may also occur in further pictures outside the chosen video cube. Each object in the video cube is modeled in the same cellular space model.
It is also interesting when the transformation assigns a value to the indicator on the basis of a computation of at least one geometrical property, derived from values of the computer representation. The cellular space model is automatically generated on the basis of a real-life video sequence. All kinds of properties of objects in the video sequence can be measured in order that the indicator can be given the correct value with great certainty by means of one or more of these properties.
In one embodiment using a robust computation of the indicator, the transformation assigns a value to the indicator on the basis of a computation of a change with respect to time of the surface, of a cross-section of the first object with a plane of a two-dimensional image in the video cube at an instant. In fact, when a two-dimensional cross-section of an object in a video sequence appears or disappears behind another object cross-section, the number of pixels associated with the cross-section changes because some pixels of the object are invisible.
The second object is realized in that the transformation makes use of a cellular space model. In addition to a cellular space model, a compression model is also generated.
The compression model comprises metric information, for example, about the precise form of the interior of the first object. The advantage of the method according to the invention is that objects in the video cube are compressed by means of a three-dimensional model, while objects in the prior art of MPEG4 are compressed two-dimensionally by modeling and compressing only two-dimensional cross-sections in different television images. By using a three-dimensional compression model, the achieved compression factor at the same image quality is higher than in two-dimensional compression. Alternatively, at a fixed compression factor, the image quality in three-dimensional compression is higher than in two-dimensional compression. Due to its fixed pattern of partitioning an image into 16×16 pixel blocks and temporal prediction of images, MPEG2 does not completely utilize the three-dimensional character of objects in the video cube. For an efficient compression, the fact that objects are occluded must be explicitly taken into account. Occlusion occurs when a first object moves behind a preceding second object in a three-dimensional space, or when the first object appears from behind the second object.
Patent application WO-A-00/64148 describes a compression method which is based on matching two-dimensional segments. Some techniques described in this application may be useful for obtaining an N-dimensional object from a video cube, required for the method according to the invention. However, the patent application does not explicitly use N-dimensional objects but only two-dimensional projections thereof.
The third object is realized in that the decompression method makes use of a cellular space model. The explicit coding of objects in a cellular space model allows advanced compression and decompression. In fact, during regeneration of N-dimensional objects, it is computed by means of the cellular space model which pixels of objects are visible.
The fourth object is realized in that the first plurality of cells is different from the second plurality of cells. When, for example, a first N-dimensional object is to be compared with a second N-dimensional object, for example, for a search for picture material on the Internet, it will be easy to compare their associated cellular space models. Before associating cells and edges of both cellular space models with each other, it may be easy to first transform one of the cellular space models. For example, a roof of an object representing a house is flat for the cellular space models modeling a house as specified in a query, and pointed for a second house in an image on the Internet. For example, the cell representing the flat roof may then be re-used for the first slanting side of the pointed roof and add an extra cell for the second slanting side. Techniques of the same kind are useful for computer graphics applications.
The fifth object is realized by providing a code comprising a computer program for the transformation method.
The sixth object is realized by providing a code comprising a computer program for the compression method.
The seventh object is realized by providing a code comprising a computer program for the decompression method.
The eighth object is realized in that the processing means are capable of generating a cellular space model with a first cell belonging to a first manifold having a higher dimension which is equal to N, and a second cell belonging to a second manifold having a lower dimension which is equal to N-1 situated on the border of the first manifold, and an edge between the first cell and the second cell, and are capable of assigning an indicator to the edge, which indicates whether the second manifold forms part of the border of the first manifold.
The ninth object is realized in that the processing means have access to a cellular space model.
The tenth object is realized in that an indicator is assigned to an edge between a first cell and a second cell of the cellular space, which indicator indicates whether the second manifold having a lower dimension forms part of a first manifold having a higher dimension, said first and second manifolds being represented by the first and the second cell, respectively.
The transformation method, the compression method, the decompression method, the apparatus, the video decompression apparatus and the data representation according to the invention will hereinafter be elucidated, by way of example, with reference to the drawings. In these drawings:
In the following Figures, parts corresponding to parts of Figures already described are denoted by the same reference numerals. The reference numerals of corresponding parts of an object and the associated cellular space model only differ by one hundred. Parts shown in broken lines are optional. The methods and apparatuses are described with reference to three or two-dimensional objects in order to elucidate the ideas more clearly. The steps described may be mathematically formulated in an obvious manner for higher dimensions.
The second step of the transformation method of
In formula [1], i is the index of a histogram bin, in which all grey values in a video cube are divided into M bins. C is the number of grey values associated with a bin i in the cubes K1 and K2. The volume V of a cube is used as normalization constant. When the difference G is small, both cubes belong to one and the same segment in accordance with the segmentation algorithm. Different criteria are described in literature each of which can make use of different properties such as voxel grey value, voxel color, texture dimensions such as values obtained by Gabor filtering or values from a co-occurrence matrix, etc. In literature, there are also different segmentation algorithms grouping, for example, small segments to larger segments or, conversely, split up larger segments into smaller ones.
The object may already have been modeled in accordance with a given model, for example, an octree. If desired, the octree model may be transformed to a voxel representation during the acquiring step. Alternatively, the cellular space model can be generated on the basis of, for example, a triangular mesh representation.
During the generation step 5 of
A second heuristic is illustrated with reference to
A third heuristic analyzes with which adjoining texture a border moves. This may be effected by means of motion estimation. First, a texture analysis can be performed, for example, by computing Laws parameters or a wavelet or fractal analysis of the texture or an analysis of texture units can be performed. It is further possible to isolate segments having textures of the same type from the images and apply a segment-based motion estimator.
If a second object is present, for example, the second object 204 together with the first object 203 in the video cube 201 of
It is interesting -when not only a cellular space model 223 of the voxel representation is generated, the generation step 5 in
During the outputting step 7, the cellular space model and, if applicable, the metric model 222 is outputted, for example, to a memory 219 or via a data connection. It is interesting when the data of the metric model and the cellular space are used to generate a compression model 228, preferably an object-based compression model. For example, a three-dimensional wavelet model of the objects can be used as a compression model, using techniques which are known from the compression technique such as, for example, quantization of wavelet coefficients, while taking the characteristics of human vision and, for example, Huffman coding into account.
The advantage of using a cellular space model is that compression and decompression can be performed more efficiently than with a metric model only. This will be illustrated with reference to
If, for example, a soccer ball rolls through an image, the texture modelization may optionally model a rotating texture function of the football or a static function which translates linearly, in which the ball will be observed as a sliding instead of a rolling ball at the receiver end. If the texture function varies with time, for example, by changes of illumination, a first option is to make use of very short three-dimensional objects modeling only a small part of the trajectory of an object, for example, through four frames. An alternative option is the use of time-variant texture functions, for example, a polynomial change of the grey value of a pixel in a system of reference axes coupled to the object.
Compression is important for many applications. Transport of data as compression applications is understood to mean, for example, Internet video, third and fourth generation mobile communications, video-on-demand over DSL (Digital Subscriber Line) and digital television. Storage is understood to mean, for example, high capacity record carriers such as, for example HDTV on digital discs such as a DVD, professional video servers, personal video recorders based on a hard disk on which, for example, many programs are recorded, though with a low quality, and proprietary compression in all kinds of systems. For low capacity storage, carriers such as video CD, small discs and solid-state memories are interesting. Video signals may originate from all kinds of sources ranging from satellite television to Internet video. The method may be used at the provider's end, for example, in a television studio, and at an intermediary's end, for example, a cable network company, as well as in the living room.
More dimensions than three can be obtained by constructing a so-called scale space, for example, for each frame. For example, the frame may be filtered with Gaussian filters in which the standard deviation or of the filter is increased continuously. The standard deviation then forms an extra dimension. Similarly as a video cube can be formed by putting frame one behind the other with respect to time, as in
Another application of the cellular space model is computer vision. For example, when a robot must plan a motion trajectory in a three-dimensional space with reference to images picked up by a camera, he can make use of the cellular space model so as to define which manifolds in the frames belong to each other so that he can better compute the three-dimensional structure and placement of objects in the three-dimensional space. Another application is the recreation of a scene from another view point such as in three-dimensional television or video-on-demand. Furthermore, the cellular space model is also interesting when creating special effects. Another application is the structural decomposition of images on, for example, the Internet. When images having given objects must be found, these objects can be described by means of a cellular space model. A cellular space model is generated, for example, both for a sketch of the searched object made by a user of, for example, an image search program, and for images in a database on the Internet. The use of a cellular space model is also interesting in medical image-processing applications.
Subsequently, a computer representation, is generated (step in
When, for example, the second image 703 is being generated, the borders of both the circle 712 and the square 713 are first computed by, for example, projection of their associated three-dimensional object on the plane of the second image 703. Subsequently, the respective texture functions are to be applied so as to color the pixels of the circle and the square. It should be computed whether either the circle or the square is in front. Since the border manifolds form part of the square, the square is in front. It follows that the texture function of the circle in its second position 712 must first be drawn and overwritten by the texture function of the square 713.
During the outputting step 105, the video cube is written, for example, into a memory 271, or the consecutive images are sent to, for example, a picture display unit.
Number | Date | Country | Kind |
---|---|---|---|
02077692.8 | Jul 2002 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB03/03034 | 7/2/2003 | WO | 12/22/2004 |