The present invention generally relates to three dimensional (3D) models. More particularly, it relates to compression and transmission of 3D models in a 3D program.
In practical applications, such as 3D games, virtual chatting room, digital museum, and CAD, many 3D models consist of a large number of connected components. These multi-connected 3D models usually contain a non-trivial amount of repetitive structures via various transformations, as shown in
Methods to automatically discover the repetitive geometric features in large 3D engineering models have been proposed, such as D. Shikhare, S. Bhakar and S. P. Mudur. Compression of Large 3D Engineering Models using Automatic Discovery of Repeating Geometric Features, 6th International Fall Workshop on Vision, Modeling and Visualization (VMV2001), Nov. 21-23, 2001, Stuttgart, Germany. However, these methods did not provide a complete compression scheme for 3D engineering models. For example, Shikhare et al. did not provide a solution for compressing the necessary transformation information for restoring a connected component. Considering of the large size of connected components a 3D engineering model usually has, the transformation information will also consume a big amount of storage if not compressed.
In PCT application WO2010149492 filed on Jun. 9, 2010, entitled Efficient Compression Scheme for Large 3D Engineering Models, an efficient compression algorithm for multi-connected 3D models by taking advantage of discovering repetitive structures in the input models is disclosed. It first discovers in a 3D model the structures or components repeating in various positions, orientations and scaling factors. Then the repetitive structures/components in the 3D model are organized using “pattern-instance” representation. A pattern is the representative geometry of the corresponding repetitive structure. The instances of a repetitive structure correspond to the components belonging to the repetitive structure and are represented by their transformations, i.e. the positions, orientations and possible scaling factors, with respect to the corresponding pattern and the pattern identification.
To restore the original model from the “pattern-instance” representation, the instance components are calculated by
Inst_Comp=Inst_Transf×Pattern, (1)
where Inst_Transf is the transformation matrix transforming the corresponding pattern to the instance component Inst_Comp. The decoder calculates the transformation matrix Inst_Transf by deriving it from the decoded position, orientation and scaling information, such as
Inst_Transf=Func(Pos_Instra,Ori_Instra,Scal_Instra), (2)
where Pos_Instra, Ori_Instra and Scal_Instra are the decoded position, orientation and scaling factor of the instance component to be restored. Thus the instance components can be restored by
Inst_Comp=Func(Pos_Instra,Ori_Instra,Scal_Instra)×Pattern. (3)
The compression scheme disclosed in WO2010149492 has achieved significant bitrates saving compared to traditional 3D model compression algorithms which do not discover repetitive structures.
The present invention solves the problem of 3D model compression and proposes an algorithm which can discover repetitive structures and further control the decoding error.
This invention directs to methods and apparatuses for 3D model compression.
According to an aspect of the present invention, there is provided a method for encoding a 3D model, which comprises the steps of identifying a repetitive structure in said 3D model, encoding said repetitive structure in a repetitive structure encoding mode, and verifying each encoded instance component of said encoded repetitive structure.
According to another aspect of the present invention, there is provided a method for decoding a compressed 3D model, which comprises the steps of decoding patterns from a bitstream of the compressed 3D model; decoding instance component information from the bitstream; and restoring instance components using the decoded patterns and the decoded instance component information.
The above features of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings in which:
In the present invention, a solution to compress a 3D model is proposed, by discovering repetitive structures and controlling the decoding error.
In one embodiment, the 3D encoder 200 further comprises a unique mode encoder 240 for re-encoding the instance components output by the instance component verification unit as failing the verification. The unique mode encoder treats each of the components to be encoded as an independent component by individually compressing each component or compressing all such components altogether. An example unique mode encoder is a traditional mesh encoder.
In a different embodiment, the 3D model encoder 200 further comprises a compression mode determining unit (not shown) after the repetitive structure identification unit for determining whether to compress the entire 3D model in a unique encoding mode, that is, encoding the 3D model without exploring the pattern-instance representation. One example implementation of the unique mode encoding is to use a traditional 3D mesh encoder. One reason for the determining step is to make sure to compress the 3D model in the RS encoding mode when it can result in bit savings. If there are no bit savings, a unique mode encoding is preferred. According to one embodiment of the present invention, the unique mode encoding for the 3D model is chosen if the number of instance components in the repetitive structures are smaller than a threshold. For example, if the instance components include less than a predetermined ratio, e.g. 50%, of the input vertices, the geometry representation of the entire input 3D model is compressed in the unique encoding mode.
The decoder further comprises a unique component decoder 540 for decoding unique components in the bitstream if there is any. The decoded unique components are then incorporated with the restored instance components to generate the decoded 3D model in the component restoring unit 550.
The following presents a preferred detailed embodiment of the 3D model compression according to the present invention, which is called Pattern-based 3D Model Compression (PB3DMC) codec. A few highlights on the PB3DMC codec are:
PB3DMC encoder includes the following main steps:
PB3DMC decoder includes the following main steps.
Some of the major steps are explained below:
E.1) Discover Repetitive Structures (602)
In this step, repetitive structures among connected components are identified in the combination of translation, rotation, uniform scaling and reflection transformations. Compared with those schemes which do not consider the reflection transformation, the PB3DMC can discover more repetitive structures and further improve compression ratio. For example, when considering reflection transformations, each of the three eigenvectors found by a PCA analysis of the component is used as the axis for the mirror reflection of the component to examine whether it is similar to other components. An exhaustive search scheme requires 8 comparisons for each component. A more efficient searching scheme is also possible.
The repetitive structure discovery is performed by a pair-wise comparison of connected components. In a preferred embodiment, in order to increase efficiency of the comparison, all components are first clustered by utilizing each component's vertex normal distribution as its feature vector for clustering, as disclosed in PCT application PCT/CN2011/080382 filed on Sep. 29, 2011, entitled “Robust Similarity Comparison of 3D Models,” the teachings of which are herein incorporated by reference in its entirety. Only the components belonging to the same cluster are compared with each other. Two components are aligned first before the comparison. Component alignment involves two steps. First align two components by their positions, orientations and scaling factors. Then they are further aligned using iterated closest points (ICP) algorithm, such as, Rusinkiewicz, S., and Levoy, M. Efficient Variants of the ICP Algorithm, in 3DIM, 145-152, 2001, which includes iterative rotation and translation transformation. Two components are determined to belong to the same repetitive structure if their surface distance is small enough after being aligned with each other. An example method for calculating the surface distance between two components can be found in N. Aspert, D. Santa-Cruz and T. Ebrahimi, MESH: Measuring Error between Surfaces using the Hausdorff distance, in Proceedings of the IEEE International Conference on Multimedia and Expo 2002 (ICME), vol. I, pp. 705-70. The surface distance threshold value can be determined based on a user input Quality Parameter (QP) table, an example of which is shown below:
Repetitive structure discovery generates repetitive structures which consist of patterns and instance components (or instances), and unique components which are connected components that do not belong to any repetitive structures. Patterns are the representative geometry of repetitive structures. Instances are “pattern-instance” representation of the corresponding instance components. In one embodiment, a pattern is not selected as one of the components of the input model; rather it is aligned with the world coordinate system. For example, a pattern can be generated by selecting an instance component of the repetitive structure and moving it to the origin of the world coordinate, rotating it so that its eigenvectors are aligned to the world coordinate. The reason for such rotation is that, as shown in WO2010149492, compressing 3D models which have been aligned with the world coordinate system helps minimizing the visual artifacts caused by vertex position quantization because of the small quantization error. This is particularly beneficial for large flat surfaces. In a different embodiment of generating a pattern, only shifting of a selected instance component to the origin is done, and no rotation is performed. In this case, the selected instance component would have the least transformation to the pattern, i.e. no rotation in its transformation matrix. Typically, the number of bits allocated to the compressed rotation sub-matrix/part is high. Thus, in this embodiment, there are no bits assigned to the rotation sub-matrix/part for the selected instance component, which reduces the encoded bit rate. In a different embodiment, instance components of repetitive structures are clustered and the instance component which is close to the center of the cluster is picked to generate the pattern. This embodiment leads to small values of the rotation information for most of the instance components in the cluster and thus fewer bits for the rotation information in the compressed bitstream.
The instances can be represented by
As can be seen from Eqn. (1), an instance component can be completely recovered from the instance transformation matrix Inst_Transf and the corresponding pattern, which can be retrieved using the pattern ID. Thus, when compressing an instance component, it is equivalent to compress the pattern ID and the instance transformation matrix. In this application, “instance” and “instance component” are used interchangeably to refer to the component or its instance representation since they are equivalent.
E.2) Determine Compression Mode (604)
In order to guarantee that the compression ratio is not decreased by introducing repetitive structure discovery into the codec, a decision needs to be made between the compression mode for the 3D model using the “pattern-instance” representation (RS encoding mode) or the original representation (unique mode) after repetitive structures discovery/identification. The general guidelines are:
The first step is to decide whether or not to compress scaling factors. To determine a scaling factor, according to one embodiment, all the instance components of a repetitive structure are shifted to their corresponding mean and rotated to the world coordinate. A bounding box ([xmin, xmax], [ymin, ymax], [zmin, zmax]) is calculated for each shifted and rotated instance component, where xmin, xmax, ymin, ymax, zmin, zmax are the minimum and maximum of the coordinate of the instance components along x, y, z axis, respectively. In one embodiment, the instance component with the largest bounding box is picked to generate the pattern. For the remaining instance components, the bounding box of each instance component is compared with that of the pattern to determine the scaling factor. The reason of using the instance component with the largest bounding box as the pattern is to make all the scaling factors smaller than or equal to 1. This helps preserving the precision after recovering the component. When the scaling factor is smaller than 1, the recovered instance component is the shrunk version of the pattern, whose higher digits of the coordinate of the vertices are relatively accurate after decoding and thus the recovered component is relatively accurate. On the other hand, if the scaling factor is larger than 1, the recovered instance component is the magnified version of the pattern, whose lower digits of the coordinates of the vertices, which are usually not accurate after decoding, are magnified. Thus the recovered instance component may contain large decoding error. Since a scaling factor affects every entry in a transformation matrix, it will be compressed without loss in a preferred embodiment if it is determined to be compressed. Thus, the compressed scaling factors will cost more bits than other types of instance information. Compressing scaling factors may decrease the entire compression ratio if there are only a small number of scaling factors not equal to 1, which means there are not many scaling factors need to be compressed and the overhead of compressing them exceeds the bit savings by the compression. Thus the decision can be made as follows
If it is determined not to compress the scaling factors, the instance components whose scaling factor is not equal to 1 will not be treated as the instance components of the repetitive structure. Attempts to regroup these components as repetitive structure with scaling factor 1 will be made. After these steps, the components that do not belong to any repetitive structures are regarded as unique components. For example, a repetitive structure contains 10 large book components and 4 small book components. The scaling factors for the 10 large book components are 1 and that for the 4 small book components are 0.75 for 3 of them and 0.5 for one of them. If it is determined that the scaling factors will not be compressed, the 10 large book components may still be treated as instance components of the original repetitive structure and compressed in the “pattern-instance” representation. Three of the small book components with original scaling factor 0.75 will form a new repetitive structure with scaling factor 1 and the remaining one small book component whose original scaling factor was 0.5 will be treated as a unique component since it does not belong to any repetitive structure.
The final decision of whether to compress the 3D model in the repetitive structure representation, i.e. in the RS encoding mode, is made by the following steps.
E.3) Encode Patterns (608)
There are two options to encode the patterns: separate encoding and group encoding. Compared with group encoding, i.e. encoding all patterns together, encoding patterns separately may cost more bits because of the followings reasons.
As the instance component verification step requires the decoded patterns and the order of the patterns might be changed in the decoded “pattern” model, after decoding the “pattern” model, the components of the decoded “pattern” model need to be recognized to calculate the new IDs (610) of patterns in the component sequence of the decoded “pattern” model so that they match the original pattern IDs.
E.4) Update Instance Transformation (610)
The instance transformation calculated during repetitive structure discovery is not accurate as it does not consider the decoding error of patterns. With the decoded patterns, the instance transformation can be updated for a better accuracy as follows
Inst_Comp=Inst_Transf×Decoded_Pattern. (4)
E.5) Verify Instances (628)
In order to control the decoding error, the encoder calculates the decoding error of instance components and those with decoding error larger than the user specified threshold do not pass the instance verification. In one embodiment, the decoding error is calculated as the surface distance between the decoded instance component and the original instance component:
Decoding_Err=Surface_dist(Decoded_Inst_Comp,Ori_Inst_Comp) (5)
where,
Decoded_Inst_Comp=Decoded_Inst_Transf×Decoded_Pattern. (6)
An example method for calculating the surface distance between two components can be found in N. Aspert, D. Santa-Cruz and T. Ebrahimi, MESH: Measuring Error between Surfaces using the Hausdorff distance, in Proceedings of the IEEE International Conference on Multimedia and Expo 2002 (ICME), vol. I, pp. 705-70.
The compressed transformation of those instances passing the instance verification is output to the compressed bitstream according to the data packing mode selected by the user as will be explained later. The instance components fail to get through the instance verification will be treated as unique components and compressed together with other unique components.
E.6) Recognize Patterns (610)
As all patterns are compressed together in PB3DMC, the pattern decoder needs to recognize the patterns by separating the decoded “pattern” model into connected components and recover their orders that match the encoding order of the patterns.
For the instance components of 605, instance transformation with respect to the corresponding pattern and the pattern ID are calculated, or recalculated if it has been calculated during the repetitive structure discovery, in the calculation unit 610. In order to reduce decoding error for the instance component, a decoded pattern 611 instead of the original pattern is used for the instance transformation calculation. The decoded pattern 611 is generated by a 3D mesh decoder 612 decoding the compressed pattern 607. The output of the calculation unit 610 is the instance information 613, which contains the pattern ID and the transformation matrix. The pattern ID needs to be calculated/registered when all the patterns are encoded together as disclosed above in a preferred embodiment of the 3D mesh encoder 608. During the encoding, all the pattern models are put on the same face and encoded using a mesh encoder as a normal 3D model. After the decoding by the 3D mesh decoder 612, the order of the patterns may be changed. In order to find the right pattern for the corresponding instance component, new pattern ID needs to be calculated to map the decoded pattern back to the original pattern. The transformation matrix in 613 can be decomposed into the reflection part, the rotation part, the translation part, and a possible scaling part. As mentioned before, PB3DMC directly compresses the transformation matrix. An example encoding scheme of the instance transformation matrix can be found in PCT application PCT/CN2011/082942, filed on Nov. 25, 2011, entitled Repetitive Structure Discovery based 3D Model Compression, the teachings of which are herein incorporated by reference in its entirety. According to one embodiment in PCT/CN2011/082942, the instance transformation matrix Inst_Transf is decomposed into four parts, a reflection part (Refle), a rotation part (Rotat1 or Rotat2), a translation part (Transl), and a possible scaling part, as shown below:
where, Rotat_Refle can be decomposed into
Rotat_Refle=Refle×[Rotat1]
or
Rotat_Refle=[Rotat2]×Refle.
The reflection part Refle can take the following value:
In a different implementation, Refle can also take the values of
if there is reflection transformation.
In this embodiment, the reflection part can be represented and compressed by a 1-bit flag. The rotation part (Rotat1 or Rotat2) is a 3×3 matrix and compressed by three Euler angles (alpha, beta, gamma), which is first quantized and then compressed by some entropy codec. The translation part (Transl) is a 3 dimensional column vector, which is first quantized and then compressed by some entropy codec (elementary mode) or an octree-based (OT) encoder 614 (group mode), as will be disclosed later. The scaling part is represented by a uniform scaling factor of the instance and compressed by the lossless compression algorithm for floating point numbers.
It is apparent that in PB3DMC, the compression schemes of these sub-matrices/parts are different, and they also depend on an instance data packing mode specified by the user. Recall that each instance has two parts of data: Pattern ID and transformation matrix. There are two packing modes for the instance data: an elementary mode and a group mode. In the elementary mode, the entire instance data for the instance components are encoded sequentially, i.e. (PID 1, trans 1) (PID 2, trans 2), . . . , (PID n, trans n), wherein PID x and trans x are the pattern ID and transformation matrix for component x, respectively and x=1, . . . , n. In the group mode, PIDs for a group of instances are encoded together followed by the encoding of the transformation matrices for that group of instances, i.e. (PID 1, PID 2, . . . , PID n)(reflection 1, reflection 2, . . . , reflection n), (translation 1, translation 2, . . . translation n), (rotation 1, rotation 2, . . . , rotation n), (scaling 1, scaling 2, scaling n). The details of the instance data packing mode are disclosed in PCT application PCT/CN2011/076991, filed on Jul. 8, 2011, entitled “Bitstream Syntax for Repetitive Structures, with Position, Orientation and Scale Separate for Each Instance”, the teachings of which are herein incorporated by reference in its entirety.
In the PB3DMC codec, the instance translation part is encoded according to the data packing mode chosen by the user. If the group mode is selected, an octree-based (OT) encoder 614 as disclosed in m22771, 98th MPEG meeting proposal, entitled “Bitstream specification for repetitive features detection in 3D mesh coding” is used to encode the instance translation part and to generate the compressed group instance translation information 615. It is to be noted that during the decoding, the OT decoder may change the order of the instance translation parts for different instances, which would cause mismatch between the instance translation parts with other parts, such as rotation parts, and pattern IDs, and lead to decoding errors. To solve the problem, the OT decoder can be run at the encoder side to find out how the order of the instance translation parts for different instances is changed so that the pattern ID and other instance component information can be encoded in the same order in the compressed bitstream for a correct decoding of the instance component at the decoder. In a different embodiment, the instance translation part indices in the original instance order are inputted to the OT encoder 614, along with instance translation information. With such information, the OT encoder 614 can output the new indices of each instance translation parts according to the octree traversal order, i.e. the decoded translation parts order. The advantage of this embodiment is that there is no need to run the OT decoder at the encoder side.
If the elementary data packing mode is selected, the instance translation part goes through an n-bit quantization unit 616 to generate the compressed instance translation information 617.
The instance rotation part is encoded by a rotation encoder 618 to generate the compressed instance rotation information 619. The instance scaling factor is compressed without loss by a floating point encoder 620 to generate the compressed instance scaling information 621. An example floating point encoder can be found in Martin Isenburg, Peter Lindstrom, Jack Snoeyink, Lossless Compression of Floating-Point Geometry, Proceedings of CAD'3D, May 2004. Since the scaling factor affects every entry of the transformation matrix, lossless compression of the scaling factor would reduce the decoding error.
The instance reflection part, which is a one-bit flag, will be sent directly to the compressed instance information packing unit 622 to combine with other information. It can be further compressed when possible. For example, if the group packing mode is selected and the instance reflection flags for the instance components are combined together as described earlier, a run-length encoding or other entropy coding can be applied to compress the reflection flags to further reduce bitrate.
In a preferred embodiment of the present invention, in order to obtain a low decoding error, the number of bits assigned to the instance rotation parts, instance translation parts and patterns in the compressed bit stream have the following relationship Bits_rotation≧Bits_translation≧Bits_pattern. For example, when quantizing the instance rotation parts, a higher number of bits, e.g. 14 bits, are assigned than that for the quantization of the translation parts, e.g. 13 bits, which is higher than the bits for quantizing the vertices of the patterns, e.g. 12 bits. Such a bit assignment can lead to a low decoding error, because an error in the rotation part can introduce a high decoding error especially for a component with large size after the rotation, and the translation and pattern error typically remain the same after the transformation.
With the compressed group instance translation information 615 (or the compressed instance translation information 617 depending on the data packing mode selected by the user), the compressed instance rotation information 619, the compressed instance scaling information 621, the instance reflection flag and the corresponding pattern ID, the instance information can be packed in the compressed instance information packing unit 622 to generate the compressed instance information 623. However, in a preferred embodiment, to control the decoding error, the compressed instance components are input into a verification process before sending to the packing unit 622.
The verification is performed by an instance component verification unit 628. The verification unit 628 takes the decoded patterns 611, decoded instance translation part 625, decoded instance rotation parts 627, the original instance reflection flag from 613 (since it is compressed without loss) and the original instance scaling of 613 (since it is compressed without loss) as input to calculate a decoding error and compare it with a user-specified threshold as described elsewhere in this application. To generate the decoded instance rotation part 627, an orientation decoder 626, which comprises an entropy decoder and a de-quantization unit, is applied onto the compressed instance rotation information 619. To generate the decoded instance translation part 625, an n-bit de-quantization unit 624 is applied onto the compressed instance translation information 617 regardless of the data packing mode. Note that it may not be necessary to employ the OT encoder and decoder for the verification purpose. Only the quantization within the OT encoder affects the decoding error and it can be modeled by the quantization of 616. The output of the verification unit 628 is the instance component verification results 629 which contain the verified instance index, for identifying the verified instance components, and discarded instance components. For those discarded instance components, they will not be compressed in the “pattern-instance” representation and will be treated as unique components and sent to the 3D mesh encoder 608 for encoding.
For those instance components that pass the verification, their compressed instance information including compressed group instance translation information 615 (or compressed instance translation information 617), compressed instance rotation information 619, compressed instance scaling information 621, the reflection flag and the corresponding pattern ID will be sent to the compressed instance information packing unit 622 to generate the compressed instance information 623 which is further incorporated into the compressed bitstream 603. Note that if the group mode is selected, after verification, the OT encoder 614 will be used to encode the translation parts of all instances that pass the verification and generate the compressed group instance translation information and instance order as the input of the compressed instance information packing unit 622. The compressed instance information packing unit 622 will reorder the other instance information according to the instance order generated by OT encoder 614.
If the elementary mode is selected, a different embodiment of the above process is to use a loop of encoding followed by verification for each instance. If one instance passes the verification, its encoded information can be directly outputted to the compressed information packing unit 622. The advantage of this embodiment is that no buffer is needed.
Due to the verification process, it is possible that none of the instance components of a certain pattern passes the verification, which makes encoding the pattern meaningless. Therefore, it is desirable to encode only patterns that have at least one instance component passing the verification. According to another embodiment of this invention, all patterns which have at least one related instance passes verification are encoded and then recognized to find out the correct order of the patterns, i.e. correct pattern ID. The compressed patterns are then output to the compressed stream 603 after finishing verifying all instances. The pattern ID of all instances passing verification will be re-set accordingly before outputting to the compressed stream.
The PB3DMC decoder as shown in
Although preferred embodiments of the present invention have been described in detail herein, it is to be understood that this invention is not limited to these embodiments, and that other modifications and variations may be effected by one skilled in the art without departing from the scope of the invention as defined by the appended claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2012/070877 | 2/3/2012 | WO | 00 | 7/28/2014 |