The present disclosure relates to an information processing apparatus and method, and more particularly, to an information processing apparatus and method capable of suppressing a reduction in quality of attribute information of 3D data.
Conventionally, mesh has been used as 3D data representing an object having a three-dimensional shape. As a mesh compression method, a method of compressing a mesh by extending video-based point cloud compression (VPCC) has been proposed (see, for example, Non-Patent Document 1).
In the VPCC, the coordinates of each point are projected onto any of six surfaces of a bounding box to generate a geometry image, and the geometry image is 2D encoded. The attribute information corresponding to each point is mapped to the same position as the geometry image and 2D encoded as an attribute image.
Meanwhile, in the mesh, attribute information having an arbitrary resolution and an arbitrary shape can be stored on the texture image for each polygon.
However, when the VPCC is extended and such a mesh is encoded as described above, the attribute information on the texture image is rearranged on the attribute image according to the geometry image, and the quality of the attribute information may be reduced.
The present disclosure has been made in view of such a situation, and an object thereof is to suppress a decrease in quality of attribute information of 3D data.
An information processing apparatus according to one aspect of the present technology is an information processing apparatus including: an image generation unit that generates a geometry image and an occupancy image by arranging a plurality of vertices on a two-dimensional plane so as to correspond to a UV map on the basis of position information of the plurality of vertices of a polygon and the UV map indicating a correspondence relationship between the plurality of vertices and a texture of the polygon; and an encoding unit that encodes the UV map, connection information indicating a connection relationship between the plurality of vertices, the geometry image, the occupancy image, and the texture of the polygon.
An information processing method according to one aspect of the present technology is an information processing method including: generating a geometry image and an occupancy image by arranging a plurality of vertices on a two-dimensional plane so as to correspond to a UV map on the basis of position information of the plurality of vertices of a polygon and the UV map indicating a correspondence relationship between the plurality of vertices and a texture of the polygon; and encoding the UV map, connection information indicating a connection relationship between the plurality of vertices, the geometry image, the occupancy image, and the texture of the polygon.
An information processing apparatus according to another aspect of the present technology is an information processing apparatus including: a decoding unit that decodes encoded data and generates a UV map indicating a correspondence relationship between a plurality of vertices of a polygon and a texture of the polygon, connection information indicating a connection relationship between the plurality of vertices, a geometry image in which the plurality of vertices is arranged on a two-dimensional plane so as to correspond to the UV map, an occupancy image corresponding to the geometry image, and a texture image in which the texture is arranged on the two-dimensional plane; and a reconstruction unit that reconstructs position information of the plurality of vertices in a three-dimensional space on the basis of the UV map, the geometry image, and the occupancy image.
An information processing method according to another aspect of the present technology is an information processing method including: decoding encoded data and generating a UV map indicating a correspondence relationship between a plurality of vertices of a polygon and a texture of the polygon, connection information indicating a connection relationship between the plurality of vertices, a geometry image in which the plurality of vertices is arranged on a two-dimensional plane so as to correspond to the UV map, an occupancy image corresponding to the geometry image, and a texture image in which the texture is arranged on the two-dimensional plane; and reconstructing position information of the plurality of vertices in a three-dimensional space on the basis of the UV map, the geometry image, and the occupancy image.
In an information processing apparatus and a method according to one aspect of the present technology, a geometry image and an occupancy image are generated by arranging a plurality of vertices on a two-dimensional plane so as to correspond to a UV map on the basis of position information of the plurality of vertices of a polygon and the UV map indicating correspondence relationship between the plurality of vertices and a texture of the polygon, and the UV map, connection information indicating a connection relationship between the plurality of vertices, the geometry image, the occupancy image, and the texture of the polygon are encoded.
In an information processing apparatus and a method according to another aspect of the present technology, encoded data is decoded, a UV map indicating a correspondence relationship between a plurality of vertices of a polygon and a texture of the polygon, connection information indicating a connection relationship between the plurality of vertices, a geometry image in which the plurality of vertices is arranged on a two-dimensional plane so as to correspond to the UV map, an occupancy image corresponding to the geometry image, and a texture image in which the texture is arranged on the two-dimensional plane are generated, and position information of the plurality of vertices in a three-dimensional space is reconstructed on the basis of the UV map, the geometry image, and the occupancy image.
Modes for carrying out the present disclosure (hereinafter, referred to as embodiments) are hereinafter described. Note that description will be given in the following order.
The scope disclosed in the present technology includes, in addition to the contents described in the embodiments, contents described in the following Non-Patent Documents and the like known at the time of filing, contents of other documents referred to in the following Non-Patent Documents, and the like.
That is, the contents described in the above-described Non-Patent Documents, the contents of other documents referred to in the above-described Non-Patent Documents, and the like are also basis for determining the support requirement.
Conventionally, there has been 3D data such as a point cloud representing a three-dimensional structure with point position information, attribute information, and the like.
For example, in a case of a point cloud, a three-dimensional structure (three-dimensional shaped object) is expressed as a set of a large number of points. The point cloud includes position information (also referred to as geometry) and attribute information (also referred to as attribute) of each point. The attribute can include any information. For example, color information, reflectance information, normal line information, and the like of each point may be included in the attribute. As described above, the point cloud has a relatively simple data structure, and can express any three-dimensional structure with sufficient accuracy by using a sufficiently large number of points.
Video-based point cloud compression (VPCC) is one of such point cloud encoding techniques, and encodes point cloud data, which is 3D data representing a three-dimensional structure, using a codec for two-dimensional images.
In the VPCC, the geometry and attribute of a point cloud are each decomposed into small regions (also referred to as patches), and each patch is projected onto a projection plane that is a two-dimensional plane. For example, the geometry and the attribute are projected onto any of the six surfaces of the bounding box containing the object. The geometry and the attribute projected on the projection plane are also referred to as projection images. The patch projected on the projection plane is also referred to as a patch image.
For example, the geometry of a point cloud 1 illustrating an object of a three-dimensional structure illustrated in A of
The attribute of the point cloud 1 is also decomposed into patches 2 similarly to the geometry, and each patch is projected onto the same projection plane as the geometry. That is, a patch image of the attribute having the same size and the same shape as the patch image of the geometry is generated. Each pixel value of the patch image of the attribute indicates an attribute (color, normal vector, reflectance, and the like) of a point at the same position of the patch image of the corresponding geometry.
Then, each patch image thus generated is arranged in a frame image (also referred to as a video frame) of the video sequence. That is, each patch image on the projection plane is arranged on a predetermined two-dimensional plane.
For example, a frame image in which a patch image of geometry is arranged is also referred to as a geometry video frame. Furthermore, this geometry video frame is also referred to as a geometry image, a geometry map, or the like. A geometry image 11 illustrated in C of
In addition, the frame image in which the patch image of the attribute is arranged is also referred to as an attribute video frame. The attribute video frame is also referred to as an attribute image or an attribute map. An attribute image 12 illustrated in D of
Then, these video frames are encoded by an encoding method for a two-dimensional image, such as, for example, advanced video coding (AVC) or high efficiency video coding (HEVC). That is, point cloud data that is 3D data representing a three-dimensional structure can be encoded by using a codec for two-dimensional images. Generally, an encoder of 2D data is more widespread than an encoder of 3D data, and can be realized at low cost. That is, by applying the video-based approach as described above, an increase in cost can be suppressed.
Note that, in the case of such a video-based approach, an occupancy image (also referred to as an occupancy map) can also be used. The occupancy image is map information indicating the presence or absence of the projection image (patch image) for each N×N pixels of the geometry video frame and the attribute video frame. For example, the occupancy image indicates a region (N×N pixels) in which a patch image exists in a geometry image or an attribute image by a value “1”, and a region (N×N pixels) in which a patch image does not exist in the geometry image or the attribute image by a value “0”.
Such an occupancy image is encoded as data different from the geometry image and the attribute image and transmitted to the decoding side. Since the decoder can grasp whether or not the region is a region where the patch exists by referring to this occupancy map, it is possible to suppress the influence of noise and the like caused by encoding and decoding, and to reconstruct the point cloud more accurately. For example, even if the depth value changes due to encoding/decoding, the decoder can ignore the depth value of the region where no patch image exists (not process the depth value as the position information of the 3D data) by referring to the occupancy map.
For example, an occupancy image 13 as illustrated in E of
It should be noted that, similarly to the geometry video frame, the attribute video frame, and the like, this occupancy image can also be transmitted as a video frame. That is, similarly to the geometry and the attribute, encoding is performed by an encoding method for a two-dimensional image such as AVC or HEVC.
That is, in the case of the VPCC, the geometry and the attribute of the point cloud are projected onto the same projection plane and are arranged at the same position in the frame image. That is, the geometry and the attribute of each point are associated with each other by the position on the frame image.
Meanwhile, as 3D data representing an object having a three-dimensional structure, for example, a mesh exists in addition to a point cloud. As illustrated in
For example, as illustrated in the lower part of
In the case of the 3D data using the mesh, unlike the case of the VPCC described above, the correspondence between each vertex 21 and the texture 23 is indicated by the UV map 34. Therefore, as in the example of
As a method of compressing 3D data using such a mesh, for example, a method of compressing 3D data using a mesh by extending the above-described VPCC has been proposed in Non-Patent Document 1 and the like.
However, in the case of this method, similarly to the case of the point cloud described above, the vertex information 31 and the like of the mesh are projected on the projection plane as geometry and then arranged in the geometry image. Then, the texture 23 is projected onto the same projection plane as the geometry as an attribute and then arranged in the attribute image so as to correspond to the geometry of the geometry image. That is, the texture 23 of each polygon is arranged at the same position in the attribute image as the position of the polygon corresponding to the texture 23 in the geometry image.
Therefore, for example, when the shape of a patch 41 of geometry is distorted in a geometry image 42 as illustrated in
More specifically, as illustrated in
In addition, as described above, since the positions of the geometry image and the attribute image correspond to each other, the texture needs to be the same size as the geometry, and it is difficult to enlarge or rotate the texture and arrange the texture, for example.
Therefore, the quality of the texture (that is, the attribute information) may be reduced.
Therefore, when the VPCC is extended and 3D data using a mesh is encoded, the texture image is not generated to match the geometry image as in the VPCC of a point cloud, but the geometry image is generated to match the texture image. That is, as illustrated in the top row of the table in
For example, in an information processing method, a geometry image and an occupancy image are generated by arranging a plurality of vertices on a two-dimensional plane so as to correspond to a UV map on the basis of position information of the plurality of vertices of a polygon and the UV map indicating a correspondence relationship between the plurality of vertices and a texture of the polygon, and the UV map, connection information indicating a connection relationship between the plurality of vertices, the geometry image, the occupancy image, and the texture of the polygon are encoded.
For example, an information processing apparatus includes: an image generation unit that generates a geometry image and an occupancy image by arranging a plurality of vertices on a two-dimensional plane so as to correspond to a UV map on the basis of position information of the plurality of vertices of a polygon and the UV map indicating a correspondence relationship between the plurality of vertices and a texture of the polygon; and an encoding unit that encodes the UV map, connection information indicating a connection relationship between the plurality of vertices, the geometry image, the occupancy image, and the texture of the polygon.
For example, it is assumed that a texture image 101 illustrated in
By doing so, the texture can be encoded as the texture image (without being converted into the attribute image). That is, the texture can be encoded without distorting its shape. In addition, since it is not necessary to arrange the texture according to the geometry, for example, the texture can be enlarged or rotated to be arranged in the texture image. Therefore, it is possible to suppress a reduction in the quality of the texture (that is, the attribute information) of the 3D data.
For example, in an information processing method, encoded data is decoded to generate a UV map indicating a correspondence relationship between a plurality of vertices of a polygon and a texture of the polygon, connection information indicating a connection relationship between the plurality of vertices, a geometry image in which the plurality of vertices is arranged on a two-dimensional plane so as to correspond to the UV map, an occupancy image corresponding to the geometry image, and a texture image in which the texture is arranged on the two-dimensional plane, and position information of the plurality of vertices in a three-dimensional space is reconstructed on the basis of the UV map, the geometry image, and the occupancy image.
For example, an information processing apparatus includes: a decoding unit that decodes encoded data and generates a UV map indicating a correspondence relationship between a plurality of vertices of a polygon and a texture of the polygon, connection information indicating a connection relationship between the plurality of vertices, a geometry image in which the plurality of vertices is arranged on a two-dimensional plane so as to correspond to the UV map, an occupancy image corresponding to the geometry image, and a texture image in which the texture is arranged on the two-dimensional plane; and a reconstruction unit that reconstructs position information of the plurality of vertices in a three-dimensional space on the basis of the UV map, the geometry image, and the occupancy image.
For example, the encoded data is decoded to generate (restore) the texture image 101 and the geometry image 102 generated such that the texture (hatched portion in the drawing) of the texture image 101 and the geometry (black portion in the drawing) of the geometry image 102 are located at the same position and have the same shape as each other as illustrated in
In this way, the encoded texture image can be decoded. That is, the texture projected on the projection plane can be decoded without distorting the shape. In addition, it is possible to enlarge or rotate the projection image to decode the texture arranged in the texture image. Therefore, it is possible to suppress a reduction in the quality of the texture (that is, the attribute information) of the 3D data.
Note that the texture is arranged in the texture image without depending on the geometry. That is, in the texture image, the texture is arranged by a grid of texture (also referred to as a texel grid). Therefore, this texture image is also referred to as a texel grid image.
As described above, when the geometry image is generated in accordance with the texture image, the geometry image is also the image of the texel grid. That is, in the geometry image in this case, the geometry is arranged in a texel grid. Note that, in the case of the VPCC of the point cloud, in the geometry image, the geometry is arranged by a grid of geometry (also referred to as a voxel grid). As described above, by applying the present technology, a geometry image different from that in the case of the VPCC of the point cloud is generated.
In the case of generating the geometry image as described above, each vertex of the mesh is divided into patches and projected for each patch. In the case of generating such a patch, as illustrated in the second row from the top of the table in
Then, information indicating how the patch is generated may be transmitted from the encoding side to the decoding side. For example, information indicating how the patch is generated may be stored in the encoded data of the 3D data. By doing so, even if the encoding side generates the patch by various methods, the decoding side can correctly grasp the method.
As described above, when a single patch or a plurality of patches are set in the small region of the texture, as illustrated in the third row from the top of the table in
For example, a small region 110 of the texture illustrated in
When the projection plane (projection direction) is set, the projection plane (projection direction) may be set on the basis of the normal vector of the polygon as illustrated in the fourth row from the top of the table in
As illustrated in the fifth row from the top of the table in
Note that, in deriving the average of such normal vectors, patch division may be performed if necessary. For example, the variance of the normal vector may also be derived, and patch division may be performed such that the variance is smaller than a predetermined threshold.
Note that, instead of referring to the normal vectors of all the polygons in the patch, the projection plane (projection direction) may be set on the basis of the normal vector of the representative polygon as illustrated in the sixth row from the top of the table in
As illustrated in the seventh row from the top of the table in
For example, as illustrated in the eighth row from the top of the table in
Furthermore, as illustrated in the ninth row from the top of the table in
As illustrated in the tenth row from the top of the table in
In this case, a patch image of the geometry is arranged in the geometry image. The patch image is a patch projected on the projection planar set as described above. The upper side of
Since such a patch image is arranged in the geometry image as described above, the distance (depth value) from the projection plane to the vertex in the three-dimensional space is stored in the pixel value at the place where each vertex of the geometry image is arranged. For example, the distance (depth value) from the projection plane to the vertex in the three-dimensional space may be stored as the luminance component of the pixel value at the place where each vertex of the geometry image is arranged.
As illustrated in the eleventh row from the top of the table in
As illustrated in the twelfth row from the top of the table in
In the case of decoding, as illustrated in
By transmitting the conversion information in this manner, the decoder can correctly perform conversion from the texel grid to the voxel grid (as inverse conversion of conversion from the voxel grid to the texel grid).
As illustrated in the thirteenth row from the top of the table in
As described above, by transmitting the offset value (dx, dy), the decoder can more easily perform coordinate conversion.
In this case, the reconstruction of the vertex in the three-dimensional space is performed as follows. First, as illustrated in
Next, as illustrated in
Next, as illustrated in
In this manner, each vertex can be reconstructed into a three-dimensional space using the conversion information (offset value) and the depth value.
The specification of the conversion information is arbitrary, and is not limited to the above-described offset value. For example, as illustrated in the fourteenth row from the top of the table in
Furthermore, for example, as illustrated in
Furthermore, as illustrated in the fifteenth row from the top of the table in
By applying the scaling value in this manner, the value of the offset value (dx, dy) can be reduced. Therefore, it is possible to suppress a decrease in encoding efficiency caused by storing the offset value (dx, dy).
A storage location of the conversion information is arbitrary. For example, as illustrated in the sixteenth row from the top of the table in
Furthermore, as illustrated in the seventeenth row from the top of the table in
Furthermore, at that time, similarly to the method described above in <Method 1-3-2-2>, a difference value of offset values between adjacent vertices may be derived, and the difference value may be stored as conversion information in a bitstream including encoded data of the geometry image or the texture image.
For example, as in a table illustrated in A of
For example, in B of
Specifications of the depth value and the conversion information to be transmitted are arbitrary. For example, as illustrated in the eighteenth row from the top of the table in
As illustrated in the bottom row of the table of
The various methods described above can be appropriately combined with other methods and applied.
Note that while
As illustrated in
Connectivity 331, vertex information 332, a UV map 333, and a texture 334 are supplied to the encoding device 300 as 3D data using the mesh.
The connectivity 331 is information similar to the connectivity 32 (
The mesh voxelization unit 311 acquires the vertex information 332 supplied to the encoding device 300. The mesh voxelization unit 311 converts coordinates of each vertex included in the acquired vertex information 332 into a voxel grid. The mesh voxelization unit 311 supplies the vertex information 332 of the voxel grid after the conversion to the patch generation unit 312.
The patch generation unit 312 acquires the connectivity 331 and the UV map 333 supplied to the encoding device 300. In addition, the patch generation unit 312 acquires vertex information 332 of the voxel grid supplied from the mesh voxelization unit 311. The patch generation unit 312 generates a patch of geometry on the basis of the information. In addition, the patch generation unit 312 projects a patch of the generated geometry onto the projection plane to generate a patch image. Further, the patch generation unit 312 generates conversion information for converting the coordinates of the vertices from the texel grid to the voxel grid.
The patch generation unit 312 supplies the connectivity 331 and the UV map 333 to the meta information encoding unit 315 as meta information. Furthermore, the patch generation unit 312 supplies the generated patch image and conversion information, and the UV map 333 to the geometry image generation unit 313. Further, the patch generation unit 312 supplies the generated patch image and the UV map 333 to the occupancy image generation unit 314.
The image generation unit 321 performs processing related to generation of an image (frame image). As described above in <Method 1>, the image generation unit 321 generates the geometry image and the occupancy image by arranging the plurality of vertices on the two-dimensional plane so as to correspond to the UV map on the basis of the position information of the plurality of vertices of the polygon and the UV map indicating the correspondence relationship between the plurality of vertices and the texture of the polygon. That is, the geometry image generation unit 313 generates the geometry image by arranging the plurality of vertices on the two-dimensional plane so as to correspond to the UV map on the basis of the position information of the plurality of vertices of the polygon and the UV map indicating the correspondence relationship between the plurality of vertices and the texture of the polygon. The occupancy image generation unit 314 similarly generates an occupancy image.
For example, the geometry image generation unit 313 acquires the patch image, the conversion information, and the UV map 333 supplied from the patch generation unit 312. The geometry image generation unit 313 generates a geometry image of the texel grid by arranging (the vertices included in) the patch images on the two-dimensional plane of the texel grid so as to correspond to the UV map 333. For example, as described above in <Method 1-3>, the geometry image generation unit 313 may arrange and store the depth value of each vertex included in the patch image at the same position in the geometry image of the texel grid as the texture corresponding to the vertex in the texture image on the basis of the acquired UV map 333. For example, the geometry image generation unit 313 may store the depth value of each vertex included in the patch image in the geometry image as a luminance component of the pixel. Furthermore, as described above in <Method 1-3-1>, the geometry image generation unit 313 may store the depth value or the like complemented from the adjacent vertex in the luminance component of the position (pixel) other than the vertex position in the patch of the geometry image.
Furthermore, as described above in <Method 1-3-2>, the geometry image generation unit 313 may store the conversion information in this geometry image. A storage location of the conversion information is arbitrary. For example, as described above in <Method 1-3-2-4>, the geometry image generation unit 313 may store the conversion information corresponding to each vertex as the color component (chrominance component) of the position (pixel) of the vertex indicated by the UV map 333 of the geometry image. In other words, the geometry image generation unit 313 generates the geometry image by arranging the depth value of the patch image in the luminance component at the position of each vertex indicated by the UV map 333 and arranging the conversion information in the color component (chrominance component). The geometry image generation unit 313 supplies the generated geometry image to the 2D encoding unit 316. Note that, as described above in <Method 1-3-2-5>, the conversion information may be encoded as information different from the geometry image. That is, the geometry image generation unit 313 may not store the conversion information in the geometry image. Furthermore, as described above in <Method 1-4>, the geometry image generation unit 313 may store the angle θ and (dx, dy) from the adjacent polygons in the geometry image instead of the depth value and the conversion information.
The occupancy image generation unit 314 acquires the patch image and the UV map 333 supplied from the patch generation unit 312. On the basis of the acquired UV map 333 and patch image, the occupancy image generation unit 314 generates an occupancy image corresponding to the geometry image generated by the geometry image generation unit 313. The occupancy image generation unit 314 supplies the generated occupancy image to the 2D encoding unit 317.
The encoding unit 322 performs processing related to encoding. For example, the encoding unit 322 encodes the UV map 333, the connectivity 331, the geometry image, the occupancy image, and the texture 334. The meta information encoding unit 315 acquires the meta information (including the connectivity 331 and the UV map 333) supplied from the patch generation unit 312. The meta information encoding unit 315 encodes the acquired meta information to generate encoded data of the meta information. The meta information encoding unit 315 supplies the encoded data of the generated meta information to the multiplexing unit 319.
The 2D encoding unit 316 acquires the geometry image supplied from the geometry image generation unit 313. The 2D encoding unit 316 encodes the acquired geometry image by an encoding method for 2D images, and generates encoded data of the geometry image. The 2D encoding unit 316 supplies the encoded data of the generated geometry image to the multiplexing unit 319.
The 2D encoding unit 317 acquires the occupancy image supplied from the occupancy image generation unit 314. The 2D encoding unit 317 encodes the acquired occupancy image by an encoding method for 2D images, and generates encoded data of the occupancy image. The 2D encoding unit 317 supplies the multiplexing unit 319 with the encoded data of the generated occupancy image.
The 2D encoding unit 318 acquires the texture 334 supplied to the encoding device 300. The 2D encoding unit 318 encodes the acquired texture 334 (that is, the texture image) by an encoding method for 2D images, and generates encoded data of the texture image. The 2D encoding unit 318 supplies the encoded data of the generated texture image to the multiplexing unit 319.
The multiplexing unit 319 acquires the encoded data of the meta information supplied from the meta information encoding unit 315, the encoded data of the geometry image supplied from the 2D encoding unit 316, the encoded data of the occupancy image supplied from the 2D encoding unit 317, and the encoded data of the texture image supplied from the 2D encoding unit 318. The multiplexing unit 319 multiplexes the acquired information to generate one bitstream (bitstream of 3D data using mesh). The multiplexing unit 319 outputs the generated bitstream to the outside of the encoding device 300.
Note that these processing units (mesh voxelization unit 311 to multiplexing unit 319) have an arbitrary configuration. For example, each processing unit may be configured by a logic circuit that implements the above-described processing. Furthermore, it is also possible that each processing unit includes a central processing unit (CPU), a read only memory (ROM), a random access memory (RAM) and the like, for example, and executes a program by using them to realize the above-described processes. Of course, it is also possible that each processing unit has both configurations such that some of the above-described processes may be realized by the logic circuit and the others may be realized by execution of the program. The configurations of the processing units may be independent from each other and, for example, some processing units may implement a part of the above-described processing by the logic circuit, some other processing units may implement the above-described processing by executing the program, and still some other processing units may implement the above-described processing by both the logic circuit and the execution of the program.
The patch separation unit 351 acquires the connectivity 331 input to the encoding device 300, the vertex information 332 of the voxel grid supplied from the mesh voxelization unit 311, and the UV map 333 input to the encoding device 300. The patch separation unit 351 separates a vertex group indicated by the acquired vertex information 332 of the voxel grid into patches. That is, the patch separation unit 351 sets a patch for the vertex information 332 and sorts each vertex included in the vertex information 332 for each patch. A method of setting this patch is arbitrary. For example, as described above in <Method 1-1>, a small region of the texture (a region arranged as a cluster in the texture image) or the small region may be divided and used as patches. For example, as described above in <Method 1-2>, the patch separation unit 351 may use, as a patch, a region constituted by continuous polygons (mutually adjacent) in the small region. In addition, the patch separation unit 351 may perform enlargement, reduction, rotation, movement, and the like of the small region. The patch separation unit 351 supplies the vertex information 332, the connectivity 331, and the UV map 333 for each patch to the patch projection unit 352 and the offset derivation unit 353.
The patch projection unit 352 acquires the vertex information 332, the connectivity 331, and the UV map 333 for each patch supplied from the patch separation unit 351. The patch projection unit 352 projects the acquired vertex information 332 for each patch on the projection plane corresponding to the patch, and derives the depth value d (distance from the projection plane to the vertex) of each vertex. A method of setting the projection plane (projection direction) is arbitrary. For example, as described above in <Method 1-2>, the patch projection unit 352 may set the projection plane (projection direction) on the basis of the vertices included in the patch. In addition, as described above in <Method 1-2-1>, the patch projection unit 352 may set the projection plane (projection direction) on the basis of the normal vector of the polygon. For example, the patch projection unit 352 may set the projection plane (projection direction) on the basis of the normal vector of the polygon average as described above in <Method 1-2-1-1>. In addition, as described above in <Method 1-2-1-2>, the patch projection unit 352 may set the projection plane (projection direction) on the basis of the normal vector of the representative polygon in the patch. Further, the patch projection unit 352 may set the projection plane on the basis of the bounding box as described above in <Method 1-2-2>. For example, as described above in <Method 1-2-2-1>, the patch projection unit 352 may use the direction of the shortest side of the bounding box as the normal vector of the projection plane. Furthermore, as described above in <Method 1-2-2-2>, the patch projection unit 352 may calculate the mean square error of the depth value to each surface of the bounding box, and may use the surface with the minimum error as the projection plane. The patch projection unit 352 supplies the derived depth value d of each vertex, the connectivity 331, and the UV map 333 to the patch image generation unit 354.
As described above in <Method 1-3-2>, the offset derivation unit 353 generates conversion information for converting the position information in the geometry image of the texel grid into the position information in the geometry image of the voxel grid. This conversion information may be any information. For example, as described above in <Method 1-3-2-1>, the offset derivation unit 353 may derive the offset (dx, dy) as the conversion information. In that case, for example, the offset derivation unit 353 acquires the vertex information 332, the connectivity 331, and the UV map 333 for each patch supplied from the patch separation unit 351. Then, the offset derivation unit 353 derives an offset (dx, dy) on the basis of the acquired vertex information 332 for each patch and the UV map 333. That is, the offset derivation unit 353 may be regarded as a conversion information generation unit in the present disclosure. The offset derivation unit 353 supplies the derived conversion information (offset (dx, dy)) to the patch image generation unit 354.
Note that the conversion information derived by the offset derivation unit 353 may include a difference value between vertices of the offset (dx, dy), as described above in <Method 1-3-2-2>. Furthermore, as described above in <Method 1-3-2-3>, this conversion information may include a scale value (sx, sy) for each patch.
The patch image generation unit 354 acquires the depth value d, the connectivity 331, and the UV map 333 of each vertex for each patch supplied from the patch projection unit 352. In addition, the patch image generation unit 354 acquires the offset (dx, dy) (conversion information) supplied from the offset derivation unit 353. The patch image generation unit 354 generates a patch image on the basis of the depth value d of each vertex of each patch. In this patch image, at least the depth value d corresponding to the vertex is stored in the luminance component of the pixel at the position of each vertex. A depth value or the like complemented from an adjacent vertex with a luminance component at a position (pixel) other than the vertex in the patch may be stored. Furthermore, an offset (dx, dy) corresponding to each vertex may be stored in the color component of the position (pixel) of each vertex.
The patch image generation unit 354 supplies the connectivity 331 and the UV map 333 to the meta information encoding unit 315 as meta information. In addition, the patch image generation unit 354 supplies the patch image, the depth value d of each vertex, the offset (dx, dy), and the UV map 333 to the geometry image generation unit 313. Note that the depth value d and the offset (dx, dy) of each vertex may be included in the patch image. In addition, the patch image generation unit 354 supplies the generated patch image and the UV map 333 to the occupancy image generation unit 314.
An example of a flow of an encoding process executed by the encoding device 300 will be described with reference to a flowchart in
When the encoding process is started, in step S301, the mesh voxelization unit 311 voxelizes the coordinates of each vertex included in the vertex information 332 to voxelize the mesh.
In step S302, the patch separation unit 351 generates a patch using the vertex information 332 and the like converted into the voxel grid, and sets a projection plane on which the generated patch is projected.
In step S303, the patch projection unit 352 derives the distance between the projection plane and the vertex set by the patch separation unit 351 in step S302.
In step S304, the offset derivation unit 353 derives an offset (dx, dy), which is a difference between the position of each vertex on the projection plane set by the patch separation unit 351 in step S302 and the position of each vertex in the geometry image of the texel grid, as the conversion information on the basis of the information such as the vertex information 332 and the UV map 333 converted into the voxel grid. The patch image generation unit 354 generates a patch image on the projection plane set by the patch separation unit 351 in step S302 on the basis of information such as the depth value d and the offset (dx, dy).
In step S305, the geometry image generation unit 313 generates the geometry image on the basis of the patch image generated by the patch image generation unit 354 in step S304 and the UV map 333, and stores the depth value and the conversion information at the vertex position. In addition, the occupancy image generation unit 314 generates an occupancy image corresponding to the geometry image.
In step S306, the 2D encoding unit 316 encodes the geometry image generated by the geometry image generation unit 313 in step S305, and generates encoded data of the geometry image.
In step S307, the 2D encoding unit 318 encodes a texture image that is the texture 334, and generates encoded data of the texture image.
In step S308, the 2D encoding unit 317 encodes the occupancy image generated by the occupancy image generation unit 314 in step S305, and generates encoded data of the occupancy image.
In step S309, the meta information encoding unit 315 encodes the meta information (the connectivity 331, the UV map 333, or the like) to generate encoded data of the meta information.
In step S310, the multiplexing unit 319 multiplexes the encoded data of the geometry image, the encoded data of the occupancy image, the encoded data of the texture image, and the encoded data of the meta information to generate one bitstream. The multiplexing unit 319 outputs the generated bitstream to the outside of the encoding device 300.
When the process of step S310 ends, the encoding process ends.
In such an encoding process, the image generation unit 321 of the encoding device 300 generates a geometry image and an occupancy image by arranging a plurality of vertices on a two-dimensional plane so as to correspond to the UV map 333 on the basis of the position information (vertex information 332) of the plurality of vertices of the polygon and the UV map 333 indicating the correspondence between the plurality of vertices and the texture of the polygon. Furthermore, the encoding unit 322 encodes the UV map 333, the connectivity 331 that is connection information indicating a connection relationship between a plurality of vertices, the geometry image, the occupancy image, and the texture 334 of the polygon.
In this way, the encoding device 300 can suppress the occurrence of distortion of the shape of the texture 334 in the two-dimensional imaging, and thus, can suppress the reduction in the quality of the attribute information of the 3D data.
At that time, the image generation unit 321 may store the depth value from the projection plane on which the plurality of vertices is projected to the vertex as the luminance component of the geometry image.
Furthermore, the offset derivation unit 353 may generate conversion information for converting the positions of the vertices in the geometry image into the positions on the projection plane. In this case, the encoding unit 322 may further encode the generated conversion information.
The offset derivation unit 353 may derive conversion information including difference information (that is, the offset (dx, dy)) indicating a difference between the position of the vertex in the geometry image and the position of the vertex in the projection plane.
The offset derivation unit 353 may derive conversion information including a difference value between vertices of the difference information (offset (dx, dy)).
Furthermore, the image generation unit 321 may store the conversion information in the color component of the geometry image.
The encoding unit 322 may encode the conversion information as data different from the geometry image.
The patch image generation unit 354 may generate a single patch or a plurality of patches using a plurality of vertices corresponding to a small region of a texture arranged on a two-dimensional plane as a continuous region, project each patch on a projection plane, and generate a patch image. Then, the image generation unit 321 may generate the geometry image and the occupancy image by arranging the patch images on a two-dimensional plane so that the plurality of vertices corresponds to the UV map 333.
The patch projection unit 352 may set the projection plane on which the patch is projected on the basis of the normal vector of the polygon corresponding to the patch.
Note that while
As illustrated in
The demultiplexing unit 411 acquires a bitstream input to the decoding device 400. This bitstream is generated, for example, by the encoding device 300 encoding 3D data using mesh.
The demultiplexing unit 411 demultiplexes the bitstream and generates each encoded data included in the bitstream. That is, the demultiplexing unit 411 extracts each encoded data from the bitstream by the demultiplexing. For example, the demultiplexing unit 411 extracts the encoded data of the meta information, and supplies the encoded data to the meta information decoding unit 412. Furthermore, the demultiplexing unit 411 extracts the encoded data of the geometry image and supplies the encoded data to the 2D decoding unit 413.
Furthermore, the demultiplexing unit 411 extracts the encoded data of the occupancy image and supplies the same to the 2D decoding unit 414. Furthermore, the demultiplexing unit 411 extracts the encoded data of the texture image and supplies the encoded data to the 2D decoding unit 415.
The meta information decoding unit 412 decodes the supplied encoded data of the meta information to generate the meta information. The meta information includes connectivity 431 and a UV map 432. The meta information decoding unit 412 outputs the generated connectivity 431 and UV map 432 to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh. The meta information decoding unit 412 also supplies the generated UV map 432 to the patch reconstruction unit 417.
The 2D decoding unit 413 decodes the encoded data of the geometry image to generate the geometry image. The 2D decoding unit 413 supplies the generated geometry image to the vertex position derivation unit 416.
The 2D decoding unit 414 decodes the encoded data of the occupancy image to generate an occupancy image. The 2D decoding unit 414 supplies the generated occupancy image to the patch reconstruction unit 417.
The 2D decoding unit 415 decodes the encoded data of the texture image to generate a texture image (texture 434). The 2D decoding unit 415 outputs the generated texture image (texture 434) to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh.
The vertex position derivation unit 416 obtains the position of each vertex included in the supplied geometry image, and acquires the depth value d and the conversion information (offset (dx, dy)) stored at the position. The vertex position derivation unit 416 derives a vertex position of the voxel grid on the basis of the conversion information. The vertex position derivation unit 416 supplies information indicating the derived vertex position of the voxel grid to the patch reconstruction unit 417.
The patch reconstruction unit 417 reconstructs a patch on the basis of the supplied information indicating the vertex positions of the voxel grid, the occupancy image, the UV map 432, and the like. The patch reconstruction unit 417 supplies the reconstructed patch to the vertex information reconstruction unit 418.
The vertex information reconstruction unit 418 reconstructs each vertex in the three-dimensional space on the basis of the supplied patch, and generates vertex information 433 including position information of each vertex in the three-dimensional space. The vertex information reconstruction unit 418 outputs the generated vertex information 433 to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh.
An example of a flow of a decoding process executed by the decoding device 400 will be described with reference to a flowchart of
When the decoding process is started, in step S401, the demultiplexing unit 411 demultiplexes the bitstream input to the decoding device 400, and extracts the encoded data of the meta information, the encoded data of the geometry image, the encoded data of the occupancy image, and the encoded data of the texture image.
In step S402, the meta information decoding unit 412 decodes the encoded data of the meta information extracted by the demultiplexing unit 411 in step S401, and generates (restores) the meta information. The meta information includes the connectivity 431 and the UV map 432. The meta information decoding unit 412 outputs the generated connectivity 431 and UV map 432 to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh.
In step S403, the 2D decoding unit 414 decodes the encoded data of the occupancy image extracted by the demultiplexing unit 411 in step S401 to generate (restore) the occupancy image.
In step S404, the 2D decoding unit 413 decodes the encoded data of the geometry image extracted by the demultiplexing unit 411 in step S401, and generates (restores) the geometry image. The depth value d is stored in the luminance component at the position where each vertex of the geometry image is arranged, and the conversion information (for example, offset (dx, dy)) is stored in the color component.
In step S405, the vertex position derivation unit 416 derives the vertex position of the voxel grid on the basis of the conversion information (for example, offset (dx, dy)) included in the geometry image generated in step S404.
In step S406, the patch reconstruction unit 417 reconstructs the patch on the basis of the UV map 432 generated by the meta information decoding unit 412 in step S402, the occupancy image generated by the 2D decoding unit 414 in step S403, and the information such as the vertex position of the voxel grid derived by the vertex position derivation unit 416 in step S405.
In step S407, the vertex information reconstruction unit 418 reconstructs the vertex information 433 on the basis of the information such as the patch reconstructed by the patch reconstruction unit 417 in step S406. The vertex information reconstruction unit 418 outputs the reconstructed vertex information 433 to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh.
In step S408, the 2D decoding unit 415 decodes the encoded data of the texture image extracted by the demultiplexing unit 411 in step S401, and generates (restores) the texture image (texture 434). The 2D decoding unit 415 outputs the generated texture image (texture 434) to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh.
When the process of step S408 ends, the decoding process ends.
In such a decoding process, the decoding unit 421 of the decoding device 400 decodes the encoded data (bitstream), and generates the UV map indicating the correspondence between the plurality of vertices of the polygon and the texture of the polygon, the connection information indicating the connection relationship between the plurality of vertices, the geometry image in which the plurality of vertices is arranged so as to correspond to the UV map on the two-dimensional plane, the occupancy image corresponding to the geometry image, and the texture image in which the texture is arranged on the two-dimensional plane, and the vertex information reconstruction unit 418 reconstructs the position information of the plurality of vertices in the three-dimensional space on the basis of the UV map, the geometry image, and the occupancy image.
By doing so, the decoding device 400 can suppress the occurrence of the distortion of the shape of the texture 334 in the two-dimensional imaging, and thus, can suppress the reduction in the quality of the attribute information of the 3D data.
Note that depth values from a projection plane on which a plurality of vertices is projected to the vertices may be stored in the luminance component of the geometry image.
Furthermore, the decoding unit 421 may decode the encoded data (bitstream) to further generate conversion information for converting the positions of the vertices in the geometry image into the positions on the projection plane. Then, the vertex information reconstruction unit 418 may reconstruct the vertices on the projection plane from the geometry image using the occupancy image and the conversion information, and may reconstruct the position information of the plurality of vertices in the three-dimensional space using the vertices on the reconstructed projection plane and the depth values.
Note that this conversion information may include difference information indicating a difference between the position of the vertex in the geometry image and the position of the vertex in the projection plane.
Furthermore, this conversion information may include a difference value between vertices of the difference information.
Further, this conversion information may be stored in a color component of the geometry image.
Note that the decoding unit 421 may decode the encoded data and generate the conversion information encoded as data different from the geometry image.
The vertex information reconstruction unit 418 may reconstruct, from the geometry image, vertices on the projection plane on which a plurality of vertices corresponding to small regions of the texture arranged on a two-dimensional plane as continuous regions is projected for each patch, using the occupancy image and the conversion information.
Further, the vertex information reconstruction unit 418 may reconstruct the vertex on the projection plane set on the basis of the normal vector of the polygon corresponding to the patch.
In <Method 1-3> and the like, the distance (depth value) from the projection plane to the vertex in the three-dimensional space and the conversion information (for example, an offset value (dx, dy)) for restoring the depth value to the voxel grid are stored at the vertex position of the geometry image of the texel grid. That is, in this case, the position of the vertex is indicated by the depth value and the offset value (d, dx, dy).
The position information of the vertex is not limited to this example. For example, a bounding box (hereinafter, also referred to as patch bounding box) may be set for each patch, and the relative coordinates of the patch bounding box from the reference point may be used as the position information of the vertex. Then, similarly to the case of <Method 1-3> or the like, the relative coordinates may be stored in the geometry image.
For example, in an information processing apparatus (for example, an encoding device), the image generation unit may store the relative coordinates of the vertex from the reference point of the patch bounding box in the geometry image. Furthermore, in the information processing apparatus (for example, decoding device), the reconstruction unit may reconstruct the position information of the vertex in the three-dimensional space using the relative coordinates of the vertex from the reference point of the patch bounding box included in the geometry image. Note that the patch bounding box may be a three-dimensional region including a patch formed for each patch obtained by dividing 3D data using polygons.
For example, as illustrated on the left side of
A patch bounding box is formed for each patch. That is, the patch bounding box is formed for each patch. The shape of the patch bounding box is arbitrary. For example, the patch bounding box may be rectangular. In the case of the example of
In addition, one location of the patch bounding box 502 is set as a reference point. The position of the reference point is arbitrary. For example, a point at which each component of coordinates of the patch bounding box has a minimum value may be used as the reference point. In the case of the example of
Then, for each vertex, relative coordinates (dx, dy, dz) from the reference point are derived as position information. That is, the relative coordinates can also be said to be a difference vector having the reference point as a start point and the vertex as an end point. For example, in the case of
Then, as described above, the position information (relative coordinates from the reference point) may be stored in the geometry image. For example, in an information processing apparatus (for example, an encoding device), the image generation unit may store the relative coordinates of each vertex in the geometry image as a pixel value of a pixel corresponding to the vertex. In other words, for example, in a geometry image generated by decoding encoded data by a decoding unit of an information processing apparatus (for example, a decoding device), relative coordinates may be stored as pixel values of pixels corresponding to vertices. For example, as illustrated on the right side of
For example, in an information processing apparatus (for example, an encoding device), the image generation unit may store each component of the relative coordinates of each vertex in each component (for example, YUV 4:4:4) of the geometry image. In other words, for example, each component of relative coordinates of each vertex may be stored in each component of a geometry image generated by decoding encoded data by a decoding unit of an information processing apparatus (for example, a decoding device). For example, the relative coordinate dx in the x-axis direction may be stored as the pixel value of the luminance component (Y). Further, the relative coordinate dy in the y-axis direction may be stored as the pixel value of the luminance component (Cb). Furthermore, the relative coordinate dz in the z-axis direction may be stored as the pixel value of the luminance component (Cr). Certainly, a storage method is arbitrary, and is not limited to this example.
The location of each vertex of the patch is limited within its patch bounding box. That is, as described above, by representing the position information of the vertex by the relative coordinates from the reference point of the patch bounding box, the width (range) of the value that each component of the position information can take can be limited to the size (length of each component) or less of the patch bounding box that is a partial region of the three-dimensional space. That is, the pixel value change in the spatial direction in the geometry image can be made gentler when the position information of the vertex is represented by the relative coordinates from the reference point of the patch bounding box. Thus, a decrease in encoding efficiency can be suppressed.
The patch separation unit 551 is a processing unit similar to the patch separation unit 351, and executes processing related to processing of separating 3D data (for example, mesh) into patches. For example, the patch separation unit 551 may acquire the connectivity 331 input to the encoding device 300, the vertex information 332 of the voxel grid supplied from the mesh voxelization unit 311, and the UV map 333 input to the encoding device 300. In addition, the patch separation unit 551 may separate a vertex group indicated by the acquired vertex information 332 of the voxel grid into patches. That is, the patch separation unit 551 may set a patch for the vertex information 332 and sort each vertex included in the vertex information 332 for each patch. A method of setting this patch is arbitrary. In addition, the patch separation unit 551 may perform enlargement, reduction, rotation, movement, and the like of the small region. Furthermore, the patch separation unit 551 may supply the vertex information 332, the connectivity 331, and the UV map 333 for each patch to the patch bounding box setting unit 552.
The patch bounding box setting unit 552 executes processing related to patch bounding box setting. For example, the patch bounding box setting unit 552 may acquire information (for example, vertex information 332 for each patch, connectivity 331, UV map 333, and the like) supplied from the patch separation unit 551. In addition, the patch bounding box setting unit 552 may set a patch bounding box for each patch.
In addition, the patch bounding box setting unit 552 may set a reference point of the patch bounding box, and derive, for each vertex in the patch, relative coordinates from the reference point as the position information. For example, the patch bounding box setting unit 552 may use a point (P0 (min_x, min_y, min_z)) at which each component of coordinates of the patch bounding box has a minimum value as the reference point.
In addition, the patch bounding box setting unit 552 may supply the patch information defining the patch bounding box set for each vertex and the derived position information (relative coordinates from the reference point) of each vertex to the patch information generation unit 553 together with the information and the like supplied from the patch separation unit 551.
The patch information generation unit 553 executes processing related to generation of patch information that stores information regarding a patch. For example, the patch information generation unit 553 may acquire information (for example, the vertex information 332 for each patch, the connectivity 331, the UV map 333, the position information of each vertex, the information defining the patch bounding box set for each vertex (for example, information indicating the position and size of the patch bounding box), and the like) supplied from the patch bounding box setting unit 552. In addition, the patch information generation unit 553 may generate patch information. In addition, the patch information generation unit 553 may store information or the like defining a patch bounding box in the patch information. In addition, the patch information generation unit 553 may supply the generated patch information to the patch image generation unit 554 together with the information and the like supplied from the patch bounding box setting unit 552.
The patch image generation unit 554 executes processing related to generation of a patch image. For example, the patch image generation unit 554 may acquire information (for example, vertex information 332 for each patch, connectivity 331, UV map 333, position information of each vertex, information defining a patch bounding box set for each vertex, patch information, and the like) supplied from the patch information generation unit 553. In addition, the patch image generation unit 554 may generate a patch image using these pieces of information. For example, the patch image generation unit 554 may set YUV 4:4:4 and generate a patch image for each component (Y, Cb, Cr).
In addition, the patch image generation unit 554 may store vertex position information (for example, relative coordinates from the reference point of the patch bounding box) in the patch image. For example, the patch image generation unit 554 may store the position information of the vertex as the pixel value of the pixel corresponding to the vertex of the patch image. In addition, the patch image generation unit 554 may store each component of the position information of each vertex in each component of the patch image.
In addition, the patch image generation unit 554 may supply the connectivity 331, the UV map 333, and the patch information to the meta information encoding unit 315 as the meta information. In addition, the patch image generation unit 554 may supply the patch image and the position information (for example, relative coordinates or the like from the reference point) of each vertex to the geometry image generation unit 313. Note that the position information of each vertex may be stored in the patch image. In addition, the patch image generation unit 554 may supply the generated patch image and the UV map 333 to the occupancy image generation unit 314.
With such a configuration, the encoding device 300 can suppress the occurrence of distortion of the shape of the texture 334 in the two-dimensional imaging, and thus, can suppress a reduction in the quality of the attribute information of the 3D data.
An example of a flow of an encoding process in this case will be described with reference to a flowchart in
When the encoding process is started, in step S501, the mesh voxelization unit 311 voxelizes the coordinates of each vertex included in the vertex information 332 to voxelize the mesh.
In step S502, the patch separation unit 551 generates a patch using the vertex information 332 and the like converted into the voxel grid, and sets a projection plane on which the generated patch is projected.
In step S503, the patch bounding box setting unit 552 sets a patch bounding box for each vertex, sets a reference point for each patch bounding box, and derives a difference vector of each vertex. For example, the patch bounding box setting unit 552 may set, as the reference point, a point at which each component of coordinates of the patch bounding box has a minimum value.
In step S504, the patch information generation unit 553 generates patch information. For example, the patch information generation unit 553 may store information defining a patch bounding box in the patch information.
In step S505, the patch image generation unit 554 projects the patch onto the projection plane to generate a patch image, and stores a difference vector (relative coordinates from the reference point) as the pixel value of the vertex position.
In step S506, the geometry image generation unit 313 generates a geometry image on the basis of the patch image generated in step S505 and the UV map 333. At that time, the geometry image generation unit 313 stores position information (relative coordinates from the reference point) of each vertex in the geometry image. For example, the geometry image generation unit 313 may store position information (relative coordinates from the reference point) of each vertex in the geometry image as a pixel value of a pixel corresponding to the vertex. Furthermore, the geometry image generation unit 313 may store each component (each component of relative coordinates from the reference point) of the position information of each vertex in each component (Y, Cb, Cr) of the geometry image.
In step S507, the occupancy image generation unit 314 generates an occupancy image corresponding to the geometry image.
In step S508, the meta information encoding unit 315 encodes the meta information (connectivity 331, UV map 333, patch information, and the like) to generate the encoded data of the meta information.
In step S509, the 2D encoding unit 316 encodes the geometry image generated in step S506, and generates encoded data of the geometry image.
In step S510, the 2D encoding unit 317 encodes the occupancy image generated in step S507, and generates encoded data of the occupancy image.
In step S511, the 2D encoding unit 318 encodes a texture image that is the texture 334, and generates encoded data of the texture image.
In step S512, the multiplexing unit 319 multiplexes the encoded data of the geometry image, the encoded data of the occupancy image, the encoded data of the texture image, and the encoded data of the meta information to generate one bitstream. The multiplexing unit 319 outputs the generated bitstream to the outside of the encoding device 300.
When the process of step S310 ends, the encoding process ends.
By executing each processing in this manner, the encoding device 300 can suppress the occurrence of distortion of the shape of the texture 334 in the two-dimensional imaging, and thus, can suppress the reduction in the quality of the attribute information of the 3D data.
The meta information decoding unit 412 decodes the supplied encoded data of the meta information to generate the meta information. The meta information includes connectivity 431, a UV map 432, and patch information. The meta information decoding unit 412 outputs the generated connectivity 431 and UV map 432 to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh. The meta information decoding unit 412 also supplies the UV map 432 and the meta information to the patch reconstruction unit 417.
The patch reconstruction unit 417 reconstructs a patch on the basis of the supplied information such as the geometry image, the occupancy image, the UV map 432, and the patch information. The patch reconstruction unit 417 supplies the reconstructed patch to the vertex information reconstruction unit 418.
The vertex information reconstruction unit 418 reconstructs each vertex in the three-dimensional space on the basis of the supplied patch, and generates vertex information 433 including position information of each vertex in the three-dimensional space. The vertex information reconstruction unit 418 outputs the generated vertex information 433 to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh.
That is, the patch reconstruction unit 417 and the vertex information reconstruction unit 418 reconstruct the position information of the plurality of vertices in the three-dimensional space on the basis of the UV map, the geometry image, and the occupancy image. The patch reconstruction unit 417 and the vertex information reconstruction unit 418 may further reconstruct the position information of each vertex in the three-dimensional space by using the relative coordinates of each vertex from the reference point of the patch bounding box included in the geometry image. That is, the patch reconstruction unit 417 and the vertex information reconstruction unit 418 can also be referred to as reconstruction units.
Note that the relative coordinates of each vertex from the reference point of the patch bounding box may be stored in the geometry image as a pixel value of a pixel corresponding to the vertex. For example, each component of the relative coordinates may be stored in each component of the geometry image. The patch reconstruction unit 417 may acquire the relative coordinates of each vertex stored in the geometry image in this manner.
In addition, the patch bounding box may be defined in the patch information. For example, information indicating the position and size of the patch bounding box may be stored in the patch information. For example, the meta information decoding unit 412 may decode the supplied encoded data of the meta information to generate the patch information (information indicating the position and size of the patch bounding box stored in the patch information).
The patch reconstruction unit 417 sets a patch bounding box for each patch on the basis of (information indicating the position and size of the patch bounding box stored in) the patch information. In addition, the patch reconstruction unit 417 sets a reference point of the patch bounding box. The reference point is an arbitrary point as described above. For example, a point at which each component of coordinates of the patch bounding box has a minimum value may be used as the reference point.
The patch reconstruction unit 417 derives the position information (that is, the three-dimensional coordinates of each vertex) of each vertex in the three-dimensional space using the set reference point and the position information (relative information from the reference point) of each vertex, and rebuilds the patch.
The vertex information reconstruction unit 418 reconstructs the position information of each vertex in the three-dimensional space, that is, the vertex information, using the reconstructed patch.
With such a configuration, the decoding device 400 can suppress the occurrence of the distortion of the shape of the texture 334 in the two-dimensional imaging, and thus, can suppress the reduction in the quality of the attribute information of the 3D data.
An example of a flow of a decoding process in this case will be described with reference to a flowchart in
In step S552, the meta information decoding unit 412 decodes the encoded data of the meta information extracted from the bitstream by the demultiplexing unit 411 in step S551, and generates (restores) the meta information. The meta information includes connectivity 431, a UV map 432, and patch information. The meta information decoding unit 412 outputs the generated connectivity 431 and UV map 432 to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh. Note that information indicating the position and size of the patch bounding box may be stored in the patch information. That is, the meta information decoding unit 412 may decode the encoded data and generate patch information storing information indicating the position and size of the patch bounding box.
In step S553, the 2D decoding unit 413 decodes the encoded data of the geometry image extracted from the bitstream by the demultiplexing unit 411 in step S551, and generates (restores) the geometry image. In this geometry image, position information of each vertex (relative coordinates from the reference point of the patch bounding box) is stored. For example, relative coordinates of vertices may be stored in the geometry image as pixel values of pixels corresponding to the vertices. Furthermore, each component of the relative coordinates of each vertex may be stored in each component of the geometry image. Note that the reference point of the patch bounding box may be, for example, a point at which each component of coordinates of the patch bounding box has a minimum value.
In step S554, the 2D decoding unit 414 decodes the encoded data of the occupancy image extracted from the bitstream by the demultiplexing unit 411 in step S551, and generates (restores) the occupancy image.
In step S555, the 2D decoding unit 415 decodes the encoded data of the texture image extracted from the bitstream by the demultiplexing unit 411 in step S551, and generates (restores) the texture image (texture 434). The 2D decoding unit 415 outputs the generated texture image (texture 434) to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh.
In step S556, the patch reconstruction unit 417 reconstructs the patch on the basis of the UV map 432 and the meta information generated by the meta information decoding unit 412 in step S552, the occupancy image generated by the 2D decoding unit 414 in step S554, and the like.
In step S557, the vertex information reconstruction unit 418 reconstructs the vertex information 433 on the basis of the information such as the patch reconstructed by the patch reconstruction unit 417 in step S556. The vertex information reconstruction unit 418 outputs the reconstructed vertex information 433 to the outside of the decoding device 400 as (data constituting) 3D data using the restored mesh.
When the process of step S418 ends, the decoding process ends.
By executing each processing as described above, the decoding device 400 can suppress the occurrence of the distortion of the shape of the texture 334 in the two-dimensional imaging, and thus, can suppress the reduction in the quality of the attribute information of the 3D data.
As described above, when geometry is projected on a two-dimensional plane by a texel grid, a plurality of vertices may be projected on the same pixel (u, v) of the geometry image. In this manner, a plurality of vertices projected on the same pixel of the geometry image is also referred to as overlapping points. However, only the position information of one vertex can be stored in each pixel of the geometry image. Therefore, when such an overlapping point exists, there is a possibility that a vertex where the position information cannot be stored in the geometry image is generated.
Therefore, when the position information cannot be stored in the geometry image as described above, the position information may be stored in patch information or the like. For example, in an information processing apparatus (for example, an encoding device), when a plurality of vertices corresponds to one pixel in a geometry image, the image generation unit may store position information of a representative vertex among the plurality of vertices in the geometry image. Then, the information processing apparatus (for example, an encoding device) may further include a patch information generation unit that generates patch information that is information regarding a patch obtained by dividing the 3D data using the polygon, and stores position information of vertices other than the representative vertex among the plurality of vertices in the patch information.
Furthermore, in the information processing apparatus (for example, decoding device), the reconstruction unit may reconstruct the three-dimensional position information for a plurality of vertices corresponding to one pixel of the geometry image by using the position information stored in the geometry image and the position information stored in the patch information that is information regarding the patch obtained by dividing the 3D data using the polygon.
For example, as illustrated in
At the time of decoding, as illustrated in
Note that, for example, in an information processing apparatus (for example, an encoding device), the patch information generation unit may store the position information of each vertex other than the representative vertex among the plurality of vertices in the patch information in association with the identification information of the vertex. In other words, in the patch information generated by decoding the encoded data by the decoding unit of the information processing apparatus (for example, decoding device), the position information of the vertex may be stored in association with the identification information of the vertex.
For example, as illustrated in
Furthermore, the position information of the vertex may be any information. For example, as illustrated in
Furthermore, the position information of a vertex may include a depth value (d) from the projection plane on which the vertex is projected to the vertex, and difference information (dx, dy) indicating a difference between the position of the vertex in the geometry image and the position of the vertex in the projection plane. That is, the position information of the vertex may be (d, dx, dy).
For example, in the encoding device 300, when a plurality of vertices corresponds to one pixel of the geometry image, the geometry image generation unit 313 may store the position information of the representative vertex among the plurality of vertices in the geometry image. Then, the patch information generation unit 553 may generate patch information that is information regarding a patch obtained by dividing the 3D data using the polygon, and store the position information of vertices other than the representative vertex among the plurality of vertices in the patch information.
For example, in step S506 of the encoding process, the geometry image generation unit 313 may generate a geometry image and store position information (relative coordinates from the reference point) of a representative vertex among a plurality of vertices corresponding to the same pixel of the geometry image, in the geometry image. Then, in step S504, the patch information generation unit 553 may generate patch information, and store the position information of vertices other than the representative vertex among the plurality of vertices corresponding to the same pixel of the geometry image in the patch information.
In this way, the encoding device 300 can transmit the position information of all the vertices that become the overlapping points to the decoding side.
For example, in the decoding device 400, the patch reconstruction unit 417 may reconstruct a patch for a plurality of vertices corresponding to one pixel of the geometry image by using the position information stored in the geometry image and the position information stored in the patch information that is information regarding the patch obtained by dividing the 3D data using the polygon, and the vertex information reconstruction unit 418 may reconstruct the three-dimensional position information by using the patch.
For example, in step S556 of the decoding process, for a plurality of vertices corresponding to one pixel of the geometry image, the patch reconstruction unit 417 may reconstruct the patch by using the position information stored in the geometry image and the position information stored in the patch information that is the information regarding the patch obtained by dividing the 3D data using the polygon. Then, in step S557, the vertex information reconstruction unit 418 may reconstruct the three-dimensional position information using the patch.
By doing so, the decoding device 400 can restore the position information of all the vertices that become the overlapping points.
As described above, when the geometry is projected on the two-dimensional plane by the texel grid, a patch of the geometry is generated on the basis of the shape of the input texture. The geometry image is encoded by the 2D encoding unit, but the 2D encoding unit generally has a restriction of a data length (bit length) such as 8 bits or 10 bits. However, for example, when the depth value from the projection plane to the vertex is stored in the geometry image, since there is no restriction on the value of the depth value, there is a possibility that the range (value range) of the value that can be taken as the depth value exceeds the restriction on the data length of the 2D encoding unit. It similarly applies to the case of decoding. Generally, the 2D decoding unit has a constraint of a data length (bit length) such as 8 bits or 10 bits, but since there is no constraint on the value of the depth value, there is a possibility that the range (value range) of values that can be taken as the depth value exceeds the constraint on the data length of the 2D decoding unit. The difference information is similar to the case of the depth value. As described above, when the pixel value of the geometry image exceeds the restriction of the 2D encoding unit and the 2D decoding unit, it may be difficult to correctly encode and decode the geometry image.
Therefore, in such a case, by dividing the patch, the pixel value of the geometry image can be prevented from exceeding the constraints of the 2D encoding unit and the 2D decoding unit. For example, in the graph on the left side of
However, in the v3c-mesh, since three vertices constituting the triangle mesh are treated as one set, it is necessary to create a new vertex in the case of division, but as described above, in the case of generating a geometry image so as to match the texture image (also referred to as texel-based projection), the shape of the patch image before the division cannot be changed. For example, it is assumed that the patch image 621 illustrated on the right side of
Therefore, for vertices located at the boundary between the divided patches, position information as vertices of any one patch is stored in the geometry image. Then, when the vertex information is reconstructed (when the three-dimensional position information is reconstructed), it is assumed that the vertex exists in both patches, and the position information stored in the geometry is applied.
For example, an information processing apparatus (for example, an encoding device) may further include a patch image generation unit that generates a single or a plurality of patches using a plurality of vertices corresponding to small regions of texture arranged on a two-dimensional plane as a continuous region, projects each patch on a projection plane, and generates a patch image. Then, when the patch image generation unit generates a plurality of patches for one small region of the texture at the time of generating the patch image, the patch image generation unit may derive the position information of the boundary vertex located at the boundary of the patch in any one patch. Then, the image generation unit may generate the geometry image and the occupancy image by arranging the patch image generated as described above on a two-dimensional plane so that the plurality of vertices corresponds to the UV map.
Furthermore, for example, in an information processing apparatus (for example, a decoding device), the reconstruction unit may reconstruct the three-dimensional position information of the boundary vertex located at the boundary of a plurality of patches corresponding to one small region of the texture arranged on the two-dimensional plane as a continuous region using the position information derived in any one of the plurality of patches.
For example, in the case of the patch image 621 in
By doing so, patch division can be realized without changing the original patch shape on the encoding side, and each divided patch can be correctly reconstructed in the three-dimensional space on the decoding side.
Here, the vertex on the boundary is arbitrarily set as the vertex of any patch forming the boundary. For example, information indicating which vertex of the patch is set may be provided from the encoding side to the decoding side. For example, the information may be stored in the patch information.
For example, an information processing apparatus (for example, an encoding device) may further include a patch information generation unit that generates patch information that is information regarding a patch, and stores information indicating in which patch the position information of the boundary vertex located at the boundary of the patch has been derived. Furthermore, for example, in an information processing apparatus (for example, a decoding device), the reconstruction unit may reconstruct the three-dimensional position information using the position information derived in the patch indicated in the patch information that is information regarding the patch.
In addition, each of the divided patches may be processed in a predetermined processing order, and it may be determined which patch has a vertex on the boundary according to the processing order. For example, the vertex on the boundary may be the vertex of the patch to be processed first, the vertex of the patch to be processed n-th, or the vertex of the patch to be processed last.
For example, in an information processing apparatus (for example, an encoding device), the patch image generation unit may derive the position information of the boundary vertices in the patch selected on the basis of the patch processing order. Furthermore, for example, in an information processing apparatus (for example, a decoding device), the reconstruction unit may reconstruct the three-dimensional position information using the position information derived in a predetermined patch according to the patch processing order.
For example, in the encoding device 300, the patch image generation unit 554 may generate a single patch or a plurality of patches using a plurality of vertices corresponding to a small region of a texture arranged on a two-dimensional plane as a continuous region, and project each patch on a projection plane to generate a patch image. Furthermore, at that time, when the patch image generation unit 554 generates a plurality of patches for one small region of the texture, the position information of the boundary vertex located at the boundary of the patch may be derived in any one patch. Then, the geometry image generation unit 313 may generate the geometry image and the occupancy image by arranging the patch images on a two-dimensional plane so that the plurality of vertices corresponds to the UV map.
For example, in step S505 of the encoding process, the patch image generation unit 554 may generate a single or a plurality of patches using a plurality of vertices corresponding to a small region of a texture arranged on a two-dimensional plane as a continuous region, and project each patch on a projection plane to generate a patch image. Furthermore, at that time, when the patch image generation unit 554 generates a plurality of patches for one small region of the texture, the position information of the boundary vertex located at the boundary of the patch may be derived in any one patch. Then, in step S506, the geometry image generation unit 313 may generate the geometry image by arranging the patch image on a two-dimensional plane so that the plurality of vertices corresponds to the UV map.
By doing so, the encoding device 300 can implement patch division without changing the original patch shape.
For example, in the decoding device 400, the patch reconstruction unit 417 may reconstruct the three-dimensional position information of the boundary vertex located at the boundary of the plurality of patches corresponding to one small region of the texture arranged on the two-dimensional plane as the continuous region using the position information derived in any one of the plurality of patches, and the vertex information reconstruction unit 418 may reconstruct the three-dimensional position information using the plurality of patches.
For example, in step S556 of the decoding process, the patch reconstruction unit 417 may reconstruct the three-dimensional position information of the boundary vertex located at the boundary of the plurality of patches corresponding to one small region of the texture arranged on the two-dimensional plane as the continuous region using the position information derived in any one of the plurality of patches. Then, in step S557, the vertex information reconstruction unit 418 may reconstruct the three-dimensional position information using the plurality of patches.
By doing so, the decoding device 400 can correctly reconstruct each divided patch into the three-dimensional space.
Note that, in this case, the position information of the vertex may be any parameter. For example, it may include relative coordinates from the reference point of the patch bounding box of that vertex. That is, the position information of the vertex may be (dx, dy, dz). Furthermore, the position information of a vertex may include a depth value (d) from the projection plane on which the vertex is projected to the vertex, and difference information (dx, dy) indicating a difference between the position of the vertex in the geometry image and the position of the vertex in the projection plane. That is, the position information of the vertex may be (d, dx, dy).
When the position information of the vertex includes the relative coordinates from the reference point of the patch bounding box of the vertex (for example, when the position information of the vertex is (dx, dy, dz)), as described above in <5-1. Another Example of Position Information>, the pixel value change in the spatial direction in the geometry image can be expected to be gentler than when the position information of the vertex is (d, dx, dy). In other words, the width (range) of the value that can be taken by each component of the position information can be narrowed. Therefore, the possibility of performing patch division for preventing the value of the position information from exceeding the constraints of the encoding unit and the decoding unit as described above is reduced. That is, the possibility of reducing the number of patches increases.
For example, as illustrated in
As described above, by setting the position information of the vertex to include the relative coordinates from the reference point of the patch bounding box of the vertex (for example, the position information of the vertex is set to (dx, dy, dz)), the possibility of reducing the number of patches increases, so that suppression of reduction in encoding efficiency can be expected.
In the case of the texel-based projection, the geometry may be shifted not only in the normal direction of the projection plane but also in the projection plane direction due to the compression distortion. That is, compression distortion of the geometry may occur in three axial directions. Therefore, there is a possibility that the texture is shifted (so-called color shift occurs) with respect to the geometry.
Therefore, recoloring processing may be performed in which the texture corresponds to the geometry with the compression distortion. For example, the information processing apparatus (for example, an encoding device) may further include a decoding unit that decodes the encoded data of the geometry image generated by the encoding unit, and a recoloring processing unit that executes texture recoloring processing using the geometry image generated by the decoding unit. Then, the encoding unit may encode the texture after the recoloring processing.
The 2D encoding unit 316 supplies the encoded data of the generated geometry image not only to the multiplexing unit 319 but also to the 2D decoding unit 651.
The 2D decoding unit 651 performs processing related to decoding of the encoded data of the geometry image. For example, the 2D decoding unit 651 may acquire the encoded data of the geometry image supplied from the 2D encoding unit 316. Furthermore, the 2D decoding unit 651 may decode the encoded data to generate (restore) a geometry image. The generated (restored) geometry image includes compression distortion. The 2D decoding unit 651 supplies the generated (restored) geometry image to the recoloring processing unit 652.
The recoloring processing unit 652 performs processing related to the recoloring processing. For example, the recoloring processing unit 652 acquires the geometry image supplied from the 2D decoding unit 651. This geometry image is data generated (restored) by the 2D decoding unit 651 and includes compression distortion. The recoloring processing unit 652 also acquires data of an enclosed mesh (data enclosed by a dotted line 653) input to the encoding device 300. That is, the recoloring processing unit 652 acquires the connectivity 331, the vertex information 332, the UV map 333, and the texture 334. The recoloring processing unit 652 executes recoloring processing using the acquired information, and performs alignment of the texture 334 with respect to the geometry with compression distortion. Note that a method of the recoloring processing is arbitrary. For example, Meshlab's “Transfer: Vertex Attributes to Texture (1 or 2 meshes)” filter may be applied. The recoloring processing unit 652 supplies the texture 334 after the recoloring processing to the 2D encoding unit 318.
The texture 334 after the recoloring processing is encoded by the 2D encoding unit 318, and the encoded data is multiplexed with other data by the multiplexing unit 319. That is, the texture 334 after the recoloring processing is provided to the decoding side. Therefore, in the decoding device, since the texture in which the alignment is performed with respect to the compression distortion is obtained, the occurrence of so-called color deviation can be suppressed.
An example of a flow of an encoding process in this case will be described with reference to a flowchart in
In step S609, the 2D decoding unit 651 decodes the encoded data of the geometry image generated in step S607, and generates (restores) the geometry image. This geometry image includes compression distortion.
In step S610, the recoloring processing unit 652 executes recoloring processing using the geometry image generated (restored) in step S609 and the mesh data input to the encoding device 300. That is, the alignment of the texture 334 with respect to the geometry with the compression distortion is executed.
In step S611, the 2D encoding unit 318 encodes the texture image after the recoloring processing obtained by the process in step S610, and generates the encoded data. In step S612, the multiplexing unit 319 multiplexes the encoded data of the texture image after the recoloring processing generated in step S611 with the encoded data of the meta information, the encoded data of the geometry image, and the encoded data of the occupancy image to generate a bitstream.
When the process of step S612 ends, the encoding process ends.
By executing the encoding process in this manner, the encoding device 300 can provide the decoding side with the texture obtained by performing alignment on the geometry with the compression distortion. Therefore, in the decoding device, since the texture in which the alignment is performed with respect to the compression distortion is obtained, the occurrence of so-called color deviation can be suppressed.
Although the case where the present technology is applied to mesh encoding/decoding has been described above, the present technology is not limited to these examples, and can be applied to encoding/decoding of 3D data of an arbitrary standard. That is, as long as there is no contradiction with the present technology described above, specifications of various processes such as an encoding/decoding method and various types of data such as 3D data and metadata are arbitrary. Furthermore, in so far as there is no conflict with the present technology, part of the above-described processing or specifications may be omitted.
The above-described series of processing can be executed by hardware or software. When the series of processing is executed by software, a program constituting the software is installed in a computer. Here, the computer includes a computer incorporated in dedicated hardware, a general-purpose personal computer capable of executing various functions by installing various programs, and the like, for example.
In a computer 900 illustrated in
An input/output interface 910 is also connected to the bus 904. An input unit 911, an output unit 912, a storage unit 913, a communication unit 914, and a drive 915 are connected to the input/output interface 910.
The input unit 911 includes, for example, a keyboard, a mouse, a microphone, a touch panel, an input terminal, and the like. The output unit 912 includes, for example, a display, a speaker, an output terminal, and the like. The storage unit 913 includes, for example, a hard disk, a RAM disk, a nonvolatile memory, and the like. The communication unit 914 includes a network interface, for example. The drive 915 drives a removable medium 921 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
In the computer configured as described above, for example, the CPU 901 loads a program stored in the storage unit 913 into the RAM 903 via the input/output interface 910 and the bus 904 and executes the program, whereby the above-described series of processing is performed. The RAM 903 also appropriately stores data and the like necessary for the CPU 901 to execute various processes.
The program executed by the computer can be applied by being recorded on, for example, the removable medium 921 as a package medium or the like. In this case, the program can be installed in the storage unit 913 via the input/output interface 910 by attaching the removable medium 921 to the drive 915.
Furthermore, this program can also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting. In this case, the program can be received by the communication unit 914 and installed in the storage unit 913.
In addition, this program can be installed in the ROM 902 or the storage unit 913 in advance.
The present technology may be applied to an arbitrary configuration. For example, the present technology can be applied to various electronic devices.
Furthermore, for example, the present technology can also be implemented as a partial configuration of an apparatus, such as a processor (for example, a video processor) as a system large scale integration (LSI) or the like, a module (for example, a video module) using a plurality of processors or the like, a unit (for example, a video unit) using a plurality of modules or the like, or a set (for example, a video set) obtained by further adding other functions to a unit.
Furthermore, for example, the present technology can also be applied to a network system including a plurality of devices. For example, the present technology may be implemented as cloud computing shared and processed in cooperation by a plurality of devices via a network. For example, the present technology may be implemented in a cloud service that provides a service related to an image (moving image) to an arbitrary terminal such as a computer, an audio visual (AV) device, a portable information processing terminal, or an Internet of Things (IoT) device.
Note that, in the present specification, a system means a set of a plurality of components (devices, modules (parts), and the like), and it does not matter whether or not all the components are in the same housing. Consequently, both of a plurality of devices stored in different housings and connected via a network, and one device in which a plurality of modules is stored in one housing are systems.
<Field and Application to which Present Technology is Applicable>
The system, device, processing unit, and the like to which the present technology is applied may be used in arbitrary fields such as traffic, medical care, crime prevention, agriculture, livestock industry, mining, beauty care, factory, household appliance, weather, and natural surveillance, for example. Furthermore, any application thereof may be adopted.
Note that, in the present specification, the “flag” is information for identifying a plurality of states, and includes not only information used for identifying two states of true (1) and false (0) but also information capable of identifying three or more states. Therefore, the value that may be taken by the “flag” may be, for example, a binary of I/O or a ternary or more. That is, the number of bits forming this “flag” is arbitrary, and may be one bit or a plurality of bits. Furthermore, identification information (including the flag) is assumed to include not only identification information thereof in a bitstream but also difference information of the identification information with respect to a certain reference information in the bitstream, and thus, in the present specification, the “flag” and “identification information” include not only the information thereof but also the difference information with respect to the reference information.
Furthermore, various kinds of information (such as metadata) related to encoded data (a bitstream) may be transmitted or recorded in any form as long as it is associated with the encoded data. Here, the term “associating” means, when processing one data, allowing other data to be used (to be linked), for example. That is, the data associated with each other may be collected as one data or may be made individual data. For example, information associated with the encoded data (image) may be transmitted on a transmission path different from that of the encoded data (image). Furthermore, for example, the information associated with the encoded data (image) may be recorded in a recording medium different from that of the encoded data (image) (or another recording area of the same recording medium). Note that, this “association” may be not the entire data but a part of data. For example, an image and information corresponding to the image may be associated with each other in any unit such as a plurality of frames, one frame, or a part within a frame.
Note that, in the present specification, terms such as “combine”, “multiplex”, “add”, “integrate”, “include”, “store”, “put in”, “introduce”, “insert”, and the like mean, for example, to combine a plurality of objects into one, such as to combine encoded data and metadata into one data, and mean one method of “associating” described above.
Furthermore, embodiments of the present technology are not limited to the embodiments described above but can be modified in a wide variety of ways within a scope of the present technology.
For example, a configuration described as one device (or processing unit) may be divided and configured as a plurality of devices (or processing units). Conversely, configurations described above as a plurality of devices (or processing units) may be collectively configured as one device (or processing unit). Furthermore, a configuration other than the above-described configurations may be added to the configuration of each device (or each processing unit). Moreover, if the configuration and operation of the entire system are substantially the same, a part of the configuration of a certain device (or processing unit) may be included in the configuration of another device (or another processing unit).
Furthermore, for example, the above-described program may be executed in an arbitrary device. In that case, it is sufficient that the device has a necessary function (functional block or the like) and can obtain necessary information.
Furthermore, for example, each step of one flowchart may be executed by one device, or may be shared and executed by a plurality of devices. Furthermore, when a plurality of processes is included in one step, the plurality of processes may be executed by one device, or may be shared and executed by a plurality of devices. In other words, a plurality of processes included in one step can also be executed as processes of a plurality of steps. Conversely, the processing described as a plurality of steps can be collectively executed as one step.
Furthermore, for example, in the program executed by the computer, process of steps describing the program may be executed in time series in the order described in the present specification, or may be executed in parallel or individually at necessary timing such as when a call is made. That is, as long as there is no contradiction, the processing of each step may be executed in an order different from the above-described order. Furthermore, the process of steps describing this program may be executed in parallel with the processing of another program, or may be executed in combination with the processing of another program.
Furthermore, for example, a plurality of technologies related to the present technology can be implemented independently as a single body as long as there is no contradiction. Of course, a plurality of arbitrary present technologies can be implemented in combination. For example, part or all of the present technologies described in any of the embodiments can be implemented in combination with part or all of the present technologies described in other embodiments. Furthermore, part or all of any of the above-described present technologies can be implemented by using together with another technology that is not described above.
Note that the present technology can adopt the following configurations.
(1) An information processing apparatus including:
(2) The information processing apparatus according to (1),
(3) The information processing apparatus according to (2), further including
(4) The information processing apparatus according to (3),
(5) The information processing apparatus according to (4),
(6) The information processing apparatus according to any one of (3) to (5), in which the image generation unit stores the conversion information in a color component of the geometry image.
(7) The information processing apparatus according to any one of (3) to (6),
(8) The information processing apparatus according to any one of (2) to (7), further including
(9) The information processing apparatus according to (8),
(10) The information processing apparatus according to any one of (1) to (9),
(11) The information processing apparatus according to (10),
(12) The information processing apparatus according to (10) or (11),
(13) The information processing apparatus according to any one of (10) to (12),
(14) The information processing apparatus according to any one of (10) to (13), further including
(15) The information processing apparatus according to any one of (1) to (14),
(16) The information processing apparatus according to (15),
(17) The information processing apparatus according to (15) or (16),
(18) The information processing apparatus according to (15) or (16),
(19) The information processing apparatus according to any one of (1) to (18), further including
(20) The information processing apparatus according to (19), further including
(21) The information processing apparatus according to (19),
(22) The information processing apparatus according to any one of (1) to (21), further including
(23) An information processing method including:
(31) An information processing apparatus including:
(32) The information processing apparatus according to (31),
(33) The information processing apparatus according to (32),
(34) The information processing apparatus according to (33),
(35) The information processing apparatus according to (34),
(36) The information processing apparatus according to any one of (33) to (35),
(37) The information processing apparatus according to any one of (33) to (36),
(38) The information processing apparatus according to any one of (33) to (37),
(39) The information processing apparatus according to (38),
(40) The information processing apparatus according to any one of (31) to (39),
(41) The information processing apparatus according to (40),
(42) The information processing apparatus according to (40) or (41),
(43) The information processing apparatus according to any one of (40) to (42),
(44) The information processing apparatus according to any one of (40) to (43),
(45) The information processing apparatus according to any one of (31) to (44),
(46) The information processing apparatus according to (45),
(47) The information processing apparatus according to (45) or (46),
(48) The information processing apparatus according to (45) or (46),
(49) The information processing apparatus according to any one of (31) to (48),
(50) The information processing apparatus according to (49),
(51) The information processing apparatus according to (49),
(52) An information processing method including:
Number | Date | Country | Kind |
---|---|---|---|
2021-076320 | Apr 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/019103 | 4/27/2022 | WO |