Embodiments of the present disclosure relate to point cloud coding.
Point clouds are one of the major three-dimension (3D) data representations, which provide, in addition to spatial coordinates, attributes associated with the points in a 3D world. Point clouds in their raw format require a huge amount of memory for storage or bandwidth for transmission. Furthermore, the emergence of higher resolution point cloud capture technology imposes, in turn, even a higher requirement on the size of point clouds. In order to make point clouds usable, compression is necessary. Two compression technologies have been proposed for point cloud compression/coding (PCC) standardization activities: video-based PCC (V-PCC) and geometry-based PCC (G-PCC). V-PCC approach is based on 3D to two-dimension (2D) projections, while G-PCC, on the contrary, encodes the content directly in 3D space. In order to achieve that, G-PCC utilizes data structures, such as an octree that describes the point locations in 3D space.
According to one aspect of the present disclosure, a method for decoding a point cloud that is represented in a one-dimension (1D) array that includes a set of points is provided. The method may include parsing, by at least one processor, a bitstream to obtain a first syntax element indicative of an enablement of multiple attribute parameter sets for the point cloud. The method may include determining, by the at least one processor, whether the first syntax element indicates that multiple attribute parameter sets are enabled for the point cloud. In response to determining that the multiple attribute parameter sets are enabled for the point cloud, the method may include decompressing, by the at least one processor, the point cloud based on the multiple attribute parameter sets.
According to another aspect of the present disclosure, a system for decoding a point cloud that is represented in a 1D array that includes a set of points is provided. The system may include at least one processor and memory storing instructions. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to parse a bitstream to obtain a first syntax element indicative of an enablement of multiple attribute parameter sets for the point cloud. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to determine whether the first syntax element indicates that multiple attribute parameter sets are enabled for the point cloud. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to, in response to determining that the multiple attribute parameter sets are enabled for the point cloud, decompress the point cloud based on the multiple attribute parameter sets.
According to a further aspect of the present disclosure, a method for encoding a point cloud that is represented in a 1D array that includes a set of points is provided. The method may include generating, by at least one processor, a first syntax element indicative of multiple attribute parameter sets for the point cloud. The method may include inputting, by the at least one processor, the first syntax element into a bitstream. In response to the multiple attribute parameter sets being enabled for the point cloud, the method may include compressing, by the at least one processor, the point cloud based on the multiple attribute parameter sets.
According to a further aspect of the present disclosure, a system for encoding a point cloud that is represented in a 1D array that includes a set of points is provided. The system may include at least one processor and memory storing instructions. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to generate a first syntax element indicative of multiple attribute parameter sets for the point cloud. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to input the first syntax element into a bitstream. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to, in response to the multiple attribute parameter sets being enabled for the point cloud, compressing the point cloud based on the multiple attribute parameter sets.
These illustrative embodiments are mentioned not to limit or define the present disclosure, but to provide examples to aid understanding thereof. Additional embodiments are described in the Detailed Description, and further description is provided there.
The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments of the present disclosure and, together with the description, further serve to explain the principles of the present disclosure and to enable a person skilled in the pertinent art to make and use the present disclosure.
Embodiments of the present disclosure will be described with reference to the accompanying drawings.
Although some configurations and arrangements are discussed, it should be understood that this is done for illustrative purposes only. A person skilled in the pertinent art will recognize that other configurations and arrangements can be used without departing from the spirit and scope of the present disclosure. It will be apparent to a person skilled in the pertinent art that the present disclosure can also be employed in a variety of other applications.
It is noted that references in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” “some embodiments,” “certain embodiments,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases do not necessarily refer to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it would be within the knowledge of a person skilled in the pertinent art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
In general, terminology may be understood at least in part from usage in context. For example, the term “one or more” as used herein, depending at least in part upon context, may be used to describe any feature, structure, or characteristic in a singular sense or may be used to describe combinations of features, structures or characteristics in a plural sense. Similarly, terms, such as “a,” “an,” or “the,” again, may be understood to convey a singular usage or to convey a plural usage, depending at least in part upon context. In addition, the term “based on” may be understood as not necessarily intended to convey an exclusive set of factors and may, instead, allow for existence of additional factors not necessarily expressly described, again, depending at least in part on context.
Various aspects of point cloud coding systems will now be described with reference to various apparatus and methods. These apparatus and methods will be described in the following detailed description and illustrated in the accompanying drawings by various modules, components, circuits, steps, operations, processes, algorithms, etc. (collectively referred to as “elements”). These elements may be implemented using electronic hardware, firmware, computer software, or any combination thereof. Whether such elements are implemented as hardware, firmware, or software depends upon the particular application and design constraints imposed on the overall system. The techniques described herein may be used for various point cloud coding applications. As described herein, point cloud coding includes both encoding and decoding a point cloud.
A point cloud is composed of a collection of points in a 3D space. Each point in the 3D space is associated with a geometry position together with the associated attribute information (e.g., color, reflectance, intensity, classification, etc.). In order to compress the point cloud data efficiently, the geometry of a point cloud can be compressed first, and then the corresponding attributes, including color or reflectance, can be compressed based upon the geometry information according to a point cloud coding technique, such as G-PCC. G-PCC has been widely used in virtual reality/augmented reality (VR/AR), telecommunication, autonomous vehicle, etc., for entertainment and industrial applications, e.g., light detection and ranging (LiDAR) sweep compression for automotive or robotics and high-definition (HD) map for navigation. Moving Picture Experts Group (MPEG) released the first version G-PCC standard, and Audio Video Coding Standard (AVS) is also developing a G-PCC standard.
The existing G-PCC standards, however, cannot work well for a wide range of PCC inputs for many different applications. For example, besides the representation of levels (or coefficients in some cases), the representation of other information (e.g., parameters) used for G-PCC may be coded in the forms of syntax elements in the bitstream as well. Since G-PCC is organized in different levels by dividing a collection of points into different pieces (e.g., sequence, slices, etc.) associated with different properties (e.g., geometry, attributes, etc.), the parameter sets are also arranged in different levels (e.g., sequence-level, property-level, slice-level, etc.), for example, in the different headers. Moreover, multiple condition checks may be required for parsing some syntax elements in G-PCC, which further increases the complexity of organizing and parsing the representation of syntax elements.
To improve the flexibility and generality of point cloud coding, the present disclosure provides various novel schemes of syntax element representation and organization, which are compatible with any suitable G-PCC standards, including, but not limited to, AVS G-PCC standards and MPEG G-PCC standards.
Processor 102 may include microprocessors, such as graphic processing unit (GPU), image signal processor (ISP), central processing unit (CPU), digital signal processor (DSP), tensor processing unit (TPU), vision processing unit (VPU), neural processing unit (NPU), synergistic processing unit (SPU), or physics processing unit (PPU), microcontroller units (MCUs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), programmable logic devices (PLDs), state machines, gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functions described throughout the present disclosure. Although only one processor is shown in
Memory 104 can broadly include both memory (a.k.a, primary/system memory) and storage (a.k.a. secondary memory). For example, memory 104 may include random-access memory (RAM), read-only memory (ROM), static RAM (SRAM), dynamic RAM (DRAM), ferro-electric RAM (FRAM), electrically erasable programmable ROM (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, hard disk drive (HDD), such as magnetic disk storage or other magnetic storage devices, Flash drive, solid-state drive (SSD), or any other medium that can be used to carry or store desired program code in the form of instructions that can be accessed and executed by processor 102. Broadly, memory 104 may be embodied by any computer-readable medium, such as a non-transitory computer-readable medium. Although only one memory is shown in
Interface 106 can broadly include a data interface and a communication interface that is configured to receive and transmit a signal in a process of receiving and transmitting information with other external network elements. For example, interface 106 may include input/output (I/O) devices and wired or wireless transceivers. Although only one memory is shown in
Processor 102, memory 104, and interface 106 may be implemented in various forms in system 100 or 200 for performing point cloud coding functions. In some embodiments, processor 102, memory 104, and interface 106 of system 100 or 200 are implemented (e.g., integrated) on one or more system-on-chips (SoCs). In one example, processor 102, memory 104, and interface 106 may be integrated on an application processor (AP) SoC that handles application processing in an operating system (OS) environment, including running point cloud encoding and decoding applications. In another example, processor 102, memory 104, and interface 106 may be integrated on a specialized processor chip for point cloud coding, such as a GPU or ISP chip dedicated to graphic processing in a real-time operating system (RTOS).
As shown in
Similarly, as shown in
As shown in
In some embodiments, geometry analysis module 306 is configured to perform geometry analysis using the octree scheme. Under the octree scheme, a cubical axis-aligned bounding box B may be defined by the two extreme points (0,0,0) and (2d, 2d, 2d) where d is the maximum size of the given point cloud along the x, y, or z direction. All point cloud points may be included in this defined cube. A cube may be divided into eight sub-cubes, which creates the octree structure allowing one parent to have 8 children, and an octree structure may then be built by recursively subdividing sub-cubes, as shown in
Referring back to
In some embodiments, a prediction may be formed from neighboring coded attributes, for example, in predicting transform and lifting transform by attribute transform module 312. Then, the difference between the current attribute and the prediction may be coded. According to some aspects of the present disclosure, in the AVS G-PCC standard, after the geometry positions are coded, a Morton code or Hilbert code may be used to convert a point cloud in a 3D space (e.g., a point cloud cube) into a 1D array, as shown in
As shown in
In some embodiments, M and N are set as a fixed number of 3 and 128, respectively. If more than 128 points before the current point are already coded, only 3 out of the previous 128 neighboring points could be used to form attribute predictors (prediction points) according to a predefined order. If there are less than 128 coded points before the current point, all coded points before the current point will be used as candidate points to find the prediction points. Among the previous up to 128 candidate points, up to 3 prediction points are selected, which have the closest “distance” (e.g., Euclidean distance) between these candidate points and the current point. The Euclidean distance d as one example may be defined as follows, while other distance metrics can also be used in other examples:
where (x1, y1, z1) and (x2, y2, z2) are the coordinates of the current point and the candidate point along the Morton order, the Hilbert order, or the native input order, respectively. Once m prediction points (e.g., the 3 closest candidate points) have been selected, a weighted attribute average from these m points may be formed as the predictor to code the attribute of the current point, according to some embodiments. It is understood that in some examples, the prediction points may be selected from the candidate points that are in the cubes sharing the same face/line/point with the current point cloud.
Since the set of n candidate points needs to be stored in the memory and traversed in order to select the set of m prediction points for coding the attributes associated with the current position, the maximum number M of candidate points is introduced to limit the size of memory and amount of computation resources that may be occupied by the candidate points storage and searching.
According to some aspects of the present disclosure, the difference in attribute values between the current point and its predictor may be referred to as a “residual.” Depending on the application, PCC can be either lossless or lossy. Hence, the residual may or may not be quantized by using the predefined quantization process. According to the present disclosure, the residual without or with quantization may be referred to as a “level,” which is a signed integer (e.g., a positive or negative integer value) coded into the bitstream.
There are three color attributes for each point, which come from the three color components. If the levels for all the three color components are zeros, this point is called a zero-level point. Otherwise, if there is at least one non-zero level for one color component with the point, this point is called a non-zero level point. The number of consecutive zero-level points is referred to as a “zero-run length.” The zero-run length values and levels for non-zero level points are coded into the bitstream. More specifically, before coding the first point, encoder 101 may set the zero-run length counter as zero.
Starting from the first point along the predefined coding order, the residuals between the three color predictors and their corresponding color attributes for the current point can be obtained. Then, the corresponding levels for the three components of the current point can also be obtained. If the current point is a zero-level point, encoder 101 may increase the zero-run length value by one, and the process proceeds to the next point. If the current point is a non-zero level point, the zero-run length value will be coded first, and then the three color levels for this non-zero level point will be coded right after. After the level coding of a non-zero level point, the zero-run length value will be reset to zero, and the process proceeds to the next point till finishing all points. On the decoding side, decoder 201 may decode the zero-run length value, and the three color levels corresponding to the number of zero-run length points are set as zero. Then, the levels for the non-zero level point are decoded, and then the next zero-run length value is decoded. This process continues until all points are decoded. Tables 1 and 2 illustrate example syntax elements used for color-residual coding and color-level coding, respectively.
For a non-zero level point, there is at least one non-zero level among the three components. The values of the three color-components are coded in the color_residual_coding( ) syntax element. Several one-bit flags plus the remainder of the absolute level may be coded to represent levels of the three color-components. The absolute level or absolute level of color residual minus one may be coded in the function coded_level_coding( ) which is also referred to hereinafter as the “coded level.”
According to some aspects of the present disclosure, a first flag (color_first_comp_zero) is coded to indicate whether the first component of color is zero or not; if the first color-component is zero, a second flag (color_second_comp_zero) is coded to indicate whether the second color-component of color is zero; if the second component of color is zero, the absolute level minus one and the sign of the third component will be coded according to the following coded-level technique.
For instance, a first flag is coded to indicate whether the first color-component of color is zero; if the first color-component is zero, a second flag may be coded to indicate whether the second-color component is zero; if the second component of color is not zero, the absolute level minus one and sign of the second color-component and the absolute level and sign of the third color-component will be coded according to the following coded-level technique.
According to another aspect of the present disclosure, a first flag may be coded to indicate whether the color-first component is zero; if the first color-component is not zero, the absolute level minus one and the sign of the first color-component, as well as the absolute levels and signs of the second and third color-components will be coded according to the following coded-level technique.
For example, the first flag (coded_level_equal_zero) is coded to indicate whether the code-level is zero or not; if the coded level is the absolute level of one color-component minus one, e.g., namely, when the isComponentNoneZero flag is set to “true,” the sign (coded_level_sign) of the level of this color-component will be coded. On the other hand, if the first flag indicates that the coded level is not zero, and if the coded level is the absolute level of one color-component, e.g., when the isComponentNoneZero flag is set to “false,” the sign of the level of this color-component will be coded. The second flag (coded_level_gt1) will be coded to indicate if the coded level is greater than one; if the coded level is greater than one, the parity of the coded level minus two is coded, and the third flag (coded_level_minus2_div2_gt0) will be coded to indicate whether the coded level minus two divided by two is greater than zero; if the coded level minus two divided by two is greater than zero, the coded level minus two divided by two minus one will be coded.
Referring to Tables 1 and 2, a color_first_comp_zero value equal to 0 specifies that the absolute coded level for the first component of color is not zero. A color_first_comp_zero value equal to 1 specifies that the absolute coded level for the first component is zero.
A color_second_comp_zero value equal to 0 specifies that the absolute coded level for the second component of color is not zero. A color_second_comp_zero value equal to 1 specifies that the absolute coded level for the second component is zero.
A coded_level_equal_zero value equal to 0 specifies that the absolute coded level for this component is not zero. A coded_level_equal_zero value equal to 1 specifies that the absolute coded level for this component is zero.
A coded_level_gt1 value equal to 0 specifies that the coded level for this component is one. A coded_level_gt1 value equal to 1 specifies that the coded level for this component is greater than one. When a coded_level_gt1 value is not included in the bitstream, decoder 201 may infer the coded_level_gt1 value is equal to 0.
A coded_level_minus2_parity specifies the parity of the coded level minus two for the current color-component. A coded_level_minus2_parity value equal to 0 specifies that the current coded level minus two is an even number. A coded_level_minus2_parity value equal to 1 specifies that the current coded level minus two is an odd number. When a coded_level_minus2_parity value is not present in the bitstream, decoder 201 may infer that coded_level_minus2_parity value is equal to 0.
A coded_level_minus2_div2_gt0 value equal to 0 specifies that the coded level minus two dividing two is zero. A coded_level_minus2_div2_gt0 value equal to 1 specifies that the coded level minus two divided by two is greater than zero. When a coded_level_minus2_div2_gt0 value is not present in the bitstream, decoder 201 may infer the coded_level_minus2_div2_gt0 value is equal to 0.
A coded_level_minu2_div2_minus1 syntax element specifies the value of the coded level minus two divided by two minus one. When a coded_level_minu2_div2_minus1 syntax is not present in the bitstream, decoder 201 may infer coded_level_minu2_div2_minus1 syntax element is equal to 0.
A coded_level and a coded_level_sign are the return values of function coded_level_coding (isComponentminusOne), which represent the coded level. The coded level may include the absolute level of the color residual or the absolute level of the color residual minus one and the sign of non-zero color residual, as indicated below according to expression (2).
The residual levels of three color components, e.g., color_component[idx], where idx is an index from 0 to 2, are calculated from color_residual_coding( ).
Moreover, the zero-run length of the reflectance level and the non-zero reflectance-level may be coded into the bitstream. More specifically, before coding the first point, encoder 101 may set the zero-run length counter as zero. Starting from the first point along the predefined coding order, the residuals between the predictors and corresponding original points are obtained. Then, the corresponding reflectance-levels may be obtained. If the current reflectance-level is zero, encoder 101 increases the value of the zero-run length counter by one, and the process proceeds to the next point. If the reflectance-level is not zero, encoder 101 may code the zero-run length, followed by coding the non-zero reflectance-level. After coding a non-zero reflectance level, encoder 101 may reset the zero-run length counter to zero, and the process proceeds to the next point. On the decoding side, decoder 201 may decode the zero-run length, and the reflectance-levels corresponding to the number of zero-run length points are set as zero. Then, decoder 201 may decode the non-zero reflectance level, followed by decoding the next number of zero-run length. This process may continue until all points are decoded.
For a non-zero reflectance-level, if the current point is not a duplicated point, the sign of the reflectance-level is coded with a “residual_sign” syntax element. Then, an “abs_level_minus1_parity” syntax element, which indicates the parity of the absolute level minus one, may be coded by encoder 101. Another syntax element “abs_level_minus1_div2_gt0” may be coded to indicate whether the value of the absolute level minus one divided by two is greater than zero; if abs_level_minus1_div2_gt0 is greater than zero, encoder 101 may encode an “abs_level_minus1_div2_gt1” syntax element to indicate whether the value of the absolute level minus one divided by two is greater than one; if the abs_level_minus1_div2_gt1 syntax element is greater than 1, encoder 101 may encode “abs_level_minu1_div2_minus2” syntax element to indicate the value of the absolute level minus one divided by two minus two. Table 3 shown below illustrates example reflectance-level coding syntax elements.
Referring to Table 3, the abs_level_minus1_parity syntax element specifies the parity of absolute reflectance level minus one. An abs_level_minus1_parity value equal to 0 may indicate that the absolute reflectance level minus one is an even number; on the other hand, an abs_level_minus1_parity value equal to 1 may indicate that the absolute reflectance level minus one is an odd number.
An abs_level_minus1_div2_gt0 value equal to 0 may indicate that the value of the absolute reflectance level minus one divided by two is zero. An abs_level_minus1_div2_gt0 value equal to 1 may indicate that the value of the absolute reflectance level minus one divided by two is greater than zero. When not present, decoder 201 may infer that the value of abs_level_minus1_div2_got0 is equal to 0.
An abs_level_minus1_div2_gt1 value equal to 0 may indicate that the value of the absolute reflectance level minus one divided by two is one. An abs_level_minus1_div2_gt1 value equal to 1 may indicate that the value of the absolute reflectance level minus one divided by two is greater than one. When not present in the bitstream, decoder 201 may infer the value of the abs_level_minus1_div2_gt1 is equal to 0.
The abs_level_minu1_div2_minus2 syntax value may indicate the value of the absolute reflectance level minus 1 divided by two minus two. When not present, decoder 201 may infer that the value of abs_level_minu1_div2_minus2 is equal to 0.
A residual_sign value equal to 0 may indicate that the sign of the reflectance level is negative; on the other hand, a residual_sign value equal to 1 may indicate that the sign of the reflectance level is positive. When not present in the bitstream, decoder 201 may infer that the value of residual_sign is equal to 1. The reflectance may be calculated according to expression (3).
Still further, encoder 101 may encode the value of the zero-run length into the bitstream. For example, encoder 101 may encode the first syntax zero_run_length_level_equal_zero (e.g., a first syntax element) into the bitstream to indicate whether the zero-run length is equal to zero; if it is not zero, encoder 101 may encode the zero_run_length_level_equal_one syntax element (e.g., a second syntax element) to indicate whether the zero-run length is equal to one; if it is not one, encoder 101 may encode the zero_run_length_level_equal_two syntax element (e.g., a third syntax element) into the bitstream to indicate whether the zero-run length is equal to two; if it is not two, encoder 101 may encode the zero_run_length_level_minus3_parity syntax element (e.g., fourth syntax element) and the zero_run_length_level_minus3_div2 syntax element (e.g., a fifth syntax element) into the bitstream to indicate the parity of the zero-run length minus three and the value of the zero-run length minus three divided by two, respectively. Examples of the syntax elements used for zero-run length encoding are provided below in Table 4.
Referring to Table 4, a zero_run_length_level_minus3_parity specifies the parity of the zero-run length level minus three. zero_run_length_level_minus3_parity equal to 0 specifies that the zero-run length level minus three is an even number. zero_run_length_level_minus3_parity equal to 1 specifies that the zero-run length level minus three is an odd number. When not present, it is inferred to be equal to 0.
A zero_run_length_level_equal_zero value equal to 0 may indicate that the zero-run length level is not zero; on the other hand, a zero_run_length_level_equal_zero value equal to 1 specifies that the zero-run length level is zero.
A zero_run_length_level_equal_one value equal to 0 may indicate that the zero-run length level is not one; on the other hand, a zero_run_length_level_equal_one value equal to 1 specifies that the zero-run length level is one.
A zero_run_length_level_equal_two value equal to 0 may indicate that the zero-run length level is not two; on the other hand, a zero_run_length_level_equal_two value equal to 1 may indicate that the zero-run length level is two.
A zero_run_length_level_minus3_div2 syntax element may indicate the value of the zero-run length level minus three divided by two. When not present in the bitstream, decoder 201 may infer that the value of the zero_run_length_level_minus3_div2 syntax element is equal to 0. The variable zero_run_length_level may be calculated according to expression (4).
The value of zero-run_length may be calculated according to expression (5)
When a point cloud bitstream (e.g., a geometry bitstream or an attribute bitstream) is input from a point cloud encoder (e.g., encoder 101), the input bitstream may be decoded by decoder 201 in a procedure opposite to that of the point cloud encoder. Thus, the details of decoding that are described above with respect to encoding may be skipped for ease of description. Arithmetic decoding modules 402 and 410 may be configured to decode the geometry bitstream and attribute bitstream, respectively, to obtain various information encoded into the bitstream. For example, arithmetic decoding module 410 may decode the attribute bitstream to obtain the attribute information associated with each point, such as the quantization levels or the coefficients of the attributes associated with each point. Optionally, dequantization module 412 may be configured to dequantize the quantization levels of attributes associated with each point to obtain the coefficients of attributes associated with each point. Besides the attribute information, arithmetic decoding module 410 may parse the bitstream to obtain various other information (e.g., in the form of syntax elements), such as the syntax element indicative of the order followed by the points in the 1D array for attribute coding.
Inverse attribute transform module 414 may be configured to perform inverse attribute transformation, such as inverse RAHT, inverse predicting transform, or inverse lifting transform, to transform the data from the transform domain (e.g., coefficients) back to the attribute domain (e.g., luma and/or chroma information for color attributes). Optionally, color inverse transform module 416 may be configured to convert YCbCr color attributes to RGB color attributes.
As to the geometry decoding, geometry synthesis module 404, reconstruction module 406, and coordinate inverse transform module 408 of decoder 201 may be configured to perform the inverse operations of geometry analysis module 306, voxelization module 304, and coordinate transform module 302 of encoder 101, respectively.
Consistent with the scope of the present disclosure, encoder 101 and decoder 201 may be configured to adopt various novel schemes of syntax element representation and organization, as disclosed herein, to improve the flexibility and generality of point cloud coding.
According to some aspects of the present disclosure, various attribute-presence syntax elements are introduced at different levels to control the enablement/disablement of all attributes or an individual attribute in point cloud coding. In some embodiments, the different parameters under the same condition check at the same level (e.g., associated with the same attribute) can be grouped altogether to reduce the number of condition checks, thereby further simplifying the scheme.
As shown in
According to some aspects of the present disclosure, the difference in attribute values between the current point and its predictor may be referred to as a “residual.” Depending on the application, PCC can be either lossless or lossy. Hence, the residual may or may not be quantized by using the predefined quantization process. According to the present disclosure, the residual without or with quantization may be referred to as a “level,” which is a signed integer (e.g., a positive or negative integer value) coded into the bitstream.
To reduce bitstream overhead, the maximum number of attributes (e.g., color, reflectance, depth, etc.) may be indicated as the maximum number of attributes minus 1. To that end, a maxNumAttributesMinus1 syntax element may be coded into the bitstream. According to some aspects consistent with the present disclosure, decoder 201 may infer the value of the maxNumAttributesMinus1 syntax element is equal to −1. Moreover, the present disclosure moves the colorQuantParam and reflQuantParam syntax elements from the SPS header to the attribute header, as illustrated below in Table 5.
Referring to Table 6, the maxNumAttributesMinus1 plus 1 syntax element may indicate the maximum number of supported attributes. The value of maxNumAttributesMinus1 may be in the range of 0 to 15, inclusive. When not present in the bitstream, decoder 201 may infer the value of maxNumAttributesMinus1 as equal to −1.
In the current AVS-GPCC specification, MPEG-GPCC specification, etc., each attribute only supports a single parameter set. For example, a point cloud may be associated with a single set of color data (e.g., a first attribute), a single set of reflectance data (e.g., a second attribute), a single set of distance data (e.g., a third attribute) and so on. However, there are scenarios in which supporting multiple parameter sets may be beneficial. These scenarios include, without limitation, a point cloud associated with a painting that has faded. For instance, a compressed point cloud of a renaissance painting may be associated with a first parameter set associated with the colors used when the painting was originally made, and a second parameter set associated with the present colors (which have faded since it was originally made).
Hence, enabling multiple attribute coding parameter sets for one type of attribute may be desirable. To support a wider range of GPCC applications, the present disclosure enables multiple attribute coding parameter sets for one type of attribute, as shown below in Tables 7-9.
Referring to Table 7, an sps_multi_data_set_flag may be used to indicate whether the bitstream includes multiple attribute coding parameter sets. For example, encoder 101 may generate an sps_multi_data_set_flag value (e.g., a first syntax element) equal to 1, which may indicate that multiple attribute coding parameter sets are enabled for the current point cloud. On the other hand, an sps_multi_data_set_flag value equal to 0 specifies that multiple attribute coding parameter sets are not enabled for the point cloud. When not present in the bitstream, decoder 201 may infer that the value of the sps_multi_data_set_flag is equal to 0.
Referring to Table 8, a multi_data_set_flag[attrIdx] value equal to 1 may indicate that multiple attribute parameter sets are enabled for the current type of attribute, and the number of allowed attribute coding parameter sets will be further specified by the attribute_num_data_set_minus1 syntax element. On the other hand, a multi_data_set_flag value equal to 0 specifies that multiple attribute parameter sets are not enabled for the current type of attribute, and the number of allowed attribute coding parameter sets is equal to 1. When not present in the bitstream, decoder 201 may infer that the value of the multi_data_set_flag is equal to 0. In some embodiments, the multi_data_set_flag may be indicated as a multi_data_set_flag[attrIdx], where the [attrIdx] is associated with the attribute index correlated to the attribute to which the multi_data_set_flag refers. By way of example and not limitation, assume the attribute associated with the multi_data_set_flag is the “color attribute” with an index value of 0, the second syntax element coded into the bitstream may include multi_data_set_flag[0].
Referring to Table 8, the attribute_num_data_set_minus1[attrIdx] syntax element plus 1 may indicate the number of parameter sets for coding the current attribute. The current attribute may be indicated by the address index (attrIdx). For instance, an attribute index of 1 may indicate a color attribute, an attribute index of 2 may indicate a reflectance attribute, an attribute index of 3 may indicate a depth attribute, and so on. The value of attribute_num_data_set_minus1 may be in the range of 0 to 15, inclusive. When not present in the bitstream, decoder 201 may infer that the value of attribute_num_data_set_minus1 is equal to 0.
The number of allowed attribute coding parameter sets (also referred to herein as “parameter sets”) may be calculated according to expression (6).
When all attributes may use the same transform-related syntax, alternative tables may be specified, as illustrated in Tables 10-12.
At 902, the encoder may generate a first syntax element indicative of multiple attribute parameter sets for the point cloud. For example, encoder 101 may generate an sps_multi_data_set_flag value (e.g., a first syntax element) equal to 1 may indicate that multiple attribute coding parameter sets are enabled for the current point cloud. On the other hand, an sps_multi_data_set_flag value equal to 0 specifies that multiple attribute coding parameter sets are not enabled for the current type of attribute. When not present in the bitstream, encoder 101 may infer that the value of the sps_multi_data_set_flag is equal to 0.
At 904, the encoder may input the first syntax element into a bitstream. For example, encoder 101 may input the sps_multi_data_set_flag into the bitstream.
At 906, the encoder may, in response to the first syntax element indicating that the multiple attribute parameter sets are enabled for the point cloud, generate a second syntax element indicative of multiple parameter sets for an attribute. For example, encoder 101 may generate a multi_data_set_flag (e.g., a second syntax element) to indicate whether an attribute has multiple parameter sets. Different multi_data_set_flag syntax elements may be generated for different attributes. For instance, a first multi_data_set flag may be generated to indicate whether the color attribute has multiple parameter sets, a second multi_data_flag may be generated to indicate whether the reflectance attribute has multiple parameter sets, a third multi_data_flag may be generated to indicate whether the depth attribute has multiple parameter sets, and so on. In some embodiments, a multi_data_set_flag value equal to 1 may indicate that multiple attribute parameter sets are enabled for the current type of attribute. On the other hand, a multi_data_set_flag value equal to 0 specifies that multiple attribute coding parameter sets are not enabled for the current type of attribute. When not present in the bitstream, decoder 201 may infer that the value of the multi_data_set_flag is equal to 0.
At 908, the encoder may input the second syntax element indicative of multiple parameter sets for the attribute into the bitstream. For example, encoder 101 may input the multi_data_set flag into the bitstream.
At 910, the encoder may, in response to the second syntax element indicating the attribute has multiple parameter sets, generate a third syntax element indicative of a number of parameter sets associated with the attribute. For example, encoder 101 may generate, for each type of attribute, an attribute_num_data_set_minus1 syntax element (e.g., a third syntax element) to indicate the number of parameter sets associated with a type of attribute. For instance, a first attribute_num_data_set_minus1 syntax element may be generated to indicate the number of parameter sets associated with the color attribute, a second attribute_num_data_set_minus1 syntax element may be generated to indicate the number of parameter sets associated with the reflectance attribute, a third attribute_num_data_set_minus1 syntax element may be generated to indicate the number of parameter sets associated with the depth attribute, and so on. The attribute_num_data_set_minus1[attrIdx] syntax element (e.g., a third syntax element) plus 1 may indicate the number of parameter sets for coding the current attribute. The current attribute may be indicated by the address index (attrIdx). For instance, an attribute index of 1 may indicate a color attribute, an attribute index of 2 may indicate a reflectance attribute, an attribute index of 3 may indicate a depth attribute, and so on. The value of attribute_num_data_set_minus1 may be in the range of 0 to 15, inclusive. When not present in the bitstream, encoder 101 may infer that the value of attribute_num_data_set_minus1 is equal to 0.
At 912, the encoder may input the third syntax element indicative of the number of parameter sets associated with the attribute into the bitstream. For example, encoder 101 may input the attribute_num_data_set_minus1 syntax element into the bitstream.
At 914, the encoder may, in response to the second syntax element indicating the attribute does not have multiple parameter sets, omit a generation of the third syntax element. For example, encoder 101 may not generate the attribute_num_data_set_minus1 syntax element.
At 916, the encoder may, in response to the multiple attribute parameter sets being enabled for the point cloud, compress the point cloud based on the multiple attribute parameter sets. For example, encoder 101 may compress and/or encode the point cloud by using multiple attribute parameter sets when an attribute has more than one parameter set associated therewith. For instance, encoder 101 may encode the point cloud with two or more color parameter sets. Encoder 101 may encode the point cloud with two or more reflectance parameter sets. Encoder 101 may encode the point cloud with two or more depth parameter sets.
At 918, the encoder may, in response to the multiple attribute parameter sets not being enabled for the point cloud, compress the point cloud based on a single attribute parameter set. For example, encoder 101 may compress and/or encode the point cloud by using a single attribute parameter set when each of the attributes is associated with a single parameter set.
At 1002, the decoder may parse a bitstream to obtain a first syntax element indicative of an enablement of multiple attribute parameter sets for the point cloud. For example, decoder 201 may parse a bitstream to obtain an sps_multi_data_set flag (e.g., a first syntax element).
At 1004, the decoder may determine whether the first syntax element indicates that multiple attribute parameter sets are enabled for the point cloud. For example, decoder 201 may determine whether the multiple attribute parameter sets are enabled based on a value of the sps_multi_data_set_flag. An sps_multi_data_set_flag value equal to 1 may indicate that multiple attribute parameter sets are enabled for the current point cloud. On the other hand, an sps_multi_data_set_flag value equal to 0 specifies that multiple attribute parameter sets are not enabled for the current point cloud.
At 1006, the decoder may, in response to the first syntax element indicating that the multiple attribute parameter sets are enabled for the point cloud, parse the bitstream to obtain a second syntax element indicative of multiple parameter sets for an attribute. For example, when the sps_multi_data_set_flag is set to 1, decoder 201 may further parse the bitstream to obtain a multi_data_set_flag (e.g., a second syntax element).
At 1008, the decoder may determine whether the second syntax element indicates the attribute has multiple parameter sets. For example, based on a value of the multi_data_set_flag, decoder 201 may determine whether an attribute has multiple parameter sets. Different multi_data_set_flag syntax elements may be included in the bitstream for different attributes. For instance, a first multi_data_set flag may indicate whether the color attribute has multiple parameter sets, a second multi_data_flag may indicate whether the reflectance attribute has multiple parameter sets, a third multi_data_flag may indicate whether the depth attribute has multiple parameter sets, and so on. In some embodiments, a multi_data_set_flag value equal to 1 may indicate that multiple attribute coding parameter sets are enabled for the current type of attribute. On the other hand, a multi_data_set_flag value equal to 0 specifies that multiple attribute parameter sets are not enabled for the current type of attribute. When not present in the bitstream, decoder 201 may infer that the value of the multi_data_set_flag is equal to 0.
At 1010, the decoder may, in response to the second syntax element indicating the attribute has multiple parameter sets, parse the bitstream to obtain a third syntax element indicative of a number of parameter sets associated with the attribute. For example, decoder 201 may parse the bitstream to obtain an attribute_num_data_set_minus1 syntax element (e.g., a third syntax element) for each type of attribute.
At 1012, the decoder may identify the number of parameter sets associated with the attribute based on the third syntax element. For example, the attribute_num_data_set_minus1[attrIdx] syntax element (e.g., a third syntax element) plus 1 may indicate the number of parameter sets for coding the current attribute. The current attribute may be indicated by the address index (attrIdx). For instance, an attribute index of 1 may indicate a color attribute, an attribute index of 2 may indicate a reflectance attribute, an attribute index of 3 may indicate a depth attribute, and so on. The value of attribute_num_data_set_minus1 may be in the range of 0 to 15, inclusive. When not present in the bitstream, decoder 201 may infer that the value of attribute_num_data_set_minus1 is equal to 0. Decoder 201 may identify the number of allowed attribute parameter sets (also referred to herein as “parameter sets”) using the calculation associated with expression (6).
At 1014, the decoder may, in response to the second syntax element indicating the attribute does not have multiple parameter sets, determine the attribute has a single parameter set associated therewith. For example, decoder 201 may parse the second syntax element from the bitstream. Based on the receipt of the second syntax element, decoder 201 may determine that the associated attribute (e.g., color, reflectance, intensity, etc.) has a single parameter set associated therewith.
At 1016, the decoder may, in response to determining that the multiple attribute parameter sets are enabled for the point cloud, decompress the point cloud based on the multiple attribute parameter sets. For example, decoder 201 may decompress and/or decode the point cloud from the bitstream based on the multiple attribute parameter sets when present.
At 1018, the decoder may, in response to determining that the multiple attribute parameter sets are not enabled for the point cloud, decompress the point cloud based on a single attribute parameter set. For example, decoder 201 may decompress and/or decode the point cloud from the bitstream based on a single attribute parameter set when each attribute type has only a single parameter set associated therewith.
In various aspects of the present disclosure, the functions described herein may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored as instructions on a non-transitory computer-readable medium. Computer-readable media includes computer storage media. Storage media may be any available media that can be accessed by a processor, such as processor 102 in
According to one aspect of the present disclosure, a method for decoding a point cloud that is represented in a 1D array that includes a set of points is provided. The method may include parsing, by at least one processor, a bitstream to obtain a first syntax element indicative of an enablement of multiple attribute parameter sets for the point cloud. The method may include determining, by the at least one processor, whether the first syntax element indicates that multiple attribute parameter sets are enabled for the point cloud. In response to determining that the multiple attribute parameter sets are enabled for the point cloud, the method may include decompressing, by the at least one processor, the point cloud based on the multiple attribute parameter sets.
In some embodiments, in response to determining that the multiple attribute parameter sets are not enabled for the point cloud, the method may include decompressing, by the at least one processor, the point cloud based on a single attribute parameter set.
In some embodiments, in response to the first syntax element indicating that the multiple attribute parameter sets are enabled for the point cloud, the method may include parsing, by the at least one processor, the bitstream to obtain a second syntax element indicative of multiple parameter sets for an attribute. In some embodiments, the method may include determining, by the at least one processor, whether the second syntax element indicates the attribute has multiple parameter sets.
In some embodiments, in response to the second syntax element indicating the attribute has multiple parameter sets, the method may include parsing, by the at least one processor, the bitstream to obtain a third syntax element indicative of a number of parameter sets associated with the attribute. In some embodiments, the method may include identifying, by the at least one processor, the number of parameter sets associated with the attribute based on the third syntax element.
In some embodiments, in response to the second syntax element indicating the attribute does not have multiple parameter sets, the method may include determining, by the at least one processor, the attribute has a single parameter set associated therewith.
In some embodiments, the first syntax element may include an sps_multi_data_set flag. In some embodiments, the second syntax element may include a multi_data_set flag. In some embodiments, the third syntax element may include an attribute_num_data_set_minus1[attrIdx] syntax element. In some embodiments, the value of the attribute_num_data_set_minus1[attrIdx] syntax element coded in the bitstream may be indicative of the number of allowed multiple attribute parameter sets for the attribute which is sum of the attribute_num_data_set_minus1[attrIdx] and 1.
According to another aspect of the present disclosure, a system for decoding a point cloud that is represented in a 1D array that includes a set of points is provided. The system may include at least one processor and memory storing instructions. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to parse a bitstream to obtain a first syntax element indicative of an enablement of multiple attribute parameter sets for the point cloud. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to determine whether the first syntax element indicates that multiple attribute parameter sets are enabled for the point cloud. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to, in response to determining that the multiple attribute parameter sets are enabled for the point cloud, decompress the point cloud based on the multiple attribute parameter sets.
In some embodiments, the memory storing instructions, which when executed by the at least one processor, may further cause the at least one processor to, in response to determining that the multiple attribute parameter sets are not enabled for the point cloud, decompress the point cloud based on a single attribute parameter set.
In some embodiments, the memory storing instructions, which when executed by the at least one processor, may further cause the at least one processor to, in response to the first syntax element indicating that the multiple attribute parameter sets are enabled for the point cloud, parse the bitstream to obtain a second syntax element indicative of multiple parameter sets for an attribute. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to determine whether the second syntax element indicates the attribute has multiple parameter sets.
In some embodiments, the memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to, in response to the second syntax element indicating the attribute has multiple parameter sets, parse the bitstream to obtain a third syntax element indicative of a number of parameter sets associated with the attribute. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to identify the number of parameter sets associated with the attribute based on the third syntax element.
In some embodiments, in response to the second syntax element indicating the attribute does not have multiple parameter sets, the memory storing instructions, which when executed by the at least one processor may cause the at least one processor to determine the attribute has a single parameter set associated therewith.
In some embodiments, the first syntax element may include an sps_multi_data_set flag. In some embodiments, the second syntax element may include a multi_data_set_flag. In some embodiments, the third syntax element may include an attribute_num_data_set_minus1[attrIdx] syntax element. In some embodiments, the value of the attribute_num_data_set_minus1[attrIdx] syntax element coded in the bitstream may be indicative of the number of allowed multiple attribute parameter sets for the attribute which is sum of the attribute_num_data_set_minus1[attrIdx] and 1.
According to a further aspect of the present disclosure, a method for encoding a point cloud that is represented in a 1D array that includes a set of points is provided. The method may include generating, by at least one processor, a first syntax element indicative of multiple attribute parameter sets for the point cloud. The method may include inputting, by the at least one processor, the first syntax element into a bitstream. In response to the multiple attribute parameter sets being enabled for the point cloud, the method may include compressing, by the at least one processor, the point cloud based on the multiple attribute parameter sets.
In some embodiments, in response to the multiple attribute parameter sets not being enabled for the point cloud, the method may include compressing, by the at least one processor, the point cloud based on a single attribute parameter set.
In some embodiments, in response to the first syntax element indicating that the multiple attribute parameter sets are enabled for the point cloud, the method may include generating, by the at least one processor, a second syntax element indicative of multiple parameter sets for an attribute. In some embodiments, the method may include inputting, by the at least one processor, the second syntax element indicative of multiple parameter sets for the attribute into the bitstream.
In some embodiments, in response to the second syntax element indicating the attribute has multiple parameter sets, the method may include generating, by the at least one processor, a third syntax element indicative of a number of parameter sets associated with the attribute. In some embodiments, inputting, by the at least one processor, the third syntax element indicative of the number of parameter sets associated with the attribute into the bitstream.
In some embodiments, in response to the second syntax element indicating the attribute does not have multiple parameter sets, the method may include omitting, by the at least one processor, a generation of the third syntax element.
In some embodiments, the first syntax element may include an sps_multi_data_set flag. In some embodiments, the second syntax element may include a multi_data_set flag. In some embodiments, the third syntax element may include an attribute_num_data_set_minus1[attrIdx] syntax element. In some embodiments, the value of the attribute_num_data_set_minus1[attrIdx] syntax element coded in the bitstream may be indicative of the number of allowed multiple attribute parameter sets for the attribute which is sum of the attribute_num_data_set_minus1[attrIdx] and 1.
According to a further aspect of the present disclosure, a system for encoding a point cloud that is represented in a 1D array that includes a set of points is provided. The system may include at least one processor and memory storing instructions. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to generate a first syntax element indicative of multiple attribute parameter sets for the point cloud. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to input the first syntax element into a bitstream. The memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to, in response to the multiple attribute parameter sets being enabled for the point cloud, compressing the point cloud based on the multiple attribute parameter sets.
In some embodiments, the memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to, in response to the multiple attribute parameter sets not being enabled for the point cloud, compress the point cloud based on a single attribute parameter set.
In some embodiments, the memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to, in response to the first syntax element indicating that the multiple attribute parameter sets are enabled for the point cloud, generate a second syntax element indicative of multiple parameter sets for an attribute. In some embodiments, the memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to input the second syntax element indicative of multiple parameter sets for the attribute into the bitstream.
In some embodiments, the memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to, in response to the second syntax element indicating the attribute has multiple parameter sets, generate a third syntax element indicative of a number of parameter sets associated with the attribute. In some embodiments, the memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to input the third syntax element indicative of the number of parameter sets associated with the attribute into the bitstream.
In some embodiments, in response to the second syntax element indicating the attribute does not have multiple parameter sets, the memory storing instructions, which when executed by the at least one processor, may cause the at least one processor to omit a generation of the third syntax element.
In some embodiments, the first syntax element may include an sps_multi_data_set flag. In some embodiments, the second syntax element may include a multi_data_set flag. In some embodiments, the third syntax element may include an attribute_num_data_set_minus1[attrIdx] syntax element. In some embodiments, the value of the attribute_num_data_set_minus1[attrIdx] syntax element coded in the bitstream may be indicative of the number of allowed multiple attribute parameter sets for the attribute which is sum of the attribute_num_data_set_minus1[attrIdx] and 1.
The foregoing description of the embodiments will so reveal the general nature of the present disclosure that others can, by applying knowledge within the skill of the art, readily modify and/or adapt for various applications such embodiments, without undue experimentation, without departing from the general concept of the present disclosure. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance.
Embodiments of the present disclosure have been described above with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.
The Summary and Abstract sections may set forth one or more but not all exemplary embodiments of the present disclosure as contemplated by the inventor(s), and thus, are not intended to limit the present disclosure and the appended claims in any way.
Various functional blocks, modules, and steps are disclosed above. The arrangements provided are illustrative and without limitation. Accordingly, the functional blocks, modules, and steps may be reordered or combined in different ways than in the examples provided above. Likewise, some embodiments include only a subset of the functional blocks, modules, and steps, and any such subset is permitted.
The breadth and scope of the present disclosure should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
This application is a national phase entry under 35 USC 371 of International Patent Application No. PCT/US2023/026010 filed on Jun. 22, 2023, which claims the benefit of priority to U.S. Provisional Application No. 63/366,904, filed Jun. 23, 2022, entitled “GEOMETRY POINT CLOUD CODING,” which are incorporated by reference herein in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2023/026010 | 6/22/2023 | WO |
Number | Date | Country | |
---|---|---|---|
63366904 | Jun 2022 | US |