The present application claims priority to Chinese Patent Application No. 202111138592.X, entitled “Point Cloud Encoding Method and Apparatus, Electronic Device, Medium and Program Product”, filed with China National Intellectual Property Administration on Sep. 27, 2021, the entire content of which is incorporated herein by reference.
The present application relates to the field of point cloud processing technologies, and particularly to a point cloud encoding method and apparatus, an electronic device, a medium, and a program product.
A point cloud is obtained by sampling a surface of an object using a three-dimensional scanning device, and a number of points of one point cloud frame is generally in a million level, wherein each point contains attribute information, such as geometric information, color, reflectivity, or the like. Therefore, the three-dimensional point cloud has a huge data quantity, which brings huge challenges to storage, transmission, or the like, of the three-dimensional point cloud, such that compression of point cloud is quite necessary.
Currently, technicians usually encode point cloud data using a progressive octree, a prediction tree, dynamic binary decomposition, shape-adaptive wavelet transform, graph transformation and other methods.
However, the above encoding methods can have a better compression performance during encoding of point cloud data with strong correlation between points, and if there are many discontinuous regions in the point cloud (for example, a laser radar point cloud), the correlation between points in the point cloud data is weak, and the above encoding methods will generate more redundancy and have a poor compression performance during encoding of such point cloud data.
In order to improve the compression performance when the laser radar point cloud is encoded, related technicians try to divide the laser radar point cloud into different local regions and encode the point cloud using various geometric models. Such a method can really achieve a better compression performance, but the method cannot realize lossless encoding due to filtering of abnormal points and a floating point operation.
Therefore, how to improve the compression performance when lossless encoding is performed on the laser radar point cloud is a problem to be solved.
Embodiments of the present application provide a point cloud encoding method and apparatus, an electronic device, a medium and a program product, which are used for improving a compression performance when lossless encoding is performed on a laser radar point cloud.
Some embodiments of the present application provide a point cloud encoding method, which may include steps of:
In the above implementation process, image layer division is performed on the laser radar point cloud to determine the different types of image layers, thus facilitating further determining region segmentation according to the type of each image layer. Region segmentation is performed on the corresponding image layer according to the region segmentation corresponding to each type, so as to obtain the region images of each image layer, thus facilitating arranging the region images of each image layer, thereby reducing an occupied space of the region images and a redundancy of image data storage. Further, each arranged image is encoded based on the encoding method correspondingly set for the type of the corresponding arranged image, thereby realizing lossless encoding of the laser radar point cloud, and improving the compression performance during encoding of the laser radar point cloud.
In an embodiment, the types of the image layers may include: a noise type, a ground type and an object type. The step of performing image layer division on a to-be-processed laser radar point cloud to generate different types of image layers includes steps of:
In the above implementation process, the image layer division is performed on the laser radar point cloud by filtering, so as to obtain the image layer of the noise type and the image layer of the non-noise type. Further, the image layer division is performed on the image layer of the non-noise type by ground extraction, so as to obtain the image layer of the ground type and the image layer of the object type, so that a layered processing of the point clouds with different characteristics in the laser radar point cloud is realized.
In an embodiment, the step of performing region segmentation on each image layer using a region segmentation method correspondingly set for a type of the corresponding image layer, so as to obtain region images corresponding to each image layer may include steps of:
In the above implementation process, region segmentation is performed on the image layer of each type to obtain the region images corresponding to each image layer, thus facilitating arranging adjacently the region images of each image layer.
In an embodiment, the step of performing object segmentation on the image layer of the object type to obtain each object region image of an object type may include steps of:
In the above implementation process, the region segmentation is performed on the image layer of the object type to obtain object region images, such that objects in the image layer of the object type are segmented as independent region images, which further provides a basis for the subsequent arrangement of the region images of the object type.
In an embodiment, the step of performing ground segmentation on the image layer of the ground type to obtain each ground region image of a ground type may include steps of:
In the above implementation process, the ground region images of the image layer of the ground type are generated by Gaussian fitting, such that the ground region images of the image layer of the ground type are segmented as independent region images, which further provide a basis for the subsequent arrangement of the region images of the ground type.
In an embodiment, the step of performing coordinate system conversion on each coordinate point in the image layer of the object type based on a coordinate system of the image layer of the object type and a reference coordinate system, so as to obtain a mapped object image of the image layer of the object type in the reference coordinate system may include a step of:
In an embodiment, the step of performing noise segmentation on the image layer of the noise type to obtain noise region images of a noise type may include a step of:
In the above implementation process, noises in the image layer of the noise type are segmented into respective noise region images, such that the noise region images are segmented as independent units, which further provides a basis for the subsequent arrangement of the region images of the noise type.
In an embodiment, the step of arranging the region images corresponding to each image layer to obtain arranged images corresponding to each image layer may include steps of:
In the above implementation process, the region images of each type are arranged, such that every two adjacent region images in the arranged images have a connection point, and therefore, the region images are aggregated to reduce the occupied space of the images and thus reduce the data storage redundancy.
In an embodiment, the step of encoding each arranged image based on an encoding method correspondingly set for the type of the corresponding arranged image, so as to obtain encoded data of the laser radar point cloud may include steps of:
In the above implementation process, the arranged images of each type are encoded using the encoding method correspondingly set for the type of the arranged images to obtain the encoded data of the image layer of each type, such that the image layer of each type forms a data stream, thus facilitating data transmission.
Some other embodiments of the present application provide a point cloud encoding apparatus, which may include:
In an embodiment, the types of the image layers may include: a noise type, a ground type and an object type; the image layer division unit may be specifically configured to:
In an embodiment, the region segmentation unit may be specifically configured to:
In an embodiment, the region segmentation unit may be specifically configured to:
In an embodiment, the region segmentation unit may be specifically configured to:
In an embodiment, the region segmentation unit may be specifically configured to:
In an embodiment, the arranging unit may be specifically configured to:
In an embodiment, the encoding unit may be specifically configured to:
Further embodiments of the present application provide an electronic device, which may include:
Still further embodiments of the present application provide a computer-readable storage medium, having a computer program stored thereon, wherein the computer program, when executed by a processor, may implement the steps in the method according to any one of the above embodiments.
Other embodiments of the present application provide a computer program product which, when run on a computer, may cause the computer to perform the method according to any one of the above embodiments.
In order to make the above mentioned objects, features, and advantages to be achieved by the embodiments of the present application more apparent, preferred embodiments are described in detail hereinafter by referring to the accompanying drawings.
To describe the technical solutions in the embodiments of the present application more clearly, the following briefly describes the accompanying drawings required in the embodiments of the present application. It should be understood that the following accompanying drawings show merely some embodiments of the present application and therefore should not be considered as limiting the scope, and a person of ordinary skill in the art may still derive other related drawings from these accompanying drawings without creative efforts.
The technical solutions in the embodiments of the present application are clearly and completely described with reference to the accompanying drawings in the embodiments of the present application, and apparently, the described embodiments are not all but only a part of the embodiments of the present application. Generally, the components of the embodiments of the present application described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Accordingly, the following detailed description of the embodiments of the present application provided in the drawings is not intended to limit the scope of protection of the present application, but only represents selected embodiments of the present application. All other embodiments obtained by those skilled in the art based on the embodiments of the present application without creative efforts shall fall within the protection scope of the present application.
It should be noted that similar reference signs and letters denote similar items in the following drawings. Therefore, once a certain item is defined in one figure, it does not need to be further defined and explained in the subsequent figures. Meanwhile, in the description of the present application, the terms such as “first”, “second”, or the like, are only used for distinguishing descriptions and are not intended to indicate or imply relative importance.
Some terms referred to in the embodiments of the present application will be described first to facilitate understanding by those skilled in the art.
Terminal device: it may be a mobile terminal, a fixed terminal, or a portable terminal, such as a mobile phone, a site, a unit, a device, a multimedia computer, a multimedia tablet, an Internet node, a communicator, a desktop computer, a laptop computer, a notebook computer, a netbook computer, a tablet computer, a personal communication system device, a personal navigation device, a personal digital assistant, an audio/video player, a digital camera/camcorder, a positioning device, a television receiver, a radio broadcast receiver, an electronic book device, a gaming device, or any combination thereof, including accessories and peripherals of these devices, or any combination thereof. It is also contemplated that the terminal device can support any type of user interface (for example, a wearable device), or the like.
Server: it may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server for providing a cloud service, a cloud database, cloud computing, a cloud function, a cloud storage, a network service, a cloud communication, a middleware service, a domain name service, a security service, and a basic cloud computing service, such as big data, an artificial intelligence platform, or the like.
A three-dimensional point cloud is an important representation of real world digitization. With a rapid development of a three-dimensional scanning device, precision and a resolution of the obtained point cloud are continuously improved. The high-precision point cloud is widely applied to construction of urban digital maps, and for example, it serves as technical supports in a plurality of popular researches, such as smart city, unmanned driving, cultural relic preservation, or the like. A point cloud is an image obtained by scanning a surface of an object using a three-dimensional scanning device, a number of points of one point cloud frame is generally in a million level, wherein each point contains attribute information, such as geometric information, color, reflectivity, or the like, and the data quantity is huge. The huge data quantity of the three-dimensional point cloud brings huge challenges to data storage, transmission, or the like, such that compression of point cloud is quite necessary.
Currently, technicians usually encode point cloud data using a progressive octree, a prediction tree, a dynamic binary decomposition, a shape-adaptive wavelet transform, a graph transformation and other methods.
However, the above encoding methods can have a better compression performance during encoding of point cloud data with strong correlation between points, and if there are many discontinuous regions in the point cloud (for example, a laser radar point cloud), the correlation between points in the point cloud data is weak, and the above encoding methods will generate more redundancy and have a poor compression performance during encoding of such point cloud data.
In order to improve the compression performance when the laser radar point cloud is encoded, related technicians try to divide the laser radar point cloud into different local regions and encode the point cloud using various geometric models. Such a method can really achieve a better compression performance, but the method cannot realize lossless encoding due to filtering of abnormal points and a floating point operation.
Thus, the present application provides a point cloud encoding method and apparatus, an electronic device, a medium and a program product, which are used for improving a compression performance when lossless encoding is performed on a laser radar point cloud.
In the embodiments of the present application, a subject for executing the method may be an electronic device, and optionally, the electronic device may be a server or a terminal device, but the present application is not limited thereto.
Reference is made to
Specifically, the types of the image layers may include: a noise type, a ground type and an object type, and following steps may be implemented when step 101 is performed:
Specifically, the to-be-processed laser radar point cloud is filtered using a filtering algorithm, so as to generate the image layer of the noise type and the image layer of the non-noise type.
As an embodiment, the to-be-processed laser radar point cloud is a point cloud generated by scanning a surrounding environment with a laser radar in an automatic driving scenario.
As an embodiment, the to-be-processed laser radar point cloud is filtered using a radius outlier removal filter (RORF) algorithm to generate the image layer of the noise type and the image layer of the non-noise type.
It should be noted that, in the embodiment of the present application, as an example for description, the RORF algorithm is taken as the filtering algorithm, and in practical applications, the filtering algorithm may be a conditional filtering algorithm or a domain filtering algorithm, which is not limited herein.
Specifically, ground extraction may be performed on the image layer of the non-noise type using a fitting algorithm, so as to generate the image layer of the ground type and the image layer of the object type.
As an embodiment, ground extraction may be performed on the image layer of the non-noise type using an M-estimator sample consensus (MSAC) algorithm.
Specifically, in a hypothesis stage, in the MSAC, a small part of points are extracted from the image layer of the non-noise type as a subset using a strategy of a random sample consensus algorithm, and then, parameters of a ground model are estimated based on the extracted subset, wherein the ground model may be defined as:
ax+by+cz+d=0 (1)
During hypothesis, in the MSAC, a plurality of planes of the ground model may be generated.
During verification, the points left in the point cloud are used to determine the most appropriate hypothesis. Typically, a cost function is used to evaluate the hypothesis, and can be defined as:
C
2=Σiρ2(ei2) (2)
The hypothesis of a numeral value with a smallest cost can be selected by H.
Optionally, the image layer of the ground type may be generated by fitting the ground.
Optionally, after around extraction is performed on the image layer of the non-noise type, the remaining part of the point cloud may be used as the image layer of the object type.
It should be noted that, in the present application, as an example for description, the MSAC algorithm is taken as the fitting algorithm, and in practical applications, the fitting algorithm may be a least median method or a random sample consensus algorithm, which is not limited herein.
As shown in
Referring to
Referring to
In the above implementation process, image layer division may be performed on the laser radar point cloud by filtering, so as to obtain the image layer of the noise type and the image layer of the non-noise type; further, image layer division may be performed on the image layer of the non-noise type by ground extraction, so as to obtain the image layer of the ground type and the image layer of the object type, thereby realizing layered processing of point clouds with different characteristics in the laser radar point cloud.
Step 102: performing region segmentation on each image layer using a region segmentation method correspondingly set for a type of the corresponding image layer, so as to obtain region images corresponding to each image layer.
Specifically, object segmentation is performed on the image layer of the object type based on a mapped segmentation algorithm, so as to obtain object region images of an object type, wherein each object region image is an independent unit.
Specifically, following steps may be implemented when step 102 is performed:
Specifically, following steps may be implemented when S1021 is performed:
Specifically, each coordinate point in the coordinate system of the image layer of the object type is mapped into the reference coordinate system using a preset resolution, so as to obtain the mapped object image of the image layer of the object type in the reference coordinate system.
Object segmentation can be performed on the mapped object image based on a robust segmentation algorithm to obtain the segmented object images.
The segmented object region images may be matched with objects in the image layer of the object type respectively to obtain matching results, wherein the matching results may include successful matching and unsuccessful matching.
In the above, points which are not successfully matched can be separated from the image layer of the object type, and the unmatched points may be transferred to the image layer of the noise type.
Specifically, the object characterized by successfully matched is screened out according to the successfully matching result from the objects in the image layer of the object type.
Specifically, in the image layer of the object type, the object region images corresponding to the successfully matched object are segmented.
As an embodiment,
In the above implementation process, region segmentation may be performed on the image layer of the object type to obtain object region images, such that objects in the image layer of the object type are divided into independent region images, which further provides a basis for the subsequent arrangement of the region images.
Ground segmentation is performed on the image layer of the ground type using a Gaussian mixture model to obtain ground region images of the ground type.
Specifically, following steps may be implemented when S1022 is performed:
As an embodiment, a coordinate point (x, y, z) in the image layer of the ground type is converted to a corresponding point (θ, φ, z) in the reference coordinate system, wherein θ is elevation angle data for the coordinate point.
Specifically, the coordinate point (θ, φ, z) in the reference coordinate system and the coordinate point (x, y, z) in the image layer of the ground type may be converted by the following expressions:
The elevation angle data of each of the coordinate points in the reference coordinate system can be subjected to Gaussian fitting to obtain a plurality of Gaussian density functions, wherein an image corresponding to each Gaussian density function is a ground region image.
As an embodiment, the ground region images of the image layer of the ground type are obtained according to the images corresponding to respective Gaussian density functions.
In the above implementation process, the ground region images of the image layer of the ground type may be generated by Gaussian fitting, such that the ground region images of the image layer of the ground type are segmented as independent region images, which provides a basis for the subsequent arrangement of the region images of the ground type.
Specifically, any of the following manners may be implemented when S1023 is performed:
Specifically, in the present application, region division for the image layer of the noise type in the second manner is taken as an example for description. When the second manner is executed, noise segmentation may be performed on the noises in the image layer of the noise type to obtain noise region images of the noise type.
As an embodiment, the noises in the image layer of the noise type can be divided into the noise region images, wherein the noise region images are independent units.
In the above implementation process, noises in the image layer of the noise type may be divided into respective noise region images, such that noise region images are divided into independent units, which further provides a basis for the subsequent arrangement of the region images.
Step 103: arranging the region images corresponding to each image layer to obtain arranged images corresponding to each image layer.
Therefore, every two adjacent region images in the arranged images may have a connection point, wherein the type of each image layer is the same as that of the corresponding arranged images.
Specifically, following steps may be implemented when step 103 is performed:
Specifically, the object region images are aggregated, such that the object region images are adjacently arranged to obtain the arranged images of the object type.
As an embodiment, a packing algorithm may be used to surround each object region image in a minimum bounding box containing all points of the corresponding object region image, and each object region image may be aggregated by moving the bounding box of corresponding object region image.
As shown in
As shown in
In the above implementation process, the object region images can be packed to a smaller space by adjacently arranging the object region images, thereby saving the space and reducing data redundancy.
Specifically, the ground region images can be aggregated using a Gaussian mixture model to obtain the arranged images of the ground type.
As an embodiment, the Gaussian mixture model may be used to perform non-linear division, and a graph corresponding to each Gaussian density function is used as one ground region image.
The Gaussian mixture model is described as the sum of M Gaussian density functions, and it can be expressed as:
p(v|wi,μi,Σi)=Σi=1Mwig(v|μi,Σi) (6)
The parameters of the Gaussian density function are estimated using maximum likelihood estimation by maximizing the likelihood of a given fitting data V={v1 . . . vT}, and the formula is as follows:
(v|wi,μi,Σi)=Πt=1Tp(vt|wi,μi,Σi) (8)
The above parameters estimated with the maximum likelihood are obtained by performing iterations using the expectation maximization algorithm, so as to determine a closed form of formula (8). However, the expectation maximization algorithm converges to a local optimum due to initial conditions. Thus, an initial point affects the performance of the Gaussian mixture model in fitting data distribution. The Gaussian mixture model is initialized using a modified clustering algorithm (k-means++).
The noise region images may be aggregated to arrange the noise region images, so as to obtain the arranged images of the noise type.
In the above implementation process, the region images of each type are adjacently arranged to obtain the arranged images of each type, and therefore, the region images are aggregated to reduce the occupied space of the images and thus reduce the redundancy.
Step 104: encoding each arranged image based on an encoding method correspondingly set for the type of the corresponding arranged image, so as to obtain encoded data of the laser radar point cloud.
Specifically, foil owing steps may be implemented when step 104 is performed:
Specifically, the encoding method set for the arranged image of the noise type is binary differential encoding.
A process of encoding the arranged image of the noise type using binary differential encoding includes:
It should be noted that the reference coordinate system may be a Cartesian three-dimensional space coordinate system, or another three-dimensional space coordinate system, which is not limited herein.
Specifically, the encoding method set for the arranged images of the object type is octree encoding.
As an embodiment, a point cloud in the arranged images of the object type is encoded using a context-based octree. First, the point cloud is divided using an implicit octree, that is, the point cloud in the arranged images of the object type after aggregation is divided according to a size of a minimum cuboid bounding box containing all points by a binary tree, a quadtree, an octree, and is represented by a mixed tree, wherein nodes containing points are represented as 1, and nodes not containing points are represented as 0. The current node is then encoded using neighbor occupancy as context.
Specifically, the encoding method set for the arranged images of the ground type is Gaussian differential encoding.
A process of encoding the arranged images of the ground type using Gaussian differential encoding is as follows.
First, points belonging to the same Gaussian density function are considered as a class. Second, a mean value of each Gaussian density function is taken as an elevation angle value θ corresponding to all the points in the class. A deflection angle φ of each point in the class is represented by linear fitting using linear fitting, i.e. parameters of the straight line and a corresponding abscissa value. Then, as for a Z value of each class, the encoding is performed on a difference of adjacent points. Next, the currently represented (θ, φ, z) is transposed into an original spatial coordinate system to obtain a reconstructed coordinate (x′, y′, z′). For achieving a lossless effect, a difference between the original coordinate (x, y, z) and the reconstructed coordinate (x′, y′, z′) for each point is encoded. Finally, when the arranged images of the ground type are encoded, elements required to be encoded are: the mean value of the Gaussian density function, the linear fitting parameters, the Z coordinate difference, and the difference between the original and reconstructed coordinates.
The encoded data of the laser radar point cloud may be obtained based on the encoded data of the image layer of the noise type, the encoded data of the image layer of the object type and the encoded data of the image layer of the ground type.
In the above implementation process, the arranged images of each type may be encoded to obtain the encoded data of the image layer of each type, such that the image layer of each type forms a data stream, thus facilitating data transmission.
As an embodiment, the same radar point cloud is encoded using the point cloud encoding method in the present application and a current point cloud geometric compression platform to obtain the encoding result, and the encoding result is compared. The comparison result is shown in
In
In the above, a sequence encoded by each platform may include Ford, Approach, Exit, Join and bends.
Average represents an average value of numbers of bits in the encoding of plural sequences by each platform.
Bits Per Point (BPP) represents a number of bits per point cloud, and Compression ratio gain represents an information gain rate compared to ACI.
From
In the above implementation process, image layer division is performed on the laser radar point cloud, so that the division of point clouds with different characteristics in the laser radar point cloud is realized, which further facilitates the determination of the region division method of the image layer of each type according to the characteristics of the point clouds. In addition, the corresponding image layer is subjected to region division according to the region division method of the image layer of each type, so that a set of region images of each type are obtained, facilitating the arrangement of the set of region images of each type, thus reducing the occupied space of the region image and the redundancy of image data storage; further, the encoding method for the image layer of each type is used to encode the corresponding arranged images, thereby realizing lossless encoding of the laser radar point cloud, and improving the compression performance of the laser radar point cloud.
Reference is made to
In an embodiment, the types of the image layers may include: a noise type, a ground type and an object type; and the image layer division unit 111 may be specifically configured to:
In an embodiment, the region segmentation unit 112 may be specifically configured to:
In an embodiment, the region segmentation unit 112 may be specifically configured to:
In an embodiment, the region segmentation unit 112 may be specifically configured to:
In an embodiment, the region segmentation unit 112 may be specifically configured to:
In an embodiment, the arranging unit 113 may be specifically configured to:
In an embodiment, the encoding unit 114 may be specifically configured to:
It should be noted that the apparatus 110 shown in
Reference is made to
An embodiment of the present application provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method process shown in
The present application further provides a computer program product which, when run on a computer, causes the computer to perform the method shown in
In several embodiments provided in the present application, it should be understood that the disclosed system and method may be implemented in other manners. The described system embodiment is only exemplary. For example, the division of system apparatus is only a logical function division and may be other division in actual implementation. For another example, a plurality of apparatuses or components may be combined or integrated into another system, or some features may be ignored or not performed.
In addition, the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one position, or may be distributed on a plurality of network units. A part or all of the units may be selected according to an actual need to achieve the objectives of the solutions in the embodiments.
The above description is only embodiments of the present application and is not intended to limit the protection scope of the present application, and various modifications and changes may be made to the present application by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.
The present application provides the point cloud encoding method and apparatus, the electronic device, the medium, and the program product. The method includes: performing image layer division on the to-be-processed laser radar point cloud to generate the different types of image layers; performing region segmentation on each image layer using the region segmentation method correspondingly set for the type of the corresponding image layer, so as to obtain the region images corresponding to each image layer; arranging the region images corresponding to each image layer to obtain the arranged images corresponding to each image layer; and encoding each arranged image based on the encoding method correspondingly set for the type of the corresponding arranged image to obtain the encoded data of the laser radar point cloud. Therefore, the compression performance may be improved when lossless encoding is performed on the laser radar point cloud.
Furthermore, it may be understood that the point cloud encoding method and apparatus, the electronic device, the medium, and the program product according to the present application are reproducible and may be used in various industrial applications. For example, the point cloud encoding method and apparatus, the electronic device, the medium, and the program product according to the present application may be used in the field of point cloud processing technologies.
Number | Date | Country | Kind |
---|---|---|---|
202111138592.X | Sep 2021 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2021/129264 | 11/8/2021 | WO |