The present disclosure relates generally to simplifying of a point cloud, and in particular, simplifying the point cloud that is either organized or not or organized.
With the recent development of 3D sensing technologies, 3D point clouds have become a practical format to represent data points in many applications. A sensing device measures a large number of points on an object's surface, and the point cloud represents the set of points that has been measured. A point cloud typically includes a massive data set that defines a large number of data points in some coordinate system. For example, a laser scan of a physical object will typically produce a data set that contains millions of data points, each specified by a 3-tuple using orthogonal coordinates that represent 3D space (e.g. x, y, z).
The processing, analysis and reporting of such large point cloud data sets can be difficult. In particular, it is often the case that the size of a point cloud data set exceeds the design and performance capabilities of the systems that need to make use of this data. Consequently, methods for reducing the size of point cloud data sets are an important pre-processing step in order to reduce the volume of data to a level suitable for consuming systems. The simplified or reduced point cloud data can then be more efficiently processed.
There are a number of related art approaches for simplifying point cloud data. However, these related art approaches either carry a significant risk of losing data, such as key features of the objects and/or surfaces represented by the data (e.g. sub-sampling decimation, uniform spatial clustering) or are complicated to implement and therefore require more expense computationally, and require more processing time.
Therefore, a need exists in the art for an improved way to detect and/or predict machine failure from the large amounts of data.
Embodiments of present disclosure are directed to simplifying the point cloud that is either organized or not or organized, by resampling the point cloud to preserve a subset of key points. This approach reduces the number of points without changing the locations of original points.
The embodiments of the present disclosure are based on a realization the point cloud does not need to be represented in a format suitable for all applications. Specifically, the point cloud can be represented in a format tailored for a specific application or for different applications, such that the point cloud can be reformatted into different formats/representations. By reformatting the point cloud into different formats/representations, the point cloud can be reformatted or pruned to preserve only points necessary for specific applications. At least one goal is to design application-dependent resampling strategies, preserving selected information depending on specific underlying applications. For example, in the task of contour detection in a point cloud, usually this requires careful and intensive computation, such as calculating surface normal and classifying points. Instead of working with an entire point cloud, there is more efficiently in a resample of a small subset of points that is sensitive to the required contour information, making the subsequent computation much cheaper without losing detection performance. Other examples may include a visualization and/or object modeling application, contours and some texture of the specific objects (but not others) can be preserved.
We realized it is more efficient to store multiple versions of point cloud pruned for specific purposes, than one version of point cloud suitable for all purposes. This can be true even when different pruned point clouds share the same points. For example, 100,000 points in an original point cloud can be turned into 60,000 points, or pruned into five different pruned point group of 5,000 points each. Thus, by pruning the point cloud for different applications to produce different pruned points, and executing a specific application with corresponding pruned points, the pruning can preserve a subset of the application specific to the key points. Other advantages can include reducing computational complexities and time, and reduces overall cost to run the specific application, when compared to trying to computationally run an application using the entire point cloud.
The present disclosure discloses techniques of selecting a subset of points that are rooted in graph signal processing, which is a framework to the interaction between signals and graph structure(s). We use a graph to capture the local dependencies between points, representing a discrete version of the surface of an object. At least one advantage of using a graph is to capture both local and global structures of point clouds. Under the present disclosure framework, the 3D coordinates and other attributes associated with each point are graph signals indexed by the nodes of the underlying graph. Thus, it becomes possible to formulate a resampling problem as sampling of graph signals. However, graph sampling techniques usually selects samples in a deterministic approach, which solve nonconvex optimization problems to obtain samples sequentially and require expensive computation. To leveage the computational cost, the present disclosure uses efficient randomized resampling strategies to choose a subset of key points from the input point cloud. The main idea is to generate subsamples according to a certain sampling distribution, which is both fast and noticeably useful to preserve information in the original input point cloud.
In other words, the present disclosure considers a feature-extraction based resampling framework, that is, the resampled points preserve selected information depending on the particular needs of a specific application. Then, based on a general feature-extraction operator, it is possible to quantify the quality of resampling by using a reconstruction error, and be able to derive the exact form. The optimal sampling distribution can be obtained by optimizing the expected reconstruction error. The present disclosure provides for an optimal sampling distribution that is guaranteed to be shift and rotation invariant. Which provides the feature extraction operator to be a graph filter and study the resampling strategies based on all-pass, low-pass and high-pass graph filtering. In each case, it is possible to derive the optimal sampling distribution and validate the performance on both simulated and real data.
Another way to explain this realization or to better understand how the pruning can be accomplished, the present disclosure uses each node on the graph by scoring each node according to the structure of the graph, based on the value of its neighboring nodes. A scoring function(s) can be selected based on the specific application, such that each different application can have their own scoring function or a multitude of scoring functions. For example, for contour determination, the scoring function can be an error in representation of the node as a function of neighboring nodes. Another example can be different scoring functions that can consider different attributes of the node. We realized that the scoring can determine probabilities of the nodes, which can be used with “random” resampling to handle the points with the same “scores” values.
In solving for resampling or processing the input point cloud, at least one system begins first by accessing the input point cloud. Wherein the input point cloud includes points, and each point includes a set of attributes including two dimensional (2D) and three dimensional (3D) coordinates and other attributes. The next step is to construct a graph (i.e. composing of graph vertex and graph edges), representing the input point cloud, based on each point in the input point cloud representing a node in the graph, and identify and connecting two neighboring nodes in the graph to obtain a graph edge.
Then, determine a graph filtering function based on the constructed graph, i.e. determine a graph operator as per a certain criteria to promote or maintain certain information in the input point cloud. Wherein, a set of attributes from the input point cloud can be also selected according to the specific application requirement, e.g. maintain geometric information and/or texture structure.
Followed by, filtering each point in the input point cloud by selecting a subset of attributes for the points, and applying by the graph filtering function on the selected subset of attributes, to determine at least one value for each point in the input point cloud. Using the at least one value for each point in the input point cloud, produce a probability for each point, based on the at least one value of the point compared to a total of all values of the points in the input point cloud, and a predetermined number of points in an output point cloud. In other words, an importance score can be calculated for each point in the point cloud using the selected graph operator. Such that based on the importance scores, a probability is generated for each point.
Finally, sample the input point cloud using random evaluation of the probabilities of each point, to obtain a subset of points in the input point cloud, wherein the subset of points is the output point cloud. Which means that a subset of points is determined based on the probability and an expected total number of points can be outputted for further usage. For example, the output point cloud can be stored in memory or outputted via an output interface in communication with the processor. It is noted that the input point cloud when compared to the selected output point cloud, i.e. the selected subset of key points from the system, can be more efficiently processed later.
For example, in one embodiment for large-scale visualization, it may be easier for a viewer to catch important details in a point cloud of a city environment when using a high-pass graph filtering based resampling. It is possible that for one embodiment for large-scale visualization, we can use the proposed high-pass graph filtering based resampling strategy to select a small subset of points to highlight the contours of buildings and streets in a urban scene.
In another example, the present disclosure may have another embodiment for robust shape modeling, such that it may be more efficient and accurate to identify the object model parameters when using a proposed low-pass graph filtering based resampling strategy to select a small subset of points. Wherein such modeling may involve finding the surfaces in a point cloud with noises or outliers being present and the present disclosure can be used to solve the problem(s).
According to an embodiment of the present disclosure, a system for processing input point cloud having points, wherein each point includes a set of attributes including two dimensional (2D) and three dimensional (3D) coordinates and other attributes. The system including sensors sensing a scene, and in communication with a computer readable memory to produce the input point cloud. The system includes an output interface. A processor in communication with the computer readable memory, wherein the processor is configured to access the input point cloud, and construct a graph representing the input point cloud, based on each point in the input point cloud representing a node in the graph, and identify and connecting two neighboring nodes in the graph to obtain a graph edge. Determine a graph filtering function based on the constructed graph. Filter each point in the input point cloud by selecting a subset of attributes for the points and applying by the graph filtering function on the selected subset of attributes, to determine at least one value for each point in the input point cloud. Produce a probability for each point, based on the at least one value of the point compared to a total of all values of the points in the input point cloud, and a predetermined number of points in an output point cloud. Sample the input point cloud using random evaluation of the probabilities of each point, to obtain a subset of points in the input point cloud, wherein the subset of points is the output point cloud. Finally, store the output point cloud in the computer readable memory or output the output point cloud via the output interface in communication with the processor, wherein the output point cloud is used to assist in subsequent processing and assists in management of the input cloud data.
According to another embodiment of the present disclosure, a method for processing input point cloud having points, wherein each point includes a set of attributes including two dimensional (2D) and three dimensional (3D) coordinates and other attributes. The method including sensing a scene via sensors, the sensors are in communication with a computer readable memory to produce the input point cloud. Using a processor in communication with the computer readable memory, wherein the processor is configured for accessing the input point cloud, and constructing a graph representing the input point cloud, based on each point in the input point cloud representing a node in the graph, and identify and connecting two neighboring nodes in the graph to obtain a graph edge. Determining a graph filtering function based on the constructed graph. Filtering each point in the input point cloud by selecting a subset of attributes for the points and applying by the graph filtering function on the selected subset of attributes, to determine at least one value for each point in the input point cloud. Producing a probability for each point, based on the at least one value of the point compared to a total of all values of the points in the input point cloud, and a predetermined number of points in an output point cloud. Sampling the input point cloud using random evaluation of the probabilities of each point, to obtain a subset of points in the input point cloud, wherein the subset of points is the output point cloud. Finally, storing the output point cloud in the computer readable memory or outputting the output point cloud via the output interface in communication with the processor, wherein the output point cloud is used to assist in subsequent processing and assists in management of the input cloud data.
According to another embodiment of the present disclosure, a non-transitory computer readable storage medium embodied thereon a program executable by a computer for performing a method. The method for processing stored input point cloud having points, wherein each point includes a set of attributes including two dimensional (2D) and three dimensional (3D) coordinates and other attributes. The method including sensing a scene via sensors, the sensors are in communication with the non-transitory computer readable storage medium to produce the input point cloud. Constructing a graph representing the input point cloud, based on each point in the input point cloud representing a node in the graph, and identify and connecting two neighboring nodes in the graph to obtain a graph edge. Determining a graph filtering function based on the constructed graph. Filtering each point in the input point cloud by selecting a subset of attributes for the points and applying by the graph filtering function on the selected subset of attributes, to determine at least one value for each point in the input point cloud. Producing a probability for each point, based on the at least one value of the point compared to a total of all values of the points in the input point cloud, and a predetermined number of points in an output point cloud. Sampling the input point cloud using random evaluation of the probabilities of each point, to obtain a subset of points in the input point cloud, wherein the subset of points is the output point cloud. Finally, storing the output point cloud in the non-transitory computer readable storage medium or outputting the output point cloud via an output interface in communication with the computer, wherein the output point cloud is used to assist in subsequent processing and assists in management of the input cloud data.
The presently disclosed embodiments will be further explained with reference to the attached drawings. The drawings shown are not necessarily to scale, with emphasis instead generally being placed upon illustrating the principles of the presently disclosed embodiments.
While the above-identified drawings set forth presently disclosed embodiments, other embodiments are also contemplated, as noted in the discussion. This disclosure presents illustrative embodiments by way of representation and not limitation. Numerous other modifications and embodiments can be devised by those skilled in the art which fall within the scope and spirit of the principles of the presently disclosed embodiments.
The following description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the following description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing one or more exemplary embodiments. Contemplated are various changes that may be made in the function and arrangement of elements without departing from the spirit and scope of the subject matter disclosed as set forth in the appended claims.
Specific details are given in the following description to provide a thorough understanding of the embodiments. However, understood by one of ordinary skill in the art can be that the embodiments may be practiced without these specific details. For example, systems, processes, and other elements in the subject matter disclosed may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known processes, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments. Further, like reference numbers and designations in the various drawings indicated like elements.
Also, individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed, but may have additional steps not discussed or included in a figure. Furthermore, not all operations in any particularly described process may occur in all embodiments. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, the function's termination can correspond to a return of the function to the calling function or the main function.
Furthermore, embodiments of the subject matter disclosed may be implemented, at least in part, either manually or automatically. Manual or automatic implementations may be executed, or at least assisted, through the use of machines, hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine readable medium. A processor(s) may perform the necessary tasks.
The embodiments of the present disclosure are based on a realization the point cloud does not need to be represented in a format suitable for all applications when considering processing the input point cloud. In fact, the point cloud can be represented in a format tailored for a specific application or for different applications, and is reformatted into different formats/representations. Reformatting or resampling the input point cloud is done to preserve only points necessary for a specific application or for multiple applications. The preserved points from the input point cloud is selected information specific to the needs of the particular application. For example, for visualization and object modeling applications, contours and some texture of some specific objects are preserved, among other things. Specifically, we realized it is more efficient to store multiple versions of point cloud reformatted or resampled for specific purposes, than one version of point cloud suitable for all purposes. By resampling the point cloud for different applications we essentially produce different groups of resampled points or subgroups of the overall input point cloud, that are then executed for the specific application with the corresponding resampled points. Because of this realization, we observed reduced computational complexities and time, and reduced overall cost to run the specific application, when compared to trying to computationally run an application using the entire input point cloud.
The selecting a subset of points is rooted in graph signal processing, which is a framework to learn about the interaction between signals and graph structure(s). We found that using a graph to capture the local dependencies between points, and representing a discrete version of the surface of an object, the present disclosure is able to deliver both local and global structures of point clouds. Under this framework, the 3D coordinates and other attributes associated with each point are graph signals indexed by nodes of the underlying graph. Which we discovered makes it possible to formulate a resampling problem as sampling of graph signals. This feature-extraction based resampling framework, i.e. resampled points preserving selected information per specific application, is based on a general feature-extraction operator, which we quantified the quality of resampling using a reconstruction error to derive the exact form. The optimal sampling distribution is obtained by optimizing the expected reconstruction error. The proposed optimal sampling distribution is guaranteed to be shift and rotation invariant. Wherein we present the feature extraction operator as a graph filter and analyze the resampling strategies based on all-pass, low-pass and high-pass graph filtering. In each case, we obtain the optimal sampling distribution and validate the performance on both simulated and real data.
The realization of resampling or pruning of the input point cloud can be generally explained by using each node on the graph and scoring each node according to a structure of the graph, based on a value of its neighboring nodes. We realized that the scoring can determine probabilities of the nodes, which can be used with “random” resampling to handle the points with the same “scores” values. For example, for contour determination, the scoring function can be an error in representation of the node as a function of neighboring nodes, or another example can be different scoring functions that can consider different attributes of the node. A scoring function(s) is selected based on the specific application, where each different application may have their own scoring function or a multitude of scoring functions.
Referring to
Then, determine a graph filtering function 125 based on the constructed graph 120, i.e. determine a graph operator as per a certain criteria to promote or maintain certain information in the input point cloud. Wherein, a set of attributes 130 from the input point cloud can be also selected according to the specific application requirement, e.g. maintain geometric information and/or texture structure.
Followed by, determining a value for each point 135, wherein filtering each point in the input point cloud is by selecting a subset of attributes 130 for the points, and applying the graph filtering function on the selected subset of attributes 135, to determine at least one value for each point 135 in the input point cloud. Using, the at least one value for each point 135 in the input point cloud to produce a probability for each point 140, based on the at least one value of the point 135 compared to a total of all values of the points in the input point cloud, and a predetermined number of points in an output point cloud. In other words, an importance score can be calculated for each point in the point cloud using the selected graph operator. Such that, based on the importance scores, a probability is generated for each point.
Finally, still referring to
Referring to
Further, another embodiment of the present disclose can be for robust shape modeling. Wherein, it may be more efficient and accurate to identify the object model parameters when using a proposed low-pass graph filtering based resampling strategy to select a small subset of points. Wherein such modeling may involve finding the surfaces in a point cloud with noises or outliers being present and the present disclosure can be used to solve the problem(s).
To better understand formulating the task of resampling a 3D point cloud, we need to introduce graph signal processing, which outlays a foundation for the methods and systems for the embodiments of the present disclosure.
We consider a matrix representation of a point cloud with N points and K attributes,
where siϵRN represents the ith attribute and xiϵRK represents the ith point. Depending on a sensing device, attributes can be 3D coordinates, RGB colors, textures, and many others. To distinguish 3D coordinates and other attributes, we use XcϵRN×3 to represent 3D coordinates and XoϵRN×(K-3) to represent other attributes.
The number of points N is usually huge. For example, a 3D scanning of a building usually needs billions of 3D points. It is challenging to work with a large-scale point cloud from both storage and data analysis perspectives. In many applications, however, we are interested in a subset of 3D points with particular properties, such as key points in point cloud registration and contour points in contour detection. To leverage the storage and computation, we consider sampling a subset of representative points from the original point cloud to reduce the scale. Since the original point cloud is sampled from an object, we call this task resampling. The procedure of resampling is to resample M (M<N) points from a point cloud, or select M rows from the point cloud matrix X. The resampled point cloud is
X
M
=ψXϵR
M×K, (2)
where M=(M1,., MM) denotes the sequence of resampled indices, or called resampling set, and Miϵ{1, . . . , N} and |M|=M, and the resampling operator ψ is a linear mapping from RN to RM, defined as
The efficiency of the proposed resampling strategy is critical. Since we work with a large-scale point cloud, we want to avoid expensive computation. To implement resampling in an efficient way, we consider a randomized resampling strategy. It means that the resampled indices are chosen according to a sampling distribution. Let {πi}i=N be a series of sampling probabilities, where πi denotes the probability to select the ith sample in each random trial. Once the sampling distribution is chosen, it is efficient to generate samples. The goal here is to find a sampling distribution that preserves information in the original point cloud.
The invariant property of the proposed resampling strategy is also critical. When we shift or rotate a point cloud, the intrinsic distribution of 3D points does not changed and the proposed resampling strategy should not change.
Definition 1 A resampling strategy is shift-invariant when a sampling distribution π is designed for a point cloud, X=[Xc Xo], then the same sampling distribution π is designed for its shifted point cloud, [Xc+1aT Xo], where aϵR3.
Definition 2 A resampling strategy is rotation-invariant when a sampling distribution π is designed for a point cloud, X=[Xc Xo], then the same sampling distribution π is designed for its rotated point cloud, [XcR Xo], where RϵR3×3 is a 3D rotation matrix.
We should guarantee that the proposed resampling strategy is both shift and rotation invariant.
A graph is a natural and efficient way to represent a point cloud because it is a discrete representation of the surface of an object. In computer graphics, polygon meshes, as a class of graphs with particular connectivity restrictions, are extensively used to represent the shape of an object. To construct a reliable mesh, we usually need sophisticated geometry analysis, such as calculating surface normals. The mesh representation is a simple tool for visualization, but may not be good at analyzing point clouds. Here we extend polygon meshes to general graphs by relaxing the connectivity restrictions. Such graphs are efficient to construct and are flexible to capture geometry information.
We construct a general graph of a point cloud by encoding the local geometry information in an adjacency matrix WϵRN×N. Let xi(c)ϵR3 be the 3D coordinates of the ith point; that is, the ith row of Xc. The edge weight between two points xi(c) and xj(c) is
where variance σ and threshold τ are hyperparameters. Equation (4) shows that when the Euclidean distance of two points is smaller than a threshold τ, we connect these two points by an edge and the edge weight depends on the similarity of two points in the 3D space. We call this type of graph as τ-Graph in this invention. The weighted degree matrix D is a diagonal matrix with diagonal element Dimi=ΣjWi,j reflecting the density around the ith point. This graph is approximately a discrete representation of the original surface and can be efficiently constructed via a tree data structure, such as octree. Here we only use the 3D coordinates to construct a graph, but it is also feasible to take other attributes into account (4). Given this graph, the attributes of point clouds are called graph signals. For example, an attribute s in (1) is a signal index by the graph. Without explicitly statement, we assume a τ-Graph is in use.
In another example of graph construction, a point is connected to its a certain number of nearest neighbors.
A graph filter is a system that takes a graph signal as an input and produces another graph signal as an output. Let AϵRN×N be a graph shift operator, which is the most elementary nontrivial graph filter. Some common choice of a graph shift operator is the adjacency matrix W (4), the transition matrix T=D−1W, the graph Laplacian matrix L=D−W, and many other structure-related matrices. The graph shift replaces the signal value at a node with a weighted linear combination of values at its neighbors; that is,
y=AsϵR
N,
where sϵRN is an input graph signal (an attribute of a point cloud). Every linear, shift-invariant graph filter is a polynomial in the graph shift
where hi are filter coefficients and L is the length of this graph filter. Its output is given by the matrix-vector product
y=h(A)sϵRN.
The eigendecomposition of a graph shift operator A is
A=VΛV
T, (6)
where the eigenvectors of A form the columns of matrix V, and the eigenvalue matrix ΛϵRN×N is the diagonal matrix of corresponding eigenvalues λ1, . . . , λN of A (λ1≥λ2≥ . . . , ≥λN). These eigenvalues represent frequencies on the graph [?] where λ1 is the lowest frequency and λN is the highest frequency. Correspondingly, v1 captures the smallest variation on the graph and vN captures the highest variation on the graph. V is also called graph Fourier basis. The graph Fourier transform of a graph signal sϵRN is
ŝ=V
T
s, (7)
The inverse graph Fourier transform is
where vk is the kth column of V and ŝk is the kth component in ŝ. The vector ŝ in (7) represents the signal's expansion in the eigenvector basis and describes the frequency components of the graph signal s. The inverse graph Fourier transform reconstructs the graph signal by combining graph frequency components.
During resampling, we reduce the number of points and unavoidably lose information in a point cloud. Our goal here is to design an application-dependent resampling strategy, preserving selected information depending on particular needs. For example, in the task of contour detection in a point cloud, we usually need careful and intensive computation, such as calculating surface normals and classifying points [?, ?]. Instead of working with a large number of points, we consider efficiently resampling a small subset of points that is sensitive to the required contour information, making the subsequent computation much cheaper without losing contour information.
Let f(⋅) be a feature-extraction operator that extracts targeted information from a point
cloud according to particular needs; that is, the features f(X)ϵRN×K are extracted from a point cloud XϵRN×K. We resample a point cloud M times. In the jth time, we independently choose a point Mj=i with probability πi. Let ψϵRM×N be the resampling operator (3) and SϵRN×N be a diagonal rescaling matrix with Si,i=1/√{square root over (Mπi)}. We quantify the performance of a resampling operator as follows:
D
f(X)(ψ)=∥SψTψf(X)−f(X)∥22, (8)
where ∥⋅∥2 is the spectral norm. ψTψϵRN×N is a zero-padding operator, which a diagonal matrix with diagonal elements (ψTψ)i,i=1 when the ith point is sampled, and 0, otherwise. The zero-padding operator ψTψ ensures the resampled points and the original point cloud have the same size. S is used to compensate non-uniform weights during resampling. SψTψf(X) represents the preserved features after resampling in a zero-padding form. From another aspect, SψT is the most naive interpolation operator that reconstructs the original feature f(X) from its resampled version ψf(X). The evaluation metric Df(X)(ψ) measures the reconstruction error; that is, how much feature information are lost after resampling without using sophisticated interpolation operator. When Df(X)(ψ) is small, preserved features after resampling are close to the original features, meaning that few information is lost. The expectation Eψ:π(Df(X)(ψ)) provides the expected error caused by resampling and quantifies the performance of a sampling distribution π. Our goal is to minimize Eψ:π(Df(X)(ψ)) over π to obtain an optimal sampling distribution in terms of preserving features f(X). We now derive the exact form of mean square error of the objective function.
Lemma 1 The nonweighted version of preserved feature is a biased estimator to the original feature,
E
ψ:π(ψTψf(X))∂πef(X), for all f(X)ϵRN×K.
where e is the row-wise multiplication.
The reweighted version of preserved feature is an unbiased estimator to the original feature, that is,
E
ψ:π(SψTψf(X))=f(X), for all f(X)ϵRN×K.
Theorem 1 The exact form of mean square error between the preserved feature and the original feature is,
E
ψ:π(Df(X)(ψ))=Tr(f(X)Qf(X)T), (9)
where QϵRN×N is a diagonal matrix with Qi,i=1/πi−1.
The sufficient condition of the shift and rotation-invariant of a proposed resampling strategy is that the evaluation metric (8) is shift and rotation-invariant.
Definition 3 A feature-extraction operator f(⋅) is shift-invariant when the features extracted from a point cloud and its shifted version are same; that is, f([Xc Xo])=f([Xc+1aT Xo]), where shift aϵR3.
Definition 4 A feature-extraction operator f(⋅) is rotation-invariant when the features extracted from a point cloud and its rotated version are same; that is, f([Xc Xo])=f([XcR Xo]), where RϵR3×3 is a 3D rotation matrix.
When f(⋅) is shift/rotation-invariant, (8) does not change through shifting or rotating, leading to a shift/rotation-invariant resampling strategy and it is sufficient to minimize Eψ:πDf(X)(ψ)) to obtain a resampling strategy; however, when f(⋅) is shift/rotation-variant, (8) may change through shifting or rotating, leading to a shift/rotation-variant resampling strategy.
To handle shift variance, we can always recenter a point cloud to the origin before any process; that is, we normalize the mean coordinates of 3D points to zeros. To handle rotation variance of f(⋅), we consider the following evaluation metric:
where ∥⋅∥2 is the spectral norm and constant c=∥Xc∥2 is the spectral norm of the original 3D coordinates. The evaluation metric Df(ψ) considers the worst possible reconstruction error caused by rotation to remove the influence of rotation. In (??), we consider 3D coordinates are variables due to rotation. We constraint the spectral norm of 3D coordinates because a rotation matrix is orthornormal and the spectral norm of 3D coordinates does not change during rotation. We then minimize Eψ:π(Df(ψ)) to obtain an invariant resampling strategy even when f(⋅) is variant.
Theorem 2 Let f(⋅) be a rotation-variant linear feature-extraction operator, where
f(X)=FX with FϵRN×N. The exact form of Eψ:πDf(ψ) is,
E
ψ:π(Df(ψ))=c2Tr(FQFT)+Tr(FXoQ(FXo)T), (10)
where c=∥Xc∥2 and QϵRN×N is a diagonal matrix with Qi,i=1/πi−1.
We now derive the optimal sampling distributions by minimizing the expected reconstruction error.
For a shift and rotation-invariant feature-extraction operator, we minimize (8).
Theorem 3 Let f(⋅) be a shift and rotation-invariant feature-extraction operator. The corresponding optimal resampling strategy π* is,
π*i∂∥fi(X)∥2, (11)
where fi(X)ϵRK is the ith row of f(X).
For a shift and rotation-variant feature-extraction operator, we minimize.
Theorem 4 Let f(⋅) be a shift and rotation-variant linear feature-extraction
operator, where f(X)=FX with FϵRN×N. The corresponding optimal resampling strategy π* is,
π*i∂√{square root over (c2∥Fi∥22+∥(FXo)i∥22)}, (12)
where constant c=∥Xc∥2, Fi is the ith row of F and (FXo)i is the ith row of FXo.
In this section, we design graph filters to exact features from a point cloud. Let features
extracted from a point cloud X be
which follows from the definition of graph filters (5). Similarly to filter design in classical signal
processing, we design a graph filter either in the graph vertex domain or in the graph spectral domain.
In the graph vertex domain, for each point, a graph filter averages the attributes of its local points. For example, the output of the ith point, fi(X)=Σl=0L-1hl(AlX)i is a weighted average of the attributes of points that are within L hops away from the ith point. The lth graph filter coefficient, hl quantifies the contribution from the lth-hop neighbors.
We design the filter coefficients to change the weights in local averaging.
In the graph spectral domain, we first design a graph spectrum distribution and then use
graph filter coefficients to fit this distribution. For example, a graph filter with length L is
where V is the graph Fourier basis and λi are graph frequencies (6). When we want the
response of the ith graph frequency is ci, we set
and solve a set of linear equations to obtain the graph filter coefficients hl. It is also possible to
use the Chebyshev polynomial to design graph filter coefficients [?]. We now consider some
special cases of graph filters.
Let h(λi)=1; that is, h(A) is an identity matrix with h0=1 and hi=0 for
i=1, . . . ,L−1. The intuition behind this setting is that the original point cloud is trustworthy and all points are uniformly sampled from an object without noise, reflecting the true geometric structure of the object. We want to preserve all the information and the features are thus the original attributes themselves. Since f(X)=X, the feature-extraction operator f(⋅) is shift and rotation-variant. Based on Theorem 4, the optimal resampling strategy is
π*i∂√{square root over (c2+∥(Xo)i∥22.)} (13)
Here the feature-extraction matrix F in (12) is an identity matrix and the norm of each
row of F is 1. When we only preserve 3D coordinates, we ignore the term of Xo and obtain a constant sampling probability for each point, meaning that uniform sampling is the optimal resampling strategy to preserve the overall geometry information.
In image processing, a high-pass filter is used to extract edges and contours. Similarly, we
use a high-pass graph filter to extract contours in a point cloud. Here we only consider the 3D coordinates as attributes (x=Xc=RN×3), but the proposed method can be easily extended to other attributes.
A critical question is how to define contours in a point cloud. We consider that contour
points break the trend formed by its neighboring points and bring innovation. Many previous works need sophisticated geometry-related computation, such as surface normal, to detect contours [?]. Instead of measuring sophisticated geometry properties, we describe the possibility of being a contour point by the local variation on graphs, which is the response of high-pass graph filtering. The corresponding local variation of the ith point is
f
i(X)=∥(h(A)X)i∥22, (14)
where h(A) is a high-pass graph filter. The local variation f(X)ϵRN quantifies the energy of
response after high-pass graph filtering. The intuition behind this is that when the local variation of a point is high, its 3D coordinates cannot be well approximated from the 3D coordinates of its neighboring points; in other words, this point bring innovation by breaking the trend formed by its neighboring points and has a high possibility to be a contour point.
The following theorem shows the local variation is rotation invariant, but shift variant.
Theorem 5 Let f(X)=diag(h(A)XXTh(A)T)ϵRN, where diag(⋅) extracts the diagonal
elements. f(X) is rotation invariant and shift invariant unless h(A)1=0ϵRN.
To guarantee the local variation is both shift and rotation invariant, we use a transition
matrix as a graph shift operator; that is, A=D−1W, where D is the diagonal degree matrix. The reason is that 1ϵRN is the eigenvector of a transition matrix, A1=D−1W1=1. Thus,
when Σl=N-1hl=0. A simple design is a Haar-like high-pass graph filter
Note that Δmax=maxi|λi|=1, where λi are eigenvalues of A, because the graph shift operator
is a transition matrix. In this case, h0=1,h1=−1 and hi=0 for all i>1, Σl=0N-1hl==0. Thus, a
Haar-like high-pass graph filter is both shift and rotation invariant. The graph frequency response of a Haar-like high-pass graph filter is hHH(λi)=1−λi. Since the eigenvalues are ordered descendingly, we have 1−λi≤1−λi+1, meaning low frequency response relatively attenuates and high frequency response relatively amplifies.
In the graph vertex domain, the response of the ith point is
Because A is a transition matrix, ΣjϵN
between a point and the convex combination of its neighbors. The geometry interpretation of the proposed local variation is the Euclidean distance between the original point and the convex combination of its neighbors, reflecting how much information we know about a point from its neighbors. When the local variation of a point is large, the Euclidean distance between this point and the convex combination of its neighbors is long and this point provides a large amount of innovation.
We can verify the proposed local variation on some simple examples.
When the points are uniformly spread along the defined shape, the proposed local variation (14) satisfies Examples 1, 2 and 3 from the geometric perspective. For example, in
The feature-extraction operator f(X)=∥hHH(A)X∥22 is shift and rotation-invariant.
Based on Theorem 3, the optimal sampling distribution is
where A=D−1W is a transition matrix.
Note that the graph Laplacian matrix is commonly used to measure variations. Let
L=D−WϵRN×N be a graph Laplacian matrix. The graph Laplacian based total variation is
where Ni is the neighbors of the ith node and the variation contributed by the ith point is
The variation here is defined based on the accumulation of pairwise differences. We call (18) pairwise difference based local variation.
The pairwise difference based local variation cannot capture geometry change and violates Example 2. We show a counter example in
the point cloud, including hinge, cone, table, chair, sofa and trash container. The first row shows (505) the original point clouds; the second (510) and third (515) rows show the resampled versions with respect to two local variations: pairwise difference based local variation (18) and Haar-like high-pass graph filtering based local variation (14). Two resampled versions have the same number of points, which is 10% of points in the original point cloud.
For two simulated objects, the hinge and the cone (first two rows), the pairwise difference
based local variation (18) fails to detect contour and the Haar-like high-pass graph filtering based local variation (14) detects all the contours. For the real objects, the Haar-like high-pass graph filtering based resampling (14) also outperform the pairwise difference based local variation (18). In summary, the Haar-like high-pass graph filtering based local variation (14) shows the contours of objects by using only 10% of points.
The high-pass graph filtering based resampling strategy can be easily extended to detect
transient changes in other attributes. In
In classical signal processing, a low-pass filter is used to capture a main shape of a smooth
signal and reduce noise. Similarly, we use a low-pass graph filter to capture a main shape of a point cloud and reduce sampling noise during obtaining 3D points. Since we use the 3D coordinates of points to construct a graph (4), the 3D coordinates are naturally smooth on this graph, meaning that two adjacent points in the graph have similar coordinates in the 3D space. When noises and outliers occur, a low-pass graph filter, as a denoising operator, uses local neighboring information to approximate a true position for each point. Since the output after low-pass graph filtering is a denoised version of the original point cloud, it may be more appealing to resample from denoised points than original points.
A straightforward choice is an ideal low-pass graph filter, which completely eliminates all
graph frequencies above a bandwidth while passing those below unchanged. An ideal low-pass graph filter with bandwidth b is
where V(b) is the first b columns of V, and the graph frequency response is
The ideal low-pass graph filter hIL projects an input graph signal into a bandlimited
subspace and hIL(A)s is a bandlimited approximation to the original graph signal s. We show an example in
The feature-extraction operator f(X)=V(b)V(b)TX is shift and rotation-variant. Based on
Theorem 4, the corresponding optimal resampling strategy is
(20)
where viϵRb is the ith row of V(b).
A direct way to obtain ∥vi∥2 requires the truncated eigendecomposition (7), whose
computational cost is O (Nb2), where b is the bandwidth. It is potentially possible to approximate the leverage scores through a fast algorithm [?, ?], where we use randomized techniques to avoid the eigendecomposition and the computational cost is O (Nb log(N)). Another way to leverage computation is to partition a graph into several subgraphs and obtain leverage scores in each subgraph.
Note that this resampling strategy is similar to sampling and recovery of approximately
bandlimited graph signals, whose idea is to sample the signal coefficients at a few nodes and approximately recover the signal coefficients at all the other nodes. Here we model the attributes of the point cloud as graph signals, sample the attributes of a few points and approximately recover the attributes of all the other points.
We could see that the resampling strategy based on ideal low-pass graph
filtering tends to put more samples on points whose neighboring points vary rapidly in the 3D space, because small-variation areas introduce lots of redundant information and we do not need to take many samples. The graph takes care of the space distribution of a point cloud and analyzes the amount of information of each point via the graph Fourier basis. We also see that with increasing number of graph frequencies, the sampling scores tend to be uniform. This means that when we want to preserve overall information, the importance score becomes equally everywhere.
Another simple choice is Haar-like low-pass graph filter; that is,
where λmax=maxi|λi| with λi being eigenvalues of A. The normalization factor λmax is to
avoid the amplification of the magnitude. We denote Anorm=A/|λmax| for simplicity.
The graph
frequency response is hHL(λi)=1+λi/|λmax|. Since the eigenvalues are ordered descendingly, we
have 1+λi≥1+λi+1, meaning low frequency response relatively amplifies and high frequency
response relatively attenuates.
In the graph vertex domain, the response of the ith point is
where Ni is the neighbors of the ith point. We see that hHL(A) averages the attributes of each
point and its neighbors to provide a smooth output.
The feature-extraction operator f(X)=hHL(A)X is shift and rotation-variant. Based on
Theorem 4, the corresponding optimal resampling strategy is
π*i∂√{square root over (c2∥(I+Anorm)i∥22+∥((I+Anorm)Xo)i∥22)},
To obtain this optimal sampling distribution, we need to compute the largest magnitude
eigenvalue λmax, which takes O(N), and compute ∥(I+Anorm)i∥22 and ∥((I+Anorm)Xo)i∥22 for each row, which takes O(∥vec(A)∥0) with ∥vec(A)∥0 the nonzero elements in the graph shift operator. We can avoid to compute the largest magnitude by using a
normalized adjacency matrix or a transition matrix as a graph shift operator. A normalized
adjacency matrix is D½WD½, where D is the diagonal degree matrix, and a transition matrix
is obtained by normalizing the sum of each row of an adjacency matrix to be one; that is D−1W.
In both cases, the largest eigenvalue of a transition matrix is one, we thus have A=Anorm.
low-pass graph filtering (??) on bunny. The point cloud of bunny includes 35,947 points. Each coordinate is contaminated by a Gaussian noise with mean zero and variance 0.002. In
To quantitatively evaluate the performance of resampling, we measure the Euclidean
distance between each resampled point and the nearest point in the original noiseless point cloud; that is,
where XϵRN×k is the noiseless point cloud and XMϵRM×k is a resampled point cloud. Since
we resample from a noisy point cloud, the resampled points are shifted from the original points.
The error metric in (22) quantifies the total shifts. A smaller shift means better representation of
the original point cloud. For example, the error of the resampled point cloud in sub-
6.1824 and the error of the resampled point cloud in sub-
the advantage of using low-pass graph filtering during resampling.
Above we proposed some basic graph filtering based resampling tools including all-pass,
low-pass and high-pass graph filter, in this section, we propose some variant designs that show how to adapt them to satisfy special requirements.
For the pure high-pass graph filter as proposed in Section 3.2, the importance score put for
a purely favors the local variance in the point cloud. For a point cloud which dynamically allocate point density, e.g. according to the level of interest, we like to jointly take the point density into consideration together with the local variance.
In one embodiment, considering that degree matrix D can be a representation of the
density distribution, we propose to modify the Haar-like high pass filter in Eqn. 15 in the following way,
h
HH(A)=D(I−A)=D−W=L, (23)
where the resulting high pass graph operator is actually a graph Laplacian operator L. Note that
the degree matrix D above could be replaced by another density representation and the resulted
operator is not necessary to be graph Laplacian.
With point density being considered, the sampling probability transition between edge area
and flat area would become smoother. This approach may be more favorable if the input point cloud was pre-processed to emphasized areas of interest.
For a high-pass graph filter as proposed in Section 3.2 or Section 3.4.1, the importance
score assigned to some points may be far less than those of other points. Typically, the points from smoothest area can have a nearly zero sampling probability, compared to points near to edges or corners. As a result, those points from flat area would have little chances to be selected during the resampling.
For some conventional surface modeling methods to build a surface from point cloud, it
may pose challenges to over-emphasize on the edge/corner area while leaving the surfaces inside to be almost empty. To overcome this, there is a motivation for a trade-off between maintaining the overall geometry information and contour information in the point cloud. In order to ensure a minimum sampling probability across all over the point cloud, we propose to enforce a minimum sampling probability.
with flooring on one face of a cubic object. Sub-
Suppose we wish to apply a low pass filter to the graph spectrum on the point cloud in
order to remove high frequency components, the ideal low-pass graph filter proposed in Section 3.3 need to compute at least some eigenvalue and eigenvector pairs, which may cause undesired computation complexity. Here we propose k-polynomial or k-conjugate gradient filter [?] to be utilized in the proposed framework of this invention. Next we assume the basic graph operator is the symmetric normalized graph Laplacian operator L without sacrificing generality of using other graph operators.
k-POLY
If restricting ourselves to polynomial filtering, e.g., due to computational considerations,
we can setup a problem of optimal polynomial low pass filters. An optimal polynomial low pass filter is based on Chebyshev polynomials.
Specifically, we propose to use a degree k Chebyshev polynomial hk-CHEB defined over
the interval [0,2] with a stop band extending from lϵ(0,2) to 2. Since we define the graph spectrum in terms of the eigenspace of the symmetric normalized Laplacian L, all the eigenvalues of L lie in the interval [0,2]. The construction of a Chebyshev polynomial is easily obtained by computing the roots of the degree k Chebyshev polynomial {circumflex over (r)}(i)=cos(π(2i−1)/2k) for i=1 . . . k. over the interval [−1,1], then shifting the roots to the interval [l,2] via linear transformation to obtain the roots ri; of the polynomial hk-CHEB, and scaling the polynomial using r0 such that hK-CHEB (0)=1. This results in formula
Chebyshev polynomials are minimax optimal, uniformly suppressing all spectral components on
the interval [l,2] and growing faster than any other polynomial outside of [l,2]. The stop band
frequency l remains a design parameter that needs to be set prior to filtering.
k-CG
Furthermore, a k-CG low-pass filter [?] can be also used to mimic a low-pass filter in the
proposed framework of this invention.
1410) methods with k=1,2,3.
With the approach proposed in Section 3, it is assumed that the importance score associated
with each point is independent from any other points, and hence the score is kept the same no matter which points have been selected during the sampling procedure. In this section, we consider the case that the importance score of a point is being actively updated during the sampling procedure, that is, the importance score of a point would be affected by those points that have been sampled. Intuitively, the importance score should become to zero for a point that has already been sampled, while the score of a point farthest to the sampled point should be kept unchanged.
Sampling distribution π is directly used as a measure of importance score in this section.
Here, we propose an active importance score πai,r of a point i relative to a reference
point r.
where xf is the point farthest from the point xr. And after a new sampling point is determined,
the importance score will be updated as follows.
π⇐ππa (25)
Unfortunately, as the distances to all other points need to be counted, the calculation of
πa and hence the updating of pi involves a high computation complexity. One way to limit the involved complexity is to avoid updating the importance scores after every new sampling point is selected. Instead, the scores can be updated only after a first group of points are sampled and before a second group of points are to be sampled. This may lead to greater performance degradation as the updating is conducted at a rather coarse step. Next we propose two novel ways to elevate the complexity burdens.
Instead of computing the exact importance score updating factor πa for every point, the
factor for points within a certain neighborhood could be shared,
where {circumflex over (x)}i is the centroid point of a voxel that xi belongs to. A fast algorithm, e.g. octree
decomposition, can be used to partition points into voxels.
m-Hop Approximation
In another embodiment, we assume that there is no impact on points that are far away
enough from a newly sampled point. Given the radius π was used to construct the graph over the point cloud, let mτ be the approximated radius within which the importance score need to be updated.
Unfortunately, it is not that straightforward to reduce the number of computation of
distances using the above formula, as all the distances need to be evaluated against a threshold. A naive method to alleviate the problem is again to use voxel centroid to approximate per-point distances. Apart from the naive method, in this subsection, we propose a novel method as shown in
As in
point, each column corresponding to an attribute. A is a determined graph operator (for example, in matrix form), or a function of a basic graph operator. The set of points that have been sampled previously is denoted by Min.
The objective is to find a new sampling set Mout with n new sampling points, by
considering impacts of old sampling points on their neighborhood within a radius of mτ.
In Step 1, initiate an operator Q the same as A.
In Step 2, we propose the replace the rows in Q with 1i for sampled points ∀iϵMin,
where entries of row vector 1i are all 0, except that the i-th entry is equal to 1. The proposed changes in the graph operator means some corresponding modification in the (directed) graph structure, i.e. to remove the graph edges that are linked to the already sampled points, and to add self loop edges on the already sampled points.
In Step 3, the importance score ξM and πM are initialized as PX−AXProw, where
P.Prow is a row-wise norm operator. The initial importance score could be regarded as a measurement of local information based on the underlying graph structure.
In Step 4, the importance score ξM is modified so that its entries corresponding to non-sampled
points are reset to 0. Now ξM represents the local information that is being carried by the old sampling point set M.
In each iteration of Step 5, the local information is propagated from old sampling points to
their neighborhood in a radius of τ. By looping m times in Step 5, such information is propagated within a range of mτ and it is represented by ξM.
In Step 6, after subtracting the information ξM carried by the old sampling point set from
the total information πM from the original point cloud, now we have the new information measurement πM.
Finally in Step 7, we could select a new sampling set Mout with n new sampling points
based on the new information measurement πM.
The algorithm proposed above can be called iteratively to granular grow the sampling set
to achieve a hierarchical representation of the original point cloud based on a very coarse representation. The resulted method is a “single pass” decomposition procedure as the subsequent processing of each layer could be invoked immediately after each layer is produced without waiting for the generation of another representation layer.
In another embodiment, we may not perform the norm operation in Step 3, but keep the
X-AX within ξM to store the actual local information. Until in Step 6, before updating πM, we perform the norm operation on ξM first. It has advantages to propagate the local information rather than a local importance score, especially when the local information is in multi-dimensional.
We consider resampling a set of reference points from a large-scale point cloud.
Reference points are used to represent the location of other points, which could be used in compression, indexing, and visualization of point cloud. The idea is to first encode the coordinates of the reference points, and then encode the relative coordinates to the closest reference points of all the other points. Let M=denotes a sequence of K reference indices with Mjϵ{1, . . . , N}. The objective is to find a set of reference points by minimizing the following function.
where minMjϵM∥xM
When the weights are uniform, (28) is similar with K-means clustering, except that the
cluster centers come from original points. We simply call (28) weighted K-means clustering problem. However, we cannot use ordinary clustering algorithms for a large-scale 3D point cloud. It is known that K-means clustering is computationally difficult (NP-hard). Even though there are efficient heuristic algorithms, such as Lloyds algorithm and its variants, theses algorithms take iterations to find the clusters and each iteration involves the computation of the distances between each of the K cluster centers and the N data points. When K is huge, the total computational cost is huge. Since we work with a large-scale point cloud and millions of reference points, we want to find reference points without iterating; in other words, the task is to efficiently choose seeding points for a weighted K-means clustering.
Inspired from K-means++[?], our idea is to sequentially update the sampling distribution.
Let π(i) denote the sampling distribution at the i-th time. Each time, we generate one sample from π(i) and update the sampling distribution according to this new sample. The sampling distribution is updated based on the Euclidean distance to the closest reference point. We avoid to sample many points in a local area. The difference between the proposed weighted K-means++ and original K-means++ is that we also consider the feature property. The algorithm sees
In Step 1, initiate a set reference indices M=Ø, and a sampling distribution π, where
the probability to sample the i-th sample is proportional to the feature value; that is, πi=wi/Σjwj.
In Step 2, generate one reference index M1 from the sampling distribution π and put it
into the set M.
In Step 3, repeat Steps 3.1 and 3.2, until the cardinality of M reaches K.
In Step 3.1, update the sampling distribution π by assigning πi=wiD2(xi)/ΣjwhD2(xj), where D(xi)=minMjϵM∥xi-xM
In Step 3.2, generate another reference index M2 from the sampling distribution π and
put it into the set M.
Similarly to the original K-means++, we can derive a theoretical bound for the error. We
can use similar techniques in Section 4.1 to elevate the complexity burdens.
In this section, we apply the proposed resampling strategies to a few applications: large-scale
visualization, robust shape modeling, feature analysis, and hierarchical representation.
In this task, we use the proposed resampling strategy to efficiently visualize large-scale
urban scenes. Since we are sensitive to the contours of buildings and streets in a urban scene, instead of showing an entire point cloud, we only show a selected subset of points. We consider a large-scale dataset, which involves several natural scenes with over 3 billion points in total and covers a range of diverse urban scenes: churches, streets, railroad tracks, squares, villages, soccer fields, castles. In
To look into some details, we show two zoom-in examples in
building and a church, which contain 381,903 and 1,622,239 points, respectively. Sub-
In this task, we use the proposed resampling strategy to achieve robust shape modeling.
The goal is to efficiently obtain a model by using a small subset of points, instead of using all the points in an original point cloud. We want this model reflects the true surface of an object, especially for a noisy point cloud.
In
points. In this noiseless case, the surface of the fitness can be modeled by a sphere. Sub-
In many real cases, the original points are collected with noise. To simulate the noisy case,
we add the Gaussian noise with mean zeros and variance 0.02 to each points. We first obtain a uniformly resampled point cloud. Then a denoised point cloud is obtained by the low-pass graph filtering (21) and the resampling strategy is based on (20). Finally we fit spheres from four point clouds, original ball (noisy free), noisy ball (with Gaussian noise added), uniformly resampled ball from the noisy ball, and resampled ball using a proposed low-pass graph filter. The statistics of the spheres are shown in
With the proposed high-pass graph filter based resampling, a small set of points with high
importance scores can be determined. Since such a small subset of points correspond to edges, corners in the point cloud, they can be regarded as key points of the original point cloud.
Moreover, feature descriptors can be defined on the selected key points of the original point
cloud, which is a collection of local descriptors upon some attributes associated on the points within a local neighborhood, e.g., location, gradients, orientations and scales.
Based on the derived feature descriptors, some point cloud analysis tasks can be conducted.
For example, given a query point cloud for a point cloud database, similar point cloud can be searched and retrieved. Similarity between two point cloud can be measured by computing the differences between their feature descriptors.
Due to limited processing capabilities, the population of points often need to be controlled.
For example, a point cloud rendering device may only be able to display up to a certain number of points at a time. Or a device to extract features from a point cloud may only be able to handle a limit-sized point set because of the limited computing resources available. A certain density of points corresponds to a scale level of a point cloud. Furthermore, operations like zoom-in/zoom-out need a series of scale levels of a point cloud, which result in a requirement to produce a hierarchical representation of a point cloud.
Assuming a higher layer of the representation is filled up with more points, let Si be the
set of points in the i-th layer, and i=1 be the coarsest layer, i=M be the finest layer.
A naive method to generate such a hierarchical representation is to produce a series of point
subsets at different scales from the raw point cloud independently and use a preferred resampling strategy as proposed earlier. With such a naive method, it is unnecessary for a point in a coarser layer to be present in a finer layer.
However, it may provide advantages if assuming that ∀iϵS1, iϵSk would hold as long
as k<l. In this way, the information from a coarser layer won't be discarded when switching to a finer layer. The new information from a finer layer would be additive to a coarser layer to produce a refined output. In other words, it is a waste of storage space or transmission rate if dropping the information from the coarser layer when moving to a finer layer.
Below we propose an advanced approach to generate a hierarchical representation of a
point cloud in an iterative way.
Let the finest layer SN be equal to the raw point cloud. Suppose we need to generate a
coarser layer Si given the finer layer sj+1 is available.
It is recommended to apply the above proposed resampling strategies on point set Sj+1.
That is, a independent nearest neighborhood graph would be constructed on the point set Si+1, where nearest neighborhood need to be defined with a proper radius by considering the density of point in the current layer j+1. The radius should increase from a finer layer to a coarse layer. Then a preferred random sampling method is then applied to generate a subset Si from Si+1. With the proposed procedure, a hierarchical representation is being generated from the most dense point set (finest resolution) to the most sparse point set (coarsest resolution).
When viewing the original point cloud, the viewer can perform zoom-in, zoom-out
operation by navigating between different layers of such a hierarchical representation.
In addition, a spatial scalable coding scheme can be designed following the hierarchical
representation generation. A preferred spatial scalable encoding scheme is a two-pass procedure, where the first pass is to generate the hierarchical representation from SN to S1 and the second pass is for actual coding from S1 to SN. That is, the encoding starts from the coarsest layer S1. Given that Si has been coded, we propose to encode the extra points Si+\Si in a predictive way based on Si.
In one implementation, we use an existing method, e.g. Octree coding method, to encode
the coarsest layer S1. Next, in order to encode a finer layer Si using the points in Si−1 as point predictors, we propose to cluster the points in Si using the points in Si−1 as centroids based on Euclidean distances. In such a way, the new points in Si\Si−1, could be efficiently predicted by the points in Si−1.
In present disclosure, we proposed a resampling framework to select a subset of points to extract key features and leverage the subsequent computation in a large-scale point cloud. We formulated an optimization problem to obtain the optimal sampling distribution, which is also guaranteed to be shift and rotation invariant. We then specified the feature extraction operator to be a graph filter and studied the resampling strategies based on all-pass, low-pass and high-pass graph filtering. Several applications, including large-scale visualization, robust shape modeling, feature descriptor extraction, hierarchical representation and coding, are presented to validate the effectiveness and efficiency of the proposed resampling methods
The computer 1711 can include a power source 1754, depending upon the application the power source 1254 may be optionally located outside of the computer 1711. Linked through bus 1756 can be a user input interface 1757 adapted to connect to a display device 1748, wherein the display device 1748 can include a computer monitor, camera, television, projector, or mobile device, among others. A printer interface 1759 can also be connected through bus 1756 and adapted to connect to a printing device 1732, wherein the printing device 1732 can include a liquid inkjet printer, solid ink printer, large-scale commercial printer, thermal printer, UV printer, or dye-sublimation printer, among others. A network interface controller (NIC) 1734 is adapted to connect through the bus 1756 to a network 1736, wherein time series data or other data, among other things, can be rendered on a third party display device, third party imaging device, and/or third party printing device outside of the computer 1711.
Still referring to
Also, the various methods or processes outlined herein may be coded as software that is executable on one or more processors that employ any one of a variety of operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.
Also, the embodiments of the present disclosure may be embodied as a method, of which an example has been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts concurrently, even though shown as sequential acts in illustrative embodiments. Further, use of ordinal terms such as “first,” “second,” in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements.
Although the present disclosure has been described with reference to certain preferred embodiments, it is to be understood that various other adaptations and modifications can be made within the spirit and scope of the present disclosure. Therefore, it is the aspect of the append claims to cover all such variations and modifications as come within the true spirit and scope of the present disclosure.
Number | Date | Country | |
---|---|---|---|
62417007 | Nov 2016 | US |