The technology disclosed in this specification (hereinafter, “the present disclosure”) relates to an information processing apparatus, an information processing method, and a mobile device that perform processing for creating an environmental map.
Conventionally, a mobile device that autonomously moves, such as a walking robot and a transport vehicle, has been developed. In order to realize autonomous movement by the mobile device, an environmental map or an obstacle map (hereinafter, the map is unified to “environmental map”) representing a region where an object such as an obstacle exists and a region where no object exists in the moving space is required. Moreover, in the case of a legged robot having high traversing performance in a place having stairs or unevenness, it is preferable that the environmental map is a 2.5 dimensionally expressed environmental map including height information in preparation for a moving space having a slope or a step.
For example, an information processing apparatus that creates an environmental map including height information on the basis of three-dimensional sensor data obtained from a three-dimensional distance sensor has been proposed (see Patent Document 1). The information processing apparatus creates a first environmental map based on first detection data in a first height range, acquires second detection data in a second height range, creates a superimposed image in which a projection image of third detection data in a third height range acquired from the second detection data is superimposed on the first environmental map, and creates a second environmental map by setting an entry prohibited area based on the superimposed image to the first environmental map.
However, since the three-dimensional sensor data is a huge amount of data including an infinite number of three-dimensional point clouds, the amount of calculation is enormous to create an environmental map including height information from the three-dimensional sensor data. For example, a desktop personal computer (PC) or a cloud equipped with a central processing unit (CPU) capable of high-speed operation or a graphics processing unit (GPU) capable of parallel operation and a large-capacity memory system can cope with a huge amount of data and an enormous amount of operation. On the other hand, an embedded device such as a small robot cannot be said to be rich in calculation resources, and thus, it is difficult to create a low-dimensional environmental map on the basis of high-dimensional sensor data.
An object of the present disclosure is to provide an information processing apparatus, an information processing method, and a mobile device that efficiently create a low-dimensional environmental map on the basis of high-dimensional sensor data.
The present disclosure has been made in view of the above-described problems, and a first aspect thereof is an information processing apparatus including: an object construction unit that clusters sensor data including a high-dimensional point cloud and constructs a high-dimensional object for each cluster; a projection unit that subjects the high-dimensional object to low-dimensional projection; and a map construction unit that constructs a map on a basis of an object that has been subjected to the low-dimensional projection.
The object construction unit constructs a high-dimensional object having a size, a shape, and a posture based on geometric information extracted from a set of sensor data included in a cluster.
The projection unit orthogonally projects the high-dimensional object onto a reference plane parallel to a ground to generate a low-dimensional object. Then, the map construction unit constructs a low-dimensional map on the basis of the object.
The information processing apparatus according to the first aspect further includes a hidden surface removal unit that performs hidden surface removal processing on the object that has been subjected to the low-dimensional projection by the projection unit. The hidden surface removal unit performs the hidden surface removal processing using a Z buffer that stores height information or an index of the height information.
The projection unit holds information necessary for height estimation among high-dimensional information of an object when the projection unit subjects the high-dimensional object to low dimensional projection, and the map construction unit may construct an environmental map to which a height estimated on the basis of the held information is assigned.
Furthermore, a second aspect of the present disclosure is an information processing method including: an object construction step of clustering sensor data including a high-dimensional point cloud and constructing a high-dimensional object on a basis of geometric information extracted from a cluster; a projection step of subjecting the high-dimensional object to low-dimensional projection; and a map construction step of constructing a map on a basis of an object that has been subjected to the low-dimensional projection.
Furthermore, a third aspect of the present disclosure is a mobile device including: a main body; a moving unit that moves the main body; a sensor mounted on the main body; a map construction unit that constructs an environmental map on the basis of sensor data from the sensor; a route planning unit that plans a route on the basis of the environmental map; and a control unit that controls an operation of the moving unit on the basis of the planned route, in which the map construction unit clusters the sensor data including a high-dimensional point cloud, and constructs a map on the basis of an object obtained by reducing a dimension of a high-dimensional object constructed on the basis of geometric information extracted from a cluster.
According to the present disclosure, it is possible to provide an information processing apparatus, an information processing method, and a mobile device that reduce a data amount and a calculation amount when a low-dimensional environmental map is created from high-dimensional sensor data.
Note that the effects described in the present description are merely examples, and the effects brought by the present disclosure are not limited thereto. Furthermore, the present disclosure may further provide additional effects in addition to the effects described above.
Still other objects, features, and advantages of the present disclosure will become apparent from a more detailed description based on embodiments as described later and the accompanying drawings.
Hereinafter, the present disclosure will be described in the following order with reference to the drawings.
In order to create an environmental map including height information from three-dimensional sensor data, both the amount of data and the amount of calculation are enormous. Therefore, it is difficult to realize the creation of the environmental map by real-time processing with limited calculation resources allowed by a small robot. Of course, even in a high-performance calculation system equipped with a high-performance processor and a large-capacity memory system, it is not easy to realize the creation of the environmental map by real-time processing, and there are cases where the creation is handled by reducing the resolution and frame rate of the three-dimensional sensor and reducing the data amount. Therefore, it is desirable to drastically reduce the data amount and the calculation amount in order to realize the real-time processing of creating the environmental map regardless of the magnitude of the calculation resources.
Therefore, in the present disclosure, when a low-dimensional environmental map is created from high-dimensional sensor data, first, only geometric information is extracted by clustering the high-dimensional sensor data. That is, in the present disclosure, one or more objects having geometric information regarding a size, a shape, and a posture therein are created on the basis of original high-dimensional sensor data, and projection processing and mapping processing are performed by utilizing the geometric information of the objects. Therefore, as compared with a case where the three-dimensional sensor data is used as it is, the amount of data and the amount of computation handled in the subsequent projection processing and mapping processing can be greatly reduced.
Moreover, in the present disclosure, memory access and a calculation amount are reduced by using a hidden surface removal and rasterizing method using a Z buffer which is rendering processing of 3D computer graphics (3DCG) in projection processing on an object.
Therefore, according to the present disclosure, it is possible to generate the environmental map including the height information in real time even in a calculation system with limited calculation resources, such as an embedded device such as a small robot. Furthermore, according to the present disclosure, feedback of three-dimensional space information can be performed in real time even in a calculation system with limited calculation resources.
The data acquisition unit 101 acquires sensor data including a high-dimensional point cloud from, for example, a sensor (not illustrated in
The object construction unit 102 clusters sensor data including a large number of high-dimensional point clouds accumulated in the sensor data accumulation unit 111. As a result, in the subsequent processing, the high-dimensional point cloud can be handled in cluster units, and the processing load can be greatly reduced as compared with the case of processing the high-dimensional point cloud as it is. For example, a technique of clustering a point cloud by near point search is known. However, since calculation resources are limited (described above), it is preferable to perform clustering of the point cloud with a low processing load and a short processing time.
The geometric information extraction unit 103 analyzes a high-dimensional point cloud included in the cluster, and extracts geometric information regarding the size, shape, and posture of a high-dimensional object corresponding to the cluster. Specifically, the geometric information extraction unit 103 performs principal component analysis (PCA) on the high-dimensional sensor data included in the cluster to obtain an average p and a variance-covariance matrix (covariance) Z.
Then, the object construction unit 102 constructs a high-dimensional object having a size, a shape, and a posture based on the geometric information extracted from the high-dimensional point cloud included in the cluster by the geometric information extraction unit 103. Hereinafter, a cluster including a high-dimensional point cloud can be treated as one high-dimensional object. In the present embodiment, the object construction unit 102 constructs a high-dimensional object including an ellipsoid defined by an average p and a covariance Z obtained by principal component analysis for a cluster. The ellipsoid is centered on the mean p. Furthermore, the covariance Z is subjected to eigenvalue decomposition, the eigenvector is defined as the inclination of the ellipsoid, and each eigenvalue is defined as the length of the major axis and the length of the minor axis of the ellipse. Therefore, the object construction unit 102 can greatly reduce the amount of data by expressing the original high-dimensional point cloud as an object having a simple high-dimensional shape such as an ellipsoid by clustering and geometric information extraction.
The projection unit 104 orthogonally projects the high-dimensional object constructed by the object construction unit 102 in low dimensions. Specifically, the projection unit 104 projects the ellipsoid object constructed by the object construction unit 102 onto the reference plane. Here, the reference plane is a plane parallel to the ground, and an axis (z axis) orthogonal to the reference plane is defined as a height. Furthermore, for example, an object is projected by an orthogonal projection operation. For example, by the projection processing by the projection unit 104, an ellipsoid that is a three-dimensional object representing a cluster is orthogonally projected onto a reference plane and converted into a two-dimensional object made of an ellipsoid flat plate. Since the original three-dimensional sensor data is a two-dimensional object including an ellipsoid flat plate on the reference plane, the data amount is greatly reduced. However, among the three-dimensional information of the three-dimensional object before orthogonal projection, information necessary for height estimation in the subsequent processing is held.
The map construction unit 105 constructs the environmental map by assigning height information to each point of the low-dimensional object formed of the ellipsoid flat plate on the reference plane on the basis of the height information of the high-dimensional object before the low-dimensional projection. For example, the map construction unit 105 searches for a three-dimensional object nearest to a query point to be estimated in height on the basis of, for example, a KD-tree or a Mahalanobis distance. Then, the found three-dimensional object is approximated to an ellipsoid flat plate on the three-dimensional space, and the height from the query point on the reference plane to the approximated ellipsoid flat plane is calculated. Furthermore, the reliability of the height information calculated at the query point is calculated from the variance in the thickness direction of the three-dimensional object before projection. Then, the map construction unit 105 outputs the constructed low-dimensional environmental map to an environment accumulation unit 112.
Note that processing from acquisition of sensor data to construction of the environmental map by the information processing apparatus 100 is performed at a frame rate of the sensor, for example.
First, the data acquisition unit 101 acquires sensor data including a large number of three-dimensional point clouds from sensors such as a depth camera, a LiDAR, a distance measurement sensor, and a stereo camera mounted on, for example, a small robot, and accumulates the sensor data in the sensor data accumulation unit 111 (step S201).
Next, the object construction unit 102 clusters the sensor data of the enormous high-dimensional point cloud accumulated in the sensor data accumulation unit 111 to generate one or more clusters including the high-dimensional point cloud (step S202).
Next, the geometric information extraction unit 103 analyzes the high-dimensional point cloud included in the cluster, and extracts geometric information regarding a size, a shape, and a posture of the high-dimensional object corresponding to the cluster (step S203). Specifically, the geometric information extraction unit 103 performs principal component analysis on the three-dimensional sensor data included in the cluster to obtain an average p and a covariance Z.
Next, the object construction unit 102 constructs a high-dimensional object having a size, a shape, and a posture based on the geometric information extracted from the high-dimensional point cloud included in the cluster by the geometric information extraction unit 103 (step S204). Specifically, the object construction unit 102 constructs an ellipsoid based on the average p and the covariance Z calculated in step S203 as a three-dimensional object corresponding to a cluster.
Next, the projection unit 104 orthogonally projects the high-dimensional object constructed by the object construction unit 102 onto the low-dimensional plane (step S205). Specifically, the projection unit 104 orthogonally projects the ellipsoid object constructed by the object construction unit 102 onto a reference plane parallel to the ground, and converts the ellipsoid object into a two-dimensional object including an ellipsoid flat plate on the reference plane.
Next, the map construction unit 105 constructs an environmental map by assigning height information to each point of the low-dimensional object subjected to the low-dimensional projection on the basis of the height information before projection (step S206). Specifically, when the height information is assigned to each point of the two-dimensional object made of the ellipsoid flat plate on the reference plane, the three-dimensional object closest to the query point to be subjected to the height calculation is approximated to the ellipsoid flat plate, and the height from the query point on the reference plane to the approximated ellipsoid plane is approximately calculated. Furthermore, the reliability of the height information calculated at the query point is calculated from the variance in the thickness direction of the three-dimensional object before projection.
Next, each process performed to create the environmental map in the information processing apparatus 100 will be described in detail with appropriate reference to the drawings.
The data acquisition unit 101 acquires sensor data including a large number of three-dimensional point clouds as illustrated in
The object construction unit 102 clusters sensor data including a large number of three-dimensional point clouds. For example, a technique of clustering a point cloud by near point search is known. However, since calculation resources are limited (described above), it is preferable to perform clustering of the point cloud with a low processing load and a short processing time. Therefore, for example, on the basis of the method disclosed in Patent Document 2, clustering of the point cloud may be performed with a low processing load and a short processing time. A clustering method used in the present embodiment will be briefly described.
First, whether to divide a large number of three-dimensional point clouds acquired by the data acquisition unit 101 is determined on the basis of a predetermined division condition, and the point cloud is divided until the division condition is no longer satisfied. The division condition here includes one or more of a condition for the number of points included in the point cloud, a condition for the size of the region surrounding the point cloud, a condition for the density of points in the point cloud, and a condition of the point cloud.
Then, the point cloud that no longer satisfies the division condition is determined on the basis of a predetermined clustering condition, and clustering is performed in a case where the clustering condition is satisfied. The clustering condition here includes one or more of a condition for the number of points included in the point cloud, a condition for the size of the region surrounding the point cloud, a condition for the density of points in the point cloud, and a condition of the point cloud.
The geometric information extraction unit 103 analyzes a set of three-dimensional sensor data included in the cluster, and extracts geometric information on the size, shape, and posture of the object corresponding to the cluster. Specifically, the geometric information extraction unit 103 performs principal component analysis on the three-dimensional sensor data included in the cluster to obtain an average p and a covariance Z. Then, the object construction unit 102 constructs a three-dimensional object including an ellipsoid defined by an average p and a covariance Z as illustrated in
Furthermore,
In the subsequent processing, the data amount can be greatly reduced by using a three-dimensional object constructed on the basis of clustering and geometric information extraction of each cluster instead of the original large number of three-dimensional point clouds.
The projection unit 104 orthographically projects one or more three-dimensional objects constructed by the object construction unit 102 onto a reference plane parallel to the ground. Here, an axis (z axis) orthogonal to the reference plane is defined as a height. As a result of the orthogonal projection, the three-dimensional object is reduced in dimension to a two-dimensional object on the reference plane, and the data amount can be further reduced. However, among the three-dimensional information of the three-dimensional object before orthogonal projection, information necessary for height estimation in the subsequent map construction processing is held. Examples of the information necessary for the height estimation include the expected value and variance of the three-dimensional point cloud data included in each three-dimensional object, and the normal vector of the object.
The map construction unit 105 constructs an environmental map by assigning height information to each point of a two-dimensional object formed of an ellipsoid flat plate on a reference plane on the basis of the height information of the three-dimensional object before orthogonal projection.
In the estimation calculation of the height information, the map construction unit 105 approximates each three-dimensional object before orthogonal projection to a flat plate on a three-dimensional space. In the example illustrated in
Next, the map construction unit 105 searches for a three-dimensional object nearest to a query point to be subjected to height estimation on the basis of, for example, a KD-tree or a Mahalanobis distance. In the example illustrated in
Then, the map construction unit 105 linearly interpolates the expected value of the height information from the point corresponding to the query point on the approximate ellipsoid flat plate 702 to calculate the expected value. Furthermore, the map construction unit 105 calculates the variance in the thickness direction of the original ellipsoid object 502 at the point corresponding to the query point on the approximate ellipsoid flat plate 702 as the reliability of the height information calculated at the query point. The variance in the thickness direction may be simply calculated by low-dimensionally projecting the variance in the normal direction of the approximate ellipsoid flat plate 702 in the z direction.
A specific example of creating an environmental map using the information processing apparatus 100 according to the present embodiment will be described. Here, it is assumed that an environmental map near the entrance 800 of the building as illustrated in
The map construction unit 105 constructs the environmental map by assigning height information to each point of the two-dimensional object that is low-dimensionally projected on the reference plane. The map construction unit 105 estimates the expected value of the height at the query point of the two-dimensional object by linear interpolation on an ellipsoid flat plate approximating the three-dimensional elliptic object.
Furthermore,
Referring to
In the above-described section B, the basic configuration and operation of the information processing apparatus 100 that performs the creation processing of the environmental map have been described. In this section C, examples of an information processing apparatus that further reduces the amount of data and the amount of computation in the environmental map creation process will be described.
When the projection unit 1304 orthogonally projects a plurality of three-dimensional objects onto the reference plane to make the plurality of three-dimensional objects two-dimensional, the hidden surface removal unit 1305 performs processing of erasing a portion that becomes a shadow of another two-dimensional object and cannot be seen from an upper viewpoint. In the present example, the hidden surface removal processing is performed using the Z buffer 1306 which is a local memory that stores height information for each pixel. As a result, in a case where there is a region overlapping in the height direction between adjacent three-dimensional objects, the hidden surface removal unit 1305 removes the overlapping region of the lower three-dimensional object hidden by the higher three-dimensional object, and further reduces the data amount.
In the first example, the Z buffer 1306 stores height information of the three-dimensional object orthogonally projected up to the previous time as a Z value for each pixel (alternatively, a region serving as a processing unit) of the reference plane. Then, when the projection unit 1304 orthogonally projects another three-dimensional object to the same pixel next time, the magnitude of the value of the height stored in the Z buffer 1306 up to the previous time and the value of the height of the three-dimensional object orthogonally projected this time are compared (that is, the Z test). At this time, if the current height is higher, orthogonal projection processing of the current three-dimensional object onto the reference plane is performed, and the height information of the pixel in the Z buffer 1306 is updated. On the other hand, if the current height is lower, it is hidden under the already orthogonally projected two-dimensional object and cannot be seen, so that it is not necessary to newly perform orthogonally projection processing on the reference plane. If orthogonal projection is performed from a three-dimensional object existing at a high position and orthogonal projection is performed on a three-dimensional object existing at a low position last, calculation time can be saved. Therefore, according to the first example, by using the Z buffer 1306, the hidden surface removal unit 1305 can efficiently perform the hidden surface removal processing of the low-dimensional object after projection by the projection unit 1304.
The negative elimination method is well known in the field of 3DCG. However, in the field of 3DCG, other methods such as a scan line method and a ray tracing method are known in addition to a method of using a Z buffer as hidden surface removal processing. In the present embodiment, the hidden surface removal method using the Z buffer is adopted from the viewpoint of memory access and calculation amount.
Also in a second example, similarly to the first example, when the projection unit 1304 orthogonally projects a plurality of three-dimensional objects onto the reference plane to make the plurality of three-dimensional objects two-dimensional, the hidden surface removal processing is performed using the Z buffer. Similarly to the first example, the second example is also realized by using the information processing apparatus 1300 having the functional configuration illustrated in
In the first example, the Z buffer 1306 stores the height information itself for each pixel. On the other hand, in the second example, the Z buffer 1306 indexes the height information, and the Z buffer 1306 stores the index of the height information. Therefore, according to the second example, the capacity of the Z buffer 1306 can be reduced.
In the second example, in a case where the projection unit 1304 orthogonally projects another three-dimensional object to the same pixel next time, height information is acquired on the basis of the index previously stored in the Z buffer 1306, and a Z test with the height information of the three-dimensional object orthogonally projected this time is performed. The hidden surface removal processing according to the result of the Z test is similar to that of the first example.
Also in a third example, similarly to the first and second examples, when the projection unit 1304 orthogonally projects a plurality of three-dimensional objects onto the reference plane to make the plurality of three-dimensional objects two-dimensional, the hidden surface removal processing is performed using the Z buffer. Similarly to the first example, the third example is also realized by using the information processing apparatus 1300 having the functional configuration illustrated in
In the third example, a bounding box is obtained in advance for each three-dimensional object before orthographic projection by the projection unit 1304. The bounding box is a square box (rectangle) of the size required to just enclose the object. Then, in a case where the projection unit 1304 orthogonally projects a certain three-dimensional object, the hidden surface removal unit 1305 limits a region to be processed on the Z buffer 1306 on the basis of a bounding box obtained in advance for the three-dimensional object.
Also in a fourth example, similarly to the first to third examples, when the projection unit 1304 orthogonally projects a plurality of three-dimensional objects onto the reference plane to make the plurality of three-dimensional objects two-dimensional, the hidden surface removal processing is performed using the Z buffer. Similarly to the first example, the fourth example is also realized by using the information processing apparatus 1300 having the functional configuration illustrated in
In the fourth example, the projection unit 1304 sorts the three-dimensional objects orthogonally projected onto the reference plane in the height direction, and then the hidden surface removal unit 1305 performs the hidden surface removal processing.
In a case where the hidden surface removal unit 1305 performs hidden surface removal processing of the three-dimensional object using the Z buffer 1306, if the height of the three-dimensional object at this time is lower, the three-dimensional object is hidden under the already orthographically-projected two-dimensional object and cannot be seen, so that it is not necessary to newly perform orthographically-projected processing on the reference plane. Therefore, according to the fourth example, the hidden surface removal unit 1305 performs the hidden surface removal processing in the order sorted in the height direction, so that the height of the three-dimensional object at this time is a lower result in the Z test. Therefore, the calculation processing in the Z buffer 1306 can be simplified, and the calculation time can be saved.
Also in a fifth example, similarly to the first to fourth examples, when the projection unit 1304 orthogonally projects a plurality of three-dimensional objects onto the reference plane to make the plurality of three-dimensional objects two-dimensional, the hidden surface removal processing is performed using the Z buffer. Similarly to the first example, the fifth example is also realized by using the information processing apparatus 1300 having the functional configuration illustrated in
In the fifth example, the hidden surface removal unit 1305 uses an incremental method for calculating the height of the Z buffer 1306. That is, in the Z test for comparing the height value stored in the Z buffer 1306 up to the previous time with the height value of the three-dimensional object orthogonally projected this time, the height of the three-dimensional object orthogonally projected this time is not calculated independently for each pixel, but the height increment is added to the Z value. For example, assuming that the normal vector of the three-dimensional object 1801 illustrated in
Therefore, according to the fifth example, by performing the height calculation using the incremental method, the calculation processing in the Z buffer 1306 can be simplified, and the calculation time can be saved.
Also in a sixth example, similarly to the first to fifth examples, when the projection unit 1304 orthogonally projects a plurality of three-dimensional objects onto the reference plane to make the plurality of three-dimensional objects two-dimensional, the hidden surface removal processing is performed using the Z buffer. Similarly to the first example, the sixth example is also realized by using the information processing apparatus 1300 having the functional configuration illustrated in
In the sixth example, the Z buffer 1306 is divided into a plurality of tiles, and the hidden surface removal unit 1305 performs height calculation (Z test) for comparing the magnitude of the height value stored in the Z buffer 1306 up to the previous time with the magnitude of the height value of the three-dimensional object orthogonally projected this time on the same memory for each tile.
Also in a seventh example, similarly to the first to sixth examples, when the projection unit 1304 orthogonally projects a plurality of three-dimensional objects onto the reference plane to make the plurality of three-dimensional objects two-dimensional, the hidden surface removal processing is performed using the Z buffer. Similarly to the first example, the seventh example is also realized by using the information processing apparatus 1300 having the functional configuration illustrated in
In the seventh example, similarly to the sixth example, the Z buffer 1306 is divided into a plurality of tiles, but the hidden surface removal unit 1305 simultaneously performs height calculations of a plurality of tiles.
After performing the hidden surface removal processing of the low-dimensional object by the hidden surface removal unit 2205, the post filter processing unit 2207 performs filter processing such as filling processing of gaps generated between adjacent low-dimensional objects and smoothing processing of discontinuous data (height, expected value, variance, etc.) between adjacent low-dimensional objects.
The data acquisition unit 2501 acquires sensor data including a three-dimensional point cloud from a sensor mounted on, for example, a small robot, such as a depth camera, a LiDAR, a distance measurement sensor, or a stereo camera. The three-dimensional point cloud data acquired by the data acquisition unit 2501 is temporarily stored in a sensor data accumulation unit 2511.
The clustering unit 2502 clusters a large number of three-dimensional point clouds accumulated in the three-dimensional point cloud accumulation unit 2511. The clustering unit 2502 performs the clustering of the three-dimensional point clouds with a low processing load and a short processing time using, for example, the technology disclosed in Patent Document 2. Furthermore, when the clustering unit 2502 performs principal component analysis on the three-dimensional point cloud included in each cluster to obtain the average p and the covariance Z, the clustering unit further performs eigenvalue decomposition on the covariance Z to generate an ellipsoid object having the average p as the center, the eigenvector as the inclination of the ellipsoid, and each eigenvalue as the length of the major axis and the minor axis of the ellipse.
Therefore, according to the ninth example, it is possible to suitably construct the low-dimensional environmental map by performing the data acquisition unit 2501 in the case of acquiring the sensor data including the three-dimensional point cloud from the depth camera, the LiDAR, the distance measurement sensor, the stereo camera, or the like.
Similarly to the ninth example, a tenth example is realized by using the information processing apparatus 2500 having the functional configuration illustrated in
The two-dimensional projection unit 2503 orthogonally projects each cluster output from the clustering unit 2502 onto a reference plane parallel to the ground to generate a two-dimensional object including an ellipsoid flat plate. In the following processing, each cluster is treated as an ellipsoid flat plate. Therefore, according to the tenth example, it is possible to obtain an environmental map with necessary and sufficient accuracy while simplifying subsequent processing such as hidden surface removal and map construction.
The two-dimensional projection and normal component extraction unit 2603 orthogonally projects each cluster output from the clustering unit 2502 onto a reference plane parallel to the ground to generate a two-dimensional object including an ellipsoid flat plate. At that time, the two-dimensional projection and normal component extraction unit 2603 extracts a normal component of each ellipsoid object as information indicating uncertainty of each original three-dimensional point cloud. Then, the map construction unit 2606 constructs a map to which information indicating uncertainty is assigned to each two-dimensional object after the hidden surface removal processing by the hidden surface removal unit 2603.
As can be seen from
The map construction unit 2606 gives information indicating uncertainty to each elliptic flat object after the hidden surface removal processing by the hidden surface removal unit 2603 on the basis of the normal component extracted from the elliptic object before projection. For example, the map construction unit 2606 may color-code and express each ellipsoid flat object after the hidden surface removal processing according to the normal component of each ellipsoid object before projection.
According to an eleventh example, a map reflecting uncertainty of sensor data can be obtained. For example, the mobile robot can take a safe action such as selecting a more reliable route by using a map to which information indicating uncertainty is given.
The data input unit 2901 inputs sensor data from one or more sensor elements that recognize an external environment or output recognition data corresponding to the recognized external environment.
The depth calculation unit 2903 inputs sensor data via the guided filter 2902, executes depth calculation, and calculates a three-dimensional position of an object (such as an obstacle) existing in an external environment.
The environmental map construction unit 2904 constructs the three-dimensional environmental map of the current time on the basis of the result of the depth calculation by the depth calculation unit 2903. The environmental map construction unit 2904 stores the constructed three-dimensional environmental map of the current time in the three-dimensional environmental map accumulation unit 2910.
The guide generation unit 2905 generates the guide information on the basis of the three-dimensional environmental map of the previous time as the feedback information in the processing of constructing the three-dimensional environmental map from the sensor data. The guided filter 2902 processes the sensor data input from the data input unit 2901 on the basis of the guide information input from the guide generation unit 2905. The guided filter 2902 removes noise or outliers included in the sensor data input from the data input unit 2901 on the basis of the guide information input from the guide generation unit 2905, for example. The guided filter 2902 may be configured by, for example, a guided filter (see Non-Patent Document 1). The guided filter 2902 may include, for example, a Bayesian filter including a least squares regression (ridge regression) with a regularization term. As a result, the guided filter 2902 can obtain data from which noise or outliers have been removed. The guided filter 2902 outputs the processed data to the depth calculation unit 2903. The environmental map construction unit 2904 constructs the environmental map at the current time using a result of depth calculation from the data processed by the guided filter 2902.
A three-dimensional point cloud constituting the three-dimensional environmental map of the previous time is input to the guide generation unit 2905. The clustering unit 3001 clusters the three-dimensional point cloud, and expresses each cluster by a three-dimensional object including an ellipsoid. The two-dimensional projection unit 3002 orthogonally projects each three-dimensional object onto a reference plane to form a two-dimensional object. The hidden surface removal unit 3003 performs a hidden surface removal process between the two-dimensional objects using the Z buffer 3004. Then, the guide generation unit 2905 outputs, as the feedback information, the guide information including the reference plane on which each two-dimensional object after the hidden surface removal processing is arranged.
According to the twelfth example, in the information processing apparatus 2900 that constructs the three-dimensional environmental map from the sensor data using the guided filter 2902, the guide information as the feedback information is generated on the basis of the present disclosure, whereby the feedback processing can be efficiently performed with lower calculation.
The data acquisition unit 3101 acquires sensor data including three-dimensional point cloud data from a sensor such as a depth camera, a LiDAR, a distance measurement sensor, or a stereo camera, and temporarily stores the sensor data in the sensor data accumulation unit 3111.
The clustering unit 3102 clusters a large number of three-dimensional point clouds accumulated in the three-dimensional point cloud accumulation unit 3111 using, for example, the technology disclosed in Patent Document 2, and expresses each cluster as a three-dimensional object including an ellipsoid as illustrated in
As illustrated in
Furthermore, when the two-dimensional projection unit 3103 projects a three-dimensional object on a reference plane, information necessary for height estimation among high dimension information of the object is held. Examples of the information necessary for the height estimation include the expected value and variance of the three-dimensional point cloud data included in each three-dimensional object, and the normal vector of the object.
As illustrated in
The map construction unit 3106 constructs a two-dimensional environmental map (HeightMap) including height information by assigning height information to each point on the reference plane after the hidden surface removal processing on the basis of information (expected value, variance, normal vector, and the like) held for each ellipsoid flat plate. Specifically, the map construction unit 3106 approximates an ellipsoid object before projection corresponding to an ellipsoid plane to a plane, and linearly interpolates an expected value of height information at a query point in the ellipsoid flat plate to calculate the height information. For example, the expected value of the height estimated by linear interpolation from the low-dimensional projection image illustrated on the right side of
Furthermore, the map construction unit 3106 subjects variance in the thickness direction of the corresponding original three-dimensional object to low-dimensional projection onto the z axis to obtain the variance at the query points in the ellipsoid flat plate. For example, the map construction unit 3106 can obtain the calculation result of the variance illustrated in
The HeightMap is a map method widely used in robots that move in a non-planar environment such as legged robots. According to the thirteenth example, it is possible to efficiently construct the HeightMap by reducing the data amount and the operation amount.
Heretofore, the first to thirteenth examples in which a two-dimensional environmental map is constructed by projecting sensor data mainly including a three-dimensional point cloud onto a two-dimensional reference plane have been mainly described. On the other hand, in item C-14, a description will be given of a fourteenth example in which a two-dimensional point cloud obtained from a two-dimensional sensor is linearly projected to construct a one-dimensional environmental map.
The data acquisition unit 3201 acquires sensor data from a two-dimensional sensor (not illustrated). The two-dimensional sensor is mounted, for example, so as to set the front or the traveling direction of the small robot as the viewing angle, and outputs sensor data including a two-dimensional point cloud that captures an obstacle or the like existing in front of the small robot. The two-dimensional point cloud data acquired by the data acquisition unit 3201 is temporarily stored in a sensor data accumulation unit 3211.
The clustering unit 3202 clusters a large number of two-dimensional point clouds accumulated in the two-dimensional point cloud accumulation unit 3211 using, for example, the technology disclosed in Patent Document 2, and constructs a two-dimensional object from each cluster. The constructed two-dimensional object is an ellipse having an average u and a covariance Z of the two-dimensional point cloud in the cluster as geometric information. For example, in a case where the two-dimensional sensor senses a two-dimensional point cloud as illustrated in
The one-dimensional projection unit 3203 orthogonally projects each two-dimensional object constructed by the clustering unit 3202, for example, in a reference straight line orthogonal to the traveling direction of the small robot to make the two-dimensional object one-dimensional.
Furthermore, when the one-dimensional projection unit 3203 projects the two-dimensional object on the reference straight line, information necessary for depth estimation among the two-dimensional information of the object is held. Examples of the information necessary for the depth estimation include the expected value and variance of the two-dimensional point cloud data included in each two-dimensional object, and the normal vector of the object.
When the one-dimensional projection unit 3203 orthogonally projects the plurality of two-dimensional objects onto the reference straight line to make the plurality of two-dimensional objects one-dimensional, the hidden surface removal unit 3204 performs processing of erasing a portion that becomes a shadow of another one-dimensional object from the viewpoint and cannot be seen. The Z buffer 3205 stores depth information of the two-dimensional object orthographically projected up to the previous time as a Z value for each pixel (alternatively, a region serving as a processing unit) of the reference straight line.
Then, when the one-dimensional projection unit 3203 orthogonally projects another two-dimensional object of the same pixel next time, the values of the depth stored in the Z buffer 3205 up to the previous time are compared with the values of the depth of the two-dimensional object orthogonally projected this time. If the current depth is smaller, the orthogonal projection processing of the current three-dimensional object onto the reference plane is performed, and the depth information of the pixel in the Z buffer 3205 is updated. On the other hand, if the current depth is larger, it is hidden behind the already orthographically projected two-dimensional object and cannot be seen, so that it is not necessary to newly perform orthographic projection processing on the reference straight line.
As described above, according to the fourteenth example, it is possible to efficiently construct the one-dimensional environmental map with the depth information by reducing the data amount and the calculation amount. Conventionally, in order to acquire depth information, it has been necessary to perform a ray-cast process of firing a line (ray) invisible from the current position to the scheduled moving destination and determining whether there is a colliding object on the line. However, according to the fourteenth example, such a process is unnecessary. Furthermore, the one-dimensional environmental map with depth information constructed using the fourteenth example can be used for route planning of a robot that autonomously moves on a two-dimensional plane.
The robot can estimate the presence or absence of an obstacle in the traveling direction on the basis of the one-dimensional environmental map with depth information. For example, as illustrated in
In this section, an example in which the information processing apparatus according to the present disclosure is applied to a robot will be described. As described above, since the information processing apparatus according to the present disclosure can create the environmental map by greatly reducing the data amount and the calculation amount, it is practically possible to mount the information processing apparatus on a small robot with limited calculation resources. The small robot can make a route plan using the created environmental map and autonomously perform movement work on the basis of the route plan.
The visual sensor 3802 is a sensor that visually recognizes the environment around the robot device 3800, and includes, for example, at least one of a camera (including a stereo camera), an infrared camera, a time of flight (TOF) sensor, a LiDAR, and the like, and outputs sensor data including, for example, a three-dimensional point cloud. The visual sensor 3802 is attached to the main body 3801 via the joint 3803 for moving a line-of-sight direction of the visual sensor 3802 up, down, left, and right. Furthermore, the robot device 3800 may include a sensor other than the visual sensor 3802, such as an inertial measurement unit (IMU) mounted on the main body 3801 or each of the legs 3810A to D, a ground sensor of the sole of each of the legs 3810A to D, or a tactile sensor of a surface of the main body 3801.
The legs 3810A to D as a moving means are connected to the main body 3801 via joints 3811A to D corresponding to hip joints, respectively. Each of legs 3810A to D includes joints 3812A to D that connect the thigh link and the lower leg link. Each of the joints 3811A to D and the joints 3812A to D has at least a degree of freedom around the pitch. The joints 3811A to D and the joints 3812A to D are active joints, and include a motor for driving a joint, an encoder for detecting a position of the motor, a speed reducer, and a torque sensor for detecting torque on an output shaft side of the motor (none of them are illustrated). Therefore, the robot device 3800 is a four-legged robot in which a gait by a leg (walking) is selectable.
Furthermore,
The visual sensor 3902 is a sensor that visually recognizes the environment around the robot device 3900, and includes, for example, at least one of a camera (including a stereo camera), an infrared camera, a TOF sensor, a LiDAR, and the like, and outputs sensor data including, for example, a three-dimensional point cloud. The visual sensor 3902 is attached to the main body 3901 via a joint 3903 for moving the line-of-sight direction of the visual sensor 3902 up, down, left, and right.
The right leg 3910R and the left leg 3910L as moving means are connected to the lower end of the main body 3901 via joints 3911R and 3911L corresponding to the hip joints, respectively. Right leg 3910R and left leg 3910L respectively include joints 3912R and 3912L corresponding to a knee joint connecting the thigh link and the lower leg link, and ground contact portions (or foot portions) 3913R and 3913L at the distal end of the lower leg link.
The right arm 3920R and the left arm 3920L are connected to the vicinity of the upper end of the main body 3901 via joints 3921R and 3921L corresponding to shoulder joints, respectively. The right arm 3920R and the left arm 3920L respectively include joints 3922R and 3922L corresponding to elbow joints connecting the upper arm link and the forearm link, and hands (or grips) 3923R and 3923L at the tip of the forearm link.
The joints 3911R and 3911L, the joints 3912R and 3912L, the joints 3921R and 3921L, and the joints 3922R and 3922L include a motor for driving a joint, an encoder for detecting a position of the motor, a speed reducer, and a torque sensor for detecting torque on an output shaft side of the motor (none of them are illustrated).
The control system 4000 operates under the overall control of a CPU 4001. In the illustrated example, the CPU 4001 has a multi-core configuration including a processor core 4001A and a processor core 4001B. The CPU 4001 is interconnected with each component in the control system 4000 via a bus 4010.
A storage device 4020 includes, for example, a large-capacity external storage device such as a hard disk drive (HDD) or a solid state drive (SSD), and stores files such as a program executed by the CPU 4001 and data generated by using or executing the program during execution. The CPU 4001 executes, for example, a device driver that drives a motor of each joint of the robot device 3800, a map construction program that constructs an environmental map on the basis of three-dimensional point data acquired from the visual sensor 3802, a route planning program that plans a route of the robot device 3800 on the basis of the environmental map, and the like.
A memory 4021 includes a read only memory (ROM) and a random access memory (RAN). The ROM stores, for example, a startup program and a basic input/output program of the control system 4000. The RAM is used for loading a program to be executed by the CPU 4001 and temporarily storing data to be used during execution of the program.
A display unit 4022 includes, for example, a liquid crystal display or an organic electro luminescence (EL) display. The display unit 4022 displays data and an execution result during execution of the program by the CPU 4001. For example, sensor data including a three-dimensional point cloud acquired from a sensor, an environmental map constructed from the three-dimensional point cloud, information regarding a movement route of the robot device 3800 planned on the basis of the environmental map, a search situation of the planned route, and the like are displayed on the display unit 4022.
A sensor input unit 4030 performs signal processing for taking sensor signals from various sensors mounted on the robot device 3800, such as the visual sensor 3802, into the control system 4000. A motor input/output unit 4040 performs input/output processing of a signal with each motor, such as outputting a command signal to the motor of each joint of the robot device 3800 and inputting a sensor signal of an encoder for detecting the position of the motor or a torque sensor on a side of an output shaft of the motor. A network input/output unit 4050 performs input/output processing between the control system 4000 and the cloud.
First, the CPU 4001 acquires sensor data including a three-dimensional point cloud from the sensor via the sensor input unit 4030 (step S4101).
Next, the CPU 4001 executes the environmental map construction program, and constructs an environmental map (step S4106) through clustering of a three-dimensional point cloud (step S4102), construction of a three-dimensional object (step S4103), two-dimensional projection (step S4104), and hidden surface removal processing (step S4105).
Next, the CPU 4001 plans a movement route of the robot device 3800 on the basis of the constructed environmental map (step S4107).
Then, the CPU 4001 controls driving of the motor of the active joint of each movable leg so that the robot device 3800 moves according to the planned route (step S4108).
The present disclosure has been described in detail with reference to a specific embodiment. However, it is obvious that those skilled in the art can make modifications and substitutions of the embodiment without departing from the scope of the present disclosure.
In the present specification, an embodiment has been mainly described in which the amount of data and the amount of computation when the environmental map is created by applying the present disclosure to a small robot having limited computation resources are reduced, but the gist of the present disclosure is not limited thereto. Of course, the present disclosure can also be similarly applied to a case where an environmental map is created on a high-performance computing system equipped with a high-performance processor and a large-capacity memory system or on a cloud.
Furthermore, in the present specification, an embodiment of creating a (2.5 dimensional) environmental map including height information mainly from three-dimensional sensor data has been mainly described, but the present disclosure can be generally applied to a case of creating a low-dimensional environmental map from high-dimensional sensor data.
The method of creating an environmental map according to the present disclosure can be applied to various types of autonomous mobile devices such as small robots, automatic conveyance vehicles, and unmanned aerial vehicles such as drones.
In short, the present disclosure has been described in an illustrative manner, and the contents disclosed in the present specification should not be interpreted in a limited manner. To determine the subject matter of the present disclosure, the claims should be taken into consideration.
Note that the present disclosure may also have the following configurations.
(1) An information processing apparatus including:
(1-1) The information processing apparatus according to (1) described above, further including a data acquisition unit that acquires sensor data from a sensor.
(1-2) The information processing apparatus according to (1-1) described above, in which the sensor is any one of a depth sensor, a distance measurement sensor, a LiDAR, and a stereo sensor.
(2) The information processing apparatus according to (1) described above, in which the object construction unit constructs a high-dimensional object having geometric information extracted from a set of sensor data included in a cluster.
(2-1) The information processing apparatus according to (2) described above, in which the object construction unit constructs a high-dimensional object having a size, a shape, and a posture based on the geometric information.
(2-2) The information processing apparatus according to (2) described above, in which the object construction unit constructs an ellipsoid object based on an average p and a covariance Z obtained by performing principal component analysis on a set of sensor data included in the cluster.
(3) The information processing apparatus according to any one of (1) and (2) described above, in which
(3-1) The information processing apparatus according to (3) described above, in which
(4) The information processing apparatus according to any one of (1) to (3) described above, further including a hidden surface removal unit that performs hidden surface removal processing on the object that has been subjected to the low-dimensional projection by the projection unit.
(5) The information processing apparatus according to (4) described above, in which
(6) The information processing apparatus according to (5) described above, in which the hidden surface removal unit limits a region to be processed on the Z buffer by a bounding box obtained for a high-dimensional object.
(7) The information processing apparatus according to any one of (4) to (6) described above, in which the hidden surface removal unit performs the hidden surface removal processing in an order in which high-dimensional objects are sorted in a height direction.
(8) The information processing apparatus according to (5) or (6) described above, in which the hidden surface removal unit uses an incremental method for height calculation in the Z buffer.
(9) The information processing apparatus according to (5) or (6) described above, in which the hidden surface removal unit divides the Z buffer into tiles and executes height calculation for each tile on the same memory.
(10) The information processing apparatus according to (5) or (6) described above, in which the hidden surface removal unit divides the Z buffer into tiles and simultaneously executes height calculations of a plurality of the tiles.
(11) The information processing apparatus according to any one of (4) to (10) described above, further including a filter processing unit that performs processing of filling or processing of smoothing a gap existing between low-dimensional objects after the hidden surface removal processing.
(12) The information processing apparatus according to any one of (1) to (11) described above, in which the object construction unit clusters sensor data including a three-dimensional point cloud, and constructs an ellipsoid object based on an average p and a covariance Z of the three-dimensional point cloud included in a cluster.
(13) The information processing apparatus according to (12) described above, in which the projection unit orthogonally projects the ellipsoid object onto a reference plane parallel to a ground to generate a two-dimensional object including an ellipsoid flat plate.
(14) The information processing apparatus according to (13) described above, in which
(14-1) The information processing apparatus according to (14) described above, in which the map construction unit color-codes and expresses the two-dimensional object on the basis of a size of the normal component of the ellipsoid object before projection.
(15) The information processing apparatus according to any one of (1) to (14) described above, in which
(16) The information processing apparatus according to (15) described above, in which the map construction unit estimates a height of each point on a low-dimensional projection image on the basis of an expected value at a corresponding point of the high-dimensional object before projection.
(17) The information processing apparatus according to (15) or (16) described above, in which the map construction unit subjects variance in a thickness direction at corresponding points of the high-dimensional object before projection to low-dimensional projection to calculate variance of each point on a low-dimensional projection image, and constructs the environmental map to which information on reliability based on the variance is further added.
(18) The information processing apparatus according to (1) described above, in which
(19) An information processing method including:
(20) A mobile device including:
Number | Date | Country | Kind |
---|---|---|---|
2021-162687 | Oct 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/030196 | 8/5/2022 | WO |