The present disclosure relates to a method and device to estimate the dimensions of an object represented by a point cloud, such as a 3-dimensional point cloud, and can find application in numerous spheres such as computer vision for example and is applicable to several kinds of object detection.
There are several methods for estimating the size of objects from a representation in the form of a point cloud. For example, methods derived from the sector of data analysis and/or artificial intelligence propose constructing models of functions f, g, h: H=f(h), W=g(w) and D=h(d) to estimate width W, height H and depth D of objects in metres, using these same magnitudes w, h and d in pixels. These can be linear models or neural networks. Other approaches use deep neural networks to determine the pixels belonging to the contour of an object and to obtain the dimensions of the object by calculating the Euclidian distance between the points of the contour. However, said approaches are complex and necessitate a database of labelled data to determine the parameters of the different models. In addition, the execution of neural networks requires specific architectures (GPU or TPU) to perform real-time inference and perfect clipping of the object of which the dimensions are to be estimated.
There is therefore a need to propose simple solutions for estimating the dimensions of an object.
For this purpose, the disclosed technology proposes a method of estimating at least one dimension of an object represented by a point cloud relating to a scene comprising said object, wherein said estimation takes into account a distance, of at least one of said points of said point cloud, from a centre of said point cloud in said at least one direction.
In at least one embodiment, the method comprises:
In at least one embodiment, the method comprises:
In at least one embodiment, sequencing of the points is performed in increasing order of said distances obtained, and said estimation of said dimension of said object, in said at least one direction, takes into account the shortest distance obtained among the distances obtained for said two points.
In at least one embodiment, the sequencing of the points is performed in decreasing order of said distances obtained, and said estimation of said dimension of said object, in said at least one direction, takes into account the longest distance obtained among the distances obtained for said two points.
In at least one embodiment, the coordinate of said centre of the point cloud in one dimension is obtained as being the median of the coordinates of the points of the point cloud in said dimension.
In at least one embodiment, when sequencing of the points is performed in
increasing order of said distances obtained, the estimated dimension is twice the distance, from the centre of the cloud, of the point in said sequence preceding the first point having a distance greater than a first value.
In at least one embodiment, when sequencing of the points is performed in decreasing order of said distances obtained, the estimated dimension is twice the distance, from the centre of the cloud, of the point in said sequence following after the first point having a distance greater than a first value.
In at least one embodiment, said method comprises normalization of said distances obtained.
In at least one embodiment, the estimation comprises:
In at least one embodiment:
In at least one embodiment:
In at least one embodiment, the method comprises:
In at least one embodiment, said at least two angles of rotation are obtained by incrementing the angle values by a constant pitch.
In at least one embodiment, said angles of rotation are sampled following the method of Nelder-Mead.
The characteristics given alone in the present application, in connection with some embodiments of the method of the present application, can be combined together in other embodiments of the present method.
The disclosed technology also relates to a recording medium readable by a computer on which there is recorded a computer programme comprising instructions for executing the steps of the method of estimating at least one dimension of an object represented by a point cloud relating to a scene comprising said object, wherein said estimation takes into account a distance, of at least one of said points of said point cloud, from a centre of said point cloud in said at least one direction.
The disclosed technology also relates to a device comprising one or more processors configured together or separately to execute the steps of the method of the disclosed technology, according to any of the embodiments thereof. Therefore, the disclosed technology concerns a device for estimating at least one dimension of an object represented by a point cloud and relating to a scene comprising said object, the device comprising one or more processors configured together or separately to:
Other characteristics and advantages of the disclosed technology will become apparent from the description given below with reference to the appended drawings illustrating examples of embodiment which do not in any respect limit the disclosed technology.
The present disclosure concerns the estimation of one or more dimensions of at least one object in a scene, such as a scene captured by a digital image, this object being represented by a point cloud. A scene can represent an indoor environment or outdoor environment and may comprise one or more inert objects, animals, persons, a background . . . . By object in the disclosed technology, it can be meant an everyday object, part of said object, but also a living being such as a person or animal, a plant, or any other element captured in an image. By scene, it can be meant a scene representing a virtual image or real image.
In the example in
These capture means can be associated with processing means 10. The processing means 10 and capture means 20 can be included in one same single device 100 or can belong to different devices coupled together. The processing means particularly comprise means allowing a point cloud to be obtained of at least one portion of the scene in the image captured by the capture means. The processing means in this respect may comprise stereo vision, Radar vision, Lidar vision devices and/or devices of ToF type.
As illustrated in
The ROM memory 3 forms a recording medium conforming to at least one embodiment of the present disclosure, readable by the processor 1 and on which a computer programme PROG is recorded conforming to at least one embodiment of the present disclosure, comprising instructions to execute steps of the method of estimating at least one dimension of at least one object according to at least one embodiment of the present disclosure. The programme PROG defines functional modules of the device.
The user interface 30 can enable a user to interact with the system, for example in some embodiments with the capture means and/or processing means. This interaction with the capture and/or processing means can be optional in some embodiments. The user interface can be in several forms, in particular one or more screens, whether or not touch screens, one or more keyboards or stylus pens or tablet, mobile telephone or computer. In
The images obtained by the capture means are for example images representing scenes captured in the form of a point cloud as illustrated in
With the present disclosure, it is possible to estimate the dimension(s) of an object that has been captured by capture means, by estimating the dimension of a bounding box bounding the point cloud representing this object. In the present disclosure, it is the smallest bounding box which is sought, the dimensions of the object then being considered as the dimensions of the smallest bounding box.
An object in an image is considered, represented in the form of a point cloud, of which it is desired to estimate the dimension.
In the present example, the point cloud has three dimensions (x,y,z) but the disclosed technology can apply to a point cloud of one or more dimensions. The steps of the method described below relate to the estimating of the dimension in direction x (also called herein the X-axis), and the steps are reproduced, in sequence or in parallel, for the other directions to obtain all the directions. The steps described below are adapted for determining the smallest bounding box when the object lies on the axis of the capture means and, as explained later in the description, when rotation of the point cloud is envisaged in order to obtain the smallest bounding box.
At step E1, it is possible to determine the position of the centre of the point cloud and hence the coordinates thereof in the direction under consideration. According to embodiments, it is possible to obtain or determine the position of the centre of the point cloud (and hence the coordinates thereof) using different methods. For example, the coordinates of the centre in one direction can be determined as being the median of the coordinates of the points in this direction. Use of the median can make the method more robust in the face of measurement errors and outlier values which could be returned by the method.
The coordinates of the points of the cloud in the direction under consideration are obtained by the capture means e.g. a Lidar. At step E2, the distances di are calculated between point Pi and the centre C in direction x.
For a point Pi, the distance di obtained is the norm.
At step E3, the points Pi are sequenced (or in other words sorted or classified) as a function of their distance di with coordinate Cx from the centre, for example according to a distance di increasing with coordinate Cx from the centre.
In other embodiments, the points Pi are sequenced as a function of their distance di with coordinate Cx from the centre, for example according to a distance di decreasing with coordinate Cx from the centre.
The curve in
When sequencing of the points is performed in increasing order of the distances obtained, the estimation of the dimension of the object, in the at least one direction, takes into account the shortest distance obtained among the distances obtained for the two points.
When sequencing of the points is performed in decreasing order of the distances obtained, the estimation of the dimension of the object, in the at least one direction, takes into account the longest distance obtained among the distances obtained for the two points.
In at least one embodiment, the dimension is estimated as being twice the distance (from the centre of the cloud) associated with the point preceding the first point having a distance greater than the first value, when the points are sorted in increasing order of distance.
In at least one embodiment, the dimension is estimated as being twice the distance associated with the point following after the first point having a distance greater than the first value, when the points are sorted in decreasing order of distance.
This first value, as illustrated in
This first value, for example, can be a threshold value defined as a function of the mean spacing between the distances of the points having an index lower than i. For example, if it is detected that the difference between the distance associated with a point i and the distance associated with a point i+1 is greater for example than 1.5. times this mean spacing, then a break is detected.
In at least one other embodiment, to determine this break, it is also possible to determine the differences in distance between two adjacent points, one by one, and to select the point i that has a maximum difference in distance with point i+1. The dimension of the object is obtained as being twice the distance di of said point i. Hereafter, point i is called the break point.
In some embodiments, at a step E51, the curve in
i being an integer higher than or equal to 1, it is written:
With min (di) and max (di) respectively being the minimum and maximum values of distance di.
The X-axis of the normalized curve is:
Within the space of this normalized representation of the distance di of points i, the straight line is plotted of the equation y=x, step E52, as illustrated in
Estimation of the distance, according to the dimension under consideration, may then comprise a step E53 to determine, for points i, a distance between the normalized representation thereof and the ordinate thereof on the curve y=x within the representational space.
The point on the normalized curve is determined that has a maximum distance between the ordinate thereof on the normalized curve and ordinate thereof on the curve y=X.
The dimension in the direction under consideration can then be estimated as being twice the distance between the centre and the break point, i.e. the point having the maximum distance between the ordinate thereof on the normalized curve and the ordinate thereof on the curve y=x. Here, as indicated in
If the index of the break point is denoted r, in the above example, r equals 7, then the dimension in the direction under consideration is equal to 2dr.
In some embodiments, this estimation can be made further robust by comparing the value M previously defined with a determined value, called noise value, at step E54. The so-called noise value is determined as a function of a measurement noise of a capture device capturing an image of the point cloud. If M is higher than this measurement noise value (or threshold), then the dimension is equal to 2dr otherwise the dimension is equal to 2M. This can advantageously allow overcoming of interference noise. The threshold value can be a function of the accuracy of the device capturing the point cloud. For some capture devices, it can be in the region of a few millimetres to a few tens of millimetres. For example, if a LIDAR has an accuracy of +/−2 mm, then any difference of this magnitude does not correspond to a break but to noise. This measurement noise can be filtered by only giving consideration to breaks that are greater than this noise.
The steps of the method can be repeated or performed in parallel for at least one other dimension of the object, for example for the two other directions y and z in the detailed examples.
It is therefore possible to estimate the dimension in three dimensions corresponding to the width, depth and height, the three estimated dimensions defining a box bounding the region of interest as illustrated in
In some embodiments, the preceding steps can be iterated to check that the box obtained is the smallest box bounding the region of interest, as illustrated in
In
In some embodiments, rotation can be performed by incrementing the angle values by a constant or variable pitch.
In some embodiments, the chosen angles for each iteration are sampled using an optimization algorithm. For example, among the optimization algorithms used, it can be envisaged to use the Nelder-Mead method for performing rotation of the point cloud. The Nelder-Mead method allows the determining of parameters minimizing a certain criterion. In our case, the parameters are the rotation angles for example of the point cloud, and the criterion is for example the volume of the bounding box.
This can help towards limiting the number of necessary iterations to determine the optimal angles (i.e. minimizing the size of the bounding box) and hence the dimensions of the bounding box corresponding to the object, and thereby obtain the dimensions (and hence size) of the point cloud of the object.
One application of this disclosure can particularly find advantage in the manufacturing industry, for example for verification of the conformity of a part produced on a production line. By allowing estimation of some dimensions of a part (e.g. a part detected more or less automatically, for example with the method described in the previously cited patent application by the Applicant), the present disclosure can help towards verifying the shape and/or size of a product and/or whether it conforms to an expected result or to specifications, this being at least partially automatic (e.g. without human intervention). The obtaining of better knowledge of the size (and optionally location) of objects handled by robots can also help towards making robot movements more stable and more precise.
Other applications can concern the logistics sector. The embodiments described in this disclosure can be used to locate goods in warehouses. Knowledge of the size of objects can help toward limiting the storage space required for storing goods, and toward management of flows of goods in warehouses, by optimizing warehousing thereof e.g. in containers.
Other applications can concern the automated driving of vehicles. Estimating the dimensions of obstacles (other vehicles, objects, roadworks . . . ) on the roadway can help toward choosing the autopilot response to be given to the presence of the latter (deviation from trajectory, emergency stop of the vehicle, etc.).
Other applications can concern environmental mapping to allow the navigation of robots, drones and automated vehicles in the presence of obstacles.
Number | Date | Country | Kind |
---|---|---|---|
2304867 | May 2023 | FR | national |