The invention pertains to a method and a system for perceiving and estimating the position of physical bodies carrying out, in a manner which is efficient in terms of computational power and energy consumption, a multi-sensor fusion.
By “physical body” is meant any physical object or substance that exhibits individuality and can be detected and identified by an appropriate sensor. Thus, inanimate objects, be they natural or artificial, plants, animals, human beings, but also liquid or solid particles in suspension in the air, such as clouds, or indeed liquid or gaseous masses, are considered to be physical bodies.
The invention applies especially to the field of the navigation of robots, drones, autonomous vehicles, etc. and more generally to that of perception.
With the explosion of computation means that can be integrated into a robot, robotics applications have multiplied in recent years, from industrial production to home automation, from space and underwater exploration to mass-market toy drones. The tasks carried out in robotic applications have become progressively more complex, ever more often requiring robots to be able to move around in unknown environments; this has made it ever more important to develop means and techniques of perception, that is to say that allow the discovery and interpretation of surrounding space. An important application which uses perception in robotics is navigation, which consists in fixing a destination objective for a robot, and leaving it to arrive thereat while taking care to avoid unknown and potentially mobile obstacles; the robot is then responsible for planning its trajectory itself. A typical example, forming the subject of intense research, is the autonomous car.
To allow knowledge of the whole environment while limiting to the maximum the dead angles, and to alleviate any possible defect of a sensor, one generally has recourse to the integration of multiple sensors. When several sensors, possibly of different types, cover the same space, it is necessary to be able to combine the information extracted from each of them: one then speaks of multi-sensor fusion.
There exist two main families of perception techniques: geometric procedures, which are aimed at identifying the geometry of the objects of the surrounding space, and occupancy grid based procedures, which are aimed at determining whether a certain location is occupied by an obstacle (more generally, by a physical body). The invention pertains to occupancy grid based techniques.
The theoretical foundations of multi-sensor perception and fusion procedures based on occupancy grids are described in the article by A. Elfes, “Occupancy grids: a stochastic spatial representation for active robot perception” (Sixth Conference on Uncertainty in Al, 1990). This publication is not concerned with the practical implementation of the procedures, direct application of which would require complex floating-point computations.
The article by K. Konolige “ Improved occupancy grids for map building” (Autonomous Robots, 4, 351-367, 1997), that by J. Adarve et al. “Computing occupancy grids from multiple sensors using linear opinion pools”, (Proceedings—IEEE International Conference on Robotics and Automation, 2012) and that by T. Rakotovao et al. “Real-time power-efficient integration of multi-sensor occupancy grid on many core” (2015 International Workshop on Advanced Robotics and its Social Impact, Jun. 30, 2015) describe enhancements of the techniques based on occupancy grids. Here again, the implementation of these techniques requires massive recourse to floating-point computation.
Documents US 2014/035775, FR 2006/050860 and DE 102009007395 describe multi-sensor perception and fusion procedures and systems based on occupancy grids, applied to the autonomous driving of terrestrial vehicles. All these procedures require, for their implementation, floating-point computations.
However, recourse to floating-point computation demands considerable resources in terms of computational power, which are hardly compatible with the constraints specific to embedded systems. For the record, the floating-point format—defined by IEEE standard 754—represents a number by means of three elements: a sign (1 bit), a mantissa (23 or 52 bits) and an exponent (8 or 11 bits). Performing computations using numbers represented in floating point is much more complex (that is to say requires many more elementary operations) than performing computations on integers. This therefore requires the use of a faster processor and/or of a dedicated hardware accelerator, with an unfavorable impact in terms of cost, bulkiness and electrical consumption.
An environment of 20 m×50 m discretized into 100 000 cells of 10 cm×10 cm and two sensors operating at 25 Hz, are considered by way of example. If data fusion must be performed, in accordance with the prior art, by implementing the theory of A. Elfes by means of floating-point computations, the computational power required is of the order of 5-50 GFlops (billions of floating operations per second). If we consider a grid of 150 000 cells with 8 sensors operating at 50 Hz—assumptions which seem more realistic for applications in the automotive field—a need for 60-600 GFlops of computational power is obtained.
The invention is aimed at providing a procedure for perceiving physical bodies by multi-sensor fusion requiring fewer computational resources and, therefore, better adapted to embedded solutions. More particularly it is aimed:
on the one hand, at allowing the use of simple embedded computation devices, not necessarily supporting floating-point operations;
on the other hand, even in the case where floating-point computations would be supported, at reducing the energy consumption of the computation device, by avoiding or greatly limiting the actual recourse to such computations. Such a reduction in consumption is advantageous by itself; moreover, it makes it possible to reduce heating, thereby improving the robustness and the lifetime of the computation device, while relaxing the constraints in terms of thermal dissipation.
By way of example, the inventors have been able to note that the invention makes it possible to carry out the fusion on a digital system offering computational performance equivalent to that available in the IS026262 ASIL-D certified platforms used in the automotive sector. This is not possible using the fusion techniques known from the prior art.
Hereinafter, the application to the perception of obstacles will be specifically considered, but the invention is in no way limited to this typical case. Other possible applications relate to the detection of clouds or precipitations, or indeed of concentrations of pollutants, or else the distribution and the movements of people in a public space. In all these cases, it is possible to detect physical bodies with the aid of sensors, to measure the distance thereof and to use the measurements to compute an occupancy grid.
A subject of the invention making it possible to achieve this aim is a method for perceiving physical bodies comprising the following steps, implemented by a computer or a dedicated digital electronic circuit:
a) Acquisition of a plurality of distance measurements of said physical bodies, arising from one or more sensors;
b) Application, to each said distance measurement, of an inverse model of the corresponding sensor on an occupancy grid providing a discretized spatial representation of an environment of said sensor, so as to determine a probability of occupancy by a physical body of a set of cells of said occupancy grid; and
c) Construction of a consolidated occupancy grid, each cell of which exhibits an occupancy probability computed by fusing the occupancy probabilities estimated during step b);
characterized in that each said inverse sensor model is a discrete model, associating with each cell of the corresponding occupancy grid, and for each distance measurement, a probability class chosen inside one and the same set of finite cardinality, each said probability class being identified by an integer index; and in that, during said step c), the probability of occupancy of each cell of the consolidated occupancy grid is determined by means of integer computations performed on the indices of the probability classes determined during said step b).
Another subject of the invention is a system for perceiving physical bodies comprising:
at least one input port for receiving a plurality of signals representative of distance measurements of said physical bodies, arising from one or more sensors;
a data processing module configured to receive as input said signals and to use them to construct a consolidated occupancy grid by applying a method such as defined hereinabove; and
at least one output port for a signal representative of said consolidated occupancy grid.
Other characteristics, details and advantages of the invention will emerge on reading the description given with reference to the appended drawings, which drawings are given by way of example and illustrate respectively:
In the detailed description which follows, reference will be made to the case of the perception of obstacles. However, everything that is described applies more generally to the perception of physical bodies of any sort.
Usually, sensors employed for navigation advise regarding the distance of surrounding obstacles; one then speaks of distance sensors. To account for the precision of a sensor, for possible error thereof or for its resolution, a probabilistic model is introduced. The idea is that a measurement output by the sensor does not necessarily indicate the exact distance between the obstacle and the sensor, and that consequently it is appropriate to reason regarding the probability that the obstacle is at a given distance knowing the response of the sensor.
If d denotes the real distance between an obstacle and the sensor, and z the output of the sensor, one is concerned with the conditional probability density function p(z|d) which models the relationship between the real position of an obstacle and its estimation seen by the sensor (“direct model”).
Hereinafter, Ω will denote a spatial benchmark with one, two or three dimensions; an occupancy grid GO is a partition of a continuous and bounded subset of Ω into a number N of parts, dubbed cells and designated by an index iϵ[0, N−1]. The cell of index i is indicated by ci. Without loss of generality, we shall consider hereinafter a one-dimensional occupancy grid observed by a single distance sensor C (or a plurality of co-located sensors), the index i increasing as the sensors get further away (c0 therefore being the cell closest to the sensor and cN−1 the one furthest away). This configuration is illustrated by
An obstacle A is a bounded continuous subset of Ω. A cell ci is said to be occupied by an obstacle A if A∩ci≠Ø, to be not occupied by A if A∩ci=Ø. Stated otherwise, if the obstacle covers the cell even partially, the latter is considered to be occupied. Other conventions are possible, but in any event a cell must be either free, or occupied.
For each of the cells of the grid, we consider the binary random experiment “state” which can have one of the two outcomes {occupied; vacant} consisting in knowing whether or not the cell contains an obstacle. The state of the cell ci will be denoted ei, oi will denote the realization ei=occupied and vi will denote the realization ei=vacant. In a grid, it is considered that all the cells are independent, so that
∀i,j ϵ[0, N−1], P(oîoj)=P(oi)·P(oj) (1)
where ̂ is the logical operator “and” and P(·) denotes the probability of an event (not to be confused with a probability density, designated by a lower case “p”).
It is also considered that the position of the obstacles can only be known with the aid of uncertain distance sensors, characterized by a probabilistic model such as described above which may be written in a more general manner p(z|{right arrow over (x)}), {right arrow over (x)} being the position of an obstacle (in several dimensions, it is a vector, expressed in cartesian, spherical, polar coordinates, etc. and not a simple scalar). These sensors may be telemetric lasers (also called lidars), sonars, infrared radars, flight-time cameras, etc.
A measurement z arising from a sensor makes it possible to determine the probability of occupancy P(oi|z) of a cell ci. For a given measurement z, the set of probabilities P(oi|z)∀i ϵ [0, N−1] constitutes the inverse model of the sensor on the grid. Whilst the direct model of the sensor advises regarding the response of the sensor as a function of the material world, the inverse model expresses the impact of the measurement on the occupancy grid which is the model of the material world that is adopted, thereby justifying the term inverse model.
In accordance with the usage which prevails in the literature,
A more apposite version of the inverse model of
It should be stressed that the notions of “occupancy” and of “obstacle distance” are not entirely equivalent. Indeed, saying that an obstacle is at a sensor distance z does not signify only that a certain cell is occupied, but also that all the other cells of the grid that are closer to the sensor are free (otherwise, the first obstacle would have been seen at a distance of less than z). In the aforementioned
If the notion of uncertain sensor characterized by its (direct) model p(z|{right arrow over (x)}) is taken into account and if di denotes the distance of the cell ci with respect to the measurement point and {right arrow over (xl)} denotes the closest point of the cell ci to said measurement point, we have:
√ (2)
Equation (2) indicates that the model of the sensor evaluated at a point which is on the boundary of a cell of the grid (xi) is equal to the probability density of response of the sensor for a corresponding grid configuration, namely a grid where the cells closer than cell i are vacant, cell i is occupied, and the occupancy states of the cells further away than cell i are not determined. It is by utilizing this information that A. Elfes, in the aforementioned article thereby, has proposed a method for constructing the inverse model of the sensor. An explanation of this method is given hereinbelow.
Bayes' theorem makes it possible to express the inverse model of a sensor P(oi|z) in the following manner:
where P(oi) and P(vi) designate the a priori probabilities (that is to say without knowing the position of the obstacles, or the output of the sensor) that the cell ci is occupied or free, respectively. Hereinafter, we will make the assumption P(oi)=P(vi)=1/2, but a generalization does not pose any fundamental theoretical difficulty.
We then obtain:
The computation of the terms p(z|oi) and p(z|vi) can be done by using Kolmogorov's theorem over all the possible configurations of the grid. A configuration is formed by an N-tuple {e0, e1 . . . eN−1} where eiϵ {oi,vi}; a grid of N cells has 2N possible configurations.
For the term p(z|oi) the possible grid configurations are of the form Go
It is therefore possible to write:
The terms P(Go
whose terms can be computed directly on the basis of the direct model of the sensor. By feeding (6) back into (4) we can therefore compute, in principle, the inverse model of the sensor on the occupancy grid considered.
The main limitation of this procedure stems from the exponential explosion of the number of terms in the sum of equation (6), making it practically impossible to compute the sum. Indeed, in the prior art, analytical expressions for the inverse model are generally used, not involving expressing it directly as a function of p(z|oi). This implies the loss of the relationship with the direct model, which is the only one to be directly accessible by experiment, and therefore of the capacity to measure the error introduced by the modeling.
A first provision of the present invention is a simplified procedure for computing the inverse model—linear in N instead of being exponential—without introducing any approximation with respect to the use of equations (6) and (4). Hereinafter will shall consider, with an aim of simplifying the disclosure but without loss of generality, the case of a one-dimensional occupancy grid, where the position of an obstacle is represented by a scalar variable x and for which (6) can be written:
As we are considering a one-dimensional grid with N cells, xk can take only the position value of one of the cells cj with j ∈ [0; N−1]. Thus, the terms of the sum (7) can take only N different values. By factorizing all the grids giving the same value of p(z|xk), it is therefore possible to reduce equation (7) to a sum of N terms only—instead of 2N−1 . The complexity of the computation goes from exponential to linear.
We begin by considering the case ei=oi. We fix ci in its occupied state and we fix the distance of the first obstacle as being that of the cell ck, so that this distance is actually xk. We therefore seek the number of grids which are such that the first obstacle is seen at the position xk knowing that the cell ci is occupied. Three typical cases can be envisaged:
Equation (7) in the case ei=oi may therefore be written:
It is possible to repeat the same reasoning for the case ei=vi. In this case also it is possible to distinguish three possibilities:
Equation (7) in the case ei=vi may therefore be written:
By feeding expressions (8) and (9) into (4), one thus succeeds in constructing the one-dimensional inverse sensor model on the occupancy grid on the basis of its direct model with linear complexity with respect to the size N of the grid:
Such simplifications are also possible in greater dimension, by using polar or spherical coordinates in particular.
The construction of the inverse model depends greatly on the definition of the grid. It is therefore interesting to study the impact of a variation of spatial resolution on the inverse model.
It is interesting to note that the relationship existing between the precision of the sensor and the resolution of the grid is apparent only if the inverse model is computed on the basis of the direct model (equations 7-10). This relationship is lost if one is content with an approximate analytical expression for the inverse model, as in the prior art; this constitutes an additional advantage of the approach proposed by the invention.
On the basis of the inverse models of two sensors on one and the same occupancy grid, the fusion of the data of the two sensors is performed with the aid of the following equation:
where z1 and z2 are the measurements provided by the two sensors (the generalization to more than two sensors is immediate—it suffices to consider P(oi|Z1Λz2) as the inverse model of a “virtual” sensor and to fuse it with the measurement provided by a third sensor, and so on and so forth. Equation (11) is valid only if P(oi)=P(vi)=1/2, but its generalization to other assumptions is trivial.
This floating-point computation must be performed for each cell of the grid and at a frequency at least as fast as the acquisition frequency of the sensors—thus requiring considerable computational power.
A second provision of the invention consists of a method for the Bayesian fusion of the data arising from multiple sensors in an occupancy grid without floating-point computation, thereby making it possible to considerably reduce the complexity of the computations required for the construction of occupancy grids and thus to address more varied application fields, especially those for which the embedded computational capacity is very limited.
This aspect of the invention rests upon representing the probability on the interval [0; 1] in a discretized manner, by way of probability classes identified by integer indices.
In what follows, a countable subset of [0; 1], whose elements pn can therefore be characterized by a relative integer index “n”, will be called a “system of probability classes” S={pn, nϵ}. If the data fusion function expressed by equation (11) hereinabove is called “F”, we can write:
A particularly interesting case is that of a system of classes such that the result of the fusion of two probability classes of the system also belongs to the system; formally: ∀pi, pjϵS, F(p1, p2)ϵS. One then speaks of an “error-free” system of classes, since the fusion does not introduce any error or approximation. It is therefore possible to label the probability values by the indices of the corresponding classes, and the result of a fusion is also identified by an index. The fusion problem then amounts to determining an appropriate function Fd which, with two integer indices, associates another integer index. Formally:
∀(k, l)ϵ2,∃iϵ:F(pk, pl)=pi
and we write Fd(k, l)=i.
The computation of Fd(k,l) requires only the knowledge of the indices k and l and of integer index arithmetic; no floating-point computation is necessary for computing the fusion of the information pk and pi. Moreover, if the system of classes is considered, the index obtained with the aid of Fd(k,l) designates a probability value that is strictly identical to that obtained—using floating-point numbers—by applying equation (11). The procedure thus allows the fusion of the probability classes free of error with respect to a floating computation.
It is possible to generalize this approach by considering the case where the system S is constructed as being the union of several sub-systems of classes which, individually, are error-free. In this case it is possible that the system S considered as a whole may not be error-free. It is therefore necessary to introduce an approximation step into the definition of the fusion function Fd:
∀(k, l)ϵ2,∃iϵ: F(pk, pl)=p and pi≤p<pi+1
It is thereafter possible to round by default, by excess or to the nearest possible, and therefore to choose Fd(k,l)=i or Fd(k,l)=i+1 according to the approximation model. However, the error remains bounded by the width of the support of the system of classes.
A trivial example of an error-free system is S={1/2, 1}. Any system of classes comprising probabilities different from 1/2, 1 and 0 necessarily comprises an infinity of elements. In practice, for obvious implementational reasons, only systems of probability classes of finite cardinality will be considered. However, given that the sensors are finite in number and that their outputs (quantized and digitized) can take only a finite number of values, it is proven that it is possible to carry out “error-free” fusions even on the basis of systems of probability classes of finite cardinality.
A first exemplary applicational benefit relates to a system of probability classes formed by the union of the following two error-free sub-systems of classes:
where k is a positive integer (kϵ*) as well as the system Sk=Sk+USk−, which is not error-free.
a probability of 0.5 indicates uncertain occupancy; it is therefore not useful to be very precise around this value; on the other hand, precision is useful in proximity to the extreme values 0 and 1;
beyond a certain value of |n|, the various values of the probability classes pn get extremely close together; the error introduced by truncating the systems of classes is therefore negligible.
It is preferable to choose a low value for k (for example not greater than 5) so as to prevent only the probabilities very close to 0 or to 1 being finely sampled. In fact, the more precise the sensor, the higher the value of k can be.
The inverse model of
A first possibility consists in replacing the values of the inverse model—represented by the curve MI in
It is noted that, whatever the type of approximation considered, the spatially discretized inverse system (curve MI in
It was shown above, with reference to
As was explained above, the benefit of using systems of probability classes of the type of Sk is that the data fusion requires only computations on the indices “n”, and therefore on integers.
The case of Sk− is considered first. Let i, j<0ϵ. By substituting (13) into (12) and by performing simple computations, it is found that:
F(pi, pj)=pi+j−k·i·j (15).
The case of Sk+ is considered first. Let i, j>0ϵ. By substituting (14) into (12) and by performing simple computations, it is found that:
F(pi, pj)=pi+j+k·i·j (16).
If i=0, the probability pi is equal to 0.5 and then F(pi, pj)=pj: a sensor for which the occupancy probability is 0.5 does not provide any information.
There remains the case where one of the probabilities to be fused is less than 0.5—and therefore pertains to the class Sk−—and the other is greater than 0.5—and therefore pertains to the class Sk+. In this case we have:
The computation of the integer fractions does not, in general, give an integer result. It is possible, however, to implement an integer division operation “÷”, delivering an integer result—the index of a probability class whose value is close to the real result. The above equation then becomes:
F(pi, pj)=p(i+j)÷(1+k·j) if i<0, j>0 and |i|≥|j|; (19)
F(pi, pj)−p(i+j)÷(1−k·i) if i<0, j>0 and |i|≤|j|. (20)
The error introduced by applying equations (19) and (20) is bounded by the maximum distance between two successive classes. The two classes of maximum distance are p0 and p1, consequently the maximum error is given by
E(k)=p1−p0=k/(2k+4) (21)
It is noted that the situation considered here entails the two sensors giving contradictory information; in certain cases, it will be preferred not to apply equations (19), (20), but a rule for managing conflicts, which may be known from the prior art. For example, especially when needing to avoid potentially dangerous collisions, it may be preferable to take the highest estimation of occupancy probability; in this case we therefore have simply:
F(pi, pj)=max(pi, pj) (22)
Another exemplary system of classes exhibiting applicational benefit can be defined by recurrence.
Let p be an occupancy probability lying strictly between 0.5 and 1:0.5<p<1. The series pn is then defined by recurrence in the following manner:
The definition of pn is thereafter extended to negative integer values of n in the following manner:
Thereafter, we define the following two systems of classes, with parameter pϵ]0.5, 1[:
G
p
+={(pn), n≥0} (25)
G
p
−={(pn), n≤0} (26)
where pn is defined by (22) or (23), depending on whether n is positive or negative. By construction, the classes Gp− and Gp+ are error-free. Indeed, if fp denotes the function in one variable which, with a probability x, associates fp(x)=F(x,p), it is immediately seen that ∀iϵ*, pi=fpi(p), where the exponent “i” signifies “composed i times with itself”. Consequently:
F(pi, pj)=F(fpi(p), fpj(p))=Fpi+j(p)=pi+j (27)
It is deduced therefrom that Gp+ is error-free. By noting that ∀x, yϵ[0,1] we have F(1−x, 1−y)=1−F(x,y), the same reasoning can be applied to Gp−, which is therefore also error-free.
By putting Gp=Gp−∪Gp+, a new system of classes is obtained that can be used in a data fusion procedure according to the invention. What is noteworthy is that—unlike the system Sk defined above, Gp is error-free over the whole of its defining set.
Equation (27) makes it possible to find the formula for fusing the classes in Gp:
Fd(i, j)=i+j∀i,jϵ (28)
Moreover, the parameter p makes it possible to finely control the error introduced by the quantization; indeed if we put p=1/2+ε, we have:
E(p)=p1−p0=ε (29)
The system Gp is very interesting since it makes it possible to carry out the whole fusion in the simplest possible manner (an integer addition) and error-free, and to curb the error introduced by the quantization by the choice of the parameter p.
In the embodiment of
Each processing block COk therefore receives as input the measurements corresponding to a respective acquisition sheet zk (references z1 . . . zNC), and delivers as output an occupancy grid, in the form of a vector of integers gk representing the indices of the probability classes associated with the various cells of the grid. Thus, the grid gk encloses the occupancy information estimated with the aid of the measurements of the sheet k alone, that is to say of the vector of measurements zk.
The consolidation hardware block F comprises four combinatorial logic circuits F1, F2, F3 and F4 implementing equations (15), (16), (19) and (20) respectively; it receives at its input the occupancy grids g1 . . . gNC and delivers at its output a “consolidated” occupancy grid, represented in its turn by a vector of integers, indices of the probability classes associated with the various cells of this consolidated grid.
As was explained above, equations (19) and (20) might not be implemented; thus blocks F3 and F4 may be absent, or be replaced with circuits for managing detection conflicts, for example, implementing the rule expressed by equation 21 (choice of the maximum probability).
If the inverse models associated with the various acquisition sheets are identical, the blocks CO1 . . . CONC are also identical, and can be replaced with a single hardware block for computing occupancy probabilities, performing the processings in a sequential manner.
The data processing module MTD1 can also be associated with distance sensors of any other type.
In this embodiment, the main difficulty resides in the fact that the occupancy grid on the one hand and the sensors on the other hand each have their own inherent frame associated therewith. Thus, the evaluation of the location of the obstacles makes it necessary to perform changes of frame.
As a variant, the equations for changing frame can be held in conversion tables stored in memories contained in the modules Rk. Thus, even in this case, it is possible to circumvent the floating-point computation and only perform operations on integers. On the other hand, these conversion tables may be fairly voluminous and their storage may have a non-negligible cost in terms of silicon surface area.
In the embodiments of
Number | Date | Country | Kind |
---|---|---|---|
1558919 | Sep 2015 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2016/072530 | 9/22/2016 | WO | 00 |