The present invention concerns a method and a system for determining the location of an object as a receiver or a transmitter.
Indoor localization systems have become a popular research area in the last years, and many universities and companies do research in this field.
There are numerous applications of indoor localization, covering a broad range of fields. One example is inventory management: keeping track of the location and the amount of a particular good in a warehouse. Another application is object tracking, such as localizing medical personnel or equipment in a hospital or tracking people with limited mobility to understand if they need assistance. Yet another application that drew considerable attention is location aware services. When you visit a museum, you are given a device, and if you want to get information about a specific piece of artwork, you can enter a code related to that artwork to retrieve it. However, if the device was aware of the location of the person, it could detect which artwork the person is approaching and information about it can be directly retrieved without the need of the user entering any information. Another important fields where indoor localization are security and rescue operations, such as detecting the location of police dogs trained to find explosives, or localizing firemen in a building on fire.
Positioning systems, containing both indoor and outdoor localization, can be divided into three main topologies. The first one is self-positioning system, where the receiver receives data from distributed transmitters in order determine its own position (e.g. GPS). Second topology is remote positioning, where receivers located at possibly multiple locations measure the signal coming from the object in order to localize it. Third topology that has two subcategories is called indirect positioning: A data link is used to transfer position information from the self-positioning system to a remote site or vice-versa.
Besides different topologies of indoor localization, there are also different modalities. One of the modalities is GPS. Although GPS is one of the most accurate outdoor localization systems, since the signals coming from the satellites have poor indoor penetration due to shielding by the concrete, the performance is poor indoors. Another modality is the radio-frequency identification (RFID). This method has found applications in tracking objects in warehouses or assembly lines. There are two different RFID setups: passive and active RFID tags. Passive RFID tags are inexpensive to manufacture, but the reading ranges are short, typically 1-2 meters, and the tag readers are expensive to manufacture. Active tags are more expensive to produce, but they have increased range and the tag readers are less expensive. Another modality is using the cellular network. This enables indoor localization when the building is covered by multiple base stations. However, the accuracy depends dramatically on how well the location is covered and on the building's structure. Wireless local area network (WEAN) is also used for indoor localization. However, this approach needs preinstalled infrastructure and the performance fluctuates since the channel changes over time. Another approach is by using ultra-wideband (UWE) radio, which is one of the most promising technologies for indoor localization. First, UWB can send sharp pulses that enable precise localization of time of arrivals in the receivers. Second, UWB can penetrate walls, equipment and clothing, thus the signal can be observed even behind obstructions. Finally, one can use sound and ultrasound. The equipment needed for ultrasound localization is inexpensive and easily accessible. Furthermore, there are planar ultrasound transceivers, that is, transmitters and receivers that emit and receive in the 2D plane. This reduced the total amount of reflections we deal with and simplifies the design.
Although indoor localization has been studied for twenty years, the field is still an active research area and open for further developments.
It is an aim of the present invention to obviate or mitigate one or more of the disadvantages of the prior art.
It is another aim of the present invention to find an alternative to the solutions of the prior art.
According to the invention, these aims are achieved by means of a method for determining the location of a transmitter in a space defined by one or more reflective surfaces, comprising the steps of
sending a signal from this transmitter;
receiving by a set of receivers the transmitted signal and echoes of the transmitted signal reflected by these reflective surfaces;
finding by a first computing module the location of the virtual sources of the echoes;
mirroring by a second computing module the virtual sources into the space and obtained mirrored virtual sources;
combining by a third computing module the locations of the mirrored virtual sources so as to obtain the location of the transmitter.
Advantageously the location of the set of receivers and the location and/or orientation of the reflective surfaces are known.
The method according to the invention locates a source or transmitter in a room with general geometry bounded by reflective surfaces using the measurements by the set of receivers, e.g. a microphone array. In other words, the geometry of the space defined by the reflective surfaces (e.g. a room) does not have to convex, and a direct path between the transmitter and the receiver(s) is not necessary. The method assumes the knowledge of the room geometry and microphone positions.
The proposed method corresponds to a system architecture fitting in remote localization topology. The method utilizes the direct signal (if present) and early reflections to localize the source. In this context the expression “direct signal” indicates the signal sent by transmitter and directly, i.e. without reflections, received by a receiver or the set of receivers.
The method according to the invention advantageously makes use of early reflections or echoes, enabling to localize the source even when there is no line of sight. The method uses method of images to reduce the problem of indoor source localization to multiple source localization in free field, and then finding the source position. In other words, the method of images is used to reduce the problem of localizing the source in indoor environment to multiple source localization in free space and then estimating the source position.
In order to localize the multiple virtual sources, combinatorially selected echoes from each receiver of the set of receivers are used and the resulting location is tested to check if the echoes were corresponding to a single virtual source, for example, the resulting location is tested to check if the echoes uniquely localize a single virtual source.
After locating the virtual sources, the method according to the invention finds the position or location inside the room that results in the generation of such virtual source.
The step of mirroring of the method comprises applying the method of images in reverse order (“inverse method of images”, or “inverse image source model”). This mirroring procedure could have additional applications, whenever it is possible to observe anything through echoes, and the geometry of the reflective surfaces is known. A possible application is e.g. GPS in urban environments, where the method according to the invention allows to “see” the satellite only through (possibly multiple) reflections.
The method according to the invention localizes an arbitrary number of image sources and reflects them iteratively until they are in the room. The main idea is then to use the knowledge of the locations of the virtual sources for finding the (unknown) location of the real and original source. Advantageously the inverse method of images reflects the locations of the virtual sources back to the real source location.
The virtual source could be of arbitrary order (1st, 2nd, 3rd, . . . ).
Advantageously the step of mirroring comprises:
drawing the lines connecting a virtual source to the set of receivers,
finding the reflective surface which intersects these lines,
reflecting the virtual source across this reflective surface and generating a reflected virtual source,
storing the points of intersections on the reflective surface,
checking if the reflected virtual source is inside the space defined by the reflective surfaces,
repeating the previous steps if the reflected virtual source is not inside the space defined by the reflective surfaces.
In other words, if the reflected virtual source is not inside the space defined by the reflective surfaces, the following steps are performed:
drawings the lines connecting the stored points of intersections and the reflected virtual source,
finding the new reflective surface which intersects these lines,
reflecting the reflected virtual source across this reflective surface and generating a new reflected virtual source,
storing the points of intersections on the new reflective surface,
checking if the new reflected virtual source is inside the space defined by the reflective surfaces,
repeating the previous steps if the new reflected virtual source is not inside the space defined by the reflective surfaces.
In one embodiment, the method according to the invention comprises echoes' sorting, i.e. grouping the echoes corresponding to a single virtual source.
In one embodiment grouping the echoes corresponding to a single virtual source comprising checking if
is less than a threshold, wherein M is the number of receivers, ŝ is the estimated location of the virtual source with the current selection of echoes, mi is a receiver and ri the distance between the transmitter and the mi receiver.
Advantageously the method according to the invention comprises optimization of the source location. A measure based on the simulated room impulse response from the estimated source location is defined to find the best estimate for the true source position within the reflected virtual sources.
After choosing the estimate from the set of reflected virtual sources, its position is optimized based on the difference between simulated and recorded impulse responses.
In one embodiment the optimization of the source location estimate comprises a gradient descent method, which is used for optimizing the estimated location based on the simulated room impulse response.
The method according to the invention could comprise the tracking of the transmitter by using an optimization method. The optimization algorithm can then be applied to the problem of source tracking, where the new position of the source is estimated by applying optimization based on the position estimated in the previous time instance. In other words, the optimization method is applied for tracking a moving source based on previous position estimates.
By duality, in one particular embodiment, the method can be applied to localizing a microphone using multiple transmitters. In another particular embodiment, the method of the invention can be applied to GPS, wherein the transmitter is a mobile device, and the receivers are satellites (which, although is opposite to the standard whereby mobile device are receivers and satellites are the transmitters, is conceptually the same). It will be understood that the method of the invention can equally be applied to GPS, wherein the transmitter is a satellite, and the receiver is a mobile device. The application of the method of the invention to GPS will be described in more detail later.
The present invention concerns also a system for determining the location of a transmitter, comprising:
one or more reflective surfaces
one transmitter for sending a signal;
a set of receivers for receiving the transmitted signal and echoes of the transmitted signal reflected by these reflective surfaces;
a first computing module for finding the location of the virtual sources of said echoes;
a second computing module for mirroring the virtual sources into the room and obtained mirrored virtual sources;
a third computing module for combining the locations of these mirrored virtual sources so as to find the location of the transmitter.
In one preferred embodiment, the first computing module, the second computing module and the third computing module are the same module.
In one preferred embodiment, the signal is a UWB signal, the transmitter being a UWB transmitter and the receivers being UWB receivers. However other kinds of signals (e.g. acoustic, RF, light, etc.) can be used, as will be discussed.
The present invention concerns also a computer program product, comprising:
a tangible computer usable medium including computer usable program code for determining the location of a transmitter sending a signal received by a set of receivers, the set of receivers receiving also echoes of the transmitted signal reflected by one or more reflective surfaces, the computer usable program code being used for
finding by a first computing module the location of the virtual sources of these echoes;
mirroring by a second computing module the virtual sources into the space and obtained mirrored virtual sources;
combining by a third computing module the locations of these mirrored virtual sources so as to obtain the location of the transmitter.
By duality, the same procedure can be performed to localize a receiver with several transmitters in a non-convex room.
The present invention concerns then also a method for determining the location of a receiver in a space defined by one or more reflective surfaces, comprising the steps of
sending a signal from a set of transmitters;
receiving by this receiver the transmitted signal and echoes of the transmitted signal reflected by these reflective surfaces;
finding by a first computing module the location of the virtual receivers of the echoes;
mirroring by a second computing module the virtual receivers into the space and obtained mirrored virtual receivers;
combining by a third computing module the locations of the mirrored virtual receivers so as to obtain the location of the receiver.
The present invention concerns then also a system for determining the location of a receiver, comprising:
one or more reflective surfaces
a set of transmitters for sending a signal;
this receiver for receiving the transmitted signal and echoes of the transmitted signal reflected by said reflective surfaces;
a first computing module for finding the location of the virtual receivers of the echoes;
a second computing module for mirroring the virtual receivers into the room and obtained mirrored virtual receivers;
a third computing module for combining the locations of the mirrored virtual receivers so as to find the location of the receiver.
The present invention concerns then also a computer program product, comprising:
a tangible computer usable medium including computer usable program code for determining the location of a receiver receiving a signal transmitted by a set of transmitters, the receiver receiving also echoes of the transmitted signal reflected by one or more reflective surfaces, the computer usable program code being used for
finding by a first computing module the location of the virtual receivers of the echoes;
mirroring by a second computing module the virtual receivers into the space and obtained mirrored virtual receivers;
combining by a third computing module the locations of the mirrored virtual receivers so as to obtain the location of the receiver.
Experiments performed by the applicant have demonstrated the effectiveness, the accuracy and the robustness of the proposed methods and systems.
The present invention concerns also a computer data carrier storing presentation content created with the described methods.
The invention will be better understood with the aid of the description of an embodiment given by way of example and illustrated by the figures, in which:
σ=0.05.
The present invention will be now described in more detail in connection with its embodiment for determining the location of a loudspeaker (or, in general, of a transmitter) by knowing the geometry of a room, i.e. the location and the orientation of its walls (or, in general, of its reflective surfaces), and the location of a set of microphones (or, in general, of a receivers). However the present invention finds applicability in connection with many other fields, as will be discussed. For example, the described method and system can be used for determining the location of a receiver by knowing the geometry of a room, i.e. the location and the orientation of its walls or reflective surfaces, and the location of a set of transmitters.
The present invention will be now described in more detail in connection with an ultrasonic signal. However the present invention finds applicability of connection with other kinds of signals, e.g. and in a non-limiting way a light signal, an RF signal, etc.
The present invention will be now described in more detail in connection with a room. The example described is with respect to a 2D room for ease of understanding of the principles of the present invention; however it will be understood that the invention can be applied, using the same principles, to a 3D room. It will also be understood that the present invention does not necessarily need to be applied in a room.
Throughout this application vectors are denoted as boldface letters, e.g. x. The position of the ith microphone is denoted as mi. Without loss of generality, one corner of the room is assumed to be at the origin, and the corners are numbered in counter clockwise direction starting from ‘1’ for the corner at the origin. The coordinates of the ith corner of the room are denoted by ci, and the unit outward normal vector of the ith wall is denoted by ni. The inner product between two vectors a and b is denoted as a, b
=aTb, and ∥a∥ denotes the f2-norm of a defined as ∥a∥=√{square root over (
a, a
)}. The coordinates of the source are denoted by sε
2. The virtual sources (to be explained) generated by reflecting the source, in specified order, by the walls, i, j, k is denoted as vi,j,k.
Times of arrivals are the measurements of the absolute propagation time for the signal to reach the microphones after being emitted by the source. TOA can be measured if there is synchronization between the microphones and the source, namely, if the microphones have the information about when the signal was emitted by the source.
If the microphones do not know when the signal is emitted by the source, the absolute propagation time for signal to reach the microphones is unknown, thus TOA are not present. In this case one can measure the time difference of arrivals between microphones if they have a common time reference. This can be achieved by designating one of the microphones as the reference microphone, and finding the difference between the time when the signal reaches the reference microphone and the other microphones.
Direction of arrival is the angle with respect to a predefined reference, which the signal is coming to a receiver.
Triangulation is determining the position of an object by measuring the angles between the object and predefined fixed anchors.
Trilateration is a method for finding the position of an object by using distance measurements to at least three anchors; in the case where one uses more than three anchors, the problem is called multilateration.
It should be noted that in TOA, TDOA and DOA preferably at least three receivers for unique localization in free space in 2D are used, and four in 3D.
It should be understood that in the present application the terms “virtual source” and “image source” are used interchangeably and both terms stand for mirror images of a true source across one or multiple walls. Image sources are used to model echoes. First order echoes can be seen as coming from first order image sources, and higher order echoes can be seen as coming from higher order image sources.
If times of arrivals are known, the distance from the source to the microphones can be found by using the speed of propagation of sound in air C=343.2 m/s (precise value depends on other factors such as the temperature). Then by trilateration, the position of the source can be found by intersecting the distance circles 1a-d as shown in
When the time of arrivals are present without measured error, the circles will intersect at a single point and the intersection gives the position of the source. However, if there is jitter in the distance measurements, the circles 1a-d do not intersect at a single point as shown in
Referring to
r
i
=∥x−m
i∥+εi,i=1,2, . . . ,M,
where εi is random measurement jitter and M is the number of microphones, the position of the source 2 can be estimated by finding the position in the 2D plane that minimizes the sum of the squares of the differences between the measured distances r1,r2 of the source 2 to the microphones and the distances between the microphones m1-m4 (generalized as mi) and the test position 7. This optimization problem can be written as:
The solution to this problem yields the maximum likelihood estimator if errors follow a Gaussian distribution with covariance matrix proportional to the identity matrix. However, this problem is not convex, and there is no efficient algorithm to find the globally optimal solution. There are methods for approximating the solution, such as by semidefinite relaxation as disclosed in:
However, it is reported that although semi-definite relaxation yields good results for some instances, it can perform badly if the relaxation is not tight as disclosed in
Another optimization problem mentioned is the so called ‘squared-range-based least squares’ obtained by squaring the distances in (1) defined as:
Although this is still a nonconvex problem, the solution can be found efficiently and globally by the method described in the publication “Exact and Approximate Solutions of Source Localization Problems”.
To formulate the problem so that the solution can be found efficiently, first, we write it in constrained from as:
By using the substitution y=(xT,α)T the problem is written as:
The resulting problem consists of minimization of a quadratic objective subject to a single quadratic equality constraint, which are called generalized trust region sub-problems (GTRS) in optimization literature, as disclosed in:
It is shown that yε3 is an optimal solution if and only if there exists ε such that λε
:
(ATA+λD)y=ATb−λf
y
T
Dy+2fTy=0
A
T
A+λD≧0.
It follows that the optimal solution to (3) is given by:
{circumflex over (y)}(λ)=(ATA+λD)T(ATb−λf), (4)
Where λ is the unique solution of:
{circumflex over (y)}(D)TDŷ(λ)−2fTŷ(λ)=0 (5)
over the interval where ATA+λD is positive definite. Interval
satisfying this property can be found by using congruence transformations. By Sylvester's law of inertia, the matrix CTHC, where C is a non-singular matrix, has the same number of positive eigenvalues, negative eigenvalues and zero eigenvalue as matrix H, as is disclosed in:
By saying that H is congruent to G if G=CTHC for some non-singular C and denoting this equivalence relation by H˜G, we have:
A
T
A+λD=(ATA)1/2(I+λ(ATA)−1/2D(ATA)−1/2)(ATA)1/2˜I+λ(ATA)−1/2D(ATA)−1/2.
since all of the matrices on the right hand side of the equation has nonnegative eigenvalues, it follows that:
It is given in the publication “Linear Algebra and Its Applications” that λ can be found by simple bisection algorithm since the function (5) is decreasing on the interval .
Referring to
d
i
=∥s−m
i
∥−∥s∥,i=1,2, . . . ,M.
Geometrically, the points in the 2D plane that have a fixed distance difference to two fixed anchors trace a hyperbola. Since distance difference to two anchors yield one hyperbola, if we have three or more microphones, we have multiple hyperbolas and in the presence of precise range difference measurements, the intersection of the hyperbolas yield the source position ‘s’. However, if there is jitter in the measurements and we have more than two hyperbolas, the hyperbolas will not intersect at a single point. In this case, we can solve an optimization problem to find the best position estimate for the source position ‘s’. Rewriting the range difference equality we have:
−2di∥s∥−2miTs=di2−∥mi∥2,i=1,2, . . . ,M,
which is satisfied in jitter-free measurements.
However when there is jitter, the equality does not hold, but a reasonable estimate for the source position ‘s’ can be found by solving so called squared-range-difference-based least squares problem:
where yi=di2−∥mi∥2.
as is disclosed in
A closed form solution to this problem is derived in the publication “Exact and Approximate Solutions of Source Localization Problems”. First, the problem is written in constraint form:
It is shown in the publication “Exact and Approximate Solutions of Source Localization Problems” that the sufficient conditions for y to be the optimal point of the problem is there exists λε such that:
(BTB+λC)y=BTg
B
T
B+λC≧0
y
T
Cy=0
y
3≧0.
Using the optimality conditions a procedure prototype is explained to find optimal solution as:
y(λ)TCy(λ)=0,λεI1,
Where
y(λ)=(BTB+λC)−1BTg,
t(λ)=(BTB+λC)−1BTg,
y(λ)TCy(λ)=0.
and BTB+λC has at most one negative eigenvalue. The intervals of λ corresponding to these settings can be found by the congruence relation as before as:
B
T
B−λC=(BTB)1/2(I+λ(BTB)−1/2C(BTB)−1/2)(BTB)1/2˜I+λ(BTB)−1/2C(BTB)−1/2.
Defining the matrix:
V=(BTB)−1/2C(BTB)−1/2
and denoting the ith eigenvalue of V as λi(V) where the eigenvalues are ordered in decreasing order as λ1≧λ2≧λ3. Since have BTB positive definite and C has 1 negative and 2 strictly positive eigenvalues, we have λ1≧λ2≧0≧λ3. From the congruence relation we see that signs of the eigenvalues of BTB+λC are the same with I+λV which has eigenvalues 1+λ·λi(V). Using these facts there are three disjoint intervals where BTB+λC has at most 1 negative eigenvalue:
Using these intervals, the full procedure is defined as:
y(λ)TCy(λ)=0,λεI1.
Where
y(λ)=(BTB+λC)−1BTg.
y(λ)TCy(λ)=0,λεI0∪I2,
Method of images (also known as image source model) provides that reflections coming from walls can be viewed as direct signals coming from virtual sources. These virtual sources are obtained by mirroring the true source across the reflecting walls (possibly across multiple walls) as disclosed in:
The positions of these virtual sources vi can be found by:
where
N
i
:=n
i
n
i
T
is the orthogonal projection operator onto the normal to wall i, and pi is any point belonging to the ith wall.
To find higher order virtual sources one can reflect the source across multiple walls, or equivalently reflect a virtual source across a wall, as:
v
i,j
=v
i−2N(vi−pj). (9)
By using the method of images, we are reducing the problem of localizing the source in a room, to localization of multiple sources in free space.
One of the goals of the present invention is localization of a source which transmits a signal (e.g. an ultrasonic source or radio source) in a known reverberant room having general geometry, not limited to convex, bounded by at least some planar walls, from the measurements by a receiver (e.g. a microphone array). In the present description an example in which the source is an ultrasonic source and a receiver is a microphone array will be described, however it will be understood that the present invention is not limited to such an embodiment and other suitable types of sources and receivers can be used.
When the room is convex, assuming point microphones so they do not block the signals, the source is visible by all microphones in the microphone array (i.e. each microphone in the microphone array can receive a signal (such as an acoustic signal) which is emitted by the source). When the source is visible, all microphones (receivers) hear the direct signal, and the direct signal arrives before any echo. Thus, in the convex room setting, these direct signals can be used for the localization of the source, and it is reported that the performance of the localization algorithms decreases with reverberation, although there are notable exceptions.
In an exemplary problem setting of the present invention, the room is not assumed to be convex, thus there are positions in the room where the source is partially visible or not visible by the microphones in the microphone array (i.e. some microphone in the microphone array cannot directly receive a signal (such as an acoustic signal) which is emitted by the source), thus direct signal may not be heard. In the context of the present invention a “direct” signal is a signal which has not been reflected (e.g. which has not been reflected by a wall or object). However, the echoes reflecting from the walls are received by those microphones in the microphone array which do not receive the direct signal. In this setting, echoes are used, which in general makes the performance worse in the convex room, to localize the source in room with general geometry.
With reference to
The building blocks of a method according to the present invention will now be described: the forward model for generating virtual sources given a source position inside the room, localization of virtual sources from the recordings by the microphone array, reflecting the localized virtual sources into the room, estimating the source position from multiple reflected sources and optimizing the position of the source location estimate.
In an embodiment of the present invention a forward model is used which generates the recordings by the microphone array given the room geometry, source position and the microphone positions. The approach used for the forward model in this exemplary embodiment is based on the method disclosed in:
The positions of the virtual sources can be found by the equations disclosed in the previous sections. For the generation of virtual sources and checking if each virtual source is heard by a microphone in a specific position, there are three aspects that need to be tested: validity, visibility and obstruction.
Validity: The virtual source needs to correspond to valid echoes. For a candidate virtual source to be valid, the generating source needs to be directly adjacent to that wall. An example of an invalid virtual source is reflecting a first generation virtual source back in the room, across the same wall that generated it in the first place. This would correspond to two consecutive bounces off the same wall, which is physically infeasible. Visibility: With reference to
Although this is sufficient for a first order echo, for higher order echoes, one needs to check visibility also in the walls that were used to generate lower order virtual sources generating it, i.e., point of intersection ‘a’ of the generating wall and the line drawn between the virtual source V2 V2,4 and microphone m1 needs to be visible from the parent walls generating the virtual source. A parent wall is a wall that is part of the sequence of walls (reflections) that lead to a particular image source. Visibility of an image source from a certain point means that the receiver at that point can hear the signal from the image source. Conditions for visibility of higher order image sources are also illustrated in
Obstructions: In a convex room, since convex combinations of any set of points belonging to the room is also inside the room there is no obstruction of the source. However, as shown in
Localization of Virtual Sources
In this section we discuss the localization of virtual sources in two settings, where we have TOA and TDOA. First we consider the case where we have the signals recorded by M microphones containing the TOA. At the receiver, we do not know whether the signal is coming directly from the source or through a reflection from a wall. In particular, if the signal is reflected, we do not know which wall(s) generate the reflection.
In order to localize the virtual sources, we take one arrival time from each microphone combinatorially and we calculate the range by multiplying it with the speed of sound, to obtain the distance ri between the virtual source and microphone where i=1, 2, . . . , M.
Using this measure, we say that a particular combination of echoes corresponds to a single virtual source if the score is less than a chosen threshold.
In the case where we do not have TOA but TDOA, we use a similar approach to find correct echo combinations corresponding to a single virtual source. We designate one microphone as the reference microphone and—without loss of generality—assume it to be at the origin. Then we go through each pulse recorded in that reference microphone and we combinatorially take pulses one from other microphones and multiply the times by the speed of sound to obtain distances. Before using the chosen echo combination in the squared-range-difference-based least squares optimization, we shift the pulses so that the distance difference in the reference microphone equals to 0 (to have it indeed become the reference). Formally, if ti is the time instances of the selected pulses from microphones i=0,1, . . . , M−1 where microphone 0 is the reference, we define range-differences as di=c(ti−t0), where c is the speed of propagation of sound. Then we localize the virtual source using squared-range-difference-based least squares algorithm using distances d1, . . . , DM-1 and check if the echo combination was correct by evaluating range-difference localization score, GRDL, defined as:
Again, we accept the chosen echo combination as coming from a single virtual source if the score is less than a threshold.
Reflecting Localized Virtual Sources
After finding the location of the virtual source, since the room geometry is known, one can use the method of images in reverse order to find the real source position that would have generated that virtual source.
Referring to
One problem that might occur while applying the inverse method of images is that the lines drawn to multiple microphones may intersect multiple walls. This may happen due to errors in virtual source localization or jitter. In that case one may choose to drop that localized virtual source or reflect across the wall with the highest number of intersections.
Estimating the Source Position
So far we have multiple localized virtual sources that are reflected inside the room. The remaining questions is how to pick the estimate position for the true source position. To this end, one may make use of the localization scores, GLOC (or GRDL for TDOA), of the virtual sources where the less GLOC is the closer the estimated virtual source is to the true one. However, as the measurement jitter increases, incorrect echo combinations start to mimic correct echo combinations coming from false virtual sources and give low localization scores. Hence, for stable localization in case of high measurement jitter, a different measure (score) of how good/accurate an position estimate is may be used:
We derive the score based on the following idea when we have TOA recordings. If the estimated source position is close to the true source, the simulated room impulse response from the estimated source will be close to the recorded one as shown in
and ri,j is the jth pulse recorded by ith microphone, and {circumflex over (r)}i,k is the kth pulse that would have been recorded by the ith microphone if the source was at ŝ.
Using this measure, we pick the reflected source that gives the least GRIR as the estimate position.
Optimizing the Position Estimate
Above we have defined a score based on RIR to choose among the localized and reflected virtual sources the virtual source that accurately estimates the true source position. We can further improve the estimate by moving it inside the room so that the RIR score, GRIR, is minimized.
Towards this end, we define the virtual source that simulates the echo closest in time to the jth echo recorded in ith microphone by {circumflex over (v)}(i,j) as:
where we denote virtual sources of any order by single subscript. With this notation we define the RIR score again, this time as an explicit function of the estimated source position:
To minimize score one may solve the optimization:
Although this problem is again non-convex, if the initial position estimate is good enough we can find the minimizer iteratively by using gradient descent, or any other local search technique. The gradient of the RIR with respect to the source position is calculated as:
where the product is over the wall sequence that generates the virtual source {circumflex over (v)}(i,j). Then the position is optimized iteratively by setting:
ŝ←ŝ−η∇G
RIR({circumflex over (s)}),
where η≧0 is the learning rate. The algorithm may be stopped when the l2 norm of the update in the source position is smaller than a predefined positive threshold.
Given the measurements with jitter, εi,k resulting from the virtual source vk obtained from true source position in microphone i as:
r
i,k
=∥v
k
−m
i∥+εi,k
we solve the optimization problem:
equivalently,
If the jitter is i.i.d. Gaussian, the optimization problem gives the source position that will generate the echoes with maximum likelihood. Denoting the likelihood of obtaining the set of recorded echoes as:
and taking negative logarithm to get the negative log-likelihood, we have:
where const is a constant that depends on σ. Maximizing the likelihood is equivalent to minimizing the negative log-likelihood, which yields (11). Hence, by solving (11) we obtain the maximum likelihood estimator for the position that generates the set of recorded pulses.
However, since the signals coming from the virtual sources are unlabeled, we do not have direct access to ri,k, thus we do not know:
r
i,k
−∥{circumflex over (v)}
k
−m
i∥.
The minimization problem (10) can be viewed as a heuristic method aiming to solve (11), that estimates
r
i,k
−∥{circumflex over (v)}
k
−m
i∥
by taking the virtual source that gives the closest time difference to ri,k.
We will now present simulation results of the present invention for source localization with TOA measurements. First the results of localization in an L-shaped room with and without measurement jitter will be shown. Then it will be shown how the present invention performs for room with complex geometry that has no parallel walls and conclude by applying the present invention to tracking a moving source.
It should be noted that in all of the
For testing the developed indoor localization algorithm, we take a typical nonconvex room having L-shaped geometry shown in
Referring to
An outcome of the localization from jitter-free measurements can be seen in FIGS. 14A,B. It can be seen that the present invention finds estimates close to the true source position, and the localization scores mark positions close to the true source as begin good. It is also seen that the RIR score chooses within reflected virtual sources the closest one to the true source position. The crossed dot that depicts the outcome of the optimization algorithm based on the position of the striped dot is seen to perfectly localize the source.
FIGS. 15A,B shows an outcome of localization when there is measurement jitter drawn i.i.d. from centered Gaussian with σ=0.05. It is seen that although there are reflected sources in the vicinity of the true source position, the best reflected sources in terms of localization scores are away from it. Thus picking the reflected source having best localization score as the estimate of the true source position is not a valid option. However, also here, the reflected source having the best RIR score is the one closest to the true source position and the estimate is further improved by applying the optimization step based on that position.
As the last localization simulation we show the result of using the present invention for a room with very complex geometry with measurement jitter drawn i.i.d. from centered Gaussian with σ=0.1. As can be seen in FIGS. 16A,B although the reflected sources are distributed in a broad range, the vicinity of the true source position is still dense. Furthermore, although the positions having best localization scores are distributed, the RIR score picks the one that is closest to the true source position and optimization algorithm gives an even closer estimate.
One approach for source tracking is by taking measurements at distinct time instances and localizing the position independently of the previous ones. However, since the position of the source depends on its history, one can leverage the previous positions estimates in localizing the source.
In this simulation we compare the performance of tracking a moving source with two approaches. First approach is by going through all of the steps of the algorithm by recording the signal, finding echoes corresponding to virtual sources and localizing them, reflecting the localized virtual sources and taking the one giving the highest RIR score but not applying the optimization algorithm. The second method is by localizing first position by using first method and in addition applying optimization algorithm and for other time instances, applying only optimization algorithm based on the position estimate of the previous time instance.
FIGS. 17A,B show the results of the two approaches where the source traces the curve:
sε
2,
where,
s
1(t)=4−3 cos3(2πt/120)
and s2(t)=6.5+2 sin3(2πt/120) for t=0, 1, 2, . . . , 120, and the jitter is drawn i.i.d. from centered Gaussian with σ=0.05.
We will now discuss the objective functions behind the optimization step and plot average localization error for special case of square room through simulations.
s=(4,5)T
and first order virtual sources generated from this position.
E
l
:=∥s−ŝ∥.
In order to find the optimal position, the algorithm is started from the vicinity of true source position and gradient descent algorithm is used. The plot shows that localization error based on minimization of (11) increases smoothly as the jitter is increased.
However, as explained in earlier above, since we do not have the labels for the echoes, we cannot minimize (11) so we estimate its solution by using the heuristic method (10). The contours for the (10) are plotted in
Finally, we will now describe
Processor unit 304 serves to execute instructions for software that may be loaded into memory 306. Processor unit 304 may be a set of one or more processors or may be a multi-processor core, depending on the particular implementation. Further, processor unit 304 may be implemented using one or more heterogeneous processor systems in which a main processor is present with secondary processors on a single chip. As another illustrative example, the processor unit 304 may be a symmetric multi-processor system containing multiple processors of the same type.
In some embodiments, the memory 306 shown in
The communications unit 310 shown in
The input/output unit 312 shown in
Instructions for the operating system and applications or programs are located on the persistent storage 308. These instructions may be loaded into the memory 306 for execution by processor unit 304. The processes of the different embodiments may be performed by processor unit 304 using computer implemented instructions, which may be located in a memory, such as memory 306. These instructions are referred to as program code, computer usable program code, or computer readable program code that may be read and executed by a processor in processor unit 304. The program code in the different embodiments may be embodied on different physical or tangible computer readable media, such as memory 306 or persistent storage 308.
Program code 316 is located in a functional form on the computer readable media 318 that is selectively removable and may be loaded onto or transferred to data processing system 300 for execution by processor unit 304. Program code 316 and computer readable media 318 form a computer program product 320 in these examples. In one example, the computer readable media 318 may be in a tangible form, such as, for example, an optical or magnetic disc that is inserted or placed into a drive or other device that is part of persistent storage 308 for transfer onto a storage device, such as a hard drive that is part of persistent storage 308. In a tangible form, the computer readable media 318 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory that is connected to data processing system 300. The tangible form of computer readable media 318 is also referred to as computer recordable storage media. In some instances, computer readable media 318 may not be removable.
Alternatively, the program code 316 may be transferred to data processing system 300 from computer readable media 318 through a communications link to communications unit 310 and/or through a connection to input/output unit 312. The communications link and/or the connection may be physical or wireless in the illustrative examples. The computer readable media also may take the form of non-tangible media, such as communications links or wireless transmissions containing the program code.
The different components illustrated for data processing system 300 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented. The different illustrative embodiments may be implemented in a data processing system including components in addition to or in place of those illustrated for data processing system 300. Other components shown in
Therefore, as explained at least in connection with
In accordance with a further embodiment of the present invention is provided for a computer data carrier storing presentation content created while employing the methods of the present invention.
Although the present invention has been described in more detail in connection with its embodiment for determining the location of a loudspeaker or a microphone, the present invention finds applicability of connection with many other fields.
The present invention can be used for determining the exact position of a receiver r, which is a person in the
Knowing the position of the satellite s, the position of the buildings B1, B2, etc. (this is possible e.g. by using an electronic map) and applying the method according to the invention, it is possible to accurately locate the mobile device r and then the person, without any error.
Number | Date | Country | Kind |
---|---|---|---|
2935/12 | Dec 2012 | CH | national |
Number | Date | Country | |
---|---|---|---|
61919316 | Dec 2013 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2013/077694 | Dec 2013 | US |
Child | 14575912 | US |