Motion sensor assisted room shape reconstruction and self-localization using first-order acoustic echoes

BACKGROUND OF THE INVENTION
1. Field of the Invention

The present invention relates to simultaneous indoor localization and, more particularly, to room shape reconstruction using a single mobile computing device.

2. Description of the Related Art

With the development of mobile devices, many applications related to public safety, medical care, or commercial use become available by using sensory information collected by the devices. In many cases, these applications highly rely on the localization feature provided by the devices. Therefore, localization becomes an integral part for applications where location information is critical.

Outdoor localization is largely considered as a solved problem. The satellite based Global Positioning System (GPS) is able to provide satisfactory accuracy and coverage in most outdoor environment. However, it cannot offer an acceptable performance for indoor localization, as the microwaves are easy to be heavily attenuated when penetrating the construction materials. In addition, the multi-path propagation caused by the reflections on the construction surfaces leads to significant losses of localization accuracy.

Indoor localization has been an active research area in the recent years. Most works focus on the simultaneous localization and mapping (SLAM), which is able to build the map of the environment, while determining the device's position within the map. Several techniques have been demonstrated to be effective to accomplish indoor localization, such as those utilizing location specific signatures from WiFi, Bluetooth, UWB signals as well as LED light. Most existing techniques require some prior information about the surrounding environment, such as anchor nodes in UWB based system whose positions are fixed and known. Additionally, these techniques invariably require the availability of infrastructure that is functioning (i.e., powered up) during the localization and mapping process. There are applications, however, where indoor mapping and localization may be required in the absence of pre-established infrastructure. A simple example is the need of first responders when natural disaster may lead to a power outage that in turn renders any pre-established infrastructure inaccessible.

BRIEF SUMMARY OF THE INVENTION

The present invention comprises a device for performing simultaneous localization and mapping in an enclosed space having a loudspeaker capable of emitting a predetermined sound, a microphone co-located with the loudspeaker, a processor interconnected to the microphone, wherein the processor is programmed to receive a series of echoes of the predetermined sound when emitted by the loudspeaker from a corresponding series of non-collinear locations within the enclosed space and to determine shape of the enclosed space based on the series of echoes from the corresponding series of locations. The processor is programmed to determine the shape of the enclosed space by measuring the distance between each of the series of locations from a preceding one of the series of locations. The processor is programmed to determine the shape of the enclosed space by measuring the distance between each of the series of locations and all walls of the enclosed space. The processor is programmed to determine the shape of the enclosed space by identifying first order echoes from within the series of echoes received at each of the corresponding series of locations. The processor is programmed to determine shape of the enclosed space by reconstructing all possible shapes of the enclosed space and selecting the shape with the most number of edges. The processor is programmed to determine the location the series of non-collinear locations within the shape of the enclosed space. The predetermined sound may comprise a chirp signal sweeping from a first frequency to a second frequency, where the first frequency is 30 Hz and the second frequency is 8 kHz.

The invention thus involves a single mobile computing device that is equipped with a loud speaker and a microphone as well as various motion sensors that is programmed to perform room shape reconstruction. The requisite equipment are generally available in conventional smartphones and laptop computer which can be programmed using applications to implement the present invention. The invention provides a technology that allows simultaneous room shape recovery and self-localization without another external infrastructure. Mobile device 10 provides as a co-located acoustic transmitter and receiver that emits and receives acoustic echoes; together with the information gathered through internal sensors, the device can autonomously reconstruct any 2-D convex polygonal room shape while self-localizing with respect the reconstructed room shape.

The present invention also encompasses a method of performing simultaneous localization and mapping in an enclosed space, comprising the steps of providing a loudspeaker capable of emitting a predetermined sound, emitting the predetermined sound from the loudspeaker from each of a series of locations within the enclosed space, receiving a corresponding series of echoes of the predetermined sound from each of the series of locations with a microphone co-located with the loudspeaker, and using a processor interconnected to the microphone to determine shape of the enclosed space based on the series of echoes received from the corresponding series of locations.

The method of the present invention can thus use a single mobile device with acoustic features and motion sensors to simultaneously recover the room shape and localize the device itself. The effectiveness of the invention was demonstrated for SLAM in 2-D convex polygonal rooms. In the method of the invention, the mobile device serves as a co-located acoustic transmitter and receiver. Specifically, it transmits a probing signal to excite the acoustic response in the indoor environment, and receives and records the echoes. By measuring the time of arrival (ToA) of the echoes, the distance between mobile device 10 and each reflector (wall) can be recovered. Then to establish the environment infrastructure through the ToA information, it is proved that the transmission-reception process needs to be done for at least three times at three distinct non-collinear positions. Moreover, to obtain better performance of infrastructure reconstruction, the inertial sensors mounted in the mobile device, such as the accelerometer and magnetometer, are used to track the trajectory of itself. However, the motion direction information estimated by the inertial sensors are known to be highly inaccurate, and will not lead to acceptable performance for localization and mapping. Therefore, in this method, only the path lengths, i.e., the distance between the consecutive measurement points, are estimated and used. Given the ToA information collected at three distinct non-collinear measurement points and the distance information between consecutive measurement points, the developed technology can reconstruct any convex polygon in 2-D, as well as localize the device itself using acoustic echoes. Thus, in the technique of the present invention, 2-D SLAM can be achieved by using the acoustic functions and motion sensors of a single mobile device, without any pre-established infrastructure or external power supply.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING(S)

The present invention will be more fully understood and appreciated by reading the following Detailed Description in conjunction with the accompanying drawings, in which:

FIG. 1 is a schematic of a room shape reconstruction system according to the present invention;

FIG. 2 is a schematic of the path of a mobile device for a room shape reconstruction system according to the present invention; and

FIG. 3 is a schematic of the path angles in a room shape reconstruction system according to the present invention;

FIG. 4 is a flowchart of a process for room shape reconstruction according to the present invention;

FIG. 5 is a schematic of an image source model according to the present invention;

FIG. 6 is a schematic of room shape geometry according to the present invention;

FIG. 7 is another schematic of room shape geometry according to the present invention;

FIG. 8 is another schematic of a mobile device employed to measure the geometry of a room according to the present invention;

FIG. 9A is a graph of a transmitted signal convolved with itself according to the present invention;

FIG. 9B is a graph of a transmitted signal convolved with its windowed version according to the present invention;

FIG. 10A is a graph of correlator outputs according to the present invention;

FIG. 10B is a graph of correlator outputs according to the present invention;

FIG. 10C is a graph of correlator outputs according to the present invention;

FIG. 11A is a graph of peak detection according to the present invention;

FIG. 11B is a graph of peak detection according to the present invention;

FIG. 11C is a graph of peak detection according to the present invention; and

FIG. 12 is a graph of a comparison between the ground truth and experiment result according to the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Referring to the figures, wherein like numerals refer to like parts throughout, there is seen in FIG. 1 a mobile computer device 10 having an internal processor 12 as well as a co-located loudspeaker 14 and microphone 16 and a motion sensor 18, such as an accelerometer, that is programmed to use the data from loudspeaker 14, microphone 16, and motion sensor 18 for room shape reconstruction system according to the present invention. Mobile device 10 may be a conventional smartphone, laptop computer, etc. having the required components that is programmed via an application to perform the processing of signals received from located loudspeaker 14, microphone 16, and motion sensor 18.

As seen in FIG. 1, device 10 may be positioned inside a room 20 having a plurality of walls 22, 24, 26, and 28. Device 10 used to take measurements in different points of the room—at a minimum three measurement points that are not on the same line. For example, a user may hold device 10 and simply walk around the room so long as he or she is not traveling along a straight line. The measurements are processed together with the information extracted from internal sensors to recover 2-D room shape (which in the example of FIG. 1 is a simple rectangle) as well as the relative locations of all measurement points. Once this is accomplished, device 10 can easily track itself in real time as it travels around the room.

Mobile device 10 provides co-located loudspeaker 14 and microphone 16, and is moved around inside the room whose shape is to be reconstructed. At each measurement point, device 10 is programmed to emit a probing acoustic signal s(t) and receives and records the echoes r(t). As seen in FIG. 2, mobile device 10 is travelling along the path O₁-O₂-O₃inside the polygonal room encompassed by six walls W₁-W₆. The proposed technique is able to reconstruct both the 2D room shape, i.e., W₁-W₆, and the device location O₁-O₂-O₃.

Image Source Model

The basic technique to link the acoustic echoes and the room shape begins with a classic model widely used in acoustics and optics, called image source model (ISM). In FIG. 2, the first-order image sources of O₁with respect to the edge W₁and W₆are shown as S₁⁽¹⁾and S₁⁽⁶⁾. There are second-order, or even higher-order image sources, which induce high-order echoes, such as the reflecting paths O₂-R_2,2-R_2,4-O₂. To reconstruct the room shape, the parameters of interest are the distances between the source and each reflector (wall), which is precisely half the distance between the source and its first order images with respect to each wall. Suppose the distance between the i-th wall and the j-th source is denoted as r_j,i, it is related to the time of arrival (ToA) of each echo by the following formula:

$τ_{i}^{(j)} = \frac{r_{j, i}}{c},$

where τ_i^(j)is the travelling time of the probing signal being reflected by the edge W and returning to the source O_j, and c is the speed of sound. Here, it is possible to assume the emission time is set at t=0.

All the distances collected at a single source are denoted as a vector {right arrow over (r)}_j. It is quite trivial to show three sets of distances {right arrow over (r)}_j, or equivalently, three sets of ToA information collected respectively at three distinct locations that are not co-linear inside the room are sufficient to reconstruct the room shape. As seen in FIG. 2, mobile device 10 moves along a random path, and collects information at at least three sources O₁-O₃.

Room Shape Reconstruction and Self-Localization with Known Path Lengths

With only first order echo information, it is conventionally known that without any additional information, such as relative distance of measurement points, it is impossible to reconstruct all 2-D convex polygons. In particular, if the room shape is a rectangle, it has been shown that there are infinite parallelograms of completely different shapes that yield the same set of first order echoes. However, with various internal sensors, it is now feasible using the present invention to measure, for example, the distance between two measurement points or even the angles if the user of device 10 walks along different straight lines after each measurement points. In the present invention, the distance information measured by motion sensor 18 between two neighboring measurement points is used to supply the necessary information. Specifically, the distances between O₁and O₂, as well as between O₂and O₃, which are denoted as d₁₂and d₂₃in FIG. 2, may be measured. With this additional information, the steps below may be used to establish the 2-D room shape while self-localizing with respect to the reconstructed room shape.

Peak Detection

To achieve better resolution for determining τ_i^(j), or equivalently r_j,i, wide-band signals are usually used. In the case of acoustic signals, a chirp signal is used as its auto-correlation provides a good approximation to the Dirac delta function. Therefore, to obtain the ToA information, the received signal r(t) is first convolved with the probing signal s(t). Whenever there is an echo (first or higher orders), a peak will occur at the output of the correlator. The first and most significant one corresponds to the light-of-sight (LOS) path (i.e., directly received by the microphone without reflecting off any wall). This LOS arrival time will be recorded and subtracted from subsequent echoes and differences are precisely the time each echo travels along a certain path. All detected echoes are collected into the distance set {right arrow over (r)}_jat each source O_j.

Reconstruction for the Ideal Case

Consider a convex planar K—polygon as shown in FIG. 3. Without loss of generality, the origin of the coordinate system is fixed at O₁, and the x-axis is chosen to be towards O₂. As indicated in FIG. 3, the path angle is denoted as π-φ, which is assumed to be within (0,π). Then, it is straightforward to show that

(r_2,i−r_1,i)+d₁₂cos θ_i=0, (1)
(r_3,i−r_2,i)+d₂₃cos(θ_i−φ)=0. (2)

The ideal case refers to the case when echoes corresponding to different walls are correctly labeled at different nodes, and only the first-order echoes are present in the distance sets. Thus, in each {right arrow over (r)}_j, the {right arrow over (r)}_j,i's are sorted in the same order as i=1, . . . , K though they may not arrive in this order. The system needs to determine the uniqueness of φ and θ_i's according to (1) and (2). The solutions to (1) and (2) are given by:

$\begin{matrix} θ_{i} = \pm \arccos (- \frac{r_{2, i} - r_{1, i}}{d_{12}}), & (3) \\ θ_{i} - φ = \pm \arccos (\frac{r_{3, i} - r_{2, i}}{d_{23}}) . & (4) \end{matrix}$

and these two equations yield four possible sign combinations. However, there are only two sign combinations which satisfy (1) and (2) simultaneously for all i=1, . . . , K, and those two are reflections of each other with respect to O₁O₂.

Notice that in such a coordinate system, the first two sources are located at (0, 0) and (d₁₂, 0), and once φ and θ_iare determined, the coordinate of O₃is determined as well. Hence, the self-localization can be accomplished.

Echo Labelling

Practically, the received echoes are not correctly labeled at different measurement points, i.e., one does not know a priori which are the first order echoes corresponding to the same wall—notice that at different nodes, echoes from different walls may not arrive at the same order. In addition, {right arrow over (r)}_jmay contain high-order echoes. Therefore, the higher-order echoes have to be eliminated, and the first-order echoes have to be labeled in the correct order; this is done by trying different echo combinations to solve (1) and (2). With random measurement points, no solutions to (1) and (2) can be obtained for all i=1, . . . , K except for the correct set of first order echoes. The length of {right arrow over (r)}_jis denoted as N_j, then N=min{N₁,N₂,N₃}. To find the correct labels of the echoes, each K out of N distances are selected from each distance set {right arrow over (r)}_j, and plugged into (3) and (4), to determine if they can yield a valid solution to (1) and (2) for all i=1, . . . ,K. As the actual number of walls is unknown in prior, K needs to vary from 3 to N, corresponding to polygons of varying number of sides. There may be multiple polygons satisfying (1) and (2), and among all these polygons, e.g., if the original shape is a pentagon, then it is possible that four set of first order echoes will also correctly solve (1) and (2), yielding a quadrilateral. Thus, the one with the most number of edges is chosen as the final reconstructed shape.

Self-Localizing

Once 2-D room shape is reconstructed after at least three measurement points, the coordinates of the three measurement points are automatically recovered in the process. Subsequently, echoes collected at other points are used to determine the location of those points, i.e., self-localization can be trivially accomplished.

The concrete steps of the system are seen in FIG. 4.

Room Impulse Response Model

Acoustic signal propagation from a loudspeaker to a microphone in a room can be described by the room impulse response (RIR), which can be formulated as the summation of both line-of-sight (LOS) and reflected components. In practice, if the microphone and loudspeaker are much closer to each other compared to the distance between the device and the walls, the device is referred to as a co-located device. For a co-located device at a measurement point denoted by O_j, the RIR is

$h^{(j)} (t) = \sum_{i} α_{i}^{(j)} δ (t - τ_{i}^{(j)})$

where α_i^(j)'s and τ_i^(j)'s are path gains and delays from the transmitter to the receiver, respectively. Since higher order reflective paths typically have much weaker power compared with the lower order ones, (1) can be approximately expressed by the first N_j+1 components including LOS and N_jreflective paths:

$h^{(j)} (t) \approx \sum_{i = 0}^{N_{j}} α_{i}^{(j)} δ (t - τ_{i}^{(j)}),$

It is possible to assume that the N reflective paths contain all first order reflections and higher order ones that are detectable. Given the transmitted signal s(t), the received signal at O_jis

r^(j)(t)=s(t)*h^(j)(t)+ω(t),

where ω(t) is the additive noise. τ_i^(j)'s can be obtained from r^(j)(t) if the s(t−τ_i^(j)) decays before s(t−τ_i+1^(j)) arrives at the receiver. However, it is difficult to generate such kind of acoustic signals which requires extremely wide bandwidth. A better way to obtain τ_i^(j)'s is to consider the correlator output:

m^(j)(t)=r^(j)(t)*s(t).

If s(t) has nice auto-correlation property, the first peak of m^(j)(t) corresponds to the LOS components, while other peaks correspond to reflective components. Hence the time difference of arrival (TDOA) can be obtained given asynchronous loudspeaker and microphone. This paper applies chirp signals which are easy to generate and have good auto-correlation properties.

Since the loudspeaker and microphone are co-located, τ₀, which corresponds to the delay of the LOS path, is close to zero. Define a column vector

${\tilde{r}}_{j} = {\frac{(τ_{i}^{(j)} - τ_{0}^{(j)}) c}{2}}_{i = 1}^{N_{j}},$

where c is the speed of sound. Then {tilde over (r)}_jcontains all the distances between the device and the walls. Hence synchronization between loudspeakers and microphones is not required for co-located device if only the distances between measurement point and the walls are of interest.

Image Source Model

By conventional image source model, reflections within a constrained space can be viewed as LOS propagation from virtual sources to the receiver in the free space. Suppose the coordinate of O_jis denoted by o_j. As shown in FIG. 5, the first order image source of O_jwith respect to the ith wall is

õ_j,i=2 custom character p_i−o_j,n_i+o_j,

where p_iis any point on the ith wall and n_iis the outward norm vector of the ith wall. Thus

$τ_{i}^{(j)} = \frac{{ {\tilde{o}}_{j, i} - o_{j} }_{2}}{c} .$

Let r_j,ibe the distance between O_jand the ith wall, then r_j,i=½τ_i^(j)c which is equal to half of the distance between o_jand õ_j,i. The second order image source of O_jwith respect to the ith and the kth wall is

õ_j,ik=2 custom character p_k−õ_j,i,n_k+õ_j,i.

Similarly, let r_j,ikbe half of the distance between o_jand õ_j,ik. Following the same step, higher order image sources can be represented by lower order image sources. Then {tilde over (r)}_jis associated with image sources. The term echo is used to refer either the delay τ_i^(j)or the corresponding distance if no ambiguity occurs.

Two Extreme Cases

There are some special cases for room shape reconstruction and mobile device location. For instance, suppose distances between each pair of measurement points are given and the three measurement points are not collinear. In this case, only the room shape is of interest. By geometry, there exists at most one common tangent line for three circles with non-collinear centers. Thus, the room shape is uniquely determined by first-order echoes.

The second special case is when the reconstruction is free of geometry information of the measurement points. In this case, both room shape and the position of the device are of interest. The conventional art has shown that a large class of convex polygons can be reconstructed by first order echoes that are correctly labeled. The basic idea is that many convex polygons can be generated by the intersection of a triangle and some lines. As long as the triangle is obtained the coordinate of the measurement points are also determined. Therefore the rest of the reconstruction work is exactly the same as the previous case. However, parallelograms cannot be reconstructed uniquely under this assumption.

Recovery with Known Path Lengths

Geometry

Consider a convex planar K-polygon. As shown in FIG. 6, mobile device 10 with co-located loudspeaker 14 and microphone 16 emits pulses and receives echoes at {O_j}³_j=1. Without loss of generality, we assume that O₁is the origin, O₂lies on the x-axis, and O₃lies above the x-axis. Let φ=(π−∠O₁O₂O₃)ϵ(0,π) and the length of O₁O₂and O₂O₃denoted by d₁₂and d₂₃, respectively.¹Suppose mobile device 10 is capable of measuring its path length when moving from one place to another, i.e., d₁₂and d₂₃are known by the device. Our goal is to simultaneously determine the room shape and the coordinate of O₃by first-order echoes. If πϵ(0, 2π, i.e. we do not have control of where to place O₃, then the reconstruction is subject to reflection ambiguity (c.f. Theorem 3.3).

From FIG. 6, it is straightforward to show that

(r_2,i−r_1,i)+d₁₂cos θ_i=0,
(r_3,i−r_2,i)+d₂₃cos(θ_i−φ)=0.

Ideal Case

Let {r_j,i}_i=1^Kbe a column vector. Here, it is possible to assume that for all j's, the one-to-one mapping f_j:r_j custom character {tilde over (r)}_jis known. In other words, r_j,i's have been correctly chosen from {tilde over (r)}_jfor j=1, 2, 3 and i=1, . . . , K. In the rest of the paper, we say that the received echoes are grouped if echoes are chosen from {tilde over (r)}_j's according to f_j's. The remaining problem is to determine the uniqueness of θ_i's and φ given (3) and (4).

Define

$α_{{ii}^{'}} = - \frac{r_{2, i} - r_{1, i^{'}}}{d_{12}} and β_{{ii}^{'}} = - \frac{r_{3, i^{'}} - r_{2, i}}{d_{23}} .$

For simplicity we denote α_i,jand β_iiby α_iand β_i, respectively. Given correctly labeled, by (3) and (4), we have

θ_i=±arccos α_iand θ_i−φ=±arccos β_i. (5)

Thus, there are four possible sign combinations for a given i,

θ_i=arccos α_iand θ_i−φ=arccos β_i, (6)
θ_i=arccos α_iand θ_i−φ=−arccos β_i, (7)
θ_i=arccos α_iand θ_i−φ=arccos β_i, (8)
θ_i=arccos α_iand θ_i−φ=−arccos β_i, (9)

Definition III.1. Given a room R and a location O, O is feasible if the co-located device at O can receive all the first-order echoes of a signal emitted at O.

Lemma III.1. Suppose O₁, O₂and O₃are feasible and not collinear. Given grouped first-order echoes, with probability 1, there exist exactly two sign combinations such that (3) and (4) hold simultaneously for all i if φ and the direction of both {right arrow over (O₁O₂)} and {right arrow over (O₂O₃)} are randomly chosen. The two possible sign combinations have opposite signs for φ and all θ_i's and correspond to reflection of each other.

Proof.

Assume that the ground truth of the polygon is (6) for all iϵ{1, . . . ,K}. Note that (6) implies that (9) holds for θ′_i=−θ_iand φ′=−φ for all i, which is the reflection of the room.

Suppose multiple sign combinations hold for a wall. Without loss of generality, let i=1. From (6) we have

φ=arccos α₁−arccos β₁. (10)

Assume that one of the following equations also holds,

φ=−arccos α₁−arccos β₁, (11)
φ=arccos α₁+arccos β₁, (12)
φ=−arccos α₁+arccos β₁. (13)

Then, the following three cases exist:

1) If (10) and (11) hold, θ₁=0 which implies that O₁O₂is perpendicular to the first wall, and φ=−arccos β₁.

2) If (10) and (12) hold, arccos β₁=0, which implies that O₂O₃is perpendicular to the first wall.

3) If (10) and (13) hold, φ=0, which contradicts with the assumption that O₁, O₂and O₃are not collinear.

With probability 1, the first two cases do not occur since both φ and directions of {right arrow over (O₁O₂)} and {right arrow over (O₂O₃)} are randomly chosen.

If a subset of (7)-(9) holds for i and i^Isimultaneously, then (θ_i,θ_i′)ϵ{θ_i=0,θ_i=φ,φ=0}×{θ_i′=0,θ_i′=φ,φ=0}, which again, does not occur due to randomly chosen measurement points. Similarly, it can be shown that for more than two walls, (6) would imply none of (7)-(9) holds for all walls.

Echo Labeling

Since echoes may arrive in different orders at different O_j's and {tilde over (r)}_jcontains higher order echoes if N_j>K, f_jis unknown. Then θ_i's and φ are also unknown. Therefore we need to find the mapping f_jfirst. We can then estimate θ_i's, the room shape and the location of the device. We say the received echoes are ungrouped if echoes are chosen according to f′_j≠f_jfor some j.

Lemma III.2. Given ungrouped echoes, with probability 1, there are only two possible cases:

1) there exist no solution to (3) and (4) given no parallel edges.

2) the reconstructed room shape has larger dimension with respect to parallel edges.

Proof. The proof is illustrated by considering only the case of K=4. The result can be easily extended to K=3 and K>4.

The ground truth is (6) for all i. Considering first parallelograms and excluding odd higher order echoes resulting from a pair of parallel walls. The distances between O_j(j=1,2,3) and the four walls satisfy

r_1,1+r_1,2=r_2,1+r_2,2=r_3,1+r_3,2=a (14)
and
r_1,3+r_1,4=r_2,3+r_2,4=r_3,3+r_3,4=b (15)

We can see that for some f_j's, pairs of {α_ii′,β_ii′} (i,i′ϵ{1, 2,3, 4}) are related to each other. Consider for example the f_j's resulting in {a₁₂,a₂₁, a₃₄, a₄₃} and {β₁₂,β₂₁,β₃₄,β₄₃}. Since α₁₂+α₂₁=0, α₃₄+α₄₃=0, β₁₂+β₂₁=0 and β₃₄+β₄₃=0, we have

arccos α₂₁=π±arccos α₁₂
arccos α₄₃=π±arccos α₃₄
arccos β₂₁=π±arccos β₁₂
arccos β₄₃=π±arccos β₃₄

Thus (5) reduces to two equations.

φ=±arccos α₁₂±arccos β₁₂
φ=±arccos α₃₄±arccos β₃₄

With probability 1, these two equations do not hold simultaneously as α₁₂, β₁₂are independent of α₃₄, β₃₄due to randomly chosen measurement points. Other f′_j(≠f_j)'s always have at least two equations with independent choice of α and β. Hence no solution can be found for those instances.

Suppose f′_j's are chosen such that we have α_ii′ and β_ii′ (i≠i′ i′≠i″). For rooms with no more than one pair of parallel walls, almost surely only echoes chosen according to f′_j's can make (6) holds for all i. This is because for those rooms, at least one of (14) and (15) does not hold. Thus some α_ii′'s and β_ii′'s are not related since r_1i′, r_2i, and r_3i″are randomly chosen from {tilde over (r)}₁, {tilde over (r)}₂, {tilde over (r)}₃, respectively.

Given parallel edges, however, higher order echoes may also satisfy (3) and (4). For instance, as shown in FIG. 7, suppose that wall 1 and 3 are parallel. Then, it is easy to verify that

r_j,131−r_j′,131=r_j,1−r_j′,1
and
r_j,313−r_j′,313=r_j,3−r_j′,3.

Where j≠j′. hence, (3) and (4) provide the same cos θ₁, cos θ₃, cos(θ₁−φ) and cos(θ₃−φ) if r_j,1and r_j,3are replaced by r_j,131and r_j,313, respectively. By Lemma III.1, the third-order echoes resulting from a pair of parallel edges may lead to a larger room with the same norm vectors. Similarly, one can prove that given odd higher order echoes resulting from a pair of parallel edges leads to a larger room with the same norm vectors. Therefore, Lemma III.2 is proved.

Given Lemma III.1 and Lemma III.2, it is possible to conclude that the grouped first-order echoes provide either a unique room or a room with the smallest dimension. Then we have the following result on the identifiability of any convex polygonal room by using only first-order echoes.

Theorem III.3. One can recover, with probability 1, any convex planar K-polygon subject to reflection ambiguity, by using the first order echoes received at three random points in the feasible region, with known d₁₂and d₂₃and unknown φϵ(0, 2π).

Remark 1: Both the room shape and the coordinate of O₃are subject to reflection ambiguity for φϵ(0, 2π). If, however, if it is possible to limit φϵ(0,π), the SLAM will be free of such ambiguity.

Remark 2: In reality, it is inevitable to collect reflection from the ceiling and the floor. However, by theorem III.3, if distances corresponding to the echoes from the ceiling and the floor are chosen, no polygon can be recovered as long as the trajectory is perpendicular to the walls.

Recovery with Known Length of O₁O₂

Geometry

The path length obtained by motion sensors may have some errors. Additionally, some of the path lengths may not be accurate enough. In the case where either d₁₂or d₂₃is not accurate enough, the inaccurate path length is removed. Without loss of generality, assume only d₁₂is known. As shown in FIG. 8, let O₁be the origin and O₂be on x-axis. But the direction of {right arrow over (O₁O₂)} with respect to the desired room is also unknown. O₃(x₃,y₃) (y₃≠0) is randomly chosen. By geometry, (4) can be rewritten as

(r_3,i−r_1,i)+x₃cos θ_i+y₃sin θ_i=0. (16)

(16) can also be rewritten in a matrix form

A[x₃,y₃]^T=b, (17)

where A=[cos θ_i, sin θ_i]_K×2and b=[−(r_3,i,−r_2,i)]_K×1. Let A(:,i) and A(j,:) be the i th column and j th row of A, respectively.

Ideal Case

Similar to the previous section, it may be assumed that r_j,i's have been correctly chosen from {tilde over (r)}_jfor j=1, 2,3 and i=1, . . . ,K. Then, since cos θ_iis uniquely determined by (3), the remaining question is whether (17) provides a unique solution to (x₃,y₃) and θ_i's given cos θ_i's and b.

Lemma IV.1. Suppose acoustic signals are emitted and received at three non-collinear feasible points O_i(i=1, 2, 3), where the coordinates of O_iare randomly chosen. If either d₁₂or d₂₃is missing, then SLAM can be done for non-parallelogram subject to reflection ambiguity given grouped first-order echoes.

Proof Given grouped echoes, we can compute cos θ_iby (3) for iϵ{1, . . . , K}. Then sin θ_i=±√{square root over (1−cos²θ_i)}. For simplicity, it is possible to assume that the ground truth of sin θ_iis √{square root over (1−cos²θ_i)} for all i. Note that if A[x,y]^T=b has a solution (x₃,y₃) (y₃>0), then A⁻[x,y]^T=b also has a solution (x₃,−y₃) where

A⁻=[cos θ_i,−sin θ_i]_K×2

which is the reflection of the ground truth.

Assume ∀iϵ{1, . . . ,K}, private use character Parenclosest α and β (α,β≠0) such that α cos θ_i+β sin θ_i=0. Then

√{square root over (α²+β²)} sin(θ_i+arctan α/β)=0. (18)

Only

$θ_{i} = - \arctan \frac{α}{β} and θ_{i} = π - \arctan \frac{α}{β}$

make (18) hold. Since there are at least three walls with different θ_i, rank(A)=2. Recall that as (x₃,y₃) is a solution to (17),

rank(A)=rank({tilde over (A)})=2,

where Ã=[A,b]. In other words, given grouped first-order echoes and correct sign combination of {sin θ_i}_i=1^K, the room shape can be recovered without ambiguity if y₃>0. If the sign of y₃is unknown, the reconstruction result is subject to reflection ambiguity.

Let A_π be a matrix with sign combination of {sin θ_i} different from the ground truth and its reflection and let Ã_π=[A_π,b]. Without loss of generality, it is possible to assume that the first two rows of Ã are linearly independent. As a result, there is a linear row transform F(·) such that

$F (\tilde{A}) = [\begin{matrix} {\tilde{A}}_{2 \times 3}^{*} \\ 0_{(K - 2) \times 3} \end{matrix}],$

where Ã*_2×3=Ã(1:2,:) is a full row rank matrix. Apply the linear row transformation F(·) to Ã_π, we have

$F ({\tilde{A}}_{π}) = (\begin{matrix} A^{*} (:, 1) & A^{*} (:, 2) & A^{*} (:, 3) \\ 0_{(K - 2) \times 1} & A^{*'} (:, 2) & 0_{(K - 2) \times 1} \end{matrix}),$

where A*′(:,2) has at least 1 non-zero entry. Hence, rank(Ã_π)=3 and no solution can be found.

Therefore only A and A⁻ provide unique solution of (x,y) and (x,−y) respectively. In other words, SLAM is accomplished.

Echo Labeling

The following lemma guarantees that given ungrouped echoes, SLAM can be achieved in any convex polygon except parallelogram.

Lemma IV.2. Suppose acoustic signals are emitted and received at three non-collinear feasible points O_j(j=1, 2, 3), where the coordinates of O_jare randomly chosen. If either d₁₂or d₂₃is missing: (i) no solution to (3) and (16) can be found given un-grouped echo collected in any convex polygon free of parallel edges; and (i) multiple solutions to (3) and (16) can he found given ungrouped echo collected in any non-parallelogram convex polygon with parallel edges. But the dimension of the room is greater than the ground truth.

Proof. All odd higher order echoes resulting from parallel edges are excluded first. Given ungrouped echoes resulting from at least three non-parallel walls:

A′=[cos θ_ii′, sin θ_ii′]_K×2
and
{tilde over (A)}=[A′,b′]

where iϵ{1, . . . , N₂}, i′ϵ{1, . . . , N₁}, i≠i′ for at least one entry, K′ is not necessarily equal to K and the j th entry of b′ is −(r_3,j′,−r_2,j). For simplicity, consider the case where sin θ_ii′=√{square root over (1−cos θ_ii′)} for all i. Similar to the proof of Lemma 4.1:

rank(A′)=rank(A′_π)=2,

where A′_π is a matrix with signs of {sin θ_ii′} different from A′. Let Ã_π=[A′_π,b′]. Since b′ is independent to A′_π,

rank({tilde over (A)}′)=rank(Ã′_π)=3.

Therefore, with probability 1 if the echo chosen according to some f′_jcontains echoes resulting from at least 3 non-parallel walls.

If echoes chosen contain odd higher order echoes resulting from a pairs of parallel walls, then the outward norm vectors remain invariant but the dimension becomes larger, which is similar to Lemma III.2.

Lemma IV.1 and IV.2 implies that for non-parallelogram convex polygon the grouped first-order echoes provides unique solution (subject to ambiguity) to (3) and (16) such that the reconstructed room shape is either the smallest one or the unique one. In other words, SLAM is accomplished by choosing the smallest room shape and the corresponding coordinate of O₃. The following lemma establishes that if either d₁₂or d₂₃is missing, parallelogram can not be recovered uniquely.

Lemma IV.3. Suppose acoustic signals are emitted and received at three non-collinear feasible points O_j(j=1, 2, 3) where the coordinates of O_jare randomly chosen. If either d₁₂or d₂₃is missing, then parallelogram can not be reconstructed given ungrouped first-order echoes.

Proof.

An example may be given to show that if the shape of the room is a parallelogram, there exist multiple rooms satisfying (3) and (16). The ground truth is assumed to be

$A = [\begin{matrix} \cos θ_{i} & \sin θ_{i} \\ \cos θ_{i^{'}} & \sin θ_{i^{'}} \\ \cos θ_{j} & \sin θ_{j} \\ \cos θ_{j^{'}} & \sin θ_{j^{'}} \end{matrix}] and b = [\begin{matrix} - (r_{3, i} - r_{1, i}) \\ - (r_{3, i^{'}} - r_{1, i^{'}}) \\ - (r_{3, j} - r_{1, j}) \\ - (r_{3, j^{'}} - r_{1, j^{'}}) \end{matrix}],$

where

r_1,i+r_1,i′=r_2,i+r_2,i′=r_3,i+r_3,i′
and
r_1,j+r_1,j′=r_2,j+r_2,j′=r_3,j+r_3,j′.

Let

$A = [\begin{matrix} \cos θ_{{ii}^{'}} & \sin θ_{{ii}^{'}} \\ \cos θ_{i^{'} i} & \sin θ_{i^{'} i} \\ \cos θ_{{jj}^{'}} & \sin θ_{{jj}^{'}} \\ \cos θ_{j^{'} j} & \sin θ_{j^{'} j} \end{matrix}] and b = [\begin{matrix} - (r_{3, i} - r_{1, i^{'}}) \\ - (r_{3, i^{'}} - r_{1, i}) \\ - (r_{3, j} - r_{1, j^{'}}) \\ - (r_{3, j^{'}} - r_{1, j}) \end{matrix}],$

Then

cos θ_ii′+cos θ_i′i=0
and
cos θ_jj′+cos θ_j′j=0.

Moreover, since sin θ=±√{square root over (1−cos²θ)}.

sin θ_ii′+sin θ_i′i=0
and
sin θ_jj′+sin θ_j′j=0

can hold if we manipulate the sign of square root properly. Then, rank(A′)=rank([A′,b′])=2. Thus, a room shape and the coordinate of O₃other than that of the ground truth and its reflection satisfies both (3) and (16).

Given lemma IV.1-IV.3, the following result on the identifiability of convex polygon except parallelogram is possible by using only first-order echoes.

Theorem IV.4. Suppose acoustic signals are emitted and received at three non-collinear feasible points O_j(j=1, 2, 3) where the coordinates of O_jare randomly chosen. If only the distance between two of the three the measurement points is known, then SLAM can be accomplished given ungrouped echoes any convex polygon except a parallelogram.

Practical Algorithm

Two distances between three consecutive measurement points are sufficient and necessary for SLAM given any convex polygon in 2-D. The remaining question is to make the algorithm robust in noisy case.

Peak-Detection Algorithm

A simple peak-detection algorithm may be used based on the idea that peaks have steep slopes. At the receiver, |m^(j)(t)| is used instead of the original one. Since the LOS component is much stronger than reflective component, the LOS peak can be easily detected. Let t₀^(j)be the time that at which the LOS peak in the correlator output. Suppose the nth local maxima after the LOS peak appear at t_n^(j)with magnitude m_n^(j)(n=1, 2, 3, . . . ) Then (t_n^(j),m_n^(j)) are some points in the 2-D plane. Define the slopes of the peak centered at (t_n^(j),m_n^(j)) to be

$g_{l, n}^{(j)} = \frac{m_{n}^{(j)} - m_{n - 1}^{(j)}}{t_{n}^{(j)} - t_{n - 1}^{(j)}}$

$And$

$g_{r, n}^{(j)} = \frac{m_{n + 1}^{(j)} - m_{n}^{(j)}}{t_{n + 1}^{(j)} - t_{n}^{(j)}} .$

A peak centered at (t_n^(j),m_n^(j)) is said to be “steep” if g_l,n^(j)and −g)_r,n^(j)are greater than the given positive threshold g_th. The experiment result suggest that g_l,n±1^(j)and g_r,n±1^(j)should also be considered. As a result, a peak centered at (t_n^(j),m_n^(j)) is “steep” if one of the following conditions is satisfied:

g_l,n^(j)>g_thand −g_r,n^(j)>g_th 1)
α_lg_l,n^(j)+(1−α_l)g_l,n^(j)>g_thand −g_r,n^(j)>g_th 2)
g_l,n^(j)>g_thand −α_rg_r,n^(j)−(1−α_r)g_r,n^(j)>g_th, 3)
α_lg_l,n^(j)+(1−α_l)g_l,n^(j)>g_thand −α_rg_r,n^(j)−(1−α_r)g_r,n^(j)>g_th 3)

where α_l,α_rϵ(0,1) depend on {t_n−2^(j),t_n−1^(j),t_n^(j))} and {t_n^(j),t_n+1^(j),t_n+2^(j)}, respectively. Hence, τ_i^(j)'s can be obtained by detected peaks.

In practice, due to the non-ideal auto-correlation property, it is necessary to assume that no TDOA exists in [0,t_min] and the time difference between contiguous peaks is greater than Δt. Two peaks are “close” to each other if the difference of their appearance time is less than Δt. Let M be the set of peaks steep enough and P be the set of detected peaks. Suppose the maximum distance between measurement points and the walls are less than t_maxc/2. The peak detection algorithm can be summarized as Algorithm 1.

Algorithm 1 Peak detection algorithm

1:
find LOS peak (t₀^(j), m₀^(j)).

2:
find local maxima appearing from t₀^(j)+ t_minto t₀^(j)+ t_max.

3:
find all peaks that are “steep” and store them in M.

4:
store the peak with the largest magnitude of M in P.

5:
if then |P| < |M|

6:
if there exist peaks in M whose locations are “close” to the stored

peak then

7:
remove those peaks from M.

8:
end if

9:
end if

Then the candidate distances are obtained by (2).

SLAM Given Distances Between Consecutive Measurement Points

In noisy case, the distances extracted from m^(j)(t) are corrupted by the noise. Define

{right arrow over (r)}_j={tilde over (r)}_j+n_j

as the corrupted distances, where n_jis the error. In the presence of noise, however, {tilde over (r)}_jis subject to measurement errors. Hence φ solved from (5) for different i's are not identical. The essential idea of a straightforward practical algorithm that handles the measurement errors is given below:

- 1) For K from 3 to N, choose K entries from {tilde over (r)}_j(j=1, 2, 3), where N=min{N₁, N₂, N₃}.
- 2) For a given K, exhaust all possible echo combinations and compute φ_i^k=±arccos α_i±arccos β_ifor each i, k and different sign combination, where i=1, . . . , K and k=1, . . . ,

$(\begin{matrix} N \\ K \end{matrix}) {(K!)}^{2} .$

- 3 Choose the echo and sign combination with minimum variance of φ_i^kfor a given K. Then choose the largest K and the corresponding echo and sign combination with the variance less than the threshold. (variance criterion)
- 4) Estimate θ_i's and φ using the obtained combination of echoes and reconstruct the polygon.
- 5) If some θ_i's are close to each other, then keep the one corresponding to the smallest distance between O₁and the walls.

The corresponding algorithm is summarized as Algorithm 2.

SLAM Given The Distance Between Two Measurement Points

In a noisy case, the echo and sign combination is chosen such that the matrix is close to a rank-2 matrix. A straightforward idea of the practical algorithm that handles the measurement errors is given below and the practical algorithm is summarized as Algorithm 3.

- 1) For K from 3 to N, choose K entries from {tilde over (r)}_j(j=1, 2, 3), where N=min{N₁,N₂,N₃}.
- 2) For a given K, exhaust all possible echo combinations and compute cos θ_iand sin θ_ifor each i, k and different sign combination, where i=1, . . . , K and k=1, . . . ,

$(\begin{matrix} N \\ K \end{matrix}) {(K!)}^{2} .$

- Then compute the minimum distance between Ã_ikand any rank-2 matrix.
- 3) For each K, find Ã_ikwith the least distance to any rank-2 matrix.
- 4) If unique solution of O(x,y) (y>0) can be obtained, choose the echo and sign combination resulting in unique solution with minimum distance between Ã_ikand any rank-2 matrix for a given K. Then choose Ã_ikwith the largest k such that the distance between Ã_ikand any rand-2 matrix is less than the threshold.
- 5) Estimate θ_i's and (x₃,y₃) using the chosen Ã_ikand reconstruct the polygon.
- 6) If some θ_i's are close to each other, then keep the one corresponding to the smallest distance between O₁and the walls.

Algorithm 2 Reconstruct convex polygon given distances

between consecutive measurement points

1:
Set i = 3 and variance threshold V_th

2:
if i ≤ N then

3:
Set th = inf

4:
Choose one echo combination with i elements from

{((\begin{matrix} N \\ K \end{matrix}))}^{3} {(K!)}^{2} feasible echo combinations

5:
Compute θ_i's and φ by the chosen echo combination

6:
if θ_i's and φ are obtained given certain sign

combinations then

7:
Compute Var[φ], which is the variance of φ

8:
if Var[φ] < th then

9:
Keep the echo and sign combination and set

th = Var[φ]

10:
end if

11:
else if There exist echo combinations that has not

been chosen yet then

12:
Return to step 4

13:
end if

14:
i = i + 1

15:
else

16.
Choose the recovered room with the largest number

of walls such that Var[φ] < V_th

17:
end if

18:
Keep the edges that form a shape with smallest area.

Algorithm 3 Reconstruct convex polygon given one of the

distances between consecutive measurement points

1:
Set i = 3 and distance threshold d_th

2:
if i ≤ N then

3:
Set d = inf

4:
Choose one echo combination with i elements from

{((\begin{matrix} N \\ K \end{matrix}))}^{3} {(K!)}^{2} feasible echo combinations

5:
Compute cos θ_i's and sin θ_iby the chosen echo

combination

6:
if cos θ_i's and sin θ_i's are valid given certain sign

combinations then

7:
Compute d(Ã), the distance between Ã and any

rank-2 matrix

8:
if d(Ã) < d then

9:
Store the echo and sign conibination and set

d = d(Ã)

10:
end if

11:
else if There exist echo combinations that has not

been chosen yet then

12:
Return to step 4

13:
end if

14:
i = i + 1

15:
else

16:
Choose the recovered room with the largest number

of walls such that d(Ã) < d_th

17:
end if

18:
Keep the edges that form a shape with smallest area.

EXAMPLE

Since a rectangle is the most common shape of the room, the method proposed in Section III for the present invention was tested by a real room. Since the three-order echoes resulting from parallel walls only change the dimension of the room, only the first- and second-order echoes were considered.

Given Two Distances of Three Consecutive Measurement Points

Using a laptop as microphone 18 and a HTC M8 phone as a loudspeaker 16, the speaker of the cell phone was placed towards each wall to ensure the corresponding first order echo is strong enough as the loudspeaker of the cell phone is not omnidirectional and power limited. Note that the loudspeaker will record both first order echoes and some higher order ones.

A chirp signal linearly sweeping from 30 Hz to 8 kHz was emitted by the cell phone. The sample rate at the receiver is f_s=96 kHz. It has been shown in the art that if the input chirp signal is correlated with its windowed version, the output may resemble a delta function. The simulation shows that the candidate distances obtained by correlating the received signals with its triangularly windowed version outperforms the correlator output using the original one. The comparison is shown in FIG. 9.

FIGS. 10A-10C is a sample path of the correlator output collected in the room where this experiment is conducted. Peaks with ellipse in FIGS. 10A-10C are determined manually. Then Algorithm 1 is used to detect desired peaks from the correlator output. We also assume that the distance between walls and measurement points are between 0.6 m and 6.5 m. The minimum difference of appearance time is

$Δ t = \frac{0.5 m}{c},$

where c=346 m/s. g_this set to be 5f_s. Under these assumptions, local maxima of FIGS. 10A-10C are shown in FIGS. 11A-11C and peaks detected by Algorithm 1 are pointed by arrows. The desired peaks were always detected. In order to detect less peaks, in the presence of noise, one possible modification of Algorithm 1 is to ignore all peaks with magnitude less than a predetermined threshold. Notably, only part of the detected echoes are used for reconstruction due to computational complexity.

The proposed algorithm for SLAM is verified by experiment in which d₁₂and d₂₃are measured with a tape measure. Even if some elements of r_jhave measurement errors up to 10 cm, SLAM is accomplished with small error of both the room shape and the coordinate of O₃by the proposed algorithm given only first-order echoes. In the presence of higher order echoes, the proposed algorithm performs poorly when the variance criterion is the only criterion used to determine the correct combination of echoes. Since most rooms are regular, a heuristic constraint is added: all the angles of two adjacent walls are between 50° and 130°. An interesting phenomenon is that sometimes the proposed algorithm is unable to provide the correct room shape, but the estimate of c is always close to the true value. Therefore, one can use the algorithm in Section III to obtain c and then reconstruct the room shape independently with full knowledge of the geometry information of the measurement points. The comparison between the SLAM result and the ground truth is illustrated in FIG. 12.

Given the Distance Between O₁and O₂

Here it is assumed that O₃lies always above x-axis, i.e., y₃>0. Thus SLAM result is free of ambiguity. In noiseless case, simulations show that the algorithm of the present invention achieves successful SLAM given all the first-order echoes and some second-order echoes. In noisy case, the candidate distances, including all that correspond to the first-order echoes and some correspond to the second-order echoes, are corrupted by the Gaussian noise with N (0, 0.005²). Heuristic constraint in the last section is also applied. Two rooms are used to test the proposed algorithm. For room 1, assume that O₂(1,0) and O₃(1,1). Then d₁₂=1 and d₂₃=1.1180. The distances between walls and measurement points and the real angles of the walls are given in table I and table II.

TABLE I

Real distances of room 1

O₁
O₂
O₃

wall 1
1.4142
2.1213
1.0607

wall 2
1.3093
1.9640
2.3926

wall 3
1.5
1
2.1160

wall 4
2.5981
1.7321
1.6651

TABLE II

Real angles of room 1

wall 1
wall 2
wall 3
wall 4

135°
−130°
−60°
30°

The sample of the corrupted distances and the recovered angles are given in table III and IV, respectively.

TABLE III

Sample of corrupted distance of room 1

O₁
O₂
O₃

wall 1
1.4178
2.1276
1.0485

wall 2
1.3124
1.9749
2.3814

wall 3
1.4914
1.0111
2.1160

wall 4
2.5978
1.7324
1.6804

TABLE IV

Recovered angle of room 1

wall 1
wall 2
wall 3
wall 4

129.5125°
−131.4876°
−66.0266°
29.0305°

The parameters of the second room are given table V and VI. Assume that O₂(0.5,0) and O₃(⅓,0.5). Then d₁₂=0.5 and d₂₃=0.5270.

TABLE V

Real distances of room 2

O₁
O₂
O₃

wall 1
1.7889
2.2361
1.8634

wall 2
1.4142
1.7678
2.0035

wall 3
1.8974
1.7393
2.2663

wall 4
3.3282
3.0509
2.7273

TABLE VI

Real angles of room 2

wall 1
wall 2
wall 3
wall 4

153.4349°
−135°
−71.5651°
56.3099°

The simulation result is shown in table VII and VIII.

TABLE VII

Sample of corrupted distances of room 2

O₁
O₂
O₃

wall 1
1.7879
2.2290
1.8705

wall 2
1.4187
1.7702
2.0049

wall 3
1.8935
1.7384
2.2673

wall 4
3.3212
3.0499
2.7352

TABLE VIII

Recovered angles of room 2

wall 1
wall 2
wall 3
wall 4

151.9001°
−134.6765°
−71.9195°
57.1366°

From the simulation result, it may be seen that in a noisy case the present invention can reconstruct the room shape given d₁₂. But in both cases, the present invention was unable to obtain the coordinate obtained by the corrupted distances. The possible reason is that the angles of the walls are estimated directly by the elements of A, while the coordinate of O₃is obtained by

$[\begin{matrix} x_{3} \\ y_{3} \end{matrix}] = A^{- 1} b .$

A⁻¹is more vulnerable to noise than A. Thus, the coordinate of O₃may not be obtained in noisy case while the norm vector of the walls can be estimated.

The present invention makes progress in acoustic SLAM integrating measurement from internal motion sensors along with echo measurements for localization and mapping. A simple approach based on gradient test is used to detect peaks of the correlator output which are used to compute candidate distances. Experiment results show that the developed system can recover all desired first order echoes along with some high order echoes as well as some spurious peaks. With the distances between consecutive measurement points obtained through internal sensors, the present invention can recover any 2-D convex polygon while self-localizing using the collected acoustic echoes. In the presence of noise, a simple algorithm is devised that is effective in recovering the room shape even in the presence of higher order echoes.

The present invention may also be applied for 3D SLAM, which has found applications for both navigation and construction monitoring. The present invention can be extended to the 3D case: it can be shown that, in an idealized case, four measurement points that do not reside on a single plane can recover any convex 3D polyhedron when distances between consecutive measurement points (in this case there are three of them) are known. Other interesting problems include 3D SLAM for shoebox rooms as they are one of the most encountered rooms in practice. For a shoebox, the outward norm vectors are always subject to rotation and translation ambiguity due to its symmetry therefore only the coordinates of the measurement points and the dimension of the shoebox are of interest. For a shoebox, fewer than four measurement points may be needed when complete set of first order echoes (in this case including from floor and ceiling) are available. Additionally, many room shapes besides shoebox have some special structural information that can be exploited. For instance, the floor is almost always perpendicular to the walls and there often exist two adjacent walls that are perpendicular to each other. This structural information, namely three connected planes are perpendicular to each other, can be explored for echo labeling, which is more challenging for 3D SLAM. Even with labeled echoes, 3D SLAM often requires solving a bilinear optimization problem for arbitrary convex polyhedra whose corresponding cost function is non-convex and thus multiple local minima exist. Clearly, having more measurement points or other geometry information may impose additional constraints and can help resolve the inherence ambiguity, i.e., in identifying the correct solution.

As described above, the present invention may be a system, a method, and/or a computer program associated therewith and is described herein with reference to flowcharts and block diagrams of methods and systems. The flowchart and block diagrams illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer programs of the present invention. It should be understood that each block of the flowcharts and block diagrams can be implemented by computer readable program instructions in software, firmware, or dedicated analog or digital circuits. These computer readable program instructions may be implemented on the processor of a general purpose computer, a special purpose computer, or other programmable data processing apparatus to produce a machine that implements a part or all of any of the blocks in the flowcharts and block diagrams. Each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical functions. It should also be noted that each block of the block diagrams and flowchart illustrations, or combinations of blocks in the block diagrams and flowcharts, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

Number	Name	Date	Kind
7831413	Date	Nov 2010	B2
20020062695	Ohta	May 2002	A1
20040240676	Hashimoto	Dec 2004	A1
20070019815	Asada	Jan 2007	A1
20070124144	Johnson	May 2007	A1
20070286404	Popovic	Dec 2007	A1
20130096922	Asaei	Apr 2013	A1
20130301391	Altman	Nov 2013	A1
20140112487	Laska	Apr 2014	A1
20140180629	Dokmanic	Jun 2014	A1
20160287166	Tran	Oct 2016	A1
20170332909	Nagae	Nov 2017	A1
20200086078	Poltorak	Mar 2020	A1

Motion sensor assisted room shape reconstruction and self-localization using first-order acoustic echoes

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Term Extension

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS

US Referenced Citations (13)

Related Publications (1)

Provisional Applications (1)