This application claims priority from French patent application no. 2001465, filed Feb. 14, 2020, the content of which is incorporated herein by reference.
The subject disclosure relates generally to a camera assisted map and navigation, and specifically to a method and system for use in navigating in a facility.
Usually, people find it difficult to navigate inside a facility and therefore need special assistance. For example, travelers need help of maps, sign boards, customer care, other fellow travelers, indoor maps etc. Current infrastructure and known systems have not addressed these problems yet.
Known systems for indoor guidance use a combination of the Global Positioning System (GPS) and other technologies such as Bluetooth, Infrared, Wi-Fi, RFID, etc. to provide detailed and accurate location information to users. For example, a representative of this category of systems is disclosed in U.S. Pat. No. 9,539,164. However, such systems are impracticable due to the need of additional hardware to provide location information. Moreover, GPS services are not accessible in facilities because it is satellite-based and line-of-sights to satellites is required for the GPS service.
The specification proposes a method and system for use in navigating in a facility which does not require the use of additional hardware, and which does also not depend on the accessibility of GPS services.
A first aspect of the subject disclosure provides a computer-implemented method for use in navigating in a facility, comprising: receiving, from a camera, at least one image; estimating, by a processor, a current location of the camera in the facility based on the at least one image and mod& data of the facility; generating, by the processor, a virtual path from the current location of the camera to a destination location in the facility using the mod& data of the facility; and generating and outputting navigation information to the destination location according to the virtual path.
In some examples, the model data of the facility comprises image data of a plurality of images, the image data of each image comprising location information corresponding to a location in the facility from which the image was acquired; object information corresponding to an object of the facility in the image; distance information corresponding to a distance between the object and the location; first relationship information specifying, as a first relationship, a distance and a relative direction to navigate from one object to another object of the image; and second relationship information specifying, as a second relationship, a distance and a relative direction to navigate from the location from which the image was acquired to a location from which another image was acquired.
In some examples, estimating the current location comprises: dividing, by the processor, the at least one image into one or more image blocks; detecting, by the processor in the one or more image blocks, object candidates corresponding to objects of the facility based on the object information from the model data of the facility; determining, by the processor, distance values to the detected object candidates based on the object information and the distance information of the corresponding object from the model data of the facility; determining, by the processor a distance between object candidates based on the distance values; and estimating, by the processor, the current location of the camera based on the location information from the model data of the facility and the distance values to the detected object candidates.
In some examples, estimating the current location further comprises: performing, by the processor, object classification with respect to the object candidates of the image based on the distance values and the distance to detect the objects of the facility.
In some examples, the computer-implemented method further comprises: receiving, via an input device, information about a destination object in the facility to which the camera is to be navigated; searching, by the processor in the model data, for at least one image block of an image, the object information in the model data of the facility corresponding to the information about the destination object; estimating, by the processor as the destination location, a location of the destination object based on image data of images comprising the destination object.
In some examples, generating a virtual path comprises: determining, by the processor, a relation between the object candidates in the image blocks of the image and the destination object based on the first and second relationship information in the model data of the facility; and deriving, by the processor, the virtual path based on the determined relation.
In some examples, outputting the navigation information comprises displaying, on a display, the at least one image and the navigation information.
In some examples, the computer-implemented method further comprises generating, by the processor, the model data of the facility. The generating comprises: acquiring, by the camera from a plurality of locations within the facility, one or more images, The generating further comprises, for each of the plurality of images, the steps of determining, by the processor, depth information based on the image and image information provided by the camera; generating, by the processor, location information based on the location from which the image was acquired; dividing, by the processor, the image into one or more image blocks; detecting, by the processor, objects of the facility in the one or more image blocks and generating object information defining features of the detected objects, the object information including information indicating the image block of the image; determining, by the processor, a distance between detected objects in the one or more image blocks and the location using the depth information and generating distance information corresponding to the detected object in an image block; calculating, by the processor, a distance between detected objects in the one or more image blocks and a relative direction describing how to navigate from one object in a first image block to another object in a second image block, and generating first relationship information based on the distance and the relative direction, the first relationship information including information indicating the first and second image blocks of the image; determining, by the processor, a distance between the location from which the image was acquired and another location from which another image was acquired based on the location information of the image and the another image, and a relative direction describing how to navigate from the location to the other location, and generating second relationship information based on the distance and the relative direction, the second relationship information including information indicating the image and the other image; and generating, by the processor, image data of the image, including the location information, the object information, the first relationship information and the second relationship information.
In some examples, the computer-implemented method further comprises: storing, by the processor, the at least one image; and performing, by the processor, machine learning operations using the at least one image and the model data of the facility to generate updated model data of the facility.
A second aspect of the subject disclosure provides a computing system for use in navigating in a facility, comprising: a processor; a camera device; and at least one memory device accessible by the processor. The memory device contains a body of program instructions which, when executed by the processor, cause the computing system to implement a method comprising; receiving, from the camera, at least one image; estimating a current location of the camera in the facility based on the at least one image and model data of the facility; generating a virtual path from the current location of the camera to a destination location in the facility using the model data of the facility; and generating and outputting navigation information to the destination location according to the virtual path.
In some examples, the system is further arranged to perform the method according to examples of the first aspect of the subject disclosure.
According to a third aspect, a computer program product is provided. The computer program product comprises instructions which, when executed by a computer, cause the computer to perform the method according to the first aspect and the examples thereof.
The above-described aspects and examples present a simplified summary in order to provide a basic understanding of some aspects of the methods and the computing systems discussed herein. This summary is not an extensive overview of the methods and the computing systems discussed herein. It is not intended to identify key/critical elements or to delineate the scope of such methods and the computing systems. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
The accompanying drawings illustrate various examples of the subject disclosure and, together with the general description given above, and the detailed description of the examples given below, serve to explain the examples of the subject disclosure. In the drawings, like reference numerals are used to indicate like parts in the various views.
Before turning to the detailed description of examples, some more general aspects on involved techniques will be explained first.
The subject disclosure generally pertains to navigating in a facility. The term “navigate” has its common meaning and is especially understood in the determination of position and direction to a destination. The term “facility” includes all types of buildings and structures. Examples of facilities include airport buildings with one or more floors (e.g., check-in areas and terminal buildings), hospitals, shopping malls, etc. The subject disclosure more specifically concerns navigating inside a facility from a position in the facility to another position in the facility (i.e., indoor). As it will be understood, the subject disclosure is not limited to indoor navigation and may also be used in outdoor navigation, i.e., navigating outside facilities, e.g., in a city. In the subject disclosure, elements, components and objects within the facility are commonly referred to as objects of the facility. Examples of objects include walls, pillars, doors, seats, sign boards, desks, kiosks, etc.
The subject disclosure uses machine learning (ML) techniques and applies algorithms and statistical models that computer systems use to perform special tasks without using explicit instructions, replying mainly on patterns and inference instead. Machine learning algorithms build a mathematical model based on sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to perform the task. The subject disclosure uses, as its data basis, a mathematical model referred to as model data of the facility herein. The model data is based on a plurality of images from the facility.
As will be described, the subject disclosure uses techniques of object detection and object classification to detect instances of semantic objects of a certain class such as humans, buildings, cars, etc. in digital images and videos. Every object class has its own special features that helps in classifying the class, The techniques for object detection and object classification are e.g. ML-based or deep learning-based. Known ML-based approaches include histogram of oriented gradients (HOG) features. The object detection also includes feature extraction to extract the features from the digital images and feature recognition to recognize the extracted features to be features that helps in classifying.
Now turning to
The method illustrated in
The method 300 starts at block 310 with generating the model data of the facility. Generating the model data will be described in more detail below with reference to
In an example, as a result of block 310, the model data of the facility comprises image data of a plurality of images. As will be described with reference to
At block 320, at least one image (also referred to as tile) is received from the camera. In one specific example, the at least one image may also be represented by frames of a video stream acquired by a video camera. At a location within the facility, the camera is used to acquire at least one image from the surrounding at the location (i.e., a physical space in one direction from the location). The image may be transmitted in any suitable form, e.g., via the coupling, by the camera to the computing system which receives the image. Alternatively, the image may be stored in any suitable form at a memory coupled to the camera and retrieved by the computing system.
An example of an image 600 received from the camera is depicted in
Optionally, at block 325, information about a destination object in the facility to which the camera is to be navigated is received. At block 325, the destination location is estimated.
In one example, the operation of block 325 involves receiving, via an input device, information about a destination object (e.g., a name of the destination object). The information may be input in any suitable form, including text input, speech input, etc. The information about the destination object is used to search for an image or image block of an image including object information corresponding to the destination object. For example, in case the object information includes the name of the object, the model data can be searched using the name of the destination object as search key. If found, a location from which the images including the destination object is estimated based on the image data of the images.
At block 330, a current location of the camera in the facility is estimated. The estimating is performed based on the image and the model data of the facility. For navigating in the facility, the current location of the camera represents the starting point of the path to navigate along to the destination point. In block 330, the location where the image was required, i.e., where the camera was placed when acquiring the image, is estimated and used as the current location. Also, the orientation of the camera may be estimated (i.e., the direction such as North, East, South, West, etc., into which the camera was pointing when acquiring the image) with the estimated location as the base or reference point.
According to an example, the estimating in block 330 of the method of
First, at block 420, the image is divided into one or more image blocks. The image blocks are non-overlapping and contiguous (directly adjacent to each other). An example of dividing the image is exemplified in
In
Then, at block 422, object candidates corresponding to objects of the facility are detected in the one or more image blocks. In one example, the detecting of object candidates is performed for each of the image blocks separately and can thus be performed in parallel. This process is also referred to as object detection using the object detection model as described above to detect regions of interest in the image block. For example, features may be extracted from the image block and compared with information of features in the object information, i.e., features extracted from objects of the facility.
In
At block 424, distance values to the object candidates detected in block 422 are determined, As described above, the mod& data comprises the object information corresponding to an object of the facility and the distance information specifying a distance between the object and a location of the camera which acquired an image with the object. From the model data, the object information of each object being a detected objected candidate, the corresponding distance information is obtained from the model data. Based on characteristics of the object candidate (e.g., the width in number of pixels) and the object in the mod& data, the distance value may be determined based on the distance information for the object. This determining may include triangulation or ML techniques, as it will be understood by the skilled person.
In
Moreover, at block 426, a distance between the object candidates detected in block 422 are determined based on the distance values determined in block 424. Again, as it will be understood by the skilled person, triangulation or ML techniques may be applied.
In the example of
Finally, at block 430, the current location of the camera is estimated based on the location information from the model data and the distance values determined in block 426. As described, the location information specifying a location in the facility where the camera was placed when acquiring a sample or training image from which the image data and the object information of objects in the sample or training image were generated. For example, the location may be assumed as a reference location to which the distance information to the objects correspond. Based thereon, relative locations of the objects may be determined. From the relative locations of the objects, the location of the camera can be derived using the determined distance values (e.g., at the intersection point of the distance values d11 and d12 from the relative location of O1 and the distance values d21 and d22 from the relative location of O2). As it will be understood by the skilled person, other techniques including triangulation or ML techniques may be used.
Optionally, in one example, the estimating in block 330 of the method of
The classified objects may be used to derive distance values and a relation between the classified objects,
Turning back to
According to an example, the generating in block 340 of the method of
First, at block 440, a relation between the object candidates and the destination object is determined. The destination object may be estimated in accordance with an instruction or information at block 325, as will be described below in more detail. The relation is determined based on the first and/or second relationship information from the model data. Based on the relations, distances and relative directions from one image block to another image block and from one image to another image is determined. In case of multiple relations, a score is determined and only the relation having the strongest score is used.
Examples for determining the relation according to block 442 are depicted in
The example of
The example of
At block 442 of
In one example of generating the virtual path in block 340 of the method of
An example of path classification with respect to image 600 of
In
Turning back to
In the method of
As described above, the method of
The generating in block 310 of
The following processing is performed for each of the plurality of images sequentially, in parallel, batch-wise, location-wise, or in any other suitable fashion. The processing is performed by a computer system receiving the plurality of images from the camera or retrieving the plurality of images from a storage device.
At block 482, depth information is determined. The depth information specifying distances from the location to the objects. In effect, since images are two-dimensional representation of the physical space around the location, the depth information representing the information of the third dimension. The depth information may be determined by a sensor associated with the camera, or by applying techniques such as stereo triangulation or time-of-flight. Also, the depth information may be determined using ML techniques based on the image (i.e., the image data) and image information provided by the camera, such as the metadata described above. The depth information may be determined for each individual pixel or groups of pixels of the image.
At block 484, location information is generated based on the location from which the image was acquired. As described above, the additional information such as the metadata associated with the image includes information on the location such that the location information may correspond to or derived from the information on the location in the additional information. The location information may be represented by coordinates of a coordinate system applied to the facility, or relative to an adjacent or reference location. For example, the location information includes information that the location is five meters in the North direction and two meters in the East direction away from the reference location. It will be understood by the skilled person that any suitable representation of the location information can be used as far as the location is identified in the facility uniquely.
At block 486, the image is divided into one or more image blocks. The image blocks are non-overlapping and contiguous (directly adjacent to each other). An example of dividing the image is illustrated in FIG, 5. The image may be divided into one image block only. That is, the whole image is taken as the image block. The image may also be divided into 2×2, 4×4, or in general n×n (n being an integer number), image blocks, all having the same size. Dividing the image is however not limited to the example of
At block 488, objects of the facility are detected in the one or more image blocks. More specifically, the detecting in block 488 is performed in each image block. For example, as it will be understood by the skilled person, ML techniques may be used to detect objects in the image blocks. Also other techniques for detecting objects are apparent to the skilled person. The detecting results in object information describing features and characteristics of the detected objects. For example, object information may include information of a histogram, color, size, texture, etc. of the object. Also, the object information includes information indicating the image block of the image (e.g., an identifier for the image block).
At block 490, a distance between detected objects and the location is determined. In determining, the depth information determined in block 482 is used. For example, the distance between the detected objects can be derived based on the distance of the detected object from the location by e.g. using triangulation or ML techniques. In one example, the distance between the objects and the location may also be measured. For each detected object or each image block, distance information is generated based on the determined distance.
A distance between detected objects in image blocks and a relative direction is calculated at block 492. For example, the distance between objects in image blocks may be calculated based on the depth information and/or the distance information. The relative direction describes how to navigate from one detected object in a first image block to another detected object in a second image block. For example, the distance may be five meters and the relative direction may be Northeast in order to describe that it is to be navigated from a first object to the Northeast and moved five meters to arrive at the second object, In one example, the distance and the relative direction form a first relationship based on which first relationship information is generated. Additionally, the first relationship information may indicate the image blocks including the detected objects (e.g., using identifiers of the image blocks). First relationships between image blocks are illustrated in
Moreover, at block 494, a distance between locations is determined. In order to describe a relationship (referred to as a second relationship) between a location from which a first image was acquired and another location from which a second image was acquired, the distance therebetween as well as a relative direction is determined. The determining is based on the respective location information of the images (e.g., the first and second image). Similar to the above described relative direction, also the relative direction with respect to the locations from which the images were acquired describes how to navigate from the location from which the first image was acquired to the location from which the second image was acquired. For example, the distance may be 50 meters and the relative direction may be North in order to describe that it is to be navigated from the location from which the first image was acquired to the North and moved 50 meters to arrive at the location from which the second image was acquired. In one example, the distance and the relative direction are used to generate second relationship information. Additionally, the second relationship information may indicate the images and/or the location (e.g., using identifiers of the images or coordinates of the locations). Second relationships between images are illustrated in
At block 496, image data of the image is generated. The image data at least include the location information, the object information, the first relationship information and the second relationship information.
Performing the steps of the blocks 482 to 496 for a plurality of sample or training images acquired in block 480 and generating image data for each image generates the model data. The model data forming the model of the facility.
Finally,
In general, the routines executed to implement examples of the subject disclosure, whether implemented as part of an operating system or a specific application, component, program, object, module or sequence of instructions, or even a subset thereof, may be referred to herein as “computer program code,” or simply “program code.” Program code typically comprises computer-readable instructions that are resident at various times in various memory and storage devices in a computer and that, when read and executed by one or more processors in a computer, cause that computer to perform the operations necessary to execute operations and/or elements embodying the various aspects of the examples of the subject disclosure. Computer-readable program instructions for carrying out operations of the examples of the subject disclosure may be, for example, assembly language or either source code or object code written in any combination of one or more programming languages.
Various program code described herein may be identified based upon the application within that it is implemented in specific examples of the subject disclosure. However, it should be appreciated that any particular program nomenclature that follows is used merely for convenience, and thus the subject disclosure should not be limited to use solely in any specific application identified and/or implied by such nomenclature. Furthermore, given the generally endless number of manners in which computer programs may be organized into routines, procedures, methods, modules, objects, and the like, as well as the various manners in which program functionality may be allocated among various software layers that are resident within a typical computer (e.g., operating systems, libraries, API's, applications, applets, etc.), it should be appreciated that the examples of the subject disclosure are not limited to the specific organization and allocation of program functionality described herein.
The program code embodied in any of the applications/modules described herein is capable of being individually or collectively distributed as a program product in a variety of different forms, In particular, the program code may be distributed using a computer-readable storage medium having computer-readable program instructions thereon for causing a processor to carry out aspects of the examples of the subject disclosure.
Computer-readable storage media, which is inherently non-transitory, may include volatile and non-volatile, and removable and non-removable tangible media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data. Computer-readable storage media may further include random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other solid state memory technology, portable compact disc read-only memory (CD-ROM), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and which can be read by a computer. A computer-readable storage medium should not be construed as transitory signals per se (e.g., radio waves or other propagating electromagnetic waves, electromagnetic waves propagating through a transmission media such as a waveguide, or electrical signals transmitted through a wire). Computer-readable program instructions may be downloaded to a computer, another type of programmable data processing apparatus, or another device from a computer-readable storage medium or to an external computer or external storage device via a network.
Computer-readable program instructions stored in a computer-readable medium may be used to direct a computer, other types of programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions that implement the functions, acts, and/or operations specified in the flow charts, sequence diagrams, and/or block diagrams. The computer program instructions may be provided to one or more processors of a general purpose computer, a special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the one or more processors, cause a series of computations to be performed to implement the functions, acts, and/or operations specified in the flow charts, sequence diagrams, and/or block diagrams.
In certain alternative examples, the functions, acts, and/or operations specified in the flow charts, sequence diagrams, and/or block diagrams may be re-ordered, processed serially, and/or processed concurrently consistent with examples of the subject disclosure. Moreover, any of the flow charts, sequence diagrams, and/or block diagrams may include more or fewer blocks than those illustrated consistent with examples of the subject disclosure.
The terminology used herein is for the purpose of describing particular examples only and is not intended to be limiting of the examples of the subject disclosure. It will be further understood that the terms “comprises” and/or “comprising,” when used in this subject disclosure, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Furthermore, to the extent that the terms “includes”, “having”, “has”, “with”, “comprised of”, or variants thereof are used, such terms are intended to be inclusive in a manner similar to the term “comprising”.
While all of the examples have been illustrated by a description of various examples and while these examples have been described in considerable detail, it is not the intention to restrict or in any way limit the scope to such detail. Additional advantages and modifications will readily appear to those skilled in the art. The subject disclosure in its broader aspects is therefore not limited to the specific details, representative apparatus and method, and illustrative examples shown and described. Accordingly, departures may be made from such details without departing from the scope of the general concept.
Number | Date | Country | Kind |
---|---|---|---|
2001465 | Feb 2020 | FR | national |