A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
The present application generally relates to robotics, and more specifically to systems and methods for detecting and correcting diverged computer readable maps for robotic devices.
The foregoing needs are satisfied by the present disclosure, which provides for, inter alia, systems and methods for detecting and correcting diverged computer readable maps for robotic devices.
Exemplary embodiments described herein have innovative features, no single one of which is indispensable or solely responsible for their desirable attributes. Without limiting the scope of the claims, some of the advantageous features will now be summarized. One skilled in the art would appreciate that as used herein, the term robot may generally be referred to autonomous vehicle or object that travels a route, executes a task, or otherwise moves automatically upon executing or processing computer readable instructions.
According to at least one non-limiting exemplary embodiment, a method for detecting erroneous or divergent maps produced by a robot is disclosed. The method comprises: receiving, via a processor coupled to a robot, a computer readable map based on data collected by one or more sensors during navigation of a route; calculating at least one scoring metric, wherein the at least one scoring metric indicates quality of the computer readable map as a function of location on the map; and providing the at least one scoring metric to a model to determine whether the computer readable map is diverged.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a footprint score, wherein the footprint score is determined based on the at least one processor superimposing a plurality of footprints of the robot at discrete locations along the route; determining that one or more pixels including only occupied pixels or undetected pixels lie within a respective boundary of the plurality of footprints; determining the footprint score based on the distance of the one or more pixels from the boundary, wherein individual pixels of the one or more pixels contribute a value to the footprint score proportional to their distance to the boundary.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a scan consistency score, the scan consistency score is determined based on the at least one processor simulating a scan taken by one or more sensors at a plurality of discrete locations along the route, wherein the scan includes a plurality of distance measurements between the one or more sensors and objects taken at a plurality of angles, the simulating of the scan comprises extending digital rays representing the distance measurements for the plurality of angles; determining a magnitude of penetration of the digital rays through occupied pixels or unknown pixels on the computer readable map; and determining a scan consistency score for each node along the route based on the magnitude of penetration at each node.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a scan alignment score, wherein the scan alignment score is determined based on the at least one processor selecting a first location along the route and a plurality of other discrete locations proximate to the first location; computing a first translational difference between the first location and each of the plurality of other discrete locations using data from one or both of LiDAR sensors and odometry IMUs; and determining the scan alignment score based on a difference between the first translational difference and a second translational distance, the second translational distance corresponding to the distance between the first location and the plurality of other discrete locations on the map.
According to at least one non-limiting exemplary embodiment, the method further includes the at least one processor projecting a scan at a second plurality of locations, each one of the second plurality of locations corresponds to the first translational difference corresponding to the plurality of other discrete locations; and determining free space overlap between the scan taken at the first location and each one of the second plurality of locations, wherein non-overlapping free space increases the scan alignment score.
According to at least one non-limiting exemplary embodiment, the at least one processor may comprise one or more processors of the robot or one or more processors on a server in communications with the robot.
According to at least one non-limiting exemplary embodiment, the method further includes scan matching, wherein scan matching comprises the at least one processor determining a transformation minimizing spatial discrepancies between nearest neighboring points of two successive scans.
Another aspect provides a robotic system, comprising at least one processor coupled to a robot configured to execute computer readable instructions to receive a computer readable map based on data collected by one or more sensors during navigation of a route; calculate at least one scoring metric, wherein the at least one scoring metric indicates quality of the computer readable map as a function of location on the map; and provide the at least one scoring metric to a model to determine whether the computer readable map is diverged.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a footprint score, wherein the footprint score is determined based on the at least one processor superimposing a plurality of footprints of the robot at discrete locations along the route; determining that one or more pixels including only occupied pixels or undetected pixels lie within a respective boundary of the plurality of footprints; and determining the footprint score based on the distance of the one or more pixels from the boundary, wherein individual pixels of the one or more pixels contribute a value to the footprint score proportional to their distance to the boundary.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a scan consistency score, wherein the scan consistency score is determined based on the at least one processor simulating a scan taken by one or more sensors at a plurality of discrete locations along the route, the scan includes a plurality of distance measurements between the one or more sensors and objects taken at a plurality of angles, the simulating of the scan comprises extending digital rays representing the distance measurements for the plurality of angles; determining a magnitude of penetration of the digital rays through occupied pixels or unknown pixels on the computer readable map; and a scan consistency score for each node along the route based on the magnitude of penetration at each node.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a scan alignment score, wherein the scan alignment score is determined based on the at least one processor selecting a first location along the route and a plurality of other discrete locations proximate to the first location; computing a first translational difference between the first location and each of the plurality of other discrete locations using data from one or both of LiDAR sensors and odometry IMUs; and determining the scan alignment score based on a difference between the first translational difference and a second translational distance, the second translational distance corresponding to the distance between the first location and the plurality of other discrete locations on the map.
According to at least one non-limiting exemplary embodiment, the at least one processor is further configured to project a scan at a second plurality of locations, each one of the second plurality of locations corresponds to the first translational difference corresponding to the plurality of other discrete locations; and determine free space overlap between the scan taken at the first location and each one of the second plurality of locations, wherein non-overlapping free space increases the scan alignment score.
According to at least one non-limiting exemplary embodiment, the at least one processor coupled to the robot executes computer readable instructions to perform scan matching, wherein scan matching comprises the at least one processor determining a transformation minimizing spatial discrepancies between nearest neighboring points of two successive scans.
Another aspect provides a non-transitory computer readable storage medium having a plurality of instructions stored thereon which, when executed by at least one processor coupled to a robot, configure the at least one processor to receive a computer readable map based on data collected by one or more sensors during navigation of a route; calculate at least one scoring metric, wherein the at least one scoring metric indicates quality of the computer readable map as a function of location on the map; and provide the at least one scoring metric to a model to determine whether the computer readable map is diverged.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a footprint score, wherein the footprint score is determined based on the at least one processor superimposing a plurality of footprints of the robot at discrete locations along the route; determining that one or more pixels including only occupied pixels or undetected pixels lie within a respective boundary of the plurality of footprints; and determining the footprint score based on the distance of the one or more pixels from the boundary, wherein individual pixels of the one or more pixels contribute a value to the footprint score proportional to their distance to the boundary.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a scan consistency score, wherein the scan consistency score is determined based on the at least one processor simulating a scan taken by one or more sensors at a plurality of discrete locations along the route, the scan includes a plurality of distance measurements between the one or more sensors and objects taken at a plurality of angles, the simulating of the scan comprises extending digital rays representing the distance measurements for the plurality of angles; determining a magnitude of penetration of the digital rays through occupied pixels or unknown pixels on the computer readable map; and determining a scan consistency score for each node along the route based on the magnitude of penetration at each node.
According to at least one non-limiting exemplary embodiment, the at least one scoring metric includes a scan alignment score, wherein the scan alignment score is determined based on the at least one processor selecting a first location along the route and a plurality of other discrete locations proximate to the first location; computing a first translational difference between the first location and each of the plurality of other discrete locations using data from one or both of LiDAR sensors and odometry IMUs; and determining the scan alignment score based on a difference between the first translational difference and a second translational distance, the second translational distance corresponding to the distance between the first location and the plurality of other discrete locations on the map.
According to at least one non-limiting exemplary embodiment, the at least one processor is further configured to project a scan at a second plurality of locations, each one of the second plurality of locations corresponds to the first translational difference corresponding to the plurality of other discrete locations; and determine free space overlap between the scan taken at the first location and each one of the second plurality of locations, wherein non-overlapping free space increases the scan alignment score.
According to at least one non-limiting exemplary embodiment, wherein the at least one processor is further configured to perform scan matching, wherein scan matching comprises the at least one processor determining a transformation minimizing spatial discrepancies between nearest neighboring points of two successive scans.
These and other objects, features, and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification, wherein like reference numerals designate corresponding parts in the various figures. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the disclosure. As used in the specification and in the claims, the singular form of “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise.
The disclosed aspects will hereinafter be described in conjunction with the appended drawings, provided to illustrate and not to limit the disclosed aspects, wherein like designations denote like elements.
All Figures disclosed herein are © Copyright 2023 Brain Corporation. All rights reserved.
Currently, many robots operate using computer readable maps which represent the physical environment around them. These maps are constructed using data from various sensors and internal monitoring units of the robots, each of which are subject to various errors, drift, biases, faulty calibration, and other issues. To represent the environment more accurately on the map, the processors of the robots aggregate not only sensory data but other constraints. For instance, to determine how far a robot has traveled in a straight line, data from a gyroscope, encoder, and light-detection and ranging sensor (“LiDAR”) may be used. Each of these instruments may produce a slightly different value for the translation due to the imperfections (e.g., noise, bias, drift, etc.). Accordingly, the processor must account for the differing measurements and infer a most likely value for its translation. Other constraints may also be implemented. For instance, loop closures, or points along a route visited more than once by the robot, may constrain the inferences by ensuring these points are at the same location on the map. Additional constraints may be feature-based, wherein a robot seeing a feature at two or more locations can infer its position relative to that feature. Inferring a most likely value of a parameter, such as the locations of objects/the robot on the map, using imperfect data and various constraints may be referred to herein as optimization. For robots with well-tuned sensors navigating short routes, optimization of the maps is minimal. However, optimization processes for maps created by robots with many sensors (difficult to calibrate) navigating long routes (subject to odometry drift) in feature-poor environments (difficult to constrain) may fail to accurately represent the environment, thereby producing a diverged map. These diverged maps may cause, for example, a robot to think it is colliding with an object or stuck when, in the physical environment, it is able to move. Divergent map detection typically involves identifying many patterns which for humans are quite intuitive, wherein a human could look at most maps and immediately tell if they are diverged due to our familiarity with 3-dimensional space. However, for many robots, each producing unique maps of their unique environments, human review may become costly. Accordingly, there is a need in the art for systems and methods which automatically detect diverged maps produced by robots as well as correct them to be usable by the robots.
Various aspects of the novel systems, apparatuses, and methods disclosed herein are described more fully hereinafter with reference to the accompanying drawings. This disclosure can, however, be embodied in many different forms and should not be construed as limited to any specific structure or function presented throughout this disclosure. Rather, these aspects are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. Based on the teachings herein, one skilled in the art would appreciate that the scope of the disclosure is intended to cover any aspect of the novel systems, apparatuses, and methods disclosed herein, whether implemented independently of, or combined with, any other aspect of the disclosure. For example, an apparatus may be implemented or a method may be practiced using any number of the aspects set forth herein. In addition, the scope of the disclosure is intended to cover such an apparatus or method that is practiced using other structure, functionality, or structure and functionality in addition to or other than the various aspects of the disclosure set forth herein. It should be understood that any aspect disclosed herein may be implemented by one or more elements of a claim.
Although particular aspects are described herein, many variations and permutations of these aspects fall within the scope of the disclosure. Although some benefits and advantages of the preferred aspects are mentioned, the scope of the disclosure is not intended to be limited to particular benefits, uses, and/or objectives. The detailed description and drawings are merely illustrative of the disclosure rather than limiting, the scope of the disclosure being defined by the appended claims and equivalents thereof.
The present disclosure provides for systems and methods for detecting and correcting diverged computer readable maps for robotic devices. As used herein, a robot may include mechanical and/or virtual entities configured to carry out a complex series of tasks or actions autonomously. In some exemplary embodiments, robots may be machines that are guided and/or instructed by computer programs and/or electronic circuitry. In some exemplary embodiments, robots may include electro-mechanical components that are configured for navigation, where the robot may move from one location to another. Such robots may include autonomous and/or semi-autonomous cars, floor cleaners, rovers, drones, planes, boats, carts, trams, wheelchairs, industrial equipment, stocking machines, mobile platforms, personal transportation devices (e.g., hover boards, SEGWAY® vehicles etc.), trailer movers, vehicles, and the like. Robots may also include any autonomous and/or semi-autonomous machine for transporting items, people, animals, cargo, freight, objects, luggage, and/or anything desirable from one location to another.
As used herein, drift refers to an accumulation of an error over a time from a sensor unit or instrument. Drift is typically found in instruments which rely on integration of a signal over time, including gyroscopes, accelerometers, and the like, wherein variance in the integration intervals causes a small, yet compounding error to be propagated which may eventually grow to being substantial. For instance, many gyroscopes suffer from drift, primarily along the yaw axis due to the lack of reference (gravity) to correct errors with respect to, which causes errors to accumulate and grow over time. This is particularly impactful for robots operating on flat surfaces (e.g., floors) that move in 2-dimensional space and turn about the yaw axis. In the case of gyroscopes, drift may be accounted for by nulling any sensed motion when it is known the gyroscope is idle; however for robots 102 fully stopping in the middle of a task may be undesirable.
As used herein, a feature may comprise one or more numeric values (e.g., floating point, decimal, a tensor of values, etc.) characterizing an input from a sensor unit 114 including, but not limited to, detection of an object, the object itself, portions of the object, parameters of the object (e.g., size, shape color, orientation, edges, etc.), an image as a whole, portions of the image (e.g., a hand of a painting of a human), color values of pixels of an image, depth values of pixels of a depth image, brightness of an image, changes of features over time (e.g., velocity, trajectory, etc. of an object), sounds, spectral energy of a spectrum bandwidth, motor feedback (i.e., encoder values), sensor values (e.g., gyroscope, accelerometer, GPS, magnetometer, etc. readings), a binary categorical variable, an enumerated type, a character/string, or any other characteristic of a sensory input. For example, a bottle of soap on a shelf may be a feature of the shelf, wherein a yellow price tag may be a feature of the bottle of soap and the shelf may be a feature of a store environment. The amount of soap bottles sold may be a feature of the sales environment.
As used herein, network interfaces may include any signal, data, or software interface with a component, network, or process including, without limitation, those of the Fire Wire (e.g., FW400, FW800, FWS800T, FWS1600, FWS3200, etc.), universal serial bus (“USB”) (e.g., USB 1.X, USB 2.0, USB 3.0, USB Type-C, etc.), Ethernet (e.g., 10/100, 10/100/1000 (Gigabit Ethernet), 10-Gig-E, etc.), multimedia over coax alliance technology (“MoCA”), Coaxsys (e.g., TVNET™), radio frequency tuner (e.g., in-band or OOB, cable modem, etc.), Wi-Fi (802.11), WiMAX (e.g., WiMAX (802.16)), PAN (e.g., PAN/802.15), cellular (e.g., 3G, 4G, or 5G including LTE/LTE-A/TD-LTE/TD-LTE, GSM, etc. variants thereof), IrDA families, etc. As used herein, Wi-Fi may include one or more of IEEE-Std. 802.11, variants of IEEE-Std. 802.11, standards related to IEEE-Std. 802.11 (e.g., 802.11 a/b/g/n/ac/ad/af/ah/ai/aj/aq/ax/ay), and/or other wireless standards.
As used herein, processor, microprocessor, and/or digital processor may include any type of digital processing device such as, without limitation, digital signal processors (“DSPs”), reduced instruction set computers (“RISC”), complex instruction set computers (“CISC”) processors, microprocessors, gate arrays (e.g., field programmable gate arrays (“FPGAs”)), programmable logic device (“PLDs”), reconfigurable computer fabrics (“RCFs”), array processors, secure microprocessors, and application-specific integrated circuits (“ASICs”). Such digital processors may be contained on a single unitary integrated circuit die or distributed across multiple components.
As used herein, computer program and/or software may include any sequence or human or machine cognizable steps which perform a function. Such computer program and/or software may be rendered in any programming language or environment including, for example, C/C++, C#, Fortran, COBOL, MATLAB™, PASCAL, GO, RUST, SCALA, Python, assembly language, markup languages (e.g., HTML, SGML, XML, VoXML), and the like, as well as object-oriented environments such as the Common Object Request Broker Architecture (“CORBA”), JAVA™ (including J2ME, Java Beans, etc.), Binary Runtime Environment (e.g., “BREW”), and the like.
As used herein, connection, link, and/or wireless link may include a causal link between any two or more entities (whether physical or logical/virtual), which enables information exchange between the entities.
As used herein, computer and/or computing device may include, but are not limited to, personal computers (“PCs”) and minicomputers, whether desktop, laptop, or otherwise, mainframe computers, workstations, servers, personal digital assistants (“PDAs”), handheld computers, embedded computers, programmable logic devices, personal communicators, tablet computers, mobile devices, portable navigation aids, J2ME equipped devices, cellular telephones, smart phones, personal integrated communication or entertainment devices, and/or any other device capable of executing a set of instructions and processing an incoming data signal.
Detailed descriptions of the various embodiments of the system and methods of the disclosure are now provided. While many examples discussed herein may refer to specific exemplary embodiments, it will be appreciated that the described systems and methods contained herein are applicable to any kind of robot. Myriad other embodiments or uses for the technology described herein would be readily envisaged by those having ordinary skill in the art, given the contents of the present disclosure.
Advantageously, the systems and methods of this disclosure at least: (i) enable automatic detection of diverged maps; (ii) provide scoring metrics which are location-specific to indicate where on a map it began to diverge; and (iii) enable models to correct for diverged maps via reducing or minimizing the scoring metrics. Other advantages are readily discernable by one having ordinary skill in the art given the contents of the present disclosure.
Controller 118 may control the various operations performed by robot 102. Controller 118 may include and/or comprise one or more processing devices or processors (e.g., microprocessors) and other peripherals. As previously mentioned and used herein, processor, microprocessor, and/or digital processor may include any type of digital processing device such as, without limitation, digital signal processors (“DSPs”), reduced instruction set computers (“RISC”), complex instruction set computers (“CISC”), microprocessors, gate arrays (e.g., field programmable gate arrays (“FPGAs”)), programmable logic device (“PLDs”), reconfigurable computer fabrics (“RCFs”), array processors, secure microprocessors and application-specific integrated circuits (“ASICs”). Peripherals may include hardware accelerators configured to perform a specific function using hardware clements such as, without limitation, encryption/description hardware, algebraic processors (e.g., tensor processing units, quadradic problem solvers, multipliers, etc.), data compressors, encoders, arithmetic logic units (“ALU”), and the like. Such digital processors may be contained on a single unitary integrated circuit die or distributed across multiple components.
Controller 118 may be operatively and/or communicatively coupled to memory 120. Memory 120 may include any type of integrated circuit or other storage device configured to store digital data including, without limitation, read-only memory (“ROM”), random access memory (“RAM”), non-volatile random access memory (“NVRAM”), programmable read-only memory (“PROM”), electrically erasable programmable read-only memory (“EEPROM”), dynamic random-access memory (“DRAM”). Mobile DRAM, synchronous DRAM (“SDRAM”), double data rate SDRAM (“DDR/2 SDRAM”), extended data output (“EDO”) RAM, fast page mode RAM (“FPM”), reduced latency DRAM (“RLDRAM”), static RAM (“SRAM”), flash memory (e.g., NAND/NOR), memristor memory, pseudostatic RAM (“PSRAM”), etc. Memory 120 may provide computer-readable instructions and data to controller 118. For example, memory 120 may be a non-transitory, computer-readable storage apparatus and/or medium having a plurality of instructions stored thereon, the instructions being executable by a processing apparatus (e.g., controller 118) to operate robot 102. In some cases, the computer-readable instructions may be configured to, when executed by the processing apparatus, cause the processing apparatus to perform the various methods, features, and/or functionality described in this disclosure. Accordingly, controller 118 may perform logical and/or arithmetic operations based on program instructions stored within memory 120. In some cases, the instructions and/or data of memory 120 may be stored in a combination of hardware, some located locally within robot 102, and some located remote from robot 102 (e.g., in a cloud, server, network, etc.).
It should be readily apparent to one of ordinary skill in the art that a processor may be internal to or on-board robot 102 and/or may be external to robot 102 and be communicatively coupled to controller 118 of robot 102 utilizing communication units 116 wherein the external processor may receive data from robot 102, process the data, and transmit computer-readable instructions back to controller 118. In at least one non-limiting exemplary embodiment, the processor may be on a remote server (not shown).
In some exemplary embodiments, memory 120, shown in
Still referring to
Returning to
In exemplary embodiments, navigation units 106 may include systems and methods that may computationally construct and update a map of an environment, localize robot 102 (e.g., find its position) in a map, and navigate robot 102 to/from destinations. The mapping may be performed by imposing data obtained in part by sensor units 114 into a computer-readable map representative at least in part of the environment. In exemplary embodiments, a map of an environment may be uploaded to robot 102 through user interface units 112, uploaded wirelessly or through wired connection, or taught to robot 102 by a user.
In exemplary embodiments, navigation units 106 may include components and/or software configured to provide directional instructions for robot 102 to navigate. Navigation units 106 may process maps, routes, and localization information generated by mapping and localization units, data from sensor units 114, and/or other operative units 104.
Still referring to
Actuator unit 108 may also include any system used for actuating and, in some cases actuating task units to perform tasks. For example, actuator unit 108 may include driven magnet systems, motors/engines (e.g., electric motors, combustion engines, steam engines, and/or any type of motor/engine known in the art), solenoid/ratchet systems, piezoelectric systems (e.g., an inchworm motor), magnetostrictive elements, gesticulation, and/or any actuator known in the art.
According to exemplary embodiments, sensor units 114 may comprise systems and/or methods that may detect characteristics within and/or around robot 102. Sensor units 114 may comprise a plurality and/or a combination of sensors. Sensor units 114 may include sensors that are internal to robot 102 or external, and/or have components that are partially internal and/or partially external. In some cases, sensor units 114 may include one or more exteroceptive sensors, such as sonars, light detection and ranging (“LiDAR”) sensors, radars, lasers, cameras (including video cameras (e.g., red-blue-green (“RBG”) cameras, infrared cameras, three-dimensional (“3D”) cameras, thermal cameras, etc.), time of flight (“ToF”) cameras, structured light cameras, etc.), antennas, motion detectors, microphones, and/or any other sensor known in the art. According to some exemplary embodiments, sensor units 114 may collect raw measurements (e.g., currents, voltages, resistances, gate logic, etc.) and/or transformed measurements (e.g., distances, angles, detected points in obstacles, etc.). In some cases, measurements may be aggregated and/or summarized. Sensor units 114 may generate data based at least in part on distance or height measurements. Such data may be stored in data structures, such as matrices, arrays, queues, lists, arrays, stacks, bags, etc.
According to exemplary embodiments, sensor units 114 may include sensors that may measure internal characteristics of robot 102. For example, sensor units 114 may measure temperature, power levels, statuses, and/or any characteristic of robot 102. In some cases, sensor units 114 may be configured to determine the odometry of robot 102. For example, sensor units 114 may include proprioceptive sensors, which may comprise sensors such as accelerometers, inertial measurement units (“IMU”), odometers, gyroscopes, speedometers, cameras (e.g. using visual odometry), clock/timer, and the like. Odometry may facilitate autonomous navigation and/or autonomous actions of robot 102. This odometry may include robot 102's position (e.g., where position may include robot's location, displacement and/or orientation, and may sometimes be interchangeable with the term pose as used herein) relative to the initial location. Such data may be stored in data structures, such as matrices, arrays, queues, lists, arrays, stacks, bags, etc. According to exemplary embodiments, the data structure of the sensor data may be called an image.
According to exemplary embodiments, sensor units 114 may be in part external to the robot 102 and coupled to communications units 116. For example, a security camera within an environment of a robot 102 may provide a controller 118 of the robot 102 with a video feed via wired or wireless communication channel(s). In some instances, sensor units 114 may include sensors configured to detect a presence of an object at a location such as, for example without limitation, a pressure or motion sensor may be disposed at a shopping cart storage location of a grocery store, wherein the controller 118 of the robot 102 may utilize data from the pressure or motion sensor to determine if the robot 102 should retrieve more shopping carts for customers.
According to exemplary embodiments, user interface units 112 may be configured to enable a user to interact with robot 102. For example, user interface units 112 may include touch panels, buttons, keypads/keyboards, ports (e.g., universal serial bus (“USB”), digital visual interface (“DVI”), Display Port, E-Sata, Firewire, PS/2, Serial, VGA, SCSI, audioport, high-definition multimedia interface (“HDMI”), personal computer memory card international association (“PCMCIA”) ports, memory card ports (e.g., secure digital (“SD”) and miniSD), and/or ports for computer-readable medium), mice, rollerballs, consoles, vibrators, audio transducers, and/or any interface for a user to input and/or receive data and/or commands, whether coupled wirelessly or through wires. Users may interact through voice commands or gestures. User interface units 218 may include a display, such as, without limitation, liquid crystal display (“LCDs”), light-emitting diode (“LED”) displays, LED LCD displays, in-plane-switching (“IPS”) displays, cathode ray tubes, plasma displays, high definition (“HD”) panels, 4K displays, retina displays, organic LED displays, touchscreens, surfaces, canvases, and/or any displays, televisions, monitors, panels, and/or devices known in the art for visual presentation. According to exemplary embodiments user interface units 112 may be positioned on the body of robot 102. According to exemplary embodiments, user interface units 112 may be positioned away from the body of robot 102 and may be communicatively coupled to robot 102 (e.g., via communication units including transmitters, receivers, and/or transceivers) directly or indirectly (e.g., through a network, server, and/or a cloud). According to exemplary embodiments, user interface units 112 may include one or more projections of images on a surface (e.g., the floor) proximally located to the robot, e.g., to provide information to the occupant or to people around the robot. The information could be the direction of future movement of the robot, such as an indication of moving forward, left, right, back, at an angle, and/or any other direction. In some cases, such information may utilize arrows, colors, symbols, etc.
According to exemplary embodiments, communications unit 116 may include one or more receivers, transmitters, and/or transceivers. Communications unit 116 may be configured to send/receive a transmission protocol, such as BLUETOOTH®, ZIGBEE®, Wi-Fi, induction wireless data transmission, radio frequencies, radio transmission, radio-frequency identification (“RFID”), near-field communication (“NFC”), infrared, network interfaces, cellular technologies such as 3G (3.5G, 3.75G, 3GPP/3GPP2/HSPA+), 4G (4GPP/4GPP2/LTE/LTE-TDD/LTE-FDD), 5G (5GPP/5GPP2), or 5G LTE (long-term evolution, and variants thereof including LTE-A, LTE-U, LTE-A Pro, etc.), high-speed downlink packet access (“HSDPA”), high-speed uplink packet access (“HSUPA”), time division multiple access (“TDMA”), code division multiple access (“CDMA”) (e.g., IS-95A, wideband code division multiple access (“WCDMA”), etc.), frequency hopping spread spectrum (“FHSS”), direct sequence spread spectrum (“DSSS”), global system for mobile communication (“GSM”). Personal Area Network (“PAN”) (e.g., PAN/802.15), worldwide interoperability for microwave access (“WiMAX”), 802.20, long term evolution (“LTE”) (e.g., LTE/LTE-A), time division LTE (“TD-LTE”), global system for mobile communication (“GSM”), narrowband/frequency-division multiple access (“FDMA”), orthogonal frequency-division multiplexing (“OFDM”), analog cellular, cellular digital packet data (“CDPD”), satellite systems, millimeter wave or microwave systems, acoustic, infrared (e.g., infrared data association (“IrDA”)), and/or any other form of wireless data transmission.
Communications unit 116 may also be configured to send/receive signals utilizing a transmission protocol over wired connections, such as any cable that has a signal line and ground. For example, such cables may include Ethernet cables, coaxial cables. Universal Serial Bus (“USB”), Fire Wire, and/or any connection known in the art. Such protocols may be used by communications unit 116 to communicate to external systems, such as computers, smart phones, tablets, data capture systems, mobile telecommunications networks, clouds, servers, or the like. Communications unit 116 may be configured to send and receive signals comprising numbers, letters, alphanumeric characters, and/or symbols. In some cases, signals may be encrypted, using algorithms such as 128-bit or 256-bit keys and/or other encryption algorithms complying with standards such as the Advanced Encryption Standard (“AES”), RSA, Data Encryption Standard (“DES”). Triple DES, and the like. Communications unit 116 may be configured to send and receive statuses, commands, and other data/information. For example, communications unit 116 may communicate with a user operator to allow the user to control robot 102. Communications unit 116 may communicate with a server/network (e.g., a network) to allow robot 102 to send data, statuses, commands, and other communications to the server. The server may also be communicatively coupled to computer(s) and/or device(s) that may be used to monitor and/or control robot 102 remotely. Communications unit 116 may also receive updates (e.g., firmware or data updates), data, statuses, commands, and other communications from a server for robot 102.
In exemplary embodiments, operating system 110 may be configured to manage memory 120, controller 118, power supply 122, modules in operative units 104, and/or any software, hardware, and/or features of robot 102. For example, and without limitation, operating system 110 may include device drivers to manage hardware recourses for robot 102.
In exemplary embodiments, power supply 122 may include one or more batteries, including, without limitation, lithium, lithium ion, nickel-cadmium, nickel-metal hydride, nickel-hydrogen, carbon-zinc, silver-oxide, zinc-carbon, zinc-air, mercury oxide, alkaline, or any other type of battery known in the art. Certain batteries may be rechargeable, such as wirelessly (e.g., by resonant circuit and/or a resonant tank circuit) and/or by plugging into an external power source. Power supply 122 may also be any supplier of energy, including wall sockets and electronic devices that convert solar, wind, water, nuclear, hydrogen, gasoline, natural gas, fossil fuels, mechanical energy, steam, and/or any power source into electricity.
One or more of the units described with respect to
As used herein, a robot 102, a controller 118, or any other controller, processor, or robot performing a task, operation or transformation illustrated in the figures below comprises a controller executing computer readable instructions stored on a non-transitory computer readable storage apparatus, such as memory 120, as would be appreciated by one skilled in the art.
Next referring to
One of ordinary skill in the art would appreciate that the architecture illustrated in
One of ordinary skill in the art would appreciate that a controller 118 of a robot 102 may include one or more processing devices 138 and may further include other peripheral devices used for processing information, such as ASICS, DPS, proportional-integral-derivative (“PID”) controllers, hardware accelerators (e.g., encryption/decryption hardware), and/or other peripherals (e.g., analog to digital converters) described above in
Individual beams 208 of photons may localize respective points 204 of the wall 206 in a point cloud, the point cloud comprising a plurality of points 204 localized in 2D or 3D space as illustrated in
According to at least one non-limiting exemplary embodiment, sensor 202 may be illustrative of a depth camera or other ToF sensor configurable to measure distance, wherein the sensor 202 being a planar LiDAR sensor is not intended to be limiting. Depth cameras may operate similarly to planar LiDAR sensors (i.e., measure distance based on a ToF of beams 208); however, depth cameras may emit beams 208 using a single pulse or flash of electromagnetic energy, rather than sweeping a laser beam across a field of view. Depth cameras may additionally comprise a two-dimensional field of view rather than a one-dimensional, planar field of view.
According to at least one non-limiting exemplary embodiment, sensor 202 may be illustrative of a structured light LiDAR sensor configurable to sense distance and shape of an object by projecting a structured pattern onto the object and observing deformations of the pattern. For example, the size of the projected pattern may represent distance to the object and distortions in the pattern may provide information of the shape of the surface of the object. Structured light sensors may emit beams 208 along a plane as illustrated or in a predetermined pattern (e.g., a circle or series of separated parallel lines).
For three-dimensional space, such as aquatic robots or drones, the states for each node 216 may include (x, y, z, yaw, pitch, roll) state variables. Additionally, based on the functions of the robot 102, certain states of actuatable features may be added to denote the tasks to be performed by the robot 102. For instance, if the robot 102 is an item transport robot, additional state parameters which denote, e.g., “drop off object” and “pick up object” may be utilized. Other states may include the state of light sources on the robot 102, states of joints of a robotic arm, states of a vacuum for floor cleaning robots, and the like.
Each node 216 is connected to previous and subsequent nodes 216 via a link 218. Links 218 may denote a shortest path from one node 216 to the next node 216. Typically, the nodes 216 are spaced close enough such that the links 218, which comprise straight-line approximations, are sufficiently small to enable smooth movement even through turns. Other methods of non-linear interpolation may be utilized; however, such methods may be computationally taxing to calculate in real time and may or may not be preferred for all robots 102.
The nodes 216 denote a location of the robot 102 in the environment and exist on a computer readable map of the environment. The robot 102 may be defined on the map via a footprint, corresponding to the size/shape occupied by the robot 102 in the physical world, wherein the footprint includes a dimensionless origin point. The origin point defines the dimensionless location of the robot 102 in space. A transform is stored in memory 120 (e.g., specified by a manufacturer of the robot 102) and/or, in some embodiments, continuously updated (e.g., using calibration methods) between the robot origin and sensor origins 210 for each sensor of the robot 102, thereby enabling the controller 118 to translate a, e.g., 5-meter distance measured by a sensor 202 to an object into a distance from the robot 102 and/or location in the physical space.
According to at least one non-limiting exemplary embodiment, the nodes 216 and links 218 therebetween are configured to reflect a user-demonstrated path. For instance, an operator may drive, push, lead, or otherwise move a robot 102 through a path. Navigation units 106 and sensor units 114 may provide data to the controller 118 to enable it to track the motions effected by the operator and sense the physical environment to construct a computer readable map.
Beams 208-2, 208-3, and 208-4 are incident upon the object 302 and therefore localize its surface nearest the sensor 202 at a distance corresponding to the measured distance by the sensor 202. Beams 208 do not travel into objects, so only the closest surface is localized. The closest surface is shown in dark black comprising occupied pixels 306. Since the top, bottom and rightmost boundaries (as illustrated in the figure) of the object 302 are not sensible from the perspective of sensor 202, the controller 118 cannot label the other edges as occupied, nor can it identify the internal space defined by these boundaries as occupied. Rather, the area behind the occupied segment 306 is denoted with unknown 304 pixels. In some embodiments, the entire map may be initialized to unknown space 304 to define the region of unknown space 304 illustrated in
Pixel states may change based on a set of priorities. An unknown pixel 304 may be changed to an occupied pixel 306 or free space pixel 308 if a new measurement is taken that either senses an object at the location of the pixel, thereby changing the unknown pixels 304 to an object pixel 306, or a traced ray (e.g., 208-1, 208-5) extends through the unknown pixel 304, changing the unknown pixel 304 to free space 308. An occupied pixel 306 may be changed to a free space pixel if a ray 208 extending from the sensor origin 210 to an object passes through the occupied pixel 306 (e.g., upon viewing the occupied pixel from a different perspective). An occupied pixel 306 may also be changed to free space 308 if a path of a ray which corresponds to no distance measurement (e.g., 208-1, 208-5) travels through the occupied pixel 306. In some embodiments, free space pixels 308 may be reverted to unknown pixels after a threshold time to require the robot 102 to plan its paths only in free space which it has measured as free space recently.
One skilled in the art may appreciate that computer readable maps for robots 102 may not perfectly represent an environment in which the robot 102 operates, while still being usable for the robot 102 to complete its tasks. For instance, small errors in a location of an object may be present; however, unless the robot 102 is required to interact with or navigate around the object the errors may not cause a substantial issue to arise. In other instances, a pixel that should be classified as occupied by the boundary 306 of object 302 may be misclassified as free space. However, the surrounding pixels properly classified as occupied may be sufficient to cause the robot to consider boundary 306 to be in an occupied state despite a small number of pixels classified as free space. Accordingly, a map as used herein is considered diverged if the robot 102 does not accurately reflect the physical environment and/or the robot 102 is unable to perform its functions using the map. Various causes for map divergences are contemplated herein, none of which are intended to be limiting. A common error may arise from improper calibration of one or more sensors which, if undetected, could propagate errors through a map, as will be discussed below. Improper sensory calibration may cause errors in robot localization, which may further delocalize objects on the map. Another common source for map divergences is optimization of the computer readable map which attempts to infer, using limited sensory data, the proper locations/positions of objects on the map by aggregating multiple measurements (which could be prone to calibration errors). Many feature-to-feature alignments, some of which are discussed herein, used to correct mapping errors can fail to determine a proper solution in environments with few features (e.g., many repeating identical hallways or aisles, e.g., in a grocery store). Lastly, drift in odometry may cause a map to diverge due to an accumulation of error which, if left unaccounted for, would accumulate localization errors and propagate them through a map. In some instances, portions of a diverged map may be usable while other portions of the map are unusable, thereby making the entire map unusable.
Various other calibration errors may also be present which cause segments 406 to not properly represent the true position of the surface 402. For instance, a translational error downward on the page away from surface 402 in the sensor 202 position may cause the distances measured by beams 208 to be larger, and therefore localize segments 406 to be further from the sensor 202/robot 102. In some instances, such as for ToF sensors, ghosts may appear due to reflections, glare, and other environmental conditions. For instance, a LiDAR sensor 202 with beams 208 incident upon a reflective surface will not localize said surface and may instead localize another object upon a beam 208 reflecting off the mirror, reflecting off the object, and back to the sensor 202, wherein the sensor 202 does not know a reflection occurred and cannot account for the change in beam path. Accordingly, the sensor 202 localizes a point 204 based on the ToF of the beam 208 while assuming the beam 208 traveled uninhibited in its emitted direction. The result would erroneously localize the other object at a distance and location based on the ToF, including the longer reflection path, and along the original emitted angle of the beam. Other environmental conditions such as excessive white lighting may drown out return signals, causing return signals to be unreadable or read incorrectly.
The positional errors in the localized segments 406 of surface 402 may not be materially impactful as the robot 102 navigates at its current position. For instance, the segments 406 may not protrude enough to contact the robot 102 body. However, often robots 102 navigate to a same location multiple times, wherein these protruding segments 406 may contact the robot 102 on a computer readable map. It is appreciated that the segments 406 are digital representations of where the controller 118 perceives surface 402 to be, wherein the robot's position intersecting with these segments 406 on the computer readable map does not necessarily involve physical contact.
A first map divergence scoring metric, referred to hereinafter as a footprint score, is discussed next in
S
fpΣn=1NΣp=1PA*N(F)=B*d(punknown, poccupied, F) (Eqn. 1)
wherein N is the number of route nodes 216, P is the total pixels of the computer readable map, and A and B are constant weights. Function N(FP) determines the number of non-free space pixels which lie within the footprint, I, boundary. Function d(punknown, poccupied, F) determines, for each pixel of the map with occupied or unknown states, how far into the footprint boundary the pixels penetrate, which takes a value of zero for all unknown or occupied pixels not within the boundary of the footprint. For consistency of explanation herein, a higher score will correspond to more errors, such as more occupied pixels penetrating a footprint, though one skilled in the art may appreciate the scoring may be inversed and/or normalized. It can be appreciated that the robot footprint 502 may intersect with segment 406 at positions between nodes. However, calculated the footprint score at nodes 216 can be sufficient to score a map without undue computations.
Next, a second scoring metric will be discussed in reference to
According to at least one non-limiting exemplary embodiment, the erroneous localization of the surface 402 shown by segments 602-1 and 602-2 may be caused by drift in odometry failing to accurately localize the robot 102, and thereby the sensor 202, at the proper physical location. In some instances, both odometry drift and miscalibrations may both cause errors to be present in a map.
As discussed above in reference to
Once the robot 102 has completed the route 604, the controller 118 may analyze the produced map for divergences. As shown in
A plurality of beams 208, which may be illustrative of simulations of the beams emitted by a LiDAR sensor or depth camera, are illustrated. Many of the beams 208 pass through a perceived object 602-2 denoted by occupied pixels within portion 608 because while at the location 202-1 the sensor did not detect the portion 608. Additionally, portion 608 was not cleared via ray tracing upon the robot 102 navigating to the portion 608 at a later time. Accordingly, there is a large disagreement between the local scan acquired at location 202-1 and the resulting computer readable map.
First, in
If the controller 118 determines all N nodes 216 of the route have been analyzed (i.e., n=N) at block 704, the controller 118 moves to block 714.
If at block 704 the controller 118 determines not all N nodes 216 of the route have been analyzed (i.e., n<N), the controller 118 moves to block 706.
Block 706 includes the controller 118 superimposing a digital representation of the robot, referred to as a footprint, onto the node n of the route. The footprint may estimate the size and shape of the robot 102 on the computer readable map and may comprise the same dimensions of the robot 102 in physical space scaled to the computer readable map.
Block 708 includes the controller 118 determining a first value based on the presence of one or more undetected space pixels 304 or object pixels 306 being present within the robot footprint boundaries. Assuming the robot 102 includes sensors capable of sensing the area along its direction of travel, it would not make physical sense for undetected space 304 to be present within the robot 102 body at a location where the robot 102 was traveling (i.e., the nth node 216), as this area should have been sensed prior to the robot 102 navigating there. Further, any object pixels 306 present within the boundary of the footprint (i.e., within an area occupied by the robot 102) does not make physical sense because a collision would have occurred if these pixels 306 represented true objects. Accordingly, it can be determined that these pixels 304, 306 that lie within the robot 102 footprint imposed at the node n are erroneous, and do not accurately represent the physical space. Given the physical limitations considered, i.e., that a robot 102 cannot have an object within itself without colliding with the object, these pixels 304, 306 will be scored against the quality of the map at the node n.
Block 710 includes the controller 118 determining a second value based on the penetration depth within the footprint of the one or more object pixels 306 or undetected space pixels 304 within the robot footprint. In one exemplary embodiment, for every undetected space or object pixel 304, 306 within the footprint, a distance to a nearest point along the boundary of the footprint is calculated. The aggregate of these distance measurements may form the second value. The second value should increase as the penetration depth of the pixels 304, 306 increases. Analogously, the second value calculates a penalty for unknown space 304 or objects 306 penetrating deep into the robot 102 footprint. As stated above, object pixels 306 penetrating deep into the footprint does not make physical sense as a collision would have occurred. Further, assuming the robot 102 does not have blind spots along its direction of travel, all regions where the robot 102 travels to and occupies should be sensed for objects, and therefore should include no unknown space 304. These pixels 304, 306 often penetrate the robot 102 footprint after map optimizations are performed or as a result of odometry drift and/or sensor calibration, wherein the second value increases to penalize the optimizations for causing the pixels 304, 306 to now intrude onto locations where the robot 102 needs to occupy in executing the route.
Block 712 includes the controller 118 incrementing n by one and returning to block 704 to repeat the footprint analysis for all N nodes of the route.
Block 714 includes the controller 118 determining a final footprint score of the route. In some embodiments, the final footprint score may be a single value based on the weighted summation of all the pixels 304, 306 which penetrate all N footprints for all N route nodes 216. In some embodiments, the footprint score may be a function of the route nodes 216 rather than a summation and may be stored as an array or matrix, wherein each element of the array/matrix corresponds to a footprint score for a corresponding node 216. While a single aggregate value for the footprint score may be used to determine if the map, as a whole, is diverged, denoting the footprint score per route node 216 may indicate locations where the map quality is poor. For example, by displaying the route nodes on the map and color coding the nodes based on their footprint score values, a heatmap of high and low scoring areas may be generated which may quickly communicate to a viewer where on the map footprint violations are occurring.
Next, in
Block 722 includes the controller 118 superimposing a scan taken by a sensor 202 and associated with the node n on a computer readable map. For each node 216 of the route, at least one associated scan by a LiDAR sensor 202 is measured. The scan includes a plurality of distances captured at different angles around an origin 210 of the sensor 202, the origin 210 being at a known location on the map. The measurement may be considered as a set of values comprising an x, y, and, if applicable, z location of the origin 210 as well as θ and distance values extending from the origin 210.
Block 724 includes the controller 118 extending a plurality of rays corresponding to the scan of block 722. The rays travel radially from the origin 210 of the sensor a distance equal to the distances measured in the scan. Each ray projected may correspond to a discrete angle along which a LiDAR sensor 202 emits a beam 208. The projected rays correspond to the distances measured while the robot 102 is at the node n and not distances that would be measured if the current map correctly reflects the environment. as shown in
Block 726 includes the controller 118 determining a first value based on the magnitude the rays travel through unknown space 304 or travel through object pixels 306. Stated another way, the controller 118 performs a ray tracing procedure on the computer readable map for the scan taken at the node n. If a ray passes through a mapped object 306 pixel and extends substantially behind the object 306 pixel, it is likely that the object 306 pixel is erroneously localized or the sensor origin 210 is erroneously localized as it would be otherwise impossible for a light-based ToF LiDAR sensor to sense points behind (i.e., through) an object. Thus, the rays extending beyond an object behind what would be out of the line of sight of the sensor (if the map was an accurate reflection of the physical space) are used as a penalty for map quality. Similarly for the unknown space 304 pixels, the rays are simulated scans by the LiDAR sensor 202, wherein the path traveled by the beams 208 comprise measured/sensed space and therefore should not be marked as unknown 304 but may have been determined to be unknown 304 by the controller 118 performing optimizations to the map.
By way of an illustrative non-limiting example with reference to
Block 728 includes the controller 118 incrementing n by one.
Block 730 includes the controller 118 determining a scan consistency score based on the first values for all N nodes 216. Like the footprint score discussed above, the scan consistency score may comprise a single numeric value as a summation of all scan consistency scores for all nodes, or as a function of the scan consistency score for each node 216. The scan consistency score provides analysis for localized disagreements between a measurement by a sensor of a physical space and the final, optimized map of the physical space.
Block 738 includes the controller 118 retrieving a scan taken from node 216 n and one or more scans corresponding to one or more other nodes 216 proximate the node 216 n. The at least one other node 216 may be sequential nodes, e.g., n±5, or may be nodes 216 which are within a threshold distance from node 216 n (e.g., a switchback as shown in
Using the scans from these nodes 216, the controller 118 may align them to determine a transform which corresponds to the translations and rotations performed to move the robot 102 from one node to another. A more detailed explanation of how scan alignment yields translation information is provided below in
According to at least one non-limiting exemplary embodiment, block 738 may calculate the translation instead using data from other localizing sensors such as, but not limited to, gyroscopes, encoders, motor feedback, and the like which measure the state of the robot 102 rather than external features (e.g., LiDAR sensors detecting ranges to objects). In some embodiments, both methods for determining actual translation by the robot 102 may be used to better measure the true displacement of the robot 102 between nodes. It is noted the “actual/true” displacement of the robot 102 may not correspond to the depicted displacement between nodes 216 on the map, rather corresponds to measurements collected by these sensors of physical space/states before, e.g., map optimizations. While using data from wheel encoders may be more reliable in determining displacement of the robot 102 in a local frame, it cannot be used for non-sequential route nodes 216 unlike scan matching because the non-sequential route nodes 216 may be separated by a sufficient length of route to allow drift in the odometry units to propagate meaningful errors.
Block 740 includes the controller 118 computing a first value equal to a difference between the (i) calculated translation between node n and the at least one other node selected, and (ii) a translation between node n and the at least one other node based on the map. If the map was an accurate reflection of the physical space and displacement of the robot 102, there should be no disagreement. However, map optimizations which attempt to align features may fail to represent the physical space if the optimizations align incorrect features, which may be common in feature-poor environments such as grocery stores with many identical (using LiDAR data) aisles. For instance, an optimizer may align the ends of one aisle with the ends of another aisle, causing the nodes 216 which pass through both aisles to be closer together than in the physical space (e.g.,
In addition to calculating disagreements between the measured translation and the mapped translation of the robot 102, the controller 118 may further calculate the alignment of free space 304 detected at node n with the free space detected at the at least one other node. As discussed above in
To state it another way, at node n controller 118 may calculate a plurality of free space pixels 308, denoted as Mn comprising a tensor of pixels and their corresponding state (i.e., unknown, occupied, free space). The controller 118 may also calculate Mm for nodes 216m, wherein nodes 216m correspond to the selected at least one other node in block 738. The controller 118 may perform a pairwise comparison of these matrices. Mn and Mm, for pixels at the same physical location on the map. For instance, if pixel (x1,y1) of Mn is a free space pixel but a pixel (x2, y2) of Mm corresponding to the same physical space/location includes an occupied pixel 306. the second value may increase as a means to penalize the map quality. If (x1, y1) of Mn is occupied but the same location is free space in Mm, no penalty is added as the penalty indicating the disagreement will be added when node m becomes the base node for comparison (i.e., to avoid double counting penalties). If both are free space pixels, no penalty is added. If pixel (x1,y1) of Mn is a free space pixel but a pixel (x2, y2) of Mm corresponding to the same physical space/location includes an unknown pixel 304, no penalty is added. If pixel (x3, y3) of a third Mn′ corresponding to the same physical space/location as free space pixel (x1,y1) and unknown pixel (x2,y1) is an occupied pixel, a penalty would be assigned when n′ is the base node for comparison because the same space/location cannot be free space and occupied. The controller 118 may continue to perform this comparison for every element of the matrices, then repeat for every pair of nodes.
Block 744 includes the controller 118 incrementing n by one.
Block 746 includes the controller determining a scan alignment score based on the cumulative values of the first and second values for all N nodes 216 of the route. That is, based on the first value for node n (block 740) and the second value for the node n (block 742), a scan alignment score may be determined for the single node n. The same score may then be calculated for all N nodes 216 and be represented as a single value or as a score per node function, similar to the footprint and scan consistency scores discussed above. The scan alignment score provides two constraints: (i) ensures physical translation of the robot 102 on the map is accurate to its true translation in physical space, and (ii) ensures physical objects are not substantially moved, as will be illustrated below in
The route 802 includes two loops between aisles formed by the objects 804, 806, 808, and 810. Since the aisles appear identical, and there are no other features nearby to identify which aisle is which, the controller 118 in performing map optimizations may erroneously align the end of the middle aisle between objects 806 and 808 with the end of the lower aisle between objects 808 and 810. Since the encoder is under-biased for left hand turns, the final left hand turns when exiting objects 808, 810 and entering objects 806, 808 in
Further, since the environment is feature-poor, when exiting the aisle between objects 808, 810 the controller 118 may not be able to determine if it is exiting from between objects 808, 810 or objects 806, 808 (further made difficult by the biased odometry) and may combine the two exits, as shown by the objects 808 and 810 on the map 812 in
For instance, the footprint score (
If odometry drift and sensor calibration can be controlled and accounted for, the robot 102 should produce a map of the environment shown in
In
To illustrate further, with reference to
To detect if the map in
In addition to feature matching, another primary constraint used in map construction is loop closures, or points along the route 802 where the robot 102 visits more than once. These points are not only constrained to be at the same physical space (i.e., should be at the same pixel on the map) but also constrain the segment of route beginning and ending at the point (i.e., the “loop”). For instance, consider a perfect circle route which begins and ends at the same location. The route is constrained in that (i) it must include two points at the same location, and (ii) includes a loop with a length measured by sensor units 114. If the optimized computer readable map does not include the same length of loop as measured via sensor units 114, the scan alignment score will increase.
Given enough maps, the three scoring functions denoted herein: ffp(n), fs(n), and fsa(n) for nodes n∈[1, N] corresponding to the footprint score, scan score, and scan alignment score respectively may provide sufficient data to form patterns used to identify diverged maps, and where the maps diverge. More specifically, the scores as a function of location along the route, n, in addition to the map data itself (i.e., the matrix of pixels and their corresponding states) may be provided as training data to a model to configure the model to predict if a given map is diverged using the three scores. Preferably the computation of the scores and execution of the model to determine diverged maps may be executed via a processor separate from the robot 102 to not overburden the controller 118 with tasks separate from its normal task of operating the robot 102. However, the controller 118 may perform the inference provided (i) the robot 102 is idle, or (ii) the controller 118 comprises sufficient computing bandwidth.
The input nodes 906 may receive a numeric value xi of a sensory input of a feature, i being an integer index. For example, xi may represent color values of an ith pixel of a color image. The input nodes 906 may output the numeric value xi to one or more intermediate nodes 906 via links 904. Each intermediate node 906 may be configured to receive a numeric value on its respective input link 904 and output another numeric value ki,j to links 908 following the equation 2 below:
k
i,j
=a
i,j
x
0
+b
i,j
x
i
+c
i,j
x
2
+d
i,j
x
3 (Eqn. 2)
Index i corresponds to a node number within a layer (e.g., x1 denotes the first input node 902 of the input layer, indexing from zero). Index j corresponds to a layer, wherein j would be equal to one for the one intermediate layer 914-1 of the neural network 900 illustrated, however, j may be any number corresponding to a neural network 900 comprising any number of intermediate layers 914. Constants a, b, c, and d represent weights to be learned in accordance with a training process. The number of constants of equation 2 may depend on the number of input links 904 to a respective intermediate node 906. In this embodiment, all intermediate nodes 906 are linked to all input nodes 902, but this is not intended to be limiting. Intermediate nodes 906 of the second (rightmost) intermediate layer 914-2 may output values ki,2 to respective links 912 following equation 2 above. It is appreciated that constants a, b, c, d may be of different values for each intermediate node 906. Further, although the above equation 2 utilizes addition of inputs multiplied by respective learned coefficients, other operations are applicable, such as convolution operations, thresholds for input values for producing an output, and/or biases, wherein the above equation is intended to be illustrative and non-limiting.
Output nodes 910 may be configured to receive at least one numeric value ki,j from at least an ith intermediate node 906 of a final (i.e., rightmost) intermediate layer 914. As illustrated, for example, each output node 910 receives numeric values k0-7.2 from the eight intermediate nodes 906 of the second intermediate layer 914-2. The output of the output nodes 910 may comprise a classification of a feature of the input nodes 902. The output ei of the output nodes 910 may be calculated following a substantially similar equation as equation 2 above (i.e., based on learned weights and inputs from connections 912). Following the above example where inputs xi comprise pixel color values of an RGB image, the output nodes 910 may output a classification ei of each input pixel (e.g., pixel i is a car, train, dog, person, background, soap, or any other classification). Other outputs of the output nodes 910 are considered, such as, for example, output nodes 910 predicting a temperature within an environment at a future time based on temperature measurements provided to input nodes 902 at prior times and/or at different locations.
The training process comprises providing the neural network 900 with both input and output pairs of values to the input nodes 902 and output nodes 910, respectively, such that weights of the intermediate nodes 906 may be determined. An input and output pair comprise a ground truth data input comprising values for the input nodes 902 and corresponding correct values for the output nodes 910 (e.g., an image and corresponding annotations or labels). The determined weights configure the neural network 900 to receive input to input nodes 902 and determine a correct output at the output nodes 910. By way of illustrative example, annotated (i.e., labeled) images may be utilized to train a neural network 900 to identify objects or features within the image based on the annotations and the image itself, the annotations may comprise, e.g., pixels encoded with “cat” or “not cat” information if the training is intended to configure the neural network 900 to identify cats within an image. The unannotated images of the training pairs (i.e., pixel RGB color values) may be provided to input nodes 902 and the annotations of the image (i.e., classifications for each pixel) may be provided to the output nodes 910, wherein weights of the intermediate nodes 906 may be adjusted such that the neural network 900 generates the annotations of the image based on the provided pixel color values to the input nodes 902. This process may be repeated using a substantial number of labeled images (e.g., hundreds or more) such that ideal weights of each intermediate node 906 may be determined. The training process is complete when predictions made by the neural network 900 fall below a threshold error rate which may be defined using a cost function.
As used herein, a training pair may comprise any set of information provided to input and output of the neural network 900 for use in training the neural network 900. For example, a training pair may comprise an image and one or more labels of the image (e.g., an image depicting a cat and a bounding box associated with a region occupied by the cat within the image).
Neural network 900 may be configured to receive any set of numeric values representative of any feature and provide an output set of numeric values representative of the feature. For example, the inputs may comprise color values of a color image and outputs may comprise classifications for each pixel of the image. As another example, inputs may comprise numeric values for a time dependent trend of a parameter (e.g., temperature fluctuations within a building measured by a sensor) and output nodes 910 may provide a predicted value for the parameter at a future time based on the observed trends, wherein the trends may be utilized to train the neural network 900. Training of the neural network 900 may comprise providing the neural network 900 with a sufficiently large number of training input/output pairs comprising ground truth (i.e., highly accurate) training data. As a third example, audio information may be provided to input nodes 902 and a meaning of the audio information may be provided to output nodes 910 to train the neural network 900 to identify words and speech patterns.
Generation of the sufficiently large number of input/output training pairs may be difficult and/or costly to produce. Accordingly, most contemporary neural networks 900 are configured to perform a certain task (e.g., classify a certain type of object within an image) based on training pairs provided, wherein the neural networks 900 may fail at other tasks due to a lack of sufficient training data and other computational factors (e.g., processing power). For example, a neural network 900 may be trained to identify cereal boxes within images, but the same neural network 900 may fail to identify soap bars within the images.
As used herein, a model may comprise the weights of intermediate nodes 906 and output nodes 910 learned during a training process. The model may be analogous to a neural network 900 with fixed weights (e.g., constants a, b, c, d of equation 2), wherein the values of the fixed weights are learned during the training process. A trained model, as used herein, may include any mathematical model derived based on a training of a neural network 900. One skilled in the art may appreciate that utilizing a model from a trained neural network 900 to perform a function (e.g., identify a feature within sensor data from a robot 102) utilizes significantly less computational recourses than training of the neural network 900 as the values of the weights are fixed. This is analogous to using a predetermined equation to solve a problem as compared to determining the equation itself based on a set of inputs and results.
According to at least one non-limiting exemplary embodiment, one or more outputs ki,j from intermediate nodes 906 of a jth intermediate layer 912 may be utilized as inputs to one or more intermediate nodes 906 an mth intermediate layer 912, wherein index m may be greater than or less than j (e.g., a recurrent or feed forward neural network). According to at least one non-limiting exemplary embodiment, a neural network 900 may comprise N dimensions for an N dimensional feature (e.g., a 9-dimensional input image or point cloud), wherein only one dimension has been illustrated for clarity. One skilled in the art may appreciate a plurality of other embodiments of a neural network 900, wherein the neural network 900 illustrated represents a simplified embodiment of a neural network to illustrate the structure, utility, and training of neural networks and is not intended to be limiting. The exact configuration of the neural network used may depend on (i) processing resources available, (ii) training data available, (iii) quality of the training data, and/or (iv) difficulty or complexity of the classification/problem. Further, programs such as AutoKeras, utilize automatic machine learning (“AutoML”) to enable one of ordinary skill in the art to optimize a neural network 900 design to a specified task or data set.
Scan matching comprises a controller 118 of the robot 102 determining a transformation 1004 along x, y, and θ which aligns the two sets of points 204. That is, the transformation 1004 minimizes the spatial discrepancies 1002 between nearest neighboring points of the two successive scans. Starting in
It is appreciated that scan matching may include the controller 118 iteratively applying rotations, translations, rotations, translations, etc. until discrepancies 1002 are minimized. That is, controller 118 may apply small rotations which reduce discrepancies 1002, followed by small translations further reducing discrepancies 1002, and interactively repeat until the discrepancies 1002 no longer decrease under any rotation or translation. Controller 118 immediately applying the correct rotation of θ° followed by the correct translation of [x1, y1] is for illustrative purposes only and is not intended to be limiting. Such iterative algorithms may include, without limitation, iterative closest point (“ICP”) and/or pyramid scan matching algorithms commonly used within the art.
Transform 1004 may denote the transformation or change in position of the object of which points 204 localize as perceived from the reference frame of the robot 102. When viewed from an external reference frame, such as one defined about a static origin point in the environment (e.g., a point on a floor), the transform 1004 may instead denote the change in position of the robot 102 between the first and second scans, assuming the object is stationary. Accordingly, scan matching may be utilized to determine motions of the robot 102 between two scans. That is, the robot 102 may translate and rotate by an amount denoted by transform 1004 during the time between acquisition of the first set of points 204 and the second set of points 204 from the same sensor 202.
Although illustrated in two dimensions, scan matching between two sets of points 204 may occur in three dimensions. That is, transform 1004 may further include [z, pitch, roll] transformations which cause the two sets of points 204 to align in some non-limiting exemplary embodiments.
In addition to determining a displacement, the scan matching may also be utilized to correlate two features of two scans as the same feature. A first scan of an object would indicate its shape, wherein the object could be found in a second scan by aligning points of the second scan to points in the first scan representing the object, provided both scans sense at least in part the same surface of the object. Identifying common features within scans is often used as an additional constraint in localization and map construction because identifying common features may indicate the position of the robot 102 relative to environment objects seen at other locations along a route. The transform 1004 may indicate the displacement between the first scan of the object and the second scan of the object. Scan matching for identifying common features between multiple scans may not always yield a perfect solution such as the one illustrated in
As the robot 102 moves around its environment and executes various routes and/or tasks, a computer readable map 1104 is produced by the controller 118 of robot 102. Every subsequent iteration or execution of tasks/routes may generate a new version of the computer readable map. The new version may be based, at least in part, on prior iterations of the map in conjunction with new sensory data collected during the present task/route execution. Small changes in the environment, noise and imperfections in sensory instruments, calibration drift, and various map optimizations may cause the maps to differ slightly from each other, even for robots 102 navigating the same static environment.
If mapping is successful with minimal noise, calibration errors, or other discrepancies, the map 1104 produced by the robot 102 during the most recent task or route execution may be communicated to the remote system 1100 via communications units 116 (step 1102). The map 1104 is provided to a map analysis 1106 block which executes at least one of methods 700, 716, and 732 to provide various scoring metrics to the map 1104. If the map 1104 scores below a threshold, the map 1104 is considered accurate enough to utilize during subsequent navigations. Tighter thresholds may be employed if the map 1104 is used for additional purposes beyond the robot 102 merely navigating, such as accurately representing the environment to report performance metrics or interacting with small objects on the map. If the map 1104 passes the threshold, the remote system 1100 may provide communications 1114 to the robot 102 indicating the map 1104 is validated with sufficient quality for navigation.
In some instances, however, the map 1104 produced by the robot 102 may contain large errors. thereby producing a combined score above the threshold required by the first the map analysis 1106. Accordingly, the map 1104 is provided to a parameterization block 1108, which produces a plurality of versions of the map 1104. The plurality of versions may each individually perform different optimizations and/or consider different sensory data with varying degrees of weight. For example, the map 1104 produced by the robot 102 may serve as a first map version. The same sensory data used to produce the map 1104 may be again processed without, e.g., gyroscopic data to produce a second map version. A third map version may be produced by adding additional artificial noise to one or more LiDAR sensor data. A fourth version may be produced without considering loop closures, or with loop closure consideration if the first map was produced without loop closures being considered. A fifth version may include or exclude the use of feature matching during map construction, which comprises the process of identifying a feature or object at multiple times or positions and inferring a location of the robot 102 from the position of the known object (assuming the object is static). These and other parameterizations of the computer readable map are considered without limitation, wherein one may produce one or more map versions by individually adjusting, removing, or adding optimization parameters during map construction. Preferably, the more map versions produced increases the likelihood of at least one of the maps being of sufficient quality for the robot 102 to utilize. Individually adding, removing or adjusting parameters of the map construction may isolate the failing optimization or noisy component. It is appreciated that various parameterized versions of a map may be further dependent on the specific mapping operations and/or specific sensor units of the robot 102, wherein the use of the three (3) scoring metrics of methods 700, 716, and/or 732 to evaluate the post construction result is not limited to any particular optimization methodology.
The plurality of map versions is provided again to a map analysis 1110 block which scores the plurality of map versions in accordance with methods 700, 716, and/or 732. In rare circumstances, all of the parameterized map versions may still fail the threshold score. Accordingly, the robot 102 is provided a communication 1116 indicating the map is not usable and should be re-trained or re-mapped. Alternatively, in some embodiments, prior maps which do pass the score threshold are utilized instead of the most recent map which does not pass the score threshold, wherein communication 1116 indicates to the robot 102 that it should use the prior versions. In some scenarios, such mapping failures typically arise from broken or faulty sensors and/or mechanical parts, wherein a robot 102 should be prevented from navigating with such faults and communications 1116 may cause the robot 102 to request maintenance before navigating again. If the robot 102 may operate both autonomously and manually, the controller 118 would only disable the autonomous functions. Depending on the risks of the environment and the degree to which the environment changes, a robot 102 operator may decide to revert to older map versions or preclude the robot 102 from operating as a safety precaution (e.g., a home cleaning robot likely does not pose a safety risk using an old map or single faulty sensor, as opposed to a robotic forklift in a warehouse).
Of the plurality of map versions, a subset of two or more may pass the scoring threshold. Accordingly, the remote system 1100 chooses the lowest score in block 1112 and provides the map with the corresponding lowest score to the robot 102 via communications 1118 for the robot 102 to utilize during navigation, reporting metrics, or other tasks.
In performing the map optimizations on maps that fail validation at block 1106 and subsequently producing a validated map after block 1110, a training pair is created. The training pair includes the initial poor-quality map 1104 which fails block 1106 and the final one or more maps which pass block 1110. Returning briefly to
It will be recognized that while certain aspects of the disclosure are described in terms of a specific sequence of steps of a method, these descriptions are only illustrative of the broader methods of the disclosure and may be modified as required by the particular application. Certain steps may be rendered unnecessary or optional under certain circumstances. Additionally, certain steps or functionality may be added to the disclosed embodiments, or the order of performance of two or more steps permuted. All such variations are encompassed within the disclosure disclosed and claimed herein.
While the above detailed description has shown, described, and pointed out novel features of the disclosure as applied to various exemplary embodiments, it will be understood that various omissions, substitutions, and changes in the form and details of the device or process illustrated may be made by those skilled in the art without departing from the disclosure. The foregoing description is of the best mode presently contemplated of carrying out the disclosure. This description is in no way meant to be limiting, but rather should be taken as illustrative of the general principles of the disclosure. The scope of the disclosure should be determined with reference to the claims.
While the disclosure has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. The disclosure is not limited to the disclosed embodiments. Variations to the disclosed embodiments and/or implementations may be understood and effected by those skilled in the art in practicing the claimed disclosure, from a study of the drawings, the disclosure and the appended claims.
It should be noted that the use of particular terminology when describing certain features or aspects of the disclosure should not be taken to imply that the terminology is being re-defined herein to be restricted to include any specific characteristics of the features or aspects of the disclosure with which that terminology is associated. Terms and phrases used in this application, and variations thereof, especially in the appended claims, unless otherwise expressly stated, should be construed as open ended as opposed to limiting. As examples of the foregoing, the term “including” should be read to mean “including, without limitation,” “including but not limited to,” or the like; the term “comprising” as used herein is synonymous with “including,” “containing,” or “characterized by,” and is inclusive or open-ended and does not exclude additional, unrecited elements or method steps; the term “having” should be interpreted as “having at least;” the term “such as” should be interpreted as “such as, without limitation;” the term “includes” should be interpreted as “includes but is not limited to;” the term “example” is used to provide exemplary instances of the item in discussion, not an exhaustive or limiting list thereof, and should be interpreted as “example, but without limitation;” adjectives such as “known,” “normal,” “standard,” and terms of similar meaning should not be construed as limiting the item described to a given time period or to an item available as of a given time, but instead should be read to encompass known, normal, or standard technologies that may be available or known now or at any time in the future; and use of terms like “preferably,” “preferred,” “desired,” or “desirable,” and words of similar meaning should not be understood as implying that certain features are critical, essential, or even important to the structure or function of the present disclosure, but instead as merely intended to highlight alternative or additional features that may or may not be utilized in a particular embodiment. Likewise, a group of items linked with the conjunction “and” should not be read as requiring that each and every one of those items be present in the grouping, but rather should be read as “and/or” unless expressly stated otherwise. Similarly, a group of items linked with the conjunction “or” should not be read as requiring mutual exclusivity among that group, but rather should be read as “and/or” unless expressly stated otherwise. The terms “about” or “approximate” and the like are synonymous and are used to indicate that the value modified by the term has an understood range associated with it, where the range may be ±20%, ±15%, ±10%, ±5%, or ±1%. The term “substantially” is used to indicate that a result (e.g., measurement value) is close to a targeted value, where close may mean, for example, the result is within 80% of the value, within 90% of the value, within 95% of the value, or within 99% of the value. Also, as used herein “defined” or “determined” may include “predefined” or “predetermined” and/or otherwise determined values, conditions, thresholds, measurements, and the like.
This application claims the benefit of U.S. Provisional Application No. 63/426,465, filed Nov. 18, 2022, which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63426465 | Nov 2022 | US |