One or more aspects of embodiments according to the present invention relate to robotic agents, and more particularly to a system and method for controlling a swarm of robotic agents.
In various situations, groups or “swarms” of mobile robotic agents may operate together, e.g., to accomplish a task as a group. In such a situation it may be impractical for one or more human operators to control the motion of each robotic agent individually. If the completion of a task involves more than rudimentary behavior, centralized control may have disadvantages, such as the potential for a communications bottleneck at the central controller, and pre-programmed behavior may be poorly suited for having the swarm move in response to information obtained in real time, e.g., by the robotic agents themselves.
Thus, there is a need for an improved system and/or method for controlling the motion of a swarm of robotic agents.
Aspects of embodiments of the present disclosure are directed toward a system and/or method for generating an artificial topography in a distributed array of robotic agents. Each robotic agent stores, and periodically updates a parameter value or “A-value” in accordance with a process including, e.g., averaging neighboring A-values, received from close neighbor robotic agents, biasing the A-value based on external commands or measured environmental parameters, and decreasing the A-value by a cooling rate factor. Averaging among neighboring robotic agents results eventually in a globally smoothed distribution of A-values. A local gradient may be estimated for the distribution of A-values, and the robotic agents may be programmed to move in the direction of the gradient, toward increasing A-values. This behavior may be employed to cause the robotic agents to follow a robotic agent with a fixed, relatively large, A-value, or, if the A-values are biased by features (e.g., gradients or steps) in environmental parameters, to converge on such features.
According to an embodiment of the present invention there is provided a method for controlling a plurality of robotic agents, the method including: storing, in each of the robotic agents, a respective first parameter value; sending, by a first robotic agent of the plurality of robotic agents, the respective first parameter value to each of a plurality of first close neighbor robotic agents of the plurality of robotic agents, each of the first close neighbor robotic agents having a distance, to the first robotic agent, less than a threshold distance, receiving, by the first robotic agent, a respective first parameter value from each of the first close neighbor robotic agents, calculating, by the first robotic agent, a new first parameter value, the calculating including calculating an average of: the first parameter value of the first robotic agent; and the received first parameter values; updating the first parameter value of the first robotic agent to equal the new first parameter value; calculating, by the first robotic agent, an estimated gradient of: the first parameter value of the first robotic agent; and the received first parameter values; calculating, by the first robotic agent, a resultant virtual force vector as a sum of one or more vector quantities including the estimated gradient; and generating, by the first robotic agent, a net thrust force, on the first robotic agent, parallel to the resultant virtual force vector.
In one embodiment, the method includes setting, by a second robotic agent different from the first robotic agent, the respective first parameter value of the second robotic agent to a value received from a central controller.
In one embodiment, the method includes, moving, by the second robotic agent, in a direction along a path received by the second agent from a central controller.
In one embodiment, each of the robotic agents is configured to measure a first value of an environmental parameter.
In one embodiment, the calculating of the first new parameter value further includes calculating a function of: the average of: the first parameter value of the first robotic agent; and the received first parameter values; and the first value of the environmental parameter.
In one embodiment, the method includes receiving, by the first robotic agent, a respective value of the environmental parameter from each of the first close neighbor robotic agents, wherein the calculating of the first new parameter value further includes calculating a function of: the average of: the first parameter value of the first robotic agent; and the received first parameter values; and the received values of the environmental parameter.
In one embodiment, the function is a weighted sum of: the average of: the first parameter value of the first robotic agent; and the received first parameter values; and a magnitude of an estimated gradient in: the first value of the environmental parameter; and the received values of the environmental parameter.
In one embodiment, the sending of the respective first parameter value includes transmitting the respective first parameter value by wireless communication, and the receiving of the respective first parameter values includes receiving the respective first parameter values by wireless communication.
In one embodiment, the threshold distance is less than or equal to a range of the wireless communication.
In one embodiment, the average is a weighted average.
In one embodiment, the average is a weighted average, each term of the weighted average being weighted in inverse proportion to a distance to a first close neighbor robotic agent, of the plurality of first close neighbor robotic agents, from which the received respective first parameter value was received.
In one embodiment, the calculating of the first new parameter value further includes multiplying by a cooling rate factor, the cooling rate factor being a number greater than 0.01 and less than 0.99.
In one embodiment, the calculating of the first new parameter value further includes calculating, for a first close neighbor robotic agent, of the plurality of first close neighbor robotic agents: a distance between the first robotic agent and the first close neighbor robotic agent; a virtual force vector corresponding to the first close neighbor robotic agent, the virtual force vector having a magnitude that is a function of the distance between the first robotic agent and the first close neighbor robotic agent, and having a direction along a straight line connecting the first robotic agent and the first close neighbor robotic agent, the virtual force vector corresponding to: a repulsive force when the distance between the first robotic agent and the first close neighbor robotic agent is less than an equilibrium distance; and an attractive force when the distance between the first robotic agent and the first close neighbor robotic agent is greater than the equilibrium distance.
According to an embodiment of the present invention there is provided a system including: a plurality of robotic agents, each of the robotic agents being configured to store a respective first parameter value, a first robotic agent of the plurality of robotic agents being further configured to: send its first parameter value to a second robotic agent of the plurality of robotic agents, the second robotic agent having a distance to the first robotic agent less than a threshold distance, receive a respective first parameter value from the second robotic agent; calculate a new first parameter value, the calculating of the new first parameter value including calculating an average of a plurality of values including the first parameter value of the first robotic agent and the received first parameter value; and update the first parameter value of the first robotic agent to equal the new first parameter value.
In one embodiment, the first robotic agent is further configured to: sense a first value of an environmental parameter; and receive a second value of the environmental parameter sensed by the second robotic agent; and calculate an estimated gradient in the environmental parameter based at least in part on the first value and the second value.
In one embodiment, the first robotic agent is configured to calculate the new first parameter value, based further on the estimated gradient.
In one embodiment, the average is a weighted average.
In one embodiment, a third robotic agent of the plurality of robotic agents is further configured to store a constant respective first parameter value.
In one embodiment, the third robotic agent is further configured to follow a prescribed path.
These and other features and advantages of the present invention will be appreciated and understood with reference to the specification, claims, and appended drawings wherein:
The detailed description set forth below in connection with the appended drawings is intended as a description of exemplary embodiments of a swarm autopilot provided in accordance with the present invention and is not intended to represent the only forms in which the present invention may be constructed or utilized. The description sets forth the features of the present invention in connection with the illustrated embodiments. It is to be understood, however, that the same or equivalent functions and structures may be accomplished by different embodiments that are also intended to be encompassed within the spirit and scope of the invention. As denoted elsewhere herein, like element numbers are intended to indicate like elements or features.
Related art methods for controlling a distributed group of robotic agents include methods involving operating by local interaction rules and methods involving developing coherent group behaviors. Controlling a distributed group may be challenging, however. Some control methods may use either centralized control or preprogrammed behavior scenarios, or may provide only rudimentary behaviors.
In some embodiments according to the present invention, an autopilot for distributed systems provides the benefits of having fully distributed robotic agents while assuring nimble adaptation of the group, or “swarm” of robotic agents. The autopilot monitors the state of the system and adjusts the control inputs to bring the system into alignment with a target. In some embodiments, the autopilot is just as distributed as the group of robotic agents it is controlling.
The autopilot may employ a parameter referred to herein as an “A-value”. Each robotic agent may have a respective A-value, and each robotic agent may have a system and method of recognizing this value in neighboring robotic agents, a set of response rules specifying how to respond (e.g., how to move in response to) to the A-values of neighboring robotic agents, and a set of update rules for how to update the robotic agent's own A-value based on the A-values of neighboring robotic agents.
At a system level the array of A-values (the array including one A-value for each robotic agent in the swarm) may define an artificial topography. The response rules may relate to moving up or down gradients in this artificial topography. The update rules may relate to smoothing the topography to provide stable gradients and to gradually diffuse local information to the swarm through a distributed network formed of communications links between robotic agents and their close neighbors.
Although the set of A-values may form an array, the A-values may also be treated as sample points in an underlying distribution in a continuous space. A continuous distribution can be smooth, i.e., having well defined derivatives, or non-smooth, e.g., comprising non-differentiable step functions. An underlying distribution that is smooth may facilitate a more stable response to changes in the A-values over time, than a non-smooth distribution.
High-level autopilot commands or instructions may include (e.g., consist of) defining high points and low points in the topography that may then create a coherent system-scale gradient map that may be used to guide the group.
Referring to
In some embodiments a swarm, such as the swarm illustrated in
Referring to
Each robotic agent may be in possession of (e.g., it may store in local memory) information about its position and environment, and information that it has received from neighbors within its communication range, referred to herein as “close” neighbors. The communication system may enable each robotic agent, when it receives information from a close neighbor, to identify (e.g., by a unique identifier identifying the sending robotic agent) the robotic agent from which the information was sent. In one embodiment, the robotic agent X0 maintains or “hosts” a neighbor list including the information of
Once relative positions are known each robotic agent may use this information to behave according to any of a number of swarm algorithms. Referring to
In some embodiments the swarm forms a distributed sensor array. Each robotic agent may measure an environmental characteristic (e.g., temperature), and by comparing environmental measurements with those obtained by, e.g., close neighbors, a robotic agent may estimate, and respond to, variations or gradients in the environment that may be challenging for any single robotic agent to detect. For example, referring to
Following gradients in the environment is one way for a swarm to behave collectively. The decentralized nature of the group is not a problem in this case because the controlling inputs, e.g. temperature measurements, are distributed. In embodiments in which the intent is for the swarm to follow control inputs provided by a central controller (which may be in direct communication with only a subset of the robotic agents, e.g., with only one or a small number of the robotic agents), other methods may be employed.
In one embodiment of an autopilot, a “swarm-the-attractor” behavior involves no additional robotic agent capabilities over those described above. This autopilot may automate a task that requires constant monitoring and adjustment. In one embodiment, the swarm as a whole, regardless of size, groups around a single robotic agent, referred to herein as an “attractor”, which may be controlled by a human operator or may be sent on a programmed, autonomous mission. Such a task may ordinarily be challenging because the individualized objective for each member of the group (e.g., moving toward the attractor) is a different instruction for each robotic agent depending on its location. Also, in some embodiments, robotic agents that are distant from the attractor may have no direct knowledge of which robotic agent is the attractor or where the attractor is.
The A-value (an artificial number, as mentioned above) may be used to accomplish this task. The A-value may be passed between robotic agents just like any other parameter, and the respective A-values of the robotic agents of the swarm may be used as data points from which local gradients may be calculated. Such gradients may be used to guide the behavior of the swarm in predictable and controllable ways. In some embodiments an autopilot using the A-value has two principal elements: (i) a method for creating a stable gradient in A-values sloping up toward the attractor, and (ii) a method for each robotic agent to calculate this gradient based on measurements from its close neighbors and to respond with a motive force in the up-gradient direction.
The stable gradient in A-values may be created through a simulated diffusion process, which may be analogous to the diffusion of heat from a local source out through a cooling medium. A-values may diffuse through the group as a result of each robotic agent's repeatedly and continuously sending its A-value to its close neighbors, and updating its A-value to be the average of its close neighbors' A-values. This process may result in the A-values within the entire swarm converging to respective stable “equilibrium” values.
In some embodiments, biasing of one or more A-values may be used to influence the behavior of the swarm. For example referring to
If the algorithm for calculating the resultant virtual force vector is suitably modified the entire swarm may be caused to “migrate” or move as a unit. For example, instead of being simply a vector sum of virtual forces from close neighbors, as in the embodiment of
Although the two operations, averaging and biasing, may take place locally, this system and method may rapidly form a consistent, reliable topography of A-values across the whole swarm. Any robotic agent may then measure this topography (e.g., using the method described in the context of
The A-value may also be used in embodiments that relate to incorporating environmental readings. In one embodiment, referred to herein as a “search and gather” process, the robotic agents first spread out to detect an object of interest (e.g. an airplane's “black box”), and then gather around the target object. For example, referring to
When a robotic agent has found a target, it may take on a set or pre-determined high A-value and become an attractor 710, as illustrated in
Using similar processes, a swarm may be used to map a gradient, e.g., in an environmental parameter. The temperature, for example may be such an environmental parameter and is shown in
In another embodiment, a swarm may be configured to map a discrete boundary, such as the edge of an oil slick, as illustrated in
In this manner, by biasing A-values based on the local variation in environmental readings, rather than on the absolute value of those readings, the autopilot can drive the swarm to spontaneously migrate toward, and spread out along, important structures in the environment.
In these examples, the biasing of the A-value according to a gradient may be performed, for example, by executing, in each robotic agent, the following algorithm for updating the A-value:
A
i,n
=kĀ
i,n-1
+G
i,n-1
where Ai,n is the new A-value (i.e., the A-value at time step n) of the ith robotic agent, k is the cooling rate factor, Āi,n-1 is the average of the A-values, at the previous time step, of all of the robotic agent within communication range of the ith robotic agent, and is the magnitude of the gradient of environmental readings at the ith robotic agent at the previous time step. Once the A-values have been determined, each robotic agent may estimate a local gradient of the A-value (again, using a method analogous that described in the context of
In some embodiments, the swarm autopilot is a concept that bridges the gap between a unified goal and a distributed system. It is a system and/or method of creating a consistent, unified, useful, predictable response from the whole swarm, despite the lack of unified computation. The diffusion of A-values creates a smooth global information structure that individual robotic agents may access locally to determine their respective course of action.
A swarm autopilot according to embodiments of the present invention may be useful in situations where the system is too complex (e.g., there are too many robotic agents) or too remote for centralized control (e.g., when it is infeasible to communicate directly with all robotic agents). A swarm autopilot according to embodiments of the present invention may be implemented in a set of robotic agents when each robotic agent is capable of (i) sensing the spatial location of close neighbors, (ii) discerning the A-values of those neighbors (iii) updating its own A-value and (iv) moving in alignment with the local gradient of A-values.
The swarm autopilot has been described above for the example of ocean surface vehicles using radio communications to share GPS information and A-values, but the invention is not limited thereto, and the method lends itself to a range of other distributed embodiments. For example, a submerged three-dimensional (3D) swarm of autonomous underwater vehicles may be formed by vehicles capable of operating underwater. All the methods described above for two-dimensional (2D) swarming may also be employed in the context of 3D swarming, including swarm-the-attractor, search-and-gather and edge-tracking. As such, an autopilot according to embodiments of the present invention may be useful for submerged swarms.
Communication may be challenging in underwater settings because radio frequency and microwave electromagnetic waves, that, on the surface, may carry satellite GPS signals and communications between the robotic agents, may be rapidly attenuated under water. Acoustic signaling or signaling using light may be used instead. For example, each robotic agent may transmit its A-value by making sound at a specific pitch (or emitting light at a specific frequency (or wavelength) or pulsed at a specific rate). A neighboring robotic agent may then receive the A-value as encoded in this pitch, and may use the strength and direction of the incoming signal as proxies for (i.e., to infer) range and bearing. If each robotic agent also has the ability to calculate the average of the incoming pitches and to adjust its own pitch correspondingly, the swarm of robotic agents may be able to embody a distributed autopilot according to embodiments of the present invention.
In some such embodiments it is unnecessary for a robotic agent to perform sophisticated mathematical calculations in this sort of system. A more direct reflexive response may be sufficient to implement an autopilot; for example each robotic agent may be configured to be slightly more attracted toward higher-pitched sounds (higher A-values), to be slightly more repelled from lower-pitched sounds (lower A-values), and to adjust its pitch toward the average pitch around it. In this way a robotic agent may tend to move in the up-gradient direction without actually having to calculate that gradient.
When alternate sensing or communication techniques like acoustics and/or light are employed it may be possible to use the autopilot with extremely small robotic agents, potentially small enough to be used inside a living body, such as a nano-scale vehicle capable of mobility, sensing, signaling and processing. If designed with the appropriate capabilities (e.g., those described above, including sensing range, bearing and parameter value of close neighbors (either directly or by proxy), determining the parameter gradient, being able to follow that gradient, and providing an average parameter value to the close neighbors) it may be possible to use the autopilot to guide large groups using any of the control methods described above.
Chemical signaling is common in biological systems and may be incorporated in a nano-robotic system to make the autopilot work. Sensed variations in the composition of emitted chemicals may be taken into account when calculating A-values. If robotic agents are able to directly sense local gradients in chemical concentrations, they may be configured to move towards neighbors signaling higher A-values.
The swarm autopilot concept may be particularly useful in a microscopic environment where, due to size, numbers and the in situ application, control of individual robotic agents may be infeasible. In such a situation, the swarm may still be controlled through the use of attractors, and external equipment such as magnetic resonance imaging (MRI) may monitor the behavior of the group.
The creation of a smooth, sloping topography of A-values may be useful in situations beyond the control of a swarm of robotic agents. For example, a distributed sensor array may be deployed to monitor a large area for an anomalous signal (in a configuration analogous to that of
In light of the above, in various embodiments an auto pilot employing an artificial parameter (the A-value) may be simply configured to guide various behaviors, such as maintaining the separation between robotic agents, and causing the swarm to move, e.g., to follow an attractor, or to gather on a feature, such as a region having a high temperature gradient.
It will be understood that, although the terms “first”, “second”, “third”, etc., may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms are only used to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, a first element, component, region, layer or section discussed below could be termed a second element, component, region, layer or section, without departing from the spirit and scope of the inventive concept.
Spatially relative terms, such as “beneath”, “below”, “lower”, “under”, “above”, “upper” and the like, may be used herein for ease of description to describe one element or feature's relationship to another element(s) or feature(s) as illustrated in the figures. It will be understood that such spatially relative terms are intended to encompass different orientations of the device in use or in operation, in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as “below” or “beneath” or “under” other elements or features would then be oriented “above” the other elements or features. Thus, the example terms “below” and “under” can encompass both an orientation of above and below. The device may be otherwise oriented (e.g., rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein should be interpreted accordingly. In addition, it will also be understood that when a layer is referred to as being “between” two layers, it can be the only layer between the two layers, or one or more intervening layers may also be present.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the inventive concept. As used herein, the terms “substantially,” “about,” and similar terms are used as terms of approximation and not as terms of degree, and are intended to account for the inherent deviations in measured or calculated values that would be recognized by those of ordinary skill in the art. As used herein, the term “major component” means a component constituting at least half, by weight, of a composition, and the term “major portion”, when applied to a plurality of items, means at least half of the items.
As used herein, the singular forms “a” and “an” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising”, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list. Further, the use of “may” when describing embodiments of the inventive concept refers to “one or more embodiments of the present invention”. Also, the term “exemplary” is intended to refer to an example or illustration. As used herein, the terms “use,” “using,” and “used” may be considered synonymous with the terms “utilize,” “utilizing,” and “utilized,” respectively.
It will be understood that when an element or layer is referred to as being “on”, “connected to”, “coupled to”, or “adjacent to” another element or layer, it may be directly on, connected to, coupled to, or adjacent to the other element or layer, or one or more intervening elements or layers may be present. In contrast, when an element or layer is referred to as being “directly on”, “directly connected to”, “directly coupled to”, or “immediately adjacent to” another element or layer, there are no intervening elements or layers present.
Any numerical range recited herein is intended to include all sub-ranges of the same numerical precision subsumed within the recited range. For example, a range of “1.0 to 10.0” is intended to include all subranges between (and including) the recited minimum value of 1.0 and the recited maximum value of 10.0, that is, having a minimum value equal to or greater than 1.0 and a maximum value equal to or less than 10.0, such as, for example, 2.4 to 7.6. Any maximum numerical limitation recited herein is intended to include all lower numerical limitations subsumed therein and any minimum numerical limitation recited in this specification is intended to include all higher numerical limitations subsumed therein.
Although exemplary embodiments of a swarm autopilot have been specifically described and illustrated herein, many modifications and variations will be apparent to those skilled in the art. Accordingly, it is to be understood that a swarm autopilot constructed according to principles of this invention may be embodied other than as specifically described herein. The invention is also defined in the following claims, and equivalents thereof.
The present application claims priority to and the benefit of U.S. Provisional Application No. 62/216,166, filed Sep. 9, 2015, entitled “SWARM AUTOPILOT”, the entire content of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62216166 | Sep 2015 | US |