Robots are often equipped with various types of machine learning models that are trained to perform various tasks and/or to enable the robots to engage with dynamic environments. These models are sometimes trained by causing real-world physical robots to repeatedly perform tasks, with outcomes of the repeated tasks being used as training examples to tune the models. However, extremely large numbers of repetitions may be required in order to sufficiently train a machine learning model to perform tasks in a satisfactory manner.
The time and costs associated with training machine learning models through real-world operation of physical robots may be reduced and/or avoided by simulating robot operation in simulated (or “virtual”) environments. For example, a three-dimensional (3D) virtual environment may be simulated with various objects to be acted upon by a robot. The robot itself may also be simulated in the virtual environment, and the simulated robot may be operated to perform various tasks on the simulated objects. The machine learning model(s) can be trained based on outcomes of these simulated tasks. However, a large number of recorded “training episodes”—instances where a simulated robot interacts with a simulated object—may need to be generated in order to sufficiently train a machine learning model such as a reinforcement machine learning model. Much of the computing resources required to generate these training episodes lies in operating a robot controller, whether it be a real-world robot controller (e.g., integral with a real-world robot or operating outside of a robot) or a robot controller that is simulated outside of the virtual environment.
Implementations are described herein for controlling a plurality of simulated robots in a virtual environment using a single robot controller. More particularly, but not exclusively, implementations are described herein for controlling the plurality of simulated robots based on common/shared joint commands received from the single robot controller to interact with multiple instances of an interactive object that are simulated in the virtual environment with a distribution of distinct physical characteristics, such as a distribution of distinct poses. Causing the plurality of simulated robots to operate on a corresponding multiple instances of the same interactive object in disjoint world states—e.g., each instance having a slightly different pose or other varied physical characteristic—accelerates the process of creating training episodes. These techniques also provide an efficient way to ascertain measures of tolerance of robot joints (e.g., grippers) and sensors.
In various implementations, the robot controller may generate and issue a set of joint commands based on the state of the robot and/or the state of the virtual environment. The state of the virtual environment may be ascertained via data generated by one or more virtual sensors based on their observations of the virtual environment. In fact, it may be the case that the robot controller is unable to distinguish between operating in the real world and operating in a simulated environment. In some implementations, the state of the virtual environment may correspond to an instance of the interactive object being observed in a “baseline” pose. Sensor data capturing this pose may be what is provided to the robot controller in order for the robot controller to generate the set of joint commands for interacting with the interactive object.
However, in addition to the instance of the interactive object in the baseline pose, a plurality of additional instances of the interactive object may be rendered in the virtual environment as well. A pose of each instance of the interactive object may be altered (e.g., rotated, translated, etc.) relative to poses of other instances of the interactive object, including to the baseline pose. Each of the plurality of simulated robots may then attempt to interact with a respective instance of the interactive object. As mentioned previously, each of the plurality of simulated robots receives the same set of joint commands, also referred to herein as a “common” set of joint commands, that is generated based on the baseline pose of the interactive object. Consequently, each of the plurality of simulated robots operates its joint(s) in the same way to interact with its respective instance of the interactive object.
However, each instance of the interactive object (other than the baseline instance) has a pose that is distinct from poses of the other instances. Consequently, the outcome of these operations may vary depending on a tolerance of the simulated robot (and hence, a real-world robot it simulates) to deviations of the interactive object from what it sensed. Put another way, by holding constant the set of joint commands issued across the plurality of simulated robots, while varying the pose of a respective instance of the interactive object for each simulated robot, it can be determined how much tolerance the simulated robot has for deviations of interactive objects from their expected/observed poses.
In various implementations, various parameters associated with the robot controller may be altered based on outcomes of the same set of joint commands being used to interact with the multiple instances of the interactive object in distinct poses. For example, a machine learning model such as a reinforcement learning policy may be trained based on success or failure of each simulated robot.
In some implementations, the outcomes may be analyzed to ascertain inherent tolerances of component(s) of the robot controller and/or the real-world robot it represents. For example, it may be observed that the robot is able to successfully interact with instances of the interactive object with poses that are within some translational and/or rotational tolerance of the baseline. Outside of those tolerances, the simulated robot may fail.
These tolerances may be subsequently associated with components of the robot controller and/or the real-world robot controlled by the robot controller. For example, the observed tolerance of a particular configuration of a simulated robot arm having a particular type of simulated gripper may be attributed to the real-world equivalents. Alternatively, the tolerances may be taken into account when selecting sensors for the real-world robot. For instance, if the simulated robot is able to successfully operate on instances of the interactive object having poses within 0.5 millimeters of the baseline pose, then sensors that are accurate within 0.5 millimeters may suffice for real-world operation of the robot.
In some implementations, a computer implemented method may be provided that includes: simulating a three-dimensional (3D) environment, wherein the simulated 3D environment includes a plurality of simulated robots controlled by a single robot controller; rendering multiple instances of an interactive object in the simulated 3D environment, wherein each instance of the interactive object has a simulated physical characteristic that is unique among the multiple instances of the interactive object; and receiving, from the robot controller, a common set of joint commands to be issued to each of the plurality of simulated robots, wherein for each simulated robot of the plurality of simulated robots, the common command causes actuation of one or more joints of the simulated robot to interact with a respective instance of the interactive object in the simulated 3D environment.
In various implementations, the robot controller may be integral with a real-world robot that is operably coupled with the one or more processors. In various implementations, the common set of joint commands received from the robot controller may be intercepted from a joint command channel between one or more processors of the robot controller and one or more joints of the real-world robot.
In various implementations, the simulated physical characteristic may be a pose, and the rendering may include: selecting a baseline pose of one of the multiple instances of the interactive object; and for each of the other instances of the interactive object, altering the baseline pose to yield the unique pose for the instance of the interactive object.
In various implementations, the simulated physical characteristic may be a pose, and the method may further include providing sensor data to the robot controller. The sensor data may capture the one of the multiple instances of the interactive object in a1 baseline pose. The robot controller may generate the common set of joint commands based on the sensor data.
In various implementations, the method may include: determining outcomes of the interactions between the plurality of simulated robots and the multiple instances of the interactive object; and based on the outcomes, adjusting one or more parameters associated with operation of one or more components of a real-world robot. In various implementations, adjusting one or more parameters may include training a machine learning model based on the outcomes. In various implementations, the machine learning model may take the form of a reinforcement learning policy.
Other implementations may include a non-transitory computer readable storage medium storing instructions executable by a processor to perform a method such as one or more of the methods described above. Yet another implementation may include a control system including memory and one or more processors operable to execute instructions, stored in the memory, to implement one or more modules or engines that, alone or collectively, perform a method such as one or more of the methods described above.
It should be appreciated that all combinations of the foregoing concepts and additional concepts described in greater detail herein are contemplated as being part of the subject matter disclosed herein. For example, all combinations of claimed subject matter appearing at the end of this disclosure are contemplated as being part of the subject matter disclosed herein.
A robot 100 may be in communication with simulation system 130. Robot 100 may take various forms, including but not limited to a telepresence robot (e.g., which may be as simple as a wheeled vehicle equipped with a display and a camera), a robot arm, a humanoid, an animal, an insect, an aquatic creature, a wheeled device, a submersible vehicle, an unmanned aerial vehicle (“UAV”), and so forth. One non-limiting example of a robot arm is depicted in
In some implementations, logic 102 may be operably coupled with one or more joints 1041-n, one or more end effectors 106, and/or one or more sensors 1081-m, e.g., via one or more buses 110. As used herein, “joint” 104 of a robot may broadly refer to actuators, motors (e.g., servo motors), shafts, gear trains, pumps (e.g., air or liquid), pistons, drives, propellers, flaps, rotors, or other components that may create and/or undergo propulsion, rotation, and/or motion. Some joints 104 may be independently controllable, although this is not required. In some instances, the more joints robot 100 has, the more degrees of freedom of movement it may have.
As used herein, “end effector” 106 may refer to a variety of tools that may be operated by robot 100 in order to accomplish various tasks. For example, some robots may be equipped with an end effector 106 that takes the form of a claw with two opposing “fingers” or “digits.” Such as claw is one type of “gripper” known as an “impactive” gripper. Other types of grippers may include but are not limited to “ingressive” (e.g., physically penetrating an object using pins, needles, etc.), “astrictive” (e.g., using suction or vacuum to pick up an object), or “contigutive” (e.g., using surface tension, freezing or adhesive to pick up object). More generally, other types of end effectors may include but are not limited to drills, brushes, force-torque sensors, cutting tools, deburring tools, welding torches, containers, trays, and so forth. In some implementations, end effector 106 may be removable, and various types of modular end effectors may be installed onto robot 100, depending on the circumstances. Some robots, such as some telepresence robots, may not be equipped with end effectors. Instead, some telepresence robots may include displays to render visual representations of the users controlling the telepresence robots, as well as speakers and/or microphones that facilitate the telepresence robot “acting” like the user.
Sensors 108 may take various forms, including but not limited to 3D laser scanners or other 3D vision sensors (e.g., stereographic cameras used to perform stereo visual odometry) configured to provide depth measurements, two-dimensional cameras (e.g., RGB, infrared), light sensors (e.g., passive infrared), force sensors, pressure sensors, pressure wave sensors (e.g., microphones), proximity sensors (also referred to as “distance sensors”), depth sensors, torque sensors, barcode readers, radio frequency identification (“RFID”) readers, radars, range finders, accelerometers, gyroscopes, compasses, position coordinate sensors (e.g., global positioning system, or “GPS”), speedometers, edge detectors, and so forth. While sensors 1081-m are depicted as being integral with robot 100, this is not meant to be limiting.
Simulation system 130 may include one or more computing systems connected by one or more networks (not depicted). An example of such a computing system is depicted schematically in
Various modules or engines may be implemented as part of simulation system 130 as software, hardware, or any combination of the two. For example, in
Simulation engine 136 may be configured to perform selected aspects of the present disclosure to simulate a virtual environment in which the aforementioned robot avatars can be operated. For example, simulation engine 136 may be configured to simulate a three-dimensional (3D) environment that includes an interactive object. The virtual environment may include a plurality of robot avatars that are controlled by a robot controller (e.g., 102 and 103 of robot 100 in combination) that is external from the virtual environment. Note that the virtual environment need not be rendered visually on a display. In many cases, the virtual environment and the operations of robot avatars within it may be simulated without any visual representation being provided on a display as output.
Simulation engine 136 may be further configured to provide, to the robot controller that controls multiple robot avatars in the virtual environment, sensor data that is generated from a perspective of at least one of the robot avatars that is controlled by the robot controller. As an example, suppose a particular robot avatar's vision sensor is pointed in a direction of a particular virtual object in the virtual environment. Simulation engine 136 may generate and/or provide, to the robot controller that controls that robot avatar, simulated vision sensor data that depicts the particular virtual object as it would appear from the perspective of the particular robot avatar (and more particularly, its vision sensor) in the virtual environment.
Simulation engine 136 may also be configured to receive, from the robot controller that controls multiple robot avatars in the virtual environment, a shared or common set of joint commands that cause actuation of one or more joints of each of the multiple robot avatars that is controlled by the robot controller. For example, the external robot controller may process the sensor data received from simulation engine 136 to make various determinations, such as recognizing an object and/or its pose (perception), and/or planning a path to the object and/or a grasp to be used to interact with the object. The external robot controller may make these determinations and may generate (execution) joint commands for one or more joints of a robot associated with the robot controller.
In the context of the virtual environment simulated by simulation engine 136, this common set of joint commands may be used, e.g., by simulation engine 136, to actuate joint(s) of the multiple robot avatars that are controlled by the external robot controller. Given that the common set of joint commands is provided to each of the robot avatars, it follows that each robot avatar may actuate its joints in the same way. Put another way, the joint commands are held constant across the multiple robot avatars.
In order to generate training episodes that can be used, for instance, to train a reinforcement learning machine learning model, variance may be introduced across the plurality of robot avatars by varying poses of instances of an interactive object being acted upon by the plurality of robot avatars. For example, one “baseline” instance of the interactive object may be rendered in the virtual environment in a “baseline” pose. Multiple other instances of the interactive object may likewise be rendered in the virtual environment, one for each robot avatar. Each instance of the interactive object may be rendered with a simulated physical characteristic, such as a pose, mass, etc., that is unique amongst the multiple instances of the interactive object.
Consequently, even though each robot avatar may actuate its joints in the same way, in response to the common set of joint commands, the outcome of each robot avatar's actuation may vary depending on a respective simulated physical characteristic of the instance of the interactive object the robot avatar acts upon. Simulated physical characteristics of interactive object instances may be varied from each other in various ways. For examples, poses may be varied via translation, rotation (along any axis), and/or repositioning of components that are repositionable. Other physical characteristics, such as size, mass, surface texture, etc., may be altered in other ways, such as via expansion (growth) or contraction. By introducing slight variances between simulated physical characteristics (e.g., poses) of instances of interactive objects, it is possible to ascertain tolerance(s) of components of the robot, such as one or more sensors 108 and/or one or more joints 104.
Robot avatars and/or components related thereto may be generated and/or organized for use by simulation engine 136 in various ways. In some implementations, a graph engine 138 may be configured to represent robot avatars and/or their constituent components, and in some cases, other environmental factors, as nodes/edges of graphs. In some implementations, graph engine 138 may generate these graphs as acyclic directed graphs. In some cases these acyclic directed graphs may take the form of dependency graphs that define dependencies between various robot components. An example of such a graph is depicted in
Representing robot avatars and other components as acyclic directed dependency graphs may provide a variety of technical benefits. One benefit is that robot avatars may in effect become portable in that their graphs can be transitioned from one virtual environment to another. As one non-limiting example, different rooms/areas of a building may be represented by distinct virtual environments. When a robot avatar “leaves” a first virtual environment corresponding to a first room of the building, e.g., by opening and entering a doorway to a second room, the robot avatar's graph may be transferred from the first virtual environment to a second virtual environment corresponding to the second room. In some such implementations, the graph may be updated to include nodes corresponding to environmental conditions and/or factors associated with the second room that may not be present in the first room (e.g., different temperatures, humidity, particulates in the area, etc.).
Another benefit is that components of robot avatars can be easily swapped out and/or reconfigured, e.g., for testing and/or training purposes. For example, to test two different light detection and ranging (“LIDAR”) sensors on a real-world physical robot, it may be necessary to acquire the two LIDAR sensors, physically swap them out, update the robot's configuration/firmware, and/or perform various other tasks to sufficiently test the two different sensors. By contrast, using the graphs and the virtual environment techniques described herein, a LIDAR node of the robot avatar's graph that represents the first LIDAR sensor can simply be replaced with a node representing the second LIDAR sensor.
Yet another benefit of using graphs as described herein is that outside influences on operation of real life robots may be represented as nodes and/or edges of the graph that can correspondingly influence operation of robot avatars in the virtual environment. In some implementations, one or more nodes of a directed acyclic graph may represent a simulated environmental condition of the virtual environment. These environmental condition nodes may be connected to sensor nodes so that the environmental conditions nodes may project or affect their environmental influence on the sensors corresponding to the connected sensor nodes. The sensor nodes in turn may detect this environmental influence and provide sensor data indicated thereof to higher nodes of the graph.
As one non-limiting example, a node coupled to (and therefore configured to influence) a vision sensor may represent particulate, smoke, or other visual obstructions that is present in an area. As another example, a node configured to simulate realistic cross wind patterns may be coupled to a wind sensor node of an unmanned aerial vehicle (“UAV”) avatar to simulate cross winds that might influence flight of a real-world UAV. Additionally, in some implementations, a node coupled to a sensor node may represent a simulated condition of that sensor of the robot avatar. For example, a node connected to a vision sensor may simulate dirt and/or debris that has collected on a lens of the vision sensor, e.g., using Gaussian blur or other similar blurring techniques.
In the real world (i.e., non-simulated environment), a robot controller may receive, e.g., from one or more sensors (e.g., 1081-M), sensor data that informs the robot controller about a state of the environment in which the robot operates. The robot controller may process the sensor data (perception) to make various determinations and/or decisions (planning) based on the state, such as path planning, grasp selection, localization, mapping, etc. Many of these determinations and/or decisions may be made by the robot controller using one or more machine learning models. Based on these determinations/decisions, the robot controller may provide (execution) joint commands to various joint(s) (e.g., 1041-6 in
When a robot controller is coupled with virtual environment 240 simulated by simulation engine 136, a plurality of robot avatars 200′1-16 may by operated by the robot controller in a similar fashion. Sixteen robot avatars 200′1-16 are depicted in
Instead of receiving real-world sensor data from real-world sensors (e.g., 108, 248), simulation engine 136 may simulate sensor data within virtual environment 240, e.g., based on a perspective of one or more of the robot avatars 200′1-16 within virtual environment 240. In
Additionally, and as shown by the arrows in
It is not necessary that a fully-functional robot be coupled with simulation engine 136 in order to simulate robot avatar(s). In some implementations, a robot controller may be executed wholly or partially in software to simulate inputs to (e.g., sensor data) and outputs from (e.g., joint commands) of a robot. Such a simulated robot controller may take various forms, such as a computing device with one or more processors and/or other hardware. A simulated robot controller may be configured to provide inputs and receive outputs in a fashion that resembles, as closely as possible, an actual robot controller integral with a real-world robot (e.g., 200). Thus, for example, the simulated robot controller may output joint commands at the same frequency as they are output by a real robot controller. Similarly, the simulated robot controller may retrieve sensor data at the same frequency as real sensors of a real-world robot. Additionally or alternatively, in some implementations, aspects of a robot that form a robot controller, such as logic 102, memory 103, and/or various busses to/from joints/sensors, may be physically extracted from a robot and, as a standalone robot controller, may be coupled with simulation system 130.
Robots (e.g., 200), standalone robot controllers, and/or simulated robot controllers may be coupled to or “plugged into” virtual environment 240 via simulation engine 136 using various communication technologies. If a particular robot controller or simulated robot controller is co-present with simulation system 130, it may be coupled with simulation engine 136 using one or more personal area networks (e.g., Bluetooth), various types of universal serial bus (“USB”) technology, or other types of wired technology. If a particular robot controller (simulated, standalone, or integral with a robot) is remote from simulation system 130, the robot controller may be coupled with simulation engine 136 over one or more local area and/or wide area networks, such as the Internet.
In various implementations, each instance 250′ of the interactive object may be rendered with a pose (or more generally, a simulated physical characteristic) that is varied from the rendered poses of the other instances. For example, in the first row of
The opposite is true in the second row. Fifth instance 250′5 is translated slightly to the right relative to the baseline pose of first instance 250′1. Sixth instance 250′6 is translated slightly to the right relative to fifth instance 250′5. Seventh instance 250′7 is translated slightly to the right relative to sixth instance 250′6. And eighth instance 250′8 is translated slightly to the right relative to seventh instance 250′7. Note that there is no significance to the arrangement of translations (or rotations) depicted in
In addition to translation being used to vary poses, in some implementations, poses may be varied in other ways. For example, in the third row of
Moreover, while not depicted in
As noted previously, the robot controller of robot 200 may receive simulated sensor data, e.g., from simulated sensor 248′ of first robot avatar 200′1, that captures first instance 250′1 of interactive object 250 in the baseline pose depicted at top left of
The same or “common” set of joint commands are also used to operate the other robot avatars 200′2-16 to interact with the other instances 250′2-16 of interactive object 250. For instance, second robot avatar 200′2 may actuate its joints in the same way to interact with second instance 250′2 of interactive object 250. Third robot avatar 200′3 may actuate its joints in the same way to interact with third instance 250′3 of interactive object 250. And so on.
As the pose of each instance 250′ of interactive object 250 varies to a greater degree from the baseline pose of first instance 250′1, it is increasingly likely that execution of the common set of joint commands will result in an unsuccessful operation by the respective robot avatar 200′. For example, it may be the case that robot avatars 200′1-3 are able to successfully act upon instances 250′1-3, but fourth robot avatar 200′4 is unable to successfully act upon fourth instance 250′4 of interactive object 250 because the variance of the pose fourth instance 250′4 is outside of a tolerance of robot avatar 200′ (and hence, of real-world robot 200).
The outcomes (e.g., successful or unsuccessful) of robot avatars 200′1-16 acting upon instances 250′1-16 of interactive object may be recorded, e.g., as training episodes. These training episodes may then be used for various purposes, such as adjusting one or more parameters associated with operation of one or more components of a real-world robot. In some implementations, the outcomes may be used to train a machine learning model such as a reinforcement learning policy, e.g., as part of a reward function. Additionally or alternatively, in some implementations, the outcomes may be used to learn tolerances of robot 200. For example, an operational tolerance of an end effector (e.g., 106) to variations between captured sensor data and reality can be ascertained. Additionally or alternatively, a tolerance of a vision sensor (e.g., 248) may be ascertained. For example, if robot avatars 200′ were successful in acting upon instances 250′ with poses that were translated less than some threshold distance from the baseline pose, a vision sensor having a corresponding resolution capabilities may be usable with the robot (or in the same context).
Graph 400 includes, as a root node, a robot controller 402 that is external to the virtual environment 240. In other implementations, the robot controller may not be represented as a node, and instead, a root node may act as an interface between the robot controller and children nodes (which may represent sensors and/or other robot controllers simulated in the virtual environment). Robot controller 402 may be implemented with various hardware and software, and may include components such as logic 102, memory 103, and in some cases, bus(ses) from
Perception module 403 may receive sensor data from any number of sensors. In the real world, this sensor data may come from real life sensors of the robot in which robot controller 402 is integral. In virtual environment 240, this sensor data may be simulated by and propagated up from various sensor nodes 4081, 4082, 4083, . . . that represent virtual sensors simulated by simulation engine 136. For example, a vision sensor 4081 may provide simulated vision data, an anemometer 4082 may provide simulated data about wind speed, a torque sensor 4083 may provide simulated torque data captured at, for example, one or more robot joints 404, and so forth.
In some implementations, simulated environmental conditions may also be represented as nodes of graph 400. These environmental conditions may be propagated up from their respective nodes to the sensor(s) that would normally sense them in real life. For example, airborne particulate (e.g., smoke) that is desired to be simulated in virtual environment 240 may be represented by an airborne particulate node 411. In various implementations, aspects of the desired airborne particulate to simulate, such as its density, particle average size, etc., may be configured into node 411, e.g., by a user who defines node 411.
In some implementations, aside from being observed by a sensor, an environmental condition may affect a sensor. This is demonstrated by Gaussian blur node 415, which may be configured to simulate an effect of particulate debris collected on a lens of vision sensor 4081. To this end, in some implementations, the lens of vision senor 4081 may be represented by its own node 413. In some implementations, having a separate node for a sensor component such as a lens may enable that component to be swapped out and/or configured separately from other components of the sensor. For example, a different lens could be deployed on vision sensor node 4081 by simply replacing lens node 413 with a different lens node having, for instance, a different focal length. Instead of the arrangement depicted in
As another example of an environmental condition, suppose the robot represented by graph 400 is a UAV that is configured to, for instance, pickup and/or deliver packages. In some such implementations, a crosswind node 417 may be defined that simulates crosswinds that might be experienced, for instance, when the UAV is at a certain altitude, in a particular area, etc. By virtue of the crosswind node 417 being a child node of anemometer node 4082, the simulated cross winds may be propagated up, and detected by, the anemometer that is represented by node 4082.
Perception module 403 may be configured to gather sensor data from the various simulated sensors represented by nodes 4081, 4082, 4083, . . . during each iteration of robot controller 402 (which may occur, for instance, at a robot controller's operational frequency). Perception module 403 may then generate, for instance, a current state. Based on this current state, planning module 406 and/or execution module 407 may make various determinations and generate joint commands to cause joint(s) of the robot avatar represented by graph 400 to be actuated.
Planning module 406 may perform what is sometimes referred to as “offline” planning to define, at a high level, a series of waypoints along a path for one or more reference points of a robot to meet. Execution module 407 may generate joint commands, e.g., taking into account sensor data received during each iteration, that will cause robot avatar joints to be actuated to meet these waypoints (as closely as possible). For example, execution module 407 may include a real-time trajectory planning module 409 that takes into account the most recent sensor data to generate joint commands. These joint commands may be propagated to various simulated robot avatar joints 4041-M to cause various types of joint actuation.
In some implementations, real-time trajectory planning module 409 may provide data such as object recognition and/or pose data to a grasp planner 419. Grasp planner 419 may then generate and provide, to gripper joints 4041-N, joint commands that cause a simulated robot gripper to take various actions, such as grasping, releasing, etc. In other implementations, grasp planner 419 may not be represented by its own node and may be incorporated into execution module 407. Additionally or alternatively, real-time trajectory planning module 409 may generate and provide, to other robot joints 404N+1 to M , joint commands to cause those joints to actuate in various ways.
Referring now to
At block 502, the system, e.g., by way of simulation engine 136, may simulate a three-dimensional (3D) environment. The simulated 3D environment may include a plurality of simulated robots (e.g., robot avatars 200′1-16 in
At block 504, the system, e.g., by way of simulation engine 136, may render multiple instances (e.g., 250′1-16 in
At block 506, the system, e.g., by way of simulation engine 136, may provide sensor data to the robot controller. In some such implementations, the sensor data may capture the one of the multiple instances (e.g., 250′1) of the interactive object in the baseline pose. The robot controller may generate the common set of joint commands based on this sensor data.
At block 508, the system, e.g., by way of simulation engine 136, may receive, from the robot controller, a common set of joint commands to be issued to each of the plurality of simulated robots. At block 510, the system, e.g., by way of simulation engine 136, may cause actuation of one or more joints of each simulated robot to interact with a respective instance of the interactive object in the simulated 3D environment.
At block 512, the system, e.g., by way of simulation engine 136, may determine outcomes (e.g., successful, unsuccessful) of the interactions between the plurality of simulated robots and the multiple instances of the interactive object. Based on the outcomes, at block 514, the system may adjust one or more parameters associated with operation of one or more components of a real-world robot. For example, tolerance(s) may be ascertained and/or reinforcement learning policies may be trained.
User interface input devices 622 may include a keyboard, pointing devices such as a mouse, trackball, touchpad, or graphics tablet, a scanner, a touchscreen incorporated into the display, audio input devices such as voice recognition systems, microphones, and/or other types of input devices. In general, use of the term “input device” is intended to include all possible types of devices and ways to input information into computer system 610 or onto a communication network.
User interface output devices 620 may include a display subsystem, a printer, a fax machine, or non-visual displays such as audio output devices. The display subsystem may include a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), a projection device, or some other mechanism for creating a visible image. The display subsystem may also provide non-visual display such as via audio output devices. In general, use of the term “output device” is intended to include all possible types of devices and ways to output information from computer system 610 to the user or to another machine or computer system.
Storage subsystem 624 stores programming and data constructs that provide the functionality of some or all of the modules described herein. For example, the storage subsystem 624 may include the logic to perform selected aspects of method 500, and/or to implement one or more aspects of robot 100 or simulation system 130. Memory 625 used in the storage subsystem 624 can include a number of memories including a main random access memory (RAM) 630 for storage of instructions and data during program execution and a read only memory (ROM) 632 in which fixed instructions are stored. A file storage subsystem 626 can provide persistent storage for program and data files, and may include a hard disk drive, a CD-ROM drive, an optical drive, or removable media cartridges. Modules implementing the functionality of certain implementations may be stored by file storage subsystem 626 in the storage subsystem 624, or in other machines accessible by the processor(s) 614.
Bus subsystem 612 provides a mechanism for letting the various components and subsystems of computer system 610 communicate with each other as intended. Although bus subsystem 612 is shown schematically as a single bus, alternative implementations of the bus subsystem may use multiple busses.
Computer system 610 can be of varying types including a workstation, server, computing cluster, blade server, server farm, smart phone, smart watch, smart glasses, set top box, tablet computer, laptop, or any other data processing system or computing device. Due to the ever-changing nature of computers and networks, the description of computer system 610 depicted in
While several implementations have been described and illustrated herein, a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein may be utilized, and each of such variations and/or modifications is deemed to be within the scope of the implementations described herein. More generally, all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific implementations described herein. It is, therefore, to be understood that the foregoing implementations are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, implementations may be practiced otherwise than as specifically described and claimed. Implementations of the present disclosure are directed to each individual feature, system, article, material, kit, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, kits, and/or methods, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, is included within the scope of the present disclosure.