The present disclosure relates to robotic systems, controllers, and methods for controlling manipulation of objects, and more particularly to systems and methods for object orientation and manipulation by machine learning based control.
Various industrial activities such as product assembly and packaging involve manipulation of objects by robotic assemblies of different types. One or more robots usually grasp and hold objects while carrying out an assembly operation or while moving such objects between two points in the assembly line. Also in many non-contact based sub-processes, it may be desired that an object be placed on a surface or conveyor in a particular pose, for example to avoid collision with other objects or entities. Given the fact that objects may come in different shapes, sizes, and may be of different rigidities, each of them may have a specific handling requirement. In view of such handling requirements, the robots manipulating such objects are required to approach the objects from a limited number of sides, angles, and surfaces.
Furthermore, specific to some processes, the objects may be required to be oriented in one or more particular orientations for successful execution of such processes. For example, during automated product assembly tasks, it is necessary to properly orient incoming parts or sub-assemblies of parts so that a robot can grasp the part in an installable orientation. Available solutions are based on hard automation and are mostly single-part orientation systems. For example, vibratory bowl feeders are not usable for any part other than the part the vibratory bowl feeder is designed for. Such conventional approaches lack meaningful, fast, and flexible schemes and measures to orient objects in any desired pose. Any attempt to make such systems flexible requires reconfiguring the hard automation, which is a time consuming, labor intensive and infeasible. Accordingly, it is desired to have systems, methods and controllers that can help orient objects, components, and parts in any desired manner. Such a need becomes imminent in industrial processes since objects, components, and parts can be highly varied in shape, and market desires are moving more and more towards mass customization. Therefore, there is a need to move away from hard automation and single-part orientation systems to more flexible and generalized part orientation systems.
Example embodiments described herein are directed towards machine control with particular emphasis on controlling operations of robots. It is an object of some embodiments to provide control policies for orienting randomly positioned objects into any desired pose. Automated object orientation using available techniques requires knowledge of contact and impulse dynamics of the candidate object. Some example embodiments are based on the realization that the contact and impulse dynamics of the irregular part are infeasible to model with traditional physical methods. Some example embodiments are also based on the realization that specific to each desired orientation, a robot attempting to orient an object in a desired orientation would require a custom program to grasp and flip or invert the object to the desired orientation. As the robot can, at best, only approach the object in a direction contained in the hemisphere above the platform on which the object is supported, it is not feasible to define a custom program for each desired orientation.
Some example embodiments are directed towards providing maneuvering schemes that help a robot in orienting an object in a desired orientation without increasing the takt time significantly. It is also an object of some example embodiments to provide an improved design of a robotic assembly for object orientation using an array of actuators controlled using commands that increase the likelihood of changing a current orientation of an object to the desired orientation without taking into consideration the contact and impulse dynamics of the object. Some example embodiments use an array of actuators to impart an impulse on a surface that the objects are lying on, with the objective of flipping them on a different face/side.
Some example embodiments also aim to provide machine learning based control schemes to control object manipulators. Some example embodiments provide self-supervised learning based approaches to generate control policies for orienting objects into desired poses. The object's current state defined by a location and orientation of the object is taken as an input by a machine learning system powered by a learned function and the machine learning system outputs a selection of the actuator(s) to be fired and the impulse duration(s) that has/have the best probability to put the object into a desired pose. The desired pose may be for example a pose that is acceptable for robot grasping. Some example embodiments use learned probabilistic models that directly yield likely outcomes as a function of the current state of an object and applied control commands to actuators, and use those models to choose the optimal control commands.
It is a realization of some example embodiments that one approach to learning manipulation is to learn a full state-space model of the system dynamics, using various system identification methods. Although this approach is very productive for linear systems, some example embodiments recognize that the complicated non-linear nature of contact dynamics requires the application of advanced methods for learning non-linear and possibly hybrid discrete/continuous dynamical models. Various universal function approximation methods including neural networks may be used to learn system dynamics. Recent interest in model-based reinforcement learning has renewed research efforts to find good methods for learning world models. In this regard, Contact Nets have improved considerably the accuracy of predictive models with respect to earlier dynamical models based on standard neural networks. Some example embodiments are based on the realization that learning such models is quite complicated, and might also be an overkill for the control problem, where prediction of the entire future trajectory of the manipulated object is not really necessary, and predicting only the stable resting state is sufficient.
In this regard, some example embodiments are based on learning predictive models that predict only the resting state of the manipulated object as a result of a particular action (actuator fired). Similar to stochastic learning automata (SLA), these predictive models are probabilistic, to capture the inherent stochasticity of the complex contact dynamics involved. However, unlike SLA, example embodiments are based on models that use the full continuous state of the manipulated object, measured as precisely as technically and economically feasible, for the purposes of predicting the resting state. The underlying control problem possesses significant aleatoric (inherent) uncertainty (mostly due to chaotic bifurcation dynamics and contact phenomena), but not necessarily significant epistemic (observational) uncertainty, and there is no reason to artificially inject such epistemic uncertainty by quantizing the state; rather, a more productive approach is to measure the continuous state as accurately as possible, and then employ machine learning methods that can work with the full continuous state.
Some embodiments are directed towards providing closed loop controllers for executing such control policies for orienting randomly positioned objects into any desired pose. The closed loop controllers continuously process the input states indicative of poses of the object and the predicted commands causing change in orientation of the object, to reach the desired states indicative of the desired poses. In this regard, the closed loop controller obtains the input states from pose data detected by a suitable system and obtains the predicted commands from a machine learning system to command a set of actuators to apply impulse forces to the object to reach the desired states. Some example embodiments may utilize the closed loop controller to train the machine learning system from scratch or on the fly. In this way, example embodiments provide measures for deciding how to control the system in order to manipulate the objects in an optimal way to reach the desired state.
Some embodiments are directed towards providing robotic assemblies for orienting such randomly positioned objects into any desired pose. Some example embodiments provide a robotic assembly comprising a robot for manipulating the object and a closed loop controller for controlling the operations of the robot. The controller may also control a set of actuators to orient the object in a desired pose by providing optimal sets of control commands to the set of actuators. In this regard, the controller interfaces with an imaging system that provides pose data of the object a machine learning system that maps the pose data to a set of control commands causing the set of actuators to apply impulse forces to the object such that the forces change the orientation of the object to a desired or target orientation. The controller may command the robot to manipulate the object when the object is oriented in the desired orientation.
The desired orientation for the object may be predefined in a database or may be provided as an input to the system, for example by a user/operator or by other machines. In this way, example embodiments of this invention help improve the overall efficiency of robotic manipulation tasks by providing an added capability of orienting parts in a desired manner. Such an additional capability may be configured specific to each individual part being manipulated without the need of requiring individual machines or assemblies for each such part. For these reasons, example embodiments of this invention may be seamlessly integrated into existing systems such as production line assemblies and similar robots without requiring reconfiguration.
In order to achieve the aforementioned objectives and advantages, example embodiments of this disclosure provide a controller and a method for robotic manipulation of a component or an object, a robotic assembly for manipulating the component or object and a method for controlling the robotic assembly.
Some example embodiments provide a robotic assembly comprising a supporting surface configured to support an object and a set of actuators coupled with the supporting surface, Each actuator of the set of actuators is configured to apply an impulse to the supporting surface with energy governed by a corresponding control command of a set of control commands. The robotic assembly also comprises a memory configured to store a learned function trained with machine learning to map a location and an orientation of the object at the supporting surface to one or more commands of the set of control commands. The robotic assembly also comprises a processor communicatively coupled to the other components of the robotic assembly.
The processor accepts a plurality of instances of pose data of the object at the supporting surface. According to some example embodiments, the plurality of instances of the pose data of the object may be provided by an imaging system that detects a location and orientation of the object on the supporting surface at discrete time intervals. The processor obtains a first current location and a first current orientation of the object based on a first instance of the plurality of instances of the pose data of the object. The object may have a plurality of stable orientations and one or more of the plurality of stable orientations may include a target orientation of the object in which the object is desired to be oriented. The processor then executes the learned function stored in the memory to map the first current location and the first current orientation of the object to at least one control command of the set of control commands.
The processor further submits the at least one control command to the set of actuators to apply a corresponding specific distribution of energy at the first current location of the object at the supporting surface to increase a likelihood of changing the first current orientation to a target orientation of the object. The processor then obtains a second current orientation of the object based on a second instance of the plurality of instances of the pose data of the object and additionally or optionally commands a robotic manipulator to manipulate the supported object based on a match between the second orientation of the object and the target orientation.
According to some example embodiments, a method for robotic manipulation of an object supported at a supporting surface of a robot is provided. The method comprises receiving a plurality of instances of pose data of the object at the supporting surface and obtaining a first current location and a first current orientation of the object based on a first instance of the plurality of instances of the pose data of the object. The method further comprises executing a learned function to map the first current location and the first current orientation of the object to at least one control command of a set of control commands. The learned function is trained with machine learning to map a location and an orientation of the object at the supporting surface to one or more commands of the set of control commands. The method further comprises submitting the at least one control command to a set of actuators to apply a corresponding specific distribution of energy at the first current location of the object at the supporting surface to increase a likelihood of changing the first current orientation to a target orientation of the object. Each actuator of the set of actuators is configured to apply an impulse to the supporting surface with energy governed by a corresponding control command of the set of control commands. The method further comprises obtaining a second current orientation of the object based on a second instance of the plurality of instances of the pose data of the object and commanding a robotic manipulator to manipulate the supported object based on a match between the second orientation of the object and the target orientation.
Some example embodiments provide a robotic controller for controlling a robotic assembly comprising a flexible bowl feeder supporting an object, a robotic arm for manipulating the supported object, and a set of actuators. The robotic controller is in communication with an impulse generator for controlling the set of actuators, an imaging system for generating a plurality of instances of pose data of the object, and the robotic arm. The robotic controller comprises an interface configured to communicate with a database of learned association between candidate orientations of the object to one or more control commands of a set of control commands yielding new orientations. The object may have a plurality of stable orientations and one or more of the plurality of stable orientations include a target orientation of the object. The interface is further configured to accept the target orientation of the object and accept the plurality of instances of pose data of the supported object.
The robotic controller also comprises a memory configured to store executable instructions and a processor configured to execute the executable instructions to obtain a current location and a first current orientation of the supported object based on a first instance of the plurality of instances of the pose data of the supported object. The processor is further configured to query the database with the first current orientation to obtain at least one control command of the set of control commands and submit the at least one control command to the pulse generator to cause the set of actuators to apply a corresponding specific distribution of energy at the current location of the supported object to increase a likelihood of changing the first current orientation to the target orientation of the object.
The processor is further configured to obtain a second current orientation of the object based on a second instance of the plurality of instances of the pose data of the supported object and command the robotic arm to manipulate the supported object based on a match between the second orientation of the object and the target orientation. The robotic controller may be implemented as one or more cloud based services whereby the imaging system uploads the plurality of instances of the pose data of the supported object and the impulse generator downloads the at least one control command.
Some example embodiments also provide systems and methods for training a function such as a classifier that is trained using one or a combination of k-nearest neighbors algorithm (k-NN), a support-vector machine (SVM), an rN (radius-neighborhood), a random forest, a relevance vector machine (RVM), a reinforcement learning (RL), or a backward propagation minimizing a loss function. The training may comprise collecting a target orientation of an object on a supporting surface and controlling the set of actuators using multiple sets of random control commands to apply different distributions of energy at different locations of the supporting surface. The training also comprises detecting, using an imaging system, an orientation change of the object for each of the different distributions of energy and training parameters of the function to produce the sets of control commands to the set of actuators increasing the likelihood of the different distributions of energy at the current location of the object to change a current orientation of the object to the target orientation.
Some beneficial aspects, among many beneficial aspects realized, is that embodiments of the present disclosure can continuously generate control commands for changing an orientation of an object without significantly increasing the wait time for the robotic manipulator. Such a capability leads to an improved robot capable of performing manipulation tasks without the need for separate sophisticated techniques for object manipulation. Such an improved robot also reduces the human intervention required in the process thereby leading to reduced handling times and handling errors.
Some features of the embodiments of the present disclosure as described herein may be used in the automotive industry, aerospace industry, nonlinear control systems, and process engineering. Also, some features of the embodiments of the present disclosure can be used with a material-handling robot with multiple end-effectors suitable for applications in semiconductor wafer processing systems. For example, the above operations may be utilized to pick/place a wafer or substrate from/to an offset station.
The presently disclosed embodiments will be further explained with reference to the attached drawings. The drawings shown are not necessarily to scale, with emphasis instead generally being placed upon illustrating the principles of the presently disclosed embodiments.
The following description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the following description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing one or more exemplary embodiments. Contemplated are various changes that may be made in the function and arrangement of elements without departing from the spirit and scope of the subject matter disclosed as set forth in the appended claims.
Specific details are given in the following description to provide a thorough understanding of the embodiments. However, understood by one of ordinary skill in the art can be that the embodiments may be practiced without these specific details. For example, systems, processes, and other elements in the subject matter disclosed may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known processes, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments. Further, like reference numbers and designations in the various drawings indicate like elements.
Also, individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed but may have additional steps not discussed or included in a figure. Furthermore, not all operations in any particularly described process may occur in all embodiments. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, the function's termination can correspond to a return of the function to the calling function or the main function.
Robots are deployed in several tasks ranging from day-to-day activities to complex industrial jobs. Many situations require robots to interact with objects and perform one or more tasks including but not limited to grasping and moving the objects. The objects with which the robots may interact, serve as payloads for the robots. Quite often, the interactions of the robots with their corresponding loads include performing robotic manipulations by manipulators on the payloads. A pose of an object may define a location and orientation of the object with respect to some reference plane such as an object supporting surface. In order to control some operations of such robotic manipulators, quite often it is desired that the object be presented to the robotic manipulators in one or more desired poses. This is especially important in product assembly lines, warehouses and pick and place tasks where an operation on the object may be selectively performed on one or more surfaces of the object or where the object is required to be presented in a desired configuration to fit into the scenario. Additionally, in many robotic uses, it is not sufficient to guide the robot to a pose where the part is graspable; the part must also be oriented in a way that the robot can grasp the part without occluding the areas of the part that will need to be accessed for product assembly to proceed.
Robotic manipulators operate on objects under the guidance of computed trajectories. Trajectory computation in turn requires accurate physical models of the payloads and the environment to generate desired motions for the manipulators to accomplish designated tasks. All of these tasks rely on an assessment of the object in some desired configuration. Therefore, object orientation and the subsequent manipulation are important blocks in industrial tasks.
Example embodiments of the present disclosure are directed towards providing techniques and measures for efficiently orienting any object in a desired target orientation.
The robotic controller 11 controls one or more other components of the robotic assembly 10. In this regard, it is contemplated that the robotic controller 11 is communicatively coupled to other components of the robotic assembly 10 through suitable means such as wired or wireless medium. Various tasks performed by the robotic controller 11 may be executed by one or more processors that interface with one or more memories. The memories may store program instructions for execution by the processors and data utilized by the processors. In some example embodiments, at least one of the memories may store a learned function mapping a state of the object 5 to a set of control commands for the actuators 9. Additionally or optionally the robotic controller 11 may comprise or interface with one or more databases of learned association between states of objects and control commands for the actuators 9. The robotic controller 11 may additionally comprise an interface for data communication with other machines and devices such as the other components of the robotic assembly 10.
The impulse controller 13 may be a standalone component of the robotic assembly 10 or may be a part of one or more other components of the robotic assembly 10. For example, in some example embodiments, the impulse controller 13 may be a part of the robotic controller 11 while in some other example embodiments, the impulse controller 13 may be a part of the electromechanical subsystem 3. Irrespective of its association with other components of the robotic assembly 11, the impulse controller 13 may be realized using one or more processors and suitable circuitry to generate impulse signals for a set of actuators 9 of the electromechanical subsystem 3. For example the impulse controller 13 may comprise a microprocessor and an array of high-powered field effect transistor (FET) pulldown pulse generators. The impulse controller 13 produces impulse signals for the actuators 9 under the guidance of control commands issued by the robotic controller 11. The control commands may indicate a subset of the set of actuators that are to be fired and/or the durations and energies with which the actuators are to be fired. Accordingly, the impulse controller 13 may send impulse signals to only those actuators of the set of actuators 9 that are indicated in the control command.
The electromechanical subsystem 3 comprises a supporting surface 7 for supporting one or more objects such as the object 5 and the set of actuators 9 for applying impulse forces to the supporting surface 7. In some example embodiments, the object 5 may be a part, component or any solid body that is capable of being flipped, tilted or moved without being disintegrated upon application of impulse forces on it. For example, the object 5 may be a nut, a bolt, an electronic component, a package, some mechanical or electrical component and the like. The object 5 may have a plurality of stable orientations (i.e. orientations in which the object is in static equilibrium). One or more of the plurality of stable orientations may include a target orientation of the object 5.
The application of some impulse forces by the actuators 9 causes a change in one or more of a location and an orientation of the object 5. In this regard, the actuators 9 may be activated by the impulse controller 13 by providing suitable electrical signals to the actuators 9. The impulse controller 13 operates under the guidance of control commands issued by the robotic controller 11. The control commands for the set of actuators 9 may be defined such that a specific set of control commands submitted to the set of the actuators 9 causes a corresponding specific distribution of energy applied to the supporting surface 7. A control command may be a digital command that identifies the actuators that are to be powered or fired and the duration for which those actuators are to be fired. It may be contemplated that a control command may cause different motions of the object 5 at different locations on the supporting surface 7. That is, the same control command may cause a certain motion of the object 5 at a first current location of the object 5 while it may cause a different motion of the object 5 at a second current location different from the first current location of the object 5.
In some example embodiments, a control command may be defined specific to an actuator of the set of actuators 9. For example, a control command may indicate “fire actuator #n for time period t, where t1≤t≤t2”. In some example embodiments, a control command may be defined for a plurality of actuators. For example, a control command may indicate “fire actuator #n1 for time period tn1, actuator #n2 for time period tn2 . . . actuator #np for time period tnp, where t1≤tn1≤t2, t3≤tn2≤t4 . . . . tx≤tnp≤ty.”
The imaging system defined by the imaging device 15A and the illumination source 15B, may additionally or optionally comprise an image processor (not shown) to process images captured by the imaging device 15A. In some example embodiments, the image processing may be performed by a processor that is external to the imaging system such as by the robotic controller 11 itself. The imaging device 15A may be a digital camera or any other suitable imaging source for capturing still images and/or videos of the supporting surface 7 of the electromechanical subsystem 3. The illumination source or illuminator 15B illuminates a field of view of the imaging device 15A. In some example embodiments, the illumination source 15B may be a ring light powered by suitable illumination subsystems such as light emitting diodes (LEDs) or incandescent bulbs. In some example embodiments, the illumination source 15B may be a point source or a beam source illuminator. In some example embodiments, the illumination source 15B may be optional to the imaging system. The image processor may execute suitable image processing algorithms such as machine vision procedures to process the images captured by the imaging device 15A to detect a state or pose of the objects 5 on the supporting surface 7. A state or pose of an object as used herein may correspond to one or more of a location and an orientation of the object on the supporting surface 7.
The robotic controller 11 executes a series of feedback based operations to orient the object 5 at the supporting surface 7 to a target orientation of the object 5. It may be contemplated that the robotic controller 11 may be configured for any number and type of objects and may orient each object in its desired orientation serially. In some other example embodiments, the robotic controller 11 may parallelly or jointly orient multiple objects in their respective orientations using a some or all of the actuators 9 in conjunction with other components of the robotic assembly 10. According to some example embodiments, the robotic controller 11 may terminate operations when an object has been oriented in its desired target orientation. According to some example embodiments, the robotic controller 11 may communicate to another machine or device, a signal indicative of successful orientation of an object in its desired target orientation. For example, the robotic controller 11 may output to a robot an output signal indicating that the object 5 has been oriented in its desired target state so that the robot can initiate further action on the oriented object 5. In another example embodiment, upon successful orientation of the object 5 in its desired target orientation on the supporting surface 7, the robotic controller 11 directly commands the robotic manipulator 17 to manipulate the oriented object 5.
In some example embodiments, the robotic manipulator 17 may be optional to the robotic assembly 10. The robotic manipulator 17 upon being commanded by the robotic controller 11 performs one or more manipulation tasks such as grasping on the object 5. The robotic manipulator 17 may be realized using any suitable configuration according to the manipulation task to be performed on the object 5. For example, in some example embodiments, the robotic manipulator 17 may comprise a robotic arm having grippers for grasping the oriented object 5.
The aforementioned components of the robotic assembly operate in conjunction to facilitate orientation and manipulation of an unoriented object. A detailed working of the robotic assembly 10 is described next with reference to
A target orientation of the object 5 on the supporting surface 7 is accepted 102, for example, from a user through an interface. In some example embodiments, the target orientation may be accepted from another program or device or may be loaded from a memory. In some example embodiments, a target location of the object 5 on the supporting surface may also be accepted at step 102. The supporting surface 7 with the object 5 is imaged 104 with the aid of the imaging device 15A and the illumination source 15B. Image processing using any suitable algorithm including but not limited to machine vision procedures is performed on the captured image of the supporting surface 7 with the object 5 to detect 106 a current location and a current orientation of the object 5 on the supporting surface 7. Hereinafter, the object 5 on the supporting surface 7 may also be referred to as the supported object 5.
The robotic controller 11 performs a check 108 whether the current orientation is similar to the target orientation defined for the supported object 5. The degree of similarity or match may be ascertained with respect to a threshold that may be configurable. For example, the overlap between the current orientation and the target orientation of the supported object may be expressed as a quantified value such as a whole number, fraction or percentage. If the quantified value is greater than or equal to a threshold value, the robotic controller 11 declares it as a match (outputs “Yes” at step 108), otherwise not a match (outputs “No” at step 108).
If the outcome of the check at 108 returns a “Yes” (i.e. the current orientation of the object 5 matches the target orientation) the control of steps passes to step 110 where the robotic controller commands the robotic manipulator 17 to manipulate the supported object 5. However, if the outcome of the check at 108 returns a “No” (i.e. the current orientation of the object 5 does not match the target orientation) the control of steps passes to step 112 where the robotic controller 11 executes a learned function stored in memory to map the current location and the current orientation of the supported object 5 to at least one control command for the set of actuators 9. The robotic controller 11 invokes the learned function from a memory and submits the detected current location and current orientation of the object 5 as an input to the learned function.
In some example embodiments, the learned function may interface with a training database storing data indicative of an association between the object's states and control commands for the actuators. The extract of the training database may be consolidated to select at least one control command. In some example embodiments, the extract may be sorted by one or more of actuator identifier (an identity of each actuator) and impulse duration. The most common command (e.g., the command that was the statistical “mode” for successful graspability) may accordingly be selected. In some example embodiments, a command with the lowest required actuator impulse duration (energy) may be selected. In some example embodiments, an inverse-distance weighting may be used to select the command.
This selection process is significantly simplified by the quantization of the commands, so that the total set of possible commands is relatively small. For example, for a system with seven actuators and three power levels of duration 17, 21, and 25 milliseconds, the full command vocabulary may be 21 possibilities, which may be relatively tractable for the machine learning procedures to operate with. The training database may be embodied locally as a part of the robotic assembly 10 or may be partly or fully hosted on a cloud service.
In some example embodiments, the entire command selection function may be fully hosted on a cloud service, typically by uploading the raw image from the camera, implementing step 106 of detecting current location and orientation, step 108 of location comparison, and step 112 of execution of the learned function to select at least one control command, and subsequently downloading the selected control command to be executed locally. Such an embodiment provides the ability to utilize a small and inexpensive local computing resource, while effectively time-sharing an expensive GPU-based machine-learning algorithm, as well as allowing the use of a proprietary learned function and/or proprietary database. In another example embodiment, the current location and orientation determination 106 may be performed locally, and only the orientation and location may be uploaded to the cloud and the selected firing command subsequently downloaded from the cloud, again allowing the effective use of expensive machine-learning hardware and/or securely using a proprietary learned function and/or a proprietary database.
The learned function predicts at least one control command for the actuators 9 corresponding to the inputted current location and current orientation of the object 5. The at least command is submitted to the impulse controller 13 for activating one or more of the actuators 9. In some example embodiments, some learned functions may have configurations where no command may be selected according to the nominal command selection procedure of the embodiments described above. In such cases where the learned function's procedures do not fetch any command, a random command may be selected.
Using the at least one control command, the set of actuators 9 are fired 114 to apply a corresponding specific distribution of energy at the first current location of the object 5 at the supporting surface 7 to increase a likelihood of changing the first current orientation to the target orientation of the object 5. It may be contemplated that each actuator of the set of actuators 9 applies an impulse force to the supporting surface 7 with energy governed by a corresponding control command or a set of control commands. The control of steps then passes back to step 104 to image the supporting surface 7 again and repeat steps 106 and 108 until the object 5 is oriented in the target orientation.
In this way, the robotic assembly 10 using the method 100 can orient the object 5 in any desired target orientation. In some example embodiments, it may only be desired that the object 5 is oriented in the target orientation and there may not be requirement of a manipulation on the object 5. In such scenarios, if the check at step 108 returns a “Yes”, the method 100 may be terminated and in such cases, step 110 may not be executed.
The learned function may be a classifier learned using any or a combination of suitable learning algorithms such as k-nearest neighbors algorithm (k-NN), a support-vector machine (SVM) algorithm, an rN (radius-neighborhood) algorithm, a random forest algorithm, a relevance vector machine (RVM) algorithm, a reinforcement learning (RL) algorithm, or a backward propagation minimizing a loss function. To learn such a function and obtain an association between a state of the object and control commands for the actuators, a training procedure may be implemented beforehand. In some example embodiments, the learning of the function may be a continuous process.
The training assembly 20 comprises a training controller 21, a training subsystem 23, a training imaging system comprising an imaging device 25A and an image processor 25B, a pulse controller 27 and a training database 29. Additionally, the training assembly 20 comprises an interface (not shown) for communicating data among the components of the training assembly 20 or with an external device. For example, the interface may be embodied as an input interface for accepting a target orientation of an object at the training supporting surface.
Structurally and functionally, the training controller 21 may be similar to the robotic controller 11 of
The training subsystem 23 comprises a training supporting surface 23A for supporting objects for orientation and a plurality of actuators 23B for applying impulse forces on the training supporting surface 23A. The imaging device 25A with the aid of the image processor 25B images the training supporting surface 23A and detects the location and orientation of objects placed on the training supporting surface 23A. The location and/or the orientation of each object on the training supporting surface 23A may be defined relative to a plane 23C of the training supporting surface 23A. The pulse controller 27 comprises a random pulse generator which may be controlled based on an input from the training controller 21. The random pulse generator generates random pulses for the actuators 23B, each random pulse activating one or more random actuators of the actuators 23B for a fixed or random duration of time. The training database 29 together with the training controller 21 may define a machine learning system.
In order to obtain a learned association between object states and control commands for actuators, the operational aspects of the training assembly 20 is described with reference to
The training supporting surface 23A supporting a training object is imaged 202 and the location and orientation of the supported training object is detected 204. The location and orientation detected at step 204 may be referred to as ‘prior location’ and ‘prior orientation’ of the training object. A random action command in the form of a random pulse from the pulse controller 27 is generated 206 and submitted to the set of actuators 23B. In this regard, the pulse controller 27 selects a randomly chosen actuator from the actuators 23B and a randomly chosen impulse power and transmits this random action command to the actuator array 23B as a part of step 206.
According to some example embodiments, for some machine learning procedures (for example kNN and rN) it may be preferred that the pulse controller 23B generates quantized random action commands rather than continuum commands. For example instead of randomly choosing the impulse power from the set of real numbers, choosing instead from a small set of integers may be preferred.
Accordingly, the actuators 23B are activated 208 to apply one or more impulse forces to the training supporting surface 23A. This results in a kinetic impulse being applied to the object, causing the object to move to a new orientation and/or position. The distribution of energy produced by an impulse force is not location agnostic at least along axes that lie in the plane 23C. In some scenarios, the application of the impulse forces may or may not change the location and/or the orientation of the object on the training supporting surface 23A. To ascertain this, the training supporting surface 23A is imaged 210 again and the post-command location and orientation of the object on the training supporting surface 23A is detected 212. The location and orientation of the object on the training supporting surface 23A detected at step 212 may be considered as a new location and a new orientation of the object on the training supporting surface 23A. The prior location and the prior orientation detected at step 204, the random action command used at step 208, and the new location and new orientation detected at step 212 are stored 214, for example as an entry in the database 29. For example, the prior location and the prior orientation detected at step 204 may define a start state of the object and are stored under the column “Start State” of the database 29, the random action command used at step 208 is stored under the column “Action Command” in the database 29, and the new location and new orientation detected at step 212 may define the new state of the object and are stored under the column “New State” in the database 29. In some example embodiments, the post-command orientation of the object on the training supporting surface 23A may be detected as ‘indeterminate’ at step 212. In such scenarios, the control of steps may return back to step 206 and a random control command may be generated again and the actuators be activated at step 208 to apply impulse forces to the object to change its orientation from ‘indeterminate’ to a recognizable orientation.
The steps 202-214 may be repeated continuously and/or sequentially and several entries may be recorded in the database 29. In this way, an association between a start state (starting location and starting orientation), a new state (a new location and a new orientation), and the random action command(s) that caused the change from the start state to the new state, for several instances may be learned in an unsupervised manner as an outcome of the training method 200. In some example embodiments, the steps 202-214 may be carried out with a reference orientation of an object in consideration. For example, entries for such iterations in which the new orientation of the object is same as the reference orientation of the object, may only be recorded in the database along with the immediate prior location of the object, the immediate prior orientation of the object and the random action command/commands used. In some example embodiments, the training procedures may be configured according to the needs of the end objective. For example, where the end objective is robotic manipulation of the object, the training procedure may be carried out with graspability of the object by a robot as the deciding factor for registering an entry in the database. In such scenarios, for every new orientation detected at step 212, a robotic manipulator may attempt to grasp the object to determine a graspability of the object in the new orientation. Accordingly, an entry may be recorded in the database 29 to include whether the object is graspable in the new orientation or not.
It may be noted that a same random control command may cause different motions of the object in two instances if the starting location of the object is different in the two instances. For example the same impulse command to a center actuator with an object to the left of center may move the object further left, whereas if the object is to the right of the center actuator, the impulse command may move the object further to the right. In this case, the induced rotations of the object may also be opposite. For this reason, it is important to record the entry with the start location along with the start orientation of the object under the “Start State” column in the training database 29. Similarly, during prediction stage, the imaging system delivers both the current location as well as orientation of the object.
In some preferred implementations, the training database 29 or 29A may be further processed by machine learning methods such as support vector machines (SVM), random forest decision trees (RF), a neural network (NN), principal component analysis (PCA), or other machine learning methods. However, because the impulse manipulation process may be rapid and because in most preferred implementations the image processing procedures generate both position and orientation as well as robot graspability information, the learning process through the method 200 may be executed several times in a single day, fully autonomously and without requiring any human intervention beyond setting the system up and supplying example cases of “graspable” and “not graspable”. For example, the learning process through the method 200 may be executed at rates over once per second, yielding sixty thousand training data samples in a single overnight run.
With such large volumes of data available and such a small requirement in human effort, purely memory-based machine learning methods such as a kNN (k-nearest-neighbor) and rN (radius neighborhood) can be used effectively with the example embodiments. Both kNN and rN have only a single tunable parameter (the number of samples “k” and the radius of the neighborhood “r” respectively) and so relatively little if any human expertise is required. In some example embodiments, by using a LOO (Leave One Out) methodology, the training database and learning algorithm can be tested against itself, to see if it generates a known-result command given a database containing all except the known-result command, from orientation and position of an object. One such preferred embodiment may use LOO to optimize the number of samples “k” (for a kNN-based machine learning system) or radius of samples “r” (for an rN-based machine learning system) by means of generating an ROC curve. By choosing “k” or “r” as the value where the AUC (Area Under Curve) of the ROC curve is maximized (possibly with a limit of computational time as well as optimization for accuracy) then choice of “k” or “r” may be made essentially automatically, thereby reducing human intervention even further.
By using the ROC AUC optimization, the entire machine learning process becomes completely self-supervised, requiring no human intervention at all beyond the mechanical setup and the human providing designation of one or more orientations and positions for the object.
The training procedures of
Another aspect of some example embodiments of this disclosure is a novel design for an electromechanical subsystem for orienting an object in its desired target orientation which is described next with reference to
The base plate 501 may be a rigid and preferably heavy plate that provides a stable mounting base for the impulse manipulator assembly. In at least some embodiments, the set of vibration dampers 503 are used to isolate the moving objects of the impulse manipulator assembly and minimize noise during operation. There may be one vibration damper for every long spacer tube 505. The set of long spacer tubes 505 provide support for the impulse actuator mounting plate 507. Although
Above the impulse actuator mounting plate 507 is the set of short spacer tubes 511. Although
Above the semi-flexible bowl base panel 513 are a set of retention springs 517. Although
In some example embodiments, the actuator array 509 may comprise a plurality of solenoids that generate vertical motion of the cores 509A upon activation. The vertical motion of the cores 509A causes the cores 509A to penetrate through the bowl base support plate 515 and strike the semi-flexible bowl base panel 513 with an impulse force. The impulse force thus applied depends on the impulse power supplied to the actuator/solenoid. The lower bound on the impulse power may be the minimum energy required to push a solenoid through the gap generated by short spacer tube 511 and just touch the semi-flexible bowl base panel 513. The upper bound on the impulse power duration is when a solenoid remains in the fully extended position even after the semi-flexible bowl base panel 513 has lifted by inertia away from the solenoid and is no longer in contact with it. The value of upper bound on the impulse power duration may vary depending on the density, rigidity, and thickness of the semi-flexible bowl base panel 513.
In some other example embodiments, the actuator array 509 may comprise other forms of actuators and may produce the desired impulses by the use of hydraulic or pneumatic cylinders or strikers, piezoelectric transducers, rotary strikers, impingement of a fluid such as pressurized water or compressed air on the semi-flexible base plate 513, or even direct impingement of a fluid such as compressed air, pressurized water, or pressurized oil flowing through plate 503 or part retention corral 519 containing perforations, nozzles, or areas of penetrable mesh. In some example embodiments, more than one type of impulse transducer may be used in a single embodiment, for example, electromagnetic solenoids generating upward impulses while pulses of compressed air firing horizontally from nozzles in the part retention corral may be simultaneously fired to generate horizontal shear forces.
Additionally, in some example embodiments, the impulse manipulator shown in
The unique structure of the electromechanical system 3 realized as an impulse manipulator assembly shown in
The controller 611 controls operations of a robotic manipulator 613 and the electromechanical subsystem 3. In some example embodiments, the controller 611 may be a centralized controller for controlling other operations in the assembly line such as the operations of the conveyors 603A, 603B, the object dispenser 601, one or more imaging systems and the like. In some example embodiments, each component of the assembly line may have a separate controller and the controller 611 may only control the operations of the electromechanical system 3.
For each one of the objects 605, the target orientation may be predefined, dynamically defined, or provided by another program or device. The supporting surface 607 of the electromechanical subsystem 3 that receives the incoming objects 605A may be equipped with suitable sensors (not shown) such as weight sensors for detecting presence or arrival of an object 605B on the supporting surface 607. In some example embodiments, the presence or arrival of the object 605B on the supporting surface 607 may be detected by an imaging system (not shown) comprising an imaging device, illumination source and an image processor similar to the embodiment described with reference to
Upon detection of the object 605B on the supporting surface 607, the controller 611 obtains a target orientation corresponding to the object 605B. In some example embodiments, the target orientation may be user defined and may be provided as an input via an input interface. In some example embodiments, the target orientation may be dynamically determined on the fly. In this regard the controller 611 may invoke an imaging system similar to the one described with reference to
The controller 611 also obtains a current location of the object 605B on the supporting surface 607 and a current orientation of the object 605B on the supporting surface 607 in a manner similar to that described for step 106 of
The controller 703 comprises one or more processors 703A, a memory 703B, and an interface 703C. The one or more processors 703A execute instructions, programs and codes stored in the memory 703B to carry out the functionalities performed by the controller 703. In this regard, the one or more processors may be embodied by any one or more of a myriad of processing circuitry implementations, for example as a FPGA, ASIC, microprocessor, CPU, and/or the like. In some embodiments, the processors may include one or more sub-processors, remote processors (e.g., “cloud” processors) and/or the like, and/or may be in communication with one or more additional processors for performing such functionality. For example, in at least one embodiment, one or more of the processors 703A may be in communication, and/or operate in conjunction with, another processor external to the controller 703.
The memory 703B may provide storage functionality, for example to store data processed by the controller 703 and/or instructions for providing the functionality described herein. In some embodiments, the processors 703A may be in communication with the memory 703B via a bus for passing information among components of the controller 703. The memory 703B may be non-transitory and may include, for example, one or more volatile and/or non-volatile memories. For example, the memory 703B may be an electronic storage device (for example, a computer readable storage medium) comprising gates configured to store data (for example, bits) that may be retrievable by a machine (for example, a computing device like the processors 703A). The memory 703B may be configured to store information, data, content, applications, instructions, or the like, for enabling the controller to carry out various functions in accordance with the example embodiments of the present invention. For example, the memory could be configured to buffer data for processing by the processors 703A. Additionally, or alternatively, the memory 703B could be configured to store instructions for execution by the processors 703A.
The controller 703 utilizes the interface 703C to communicate with components and devices external to the controller 703. In this regard, the interface 703C may be any suitable input/output interface, a communication interface and the like.
The robotic manipulator 701 comprises a robot controller 701A having a structure similar to the controller 703 and a robot 701B. The robot 701B may be any electromechanical robot that is capable of manipulating objects, parts and components. The robotic system 700 further comprises one or more sensors 705 for sensing and detecting one or more parameters, data or information. For example, the sensors 705 may include image sensors, sound sensors, temperature sensors, gyro sensors, position sensors, weight sensors, electromagnetic sensors and the like. The robotic system 700 also comprises a user interface 707 for receiving inputs, data, and feedbacks from a user such as an operator of the robotic system 700.
The robotic system 700 also comprises an imaging system 709 comprising one or more imaging devices 709A and an image processor 709B. The imaging devices 709A comprise a plurality of image sensors, for example, a near-field image sensor configured for capturing image data objects in a near field of view, optics embodied by one or more lens(es) and/or other optical components configured to enable light to transverse through and interact with the image sensor. The image processor 709B may include one or more processors or a microcontroller hard coded for executing image processing algorithms.
The robotic system 700 also comprises a storage 711 comprising a database 711A of learned association between object states and control commands and other databases 711B. The database 711A may be similar to database 29 of
The robotic system 700 also comprises an actuator assembly 713 having an impulse controller 713A and an array of actuators 713B. The impulse controller 713A may be similar to the impulse controller 13 of
One or more components 701-713 of the robotic system 700 may communicate with each other through a network 715. The network 715 may be wired, wireless or a combination of both.
The robot arm 811 is controlled using a robot control system 830 such as the robot controller 701A that receives a command or task that may be externally supplied. An example of the command or task could be touching or grasping an object using grippers 813. The robot control system 830 sends a control signal to the robot arm 811. The control signal may be the torques to be applied at each of the joints of the robot arm 811, and opening/closing of the gripper 813. The state of the robotic device 800 may be derived using sensors 820. These sensors may include encoders at the joints of the robotic arm 811, and a camera that can observe the environment of the robotic device 800 and tactile sensors that can be attached to the fingers of the gripper 813. The robot control system 830 may execute a policy to achieve some task or command.
In some example embodiments the sensors 820 may include a camera as the environment sensor 821. The camera may be an RGBD camera which can supply both an RGB color image and a depth image. The internal information from the RGBD camera can be used to convert the depth into a 3D point cloud. In another embodiment the camera can be a stereo camera, consisting of two color cameras for which depth and 3D point cloud can be computed. In yet another embodiment the camera can be a single RGB camera, and 3D point cloud can be estimated directly using machine learning. In another embodiment, there could be more than one camera. Finally, in another embodiment the camera can be attached at some point on the robot arm 811, the gripper 813 or the base of the gripper 815.
In this way, example embodiments provide a universal object orientation assembly that can orient any rigid body in a desired orientation and can be customized according to needs and properties of the object. Such a universal orientation assembly does not require any mechanical alteration of the bowl or corral. In order to adapt the exemplar impulse manipulator to different objects, only the software for the manipulator needs update, thereby providing a highly flexible system. Furthermore, such an update to the software may be performed in a self-supervised training manner as discussed previously, without human intervention. Additionally, example embodiments described herein provide means to orient objects in a very short interval, thereby leading to application of the embodiments to a wide variety of industrial and robotic tasks.
The above-described embodiments of the present disclosure can be implemented in any of numerous ways. For example, the embodiments may be implemented using hardware, software or a combination thereof. When implemented in software, the software code can be executed on any suitable processor or collection of processors, whether provided in a single computer or distributed among multiple computers. Such processors may be implemented as integrated circuits, with one or more processors in an integrated circuit component. Though, a processor may be implemented using circuitry in any suitable format.
Also, the various methods or processes outlined herein may be coded as software that is executable on one or more processors that employ any one of a variety of operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.
Also, the embodiments of the present disclosure may be embodied as a method, of which an example has been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts concurrently, even though shown as sequential acts in illustrative embodiments. Further, use of ordinal terms such as first, second, in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having the same name (but for use of the ordinal term) to distinguish the claim elements.
Although the present disclosure has been described with reference to certain preferred embodiments, it is to be understood that various other adaptations and modifications can be made within the spirit and scope of the present disclosure. Therefore, it is the aspect of the append claims to cover all such variations and modifications as come within the true spirit and scope of the present disclosure.