The present disclosure is generally directed to industrial systems, and more specifically, towards machine learning systems involving human robot collaboration.
In factories, industrial robots are programmed to perform a task such as welding, assembly, pick and place, and so on. However, there are many challenges associated with the industrial robots such as if a small change is required to the manufacturing line, then an integrator is often called to redesign and repurpose the robots to meet the new task specification. Furthermore, these robots are highly inflexible with respect to the robot programming interface, are often difficult to use, and require extensive programming knowledge which limits the ability of the line worker to easily repurpose the robot.
To overcome these challenges, human-robot collaboration in factories are on the rise, in which the robot needs to learn what the human does. Typically, the robot learning involves teaching a robot. Example implementations described herein involve a more adaptive and flexible technique in which the robot learns by observing human actions. In existing technologies, the human generally performs the task in a correct sequence for the robot to understand and learn, or use wearable sensors for more accurate sensor readings for human demonstration. Furthermore, these technologies use the quality of the product at the end of the task to compare with the robot task execution. However, in a manufacturing line, the quality information may not be available after each task is performed, which thereby requires an estimation of the quality of each task.
In example implementations described herein, there are systems and methods that record the human action as the human is performing the task and categorize these tasks into subtasks by observing the change point in the human actions and then estimate the quality of the subtasks based on the final product quality. Furthermore, the subtask sequence order is also determined, which is then sent to multiple robots performing the same task for robot learning.
Aspects of the present disclosure can involve a method, which can involve receiving information associated with a plurality of subtasks, the received information associated with human actions to train an associated robot in an edge system; conducting a quality evaluation on each of the plurality of subtasks; determining one or more subtask sequences from the plurality of subtasks; evaluating each of the one or more subtask sequences based on the quality evaluation of the each of the plurality of subtasks associated with the each of the one or more subtask sequences; and outputting ones of the one or more subtask sequences to train the associated robot based on the evaluation of the each of the one or more subtask sequences.
Aspects of the present disclosure can involve a computer program, which can involve instructions including receiving information associated with a plurality of subtasks, the received information associated with human actions to train an associated robot in an edge system; conducting a quality evaluation on each of the plurality of subtasks; determining one or more subtask sequences from the plurality of subtasks; evaluating each of the one or more subtask sequences based on the quality evaluation of the each of the plurality of subtasks associated with the each of the one or more subtask sequences; and outputting ones of the one or more subtask sequences to train the associated robot based on the evaluation of the each of the one or more subtask sequences. The computer program can be stored in a non-transitory computer readable medium and executed by one or more processors.
Aspects of the present disclosure can involve a system, which can involve means for receiving information associated with a plurality of subtasks, the received information associated with human actions to train an associated robot in an edge system; means for conducting a quality evaluation on each of the plurality of subtasks; means for determining one or more subtask sequences from the plurality of subtasks; means for evaluating each of the one or more subtask sequences based on the quality evaluation of the each of the plurality of subtasks associated with the each of the one or more subtask sequences; and means for outputting ones of the one or more subtask sequences to train the associated robot based on the evaluation of the each of the one or more subtask sequences.
Aspects of the present disclosure can involve an apparatus, which can involve a processor, configured to receive information associated with a plurality of subtasks, the received information associated with human actions to train an associated robot in an edge system; conduct a quality evaluation on each of the plurality of subtasks; determining one or more subtask sequences from the plurality of subtasks; evaluate each of the one or more subtask sequences based on the quality of the each of the plurality of subtasks associated with the each of the one or more subtask sequences; and output ones of the one or more subtask sequences to train the associated robot based on the evaluation of the each of the one or more subtask sequences.
Aspects of the present disclosure can involve an apparatus, which can involve one or more computer readable mediums storing instructions, and a processor that executes the instructions stored in the one or more computer readable mediums to perform the process involving receiving information associated with a plurality of subtasks, the received information associated with human actions to train an associated robot in an edge system; conducting a quality evaluation on each of the plurality of subtasks; determining one or more subtask sequences from the plurality of subtasks; evaluating each of the one or more subtask sequences based on the quality evaluation of the each of the plurality of subtasks associated with the each of the one or more subtask sequences; and output ones of the one or more subtask sequences to train the associated robot based on the evaluation of the each of the one or more subtask sequences.
The following detailed description provides details of the figures and example implementations of the present application. Reference numerals and descriptions of redundant elements between figures are omitted for clarity. Terms used throughout the description are provided as examples and are not intended to be limiting. For example, the use of the term “automatic” may involve fully automatic or semi-automatic implementations involving user or administrator control over certain aspects of the implementation, depending on the desired implementation of one of ordinary skill in the art practicing implementations of the present application. Selection can be conducted by a user through a user interface or other input means, or can be implemented through a desired algorithm. Example implementations as described herein can be utilized either singularly or in combination and the functionality of the example implementations can be implemented through any means according to the desired implementations.
In a factory, there is a well-defined task description template (i.e. Work Order) that details a sequence of tasks used to complete a product. A product is manufactured in a work cell. Each work cell has an assigned robot and human worker. Human workers can change over time or may not be present in the work cell with the robot.
In example implementations described herein, the robot downloads the task template for the specific product, observes the human tasks, learns subtasks, and takes product quality information as input.
In example implementations described herein, all robots in a factory feed this information to a central robot knowledge server along with their meta-data (e.g. for robot identifier (ID), human operator profile, product ID, and so on).
In example implementations, a global machine learning (ML) algorithm determines the correct subtasks for a given task by considering {subtask, quality} pairings over all inputs from all robots and feeds this information back to each robot that is doing the task.
In example implementations, the global algorithm determines an optimal order of subtasks which is then transmitted to each robot performing the task for robot learning.
In example implementations described herein, a core learning system is proposed that is connected to all the edge learning systems where the edge learning systems gathers the video data of the human actions performing a task, processes these actions into subtasks and send it to the core learning system for subtask evaluation and subtask sequence reconstruction. Thereon, the core learning system sends the updated subtask sequence for a task to the edge learning for robot to learn the task efficiently. Even in the case where no human is present at the work cell (i.e. the case in
Although the example implementation described above involves robot vision 2011 or other camera or imaging devices installed in the robot 201 to observe the human worker, other systems can also be utilized to observe the human worker, and the present disclosure is not limited thereto. For example, the human worker can be observed by a separate camera and/or other imaging sensors (e.g., infrared sensors, depth cameras, etc.) that views the area in which the human worker is operating, or so on in accordance with the desired implementation.
The edge learning system 301 is a system that articulates the task using task template acquisition module 3011 sent by the ERP system 5011, records the human actions from the robot vision using robot vision module 3012, divides the tasks into subtasks, and generates respective subtask videos using the subtask learning module 3013. These subtasks videos are stored in the edge video database (DB) 3015 using the edge video module 3014. Edge video module 3014 saves the current videos generated at the edge and updates the videos sent by the core learning system 401. The updated videos in the edge video module 3014 are then sent to the robot learning module 3016 for robot to start learning the subtask in a sequential manner in order to achieve higher accuracy in task completion.
The core leaning system 401 involves the subtask evaluation module 4011 which takes the subtask videos from the subtask learning module 3013 and the product quality check system 5012 and uses machine learning algorithms to predict the subtask quality. The estimated subtask quality is then sent to the task reconstruction module 4012 which uses the quality information and the frequency of the correct subtask sequence to evaluate the subtask sequences. The evaluated subtask sequence can be used to train the associated robot via the robot learning module 3016 as follows. The evaluation of the subtask sequence is used to select the subtask sequence. The subtask sequence is then sent to the core video module 4013 to request the respective videos from the edge video module 3014 and stores the subtask videos in the core video database (DB) 4014. The selected subtask sequence and the subtask videos can then be sent to the robot learning module 3016 over 7013.
These feature vectors along with the meta data about the task (such as task ID, worker cell ID, worker ID etc.) are transmitted to the core learning system 401. In the example of
However, in
To generate the quality evaluation/quality check of each of the subtasks, in a first step of the subtask evaluation module 4011, for each subtask STi, fi is used as the sampled feature vector for the respective subtask according to distribution Pi [t]. In a second step, subtask evaluation module 4011 clusters the feature vector and apply a suitable threshold to learn binary function Ψi that represent quality checkers after each subtask STi. In a third step, the subtask evaluation module 4011 sets Ψi(fi)=qci.
To evaluate each of the one or more subtask sequences based on the quality evaluation of the each of the plurality of subtasks associated with the each of the one or more subtask sequences, in a fourth step, the subtask evaluation module 4011 uses these generated qcis (quality check/evaluation for each subtask) to see the predictive ability in the final quality check for the task qcFinal. In a fifth step, the subtask evaluation module 4011 will construct a function that uses qc1, qc2, qc3, . . . , qc(last subtask-1) to predict qcFinal. In a sixth step, the subtask evaluation module 4011 obtains the actual quality check QC for the task from the product quality check system 5012. In a seventh step, the subtask evaluation module 4011 uses a validation dataset and generates a reward based on prediction of qcFinal compared with the actual quality check QC of the task. In an eighth step, the subtask evaluation module 4011 uses this reward to update the Pi [t+1] based on Pi [t] for each i. In a ninth step, the first through sixth steps are reiterated for as many training epochs as necessary. In a tenth step, based on Pi [tfinal], the subtask evaluation module 4011 assigns qci that are effective quality checks for each subtask STi.
Similarly, the second sequence was observed 59 times, which was correct 46 times, and the third sequence was observed only five times, which was correct four times. In such case, the second sequence will be the suitable sequence for the robot to learn. This sequence has the maximum number of the correct sequences and occurred maximum number of times.
Thus, if x is the number of times the sequence was observed for n number of observations and y is the no. of times the sequence was correct. Then,
An example of a solution a solution description according to the example implementations is provided herein with respect to
In the first step, the work order is first sent to the respective edge learning system 301 where the human and robot both are present in the workcell. In the second step, as the work order is received, the robot vision module 3012 will start recording the human performing task.
In a third step, after task video recording, the subtask learning module 3013 will process this video by looking into any significant changes in human actions in order to split the video into multiple subtask video. These subtasks identified by subtask learning module 3013 are
In the fourth step of the subtask learning module 3013, the subtasks identified in the third step and their respective video clips are then given a unique identifier (ID) and features are extracted from individual video clips using Convolutional Neural Network (CNN) based methods such as I3D. In example implementations, the CNN based methods can be replaced by other neural network based methods, such as, but not limited to, recurrent neural network (RNN) based methods, segment-based methods, multi-stream networks, and so on depending on the desired implementation, and the present disclosure is not limited to the CNN based methods. In a fifth step, the video clips are then stored in the edge video database (DB) 3015 through edge video module 3014. In a sixth step, the subtasks and their respective features from the fourth step are then sent to the subtask evaluation module 4011 in the core learning module 401. In a seventh step, the subtask evaluation module 4011 predicts the subtask quality which is then used to predict the task quality and then compares it with the actual quality of the task provided by the product quality check system 5012. The steps involved in the subtask evaluation are as follows.
In an eight step, the subtask evaluation module 4011 will then produce a quality check for each of the subtasks and task reconfiguration module 4012 then selects the best sequence from the multiple correct subtask sequence using equation 1. An example for the different subtask sequence is shown in
In a ninth step, after selecting the best subtask sequence for a given task, the core video module 4013 requests the videos from the edge video module and sends the videos to the other edge learning module as shown in
In a tenth step, the robot is ready to start learning the task using the video clips of the subtask sequence where the video frames are first extracted from the video clips and unique identifiers are given to each of the frame and the frames are segmented to identify the actions. Using the action frames for a subtask, the trajectory for that subtask is generated. Multiple trajectories are generated for a subtask and a model is trained that will be used for testing the robot actions in simulation. Thereafter, the learned and tested model is transferred to the real robot for real time task execution.
Example implementations involve a system for training and managing machine learning models in an industrial setting. Specifically, by leveraging the similarity across certain production areas, it is possible to group together these areas to train models efficiently that uses human pose data to predict human activities or specific task(s) the workers are engaged in. Specifically, the example implementations do away with previous methods of independent model construction for each production area and takes advantage of the commonality amongst different environments.
Computer device 1605 in computing environment 1600 can include one or more processing units, cores, or processors 1610, memory 1615 (e.g., RAM, ROM, and/or the like), internal storage 1620 (e.g., magnetic, optical, solid state storage, and/or organic), and/or I/O interface 1625, any of which can be coupled on a communication mechanism or bus 1630 for communicating information or embedded in the computer device 1605. I/O interface 1625 is also configured to receive images from cameras or provide images to projectors or displays, depending on the desired implementation.
Computer device 1605 can be communicatively coupled to input/user interface 1635 and output device/interface 1640. Either one or both of input/user interface 1635 and output device/interface 1640 can be a wired or wireless interface and can be detachable. Input/user interface 1635 may include any device, component, sensor, or interface, physical or virtual, that can be used to provide input (e.g., buttons, touch-screen interface, keyboard, a pointing/cursor control, microphone, camera, braille, motion sensor, optical reader, and/or the like). Output device/interface 1640 may include a display, television, monitor, printer, speaker, braille, or the like. In some example implementations, input/user interface 1635 and output device/interface 1640 can be embedded with or physically coupled to the computer device 1605. In other example implementations, other computer devices may function as or provide the functions of input/user interface 1635 and output device/interface 1640 for a computer device 1605.
Examples of computer device 1605 may include, but are not limited to, highly mobile devices (e.g., smartphones, devices in vehicles and other machines, devices carried by humans and animals, and the like), mobile devices (e.g., tablets, notebooks, laptops, personal computers, portable televisions, radios, and the like), and devices not designed for mobility (e.g., desktop computers, other computers, information kiosks, televisions with one or more processors embedded therein and/or coupled thereto, radios, and the like).
Computer device 1605 can be communicatively coupled (e.g., via I/O interface 1625) to external storage 1645 and network 1650 for communicating with any number of networked components, devices, and systems, including one or more computer devices of the same or different configuration. Computer device 1605 or any connected computer device can be functioning as, providing services of, or referred to as a server, client, thin server, general machine, special-purpose machine, or another label.
I/O interface 1625 can include, but is not limited to, wired and/or wireless interfaces using any communication or I/O protocols or standards (e.g., Ethernet, 802.11x, Universal System Bus, WiMax, modem, a cellular network protocol, and the like) for communicating information to and/or from at least all the connected components, devices, and network in computing environment 1600. Network 1650 can be any network or combination of networks (e.g., the Internet, local area network, wide area network, a telephonic network, a cellular network, satellite network, and the like).
Computer device 1605 can use and/or communicate using computer-usable or computer-readable media, including transitory media and non-transitory media. Transitory media include transmission media (e.g., metal cables, fiber optics), signals, carrier waves, and the like. Non-transitory media include magnetic media (e.g., disks and tapes), optical media (e.g., CD ROM, digital video disks, Blu-ray disks), solid state media (e.g., RAM, ROM, flash memory, solid-state storage), and other non-volatile storage or memory.
Computer device 1605 can be used to implement techniques, methods, applications, processes, or computer-executable instructions in some example computing environments. Computer-executable instructions can be retrieved from transitory media and stored on and retrieved from non-transitory media. The executable instructions can originate from one or more of any programming, scripting, and machine languages (e.g., C, C++, C#, Java, Visual Basic, Python, Perl, JavaScript, and others).
Processor(s) 1610 can execute under any operating system (OS) (not shown), in a native or virtual environment. One or more applications can be deployed that include logic unit 1660, application programming interface (API) unit 1665, input unit 1670, output unit 1675, and inter-unit communication mechanism 1695 for the different units to communicate with each other, with the OS, and with other applications (not shown). The described units and elements can be varied in design, function, configuration, or implementation and are not limited to the descriptions provided.
In some example implementations, when information or an execution instruction is received by API unit 1665, it may be communicated to one or more other units (e.g., logic unit 1660, input unit 1670, output unit 1675). In some instances, logic unit 1660 may be configured to control the information flow among the units and direct the services provided by API unit 1665, input unit 1670, output unit 1675, in some example implementations described above. For example, the flow of one or more processes or implementations may be controlled by logic unit 1660 alone or in conjunction with API unit 1665. The input unit 1670 may be configured to obtain input for the calculations described in the example implementations, and the output unit 1675 may be configured to provide output based on the calculations described in example implementations.
Processor(s) 1610 can be configured to receive information associated with a plurality of subtasks, the received information associated with human actions to train an associated robot in an edge system; conduct a quality evaluation on each of the plurality of subtasks; determine one or more subtask sequences from the plurality of subtasks; evaluate each of the one or more subtask sequences based on the quality evaluation of the each of the plurality of subtasks associated with the each of the one or more subtask sequences; and output ones of the one or more subtask sequences to train the associated robot based on the evaluation of the each of the one or more subtask sequences as illustrated in
In example implementations, the information associated with the plurality of subtasks from the edge system associated with a robot can involve video clips, each of the video clips associated with a subtask from the plurality of subtasks; wherein the processor(s) 1610 is configured to output ones of the one or more subtask sequences to train the associated robot based on the evaluation of the each of the one or more subtask sequences by providing ones of the video clips associated with ones of the subtasks associated with the each of the one or more subtask sequences as illustrated in
In example implementations, the robot can involve robot vision configured to record video from which the video clips are generated; wherein a manufacturing system is configured to provide a task involving the plurality of subtasks to the edge system for execution and to provide a quality evaluation of the task for the evaluation of the each of the one or more subtask sequences as illustrated at 3012 and 5012 of
Depending on the desired implementation, the video clips can involve the human actions of the plurality of subtasks as illustrated in
Processor(s) 1610 can be further configured to recognize each of the plurality of subtasks based on change point detection to the human actions as determined from feature extraction, wherein detected change points from the change point detection are utilized to separate the each of the plurality of subtasks by time period as illustrated in
Depending on the desired implementation, the edge system can be configured to identify the plurality of subtasks and provide the information associated with the plurality of subtasks based on the identification as illustrated at 3013 on
In example implementations, processor(s) 1610 can conduct the evaluating the each of the one or more subtask sequences based on the quality evaluation of the each of the plurality of subtasks associated with the each of the one or more subtask sequences by constructing a function configured to provide a quality evaluation for the each of the one or more subtask sequences from the quality evaluation of the each of the plurality of subtasks associated with the each of the one or more subtask sequences; utilizing a validation set to evaluate the quality evaluation for the each of the one or more subtask sequences; modifying the function based on the evaluation of the quality evaluation for the each of the one or more subtask sequences based on reinforcement learning; iteratively repeating the constructing, utilizing, and modifying to finalize the function; and executing the finalized function to evaluate the each of the one or more subtask sequences as illustrated by the subtask evaluation module 4011,
Processor(s) 1610 can also be configured to train the associated robot with the outputted evaluation, the training the associated robot involving selecting ones of the one or more subtask sequences based one the outputted evaluation and frequency of the each of the one or more subtask sequences; extracting video frames corresponding to each of the selected ones of the one or more subtask sequences; segmenting actions from the extracted video frames; determining trajectory and trajectory parameters for the associated robot from the segmented actions; and executing reinforcement learning on the associated robot based on the trajectory, the trajectory parameters, and the segmented actions to learn the selected ones of the one or more subtask sequences as illustrated by robot learning in
Some portions of the detailed description are presented in terms of algorithms and symbolic representations of operations within a computer. These algorithmic descriptions and symbolic representations are the means used by those skilled in the data processing arts to convey the essence of their innovations to others skilled in the art. An algorithm is a series of defined steps leading to a desired end state or result. In example implementations, the steps carried out require physical manipulations of tangible quantities for achieving a tangible result.
Unless specifically stated otherwise, as apparent from the discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining,” “displaying,” or the like, can include the actions and processes of a computer system or other information processing device that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system's memories or registers or other information storage, transmission or display devices.
Example implementations may also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may include one or more general-purpose computers selectively activated or reconfigured by one or more computer programs. Such computer programs may be stored in a computer readable medium, such as a computer-readable storage medium or a computer-readable signal medium. A computer-readable storage medium may involve tangible mediums such as, but not limited to optical disks, magnetic disks, read-only memories, random access memories, solid state devices and drives, or any other types of tangible or non-transitory media suitable for storing electronic information. A computer readable signal medium may include mediums such as carrier waves. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Computer programs can involve pure software implementations that involve instructions that perform the operations of the desired implementation.
Various general-purpose systems may be used with programs and modules in accordance with the examples herein, or it may prove convenient to construct a more specialized apparatus to perform desired method steps. In addition, the example implementations are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the example implementations as described herein. The instructions of the programming language(s) may be executed by one or more processing devices, e.g., central processing units (CPUs), processors, or controllers.
As is known in the art, the operations described above can be performed by hardware, software, or some combination of software and hardware. Various aspects of the example implementations may be implemented using circuits and logic devices (hardware), while other aspects may be implemented using instructions stored on a machine-readable medium (software), which if executed by a processor, would cause the processor to perform a method to carry out implementations of the present application. Further, some example implementations of the present application may be performed solely in hardware, whereas other example implementations may be performed solely in software. Moreover, the various functions described can be performed in a single unit, or can be spread across a number of components in any number of ways. When performed by software, the methods may be executed by a processor, such as a general-purpose computer, based on instructions stored on a computer-readable medium. If desired, the instructions can be stored on the medium in a compressed and/or encrypted format.
Moreover, other implementations of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the teachings of the present application. Various aspects and/or components of the described example implementations may be used singly or in any combination. It is intended that the specification and example implementations be considered as examples only, with the true scope and spirit of the present application being indicated by the following claims.