The present disclosure relates to fields of robotics and artificial intelligence (AI). More particularly the present disclosure relates to computerized robotic systems employing a coupling device for coupling one or more objects to a robotic system, and a locking mechanism for locking the one or more objects with the robotic system. Further, the present disclosure also relates to integration of marker system with the robotic system, for grasping and interacting with the one or more objects. Furthermore, the present disclosure also relates to integration of electronic libraries of mini-manipulations with transformed robotic instructions for replicating movements, processes, and techniques with real-time electronic adjustments.
Research and development in robotics have been undertaken for decades, but the progress has been mostly in the field of heavy industrial applications such as automobile manufacturing automation or military applications. Simple robotics systems have been designed for the consumer markets, but they have not seen a wide application in the home-consumer robotics space, thus far. With advances in technology, combined with a population with higher incomes, the market may be ripe to create opportunities for technological advances to improve people's lives. Robotics has continued to improve automation technology with enhanced artificial intelligence and emulation of human skills and tasks in many forms in operating a robotic apparatus or a humanoid.
The notion of robots replacing humans in certain areas and executing tasks that humans would typically perform is an ideology in continuous evolution since robots were first developed in the 1970s. Manufacturing sectors have long used robots in teach-playback mode, where the robot is taught, via pendant or offline fixed-trajectory generation and download, which motions to copy continuously and without alteration or deviation. Companies have taken the pre-programmed trajectory-execution of computer-taught trajectories and robot motion-playback into such application domains as mixing drinks, welding or painting cars, and others. However, all of these conventional applications use a 1:1 computer-to-robot or tech-playback principle that is intended to have only the robot faithfully execute the motion-commands, which is usually following a taught/pre-computed trajectory without deviation.
Additionally, in conventional robotic systems, one or more objects or manipulators or end-effectors are coupled directly to the robotic systems. In the conventional systems, the coupling devices are often characterized with stability issues, which may render inefficient or inaccurate operation of the robotic systems. Though, there has been improvements in the coupling devices to improve stability and accuracy of the coupling, these systems tend to be cumbersome and complex. Also, the configuration of coupling devices in the conventional systems may require the entire coupling device to be replaced or altered as per the configuration of the one or more objects or manipulators or end-effectors to be coupled to the robotic system, which is undesirable. The present disclosure is directed to overcome one or more limitations stated above.
The information disclosed in this background of the disclosure section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgment or any form of suggestion that this information forms the prior art already known to a person skilled in the art.
Embodiments of the present disclosure are directed to methods, computer program products, and computer systems of a multi-level robotic system for high speed and high fidelity manipulation operations segmented into, in one embodiment, two physical and logical subsystems made up of instrumented, articulated and controller-actuated subsystems, including a larger and coarser-motion macro-manipulation system operating responsible for operations in larger unconstrained environment workspaces at a reduced endpoint accuracy, and a smaller and finer-motion micro-manipulation system responsible for operations in a smaller workspace and while interacting with tooling and the environment at a higher endpoint motion accuracy, carrying out mini-manipulation trajectory-following tasks based on mini-manipulation commands provided through a dual-level database specific to the macro- and micro-manipulation subsystems, supported by a dedicated and separate distributed processor and sensor architecture operating under an overall real-time operating system communicating with all subsystems over multiple bus interfaces specific to sensor, command and database-elements.
Systems and methods are provided for operating universal robotic assistant systems. In some embodiments, a method for operating robotic assistant system comprises: receiving, by one or more processors configured in a robotic assistant system, environment data corresponding to a current environment, from one or more sensors configured in the robotic assistant system; determining, by the one or more processors, a type of the current environment based on the collected environment data; detecting by the one or more processors, one or more objects in the current environment, wherein the one or more objects are associated with the type of the current environment; identifying, by the one or more processors, for each of the one or more objects, one or more interactions based on type of the one or more objects and the type of the current environment; retrieving, by the one or more processors, interaction data corresponding to the one or more objects from a remote storage associated with the robotic assistant system; and executing, by the one or more processors, the one or more interactions on the corresponding one or more objects, based on the interaction data.
In some embodiments, determining the type of the current environment includes transmitting, by the one or more processors, the environment data to a remote storage associated with the universal robotic assistant systems, wherein the remote storage comprises a library of environment candidates; and receiving, by the one or more processors, in response to the transmitted environment data, the type of the current environment determined based on the environment data, from among the library of environment candidates.
In some embodiments, each of the one or more processors is communicatively connected to a central processor associated with the robotic assistant system.
In some embodiments, the environment data includes position data and image data of the current environment.
In some embodiments, the position data and the image data are obtained from the one or more sensors, wherein the one or more sensors comprises at least one of a navigation system and one or more image capturing devices.
In some embodiments, detecting the one or more objects is based on at least one of the type of the current environment, the environment data corresponding to the current environment, and object data.
In some embodiments, the one or more objects are detected from a plurality of objects associated with the type of the current environment, wherein the plurality of objects are retrieved from a remote storage.
In some embodiments, the object data is collected by the one or more sensors comprising one or more cameras.
In some embodiments, detecting the one or more objects and the type of the one or more objects further comprises analysing features of the one or more objects, wherein the features comprises at least one of shape, size, texture, color, state, material and pose of the one or more objects.
In some embodiments, analyzing the features of the one or more objects includes detecting one or more markers disposed on each of the one or more objects.
In some embodiments, the one or more interactions identified for each of the one or more objects based on the type of objects and the type of the current environment indicates the one or more interactions to be performed by the respective object or on the respective object within the current environment.
In some embodiments, the interaction data of each of the one or more interactions comprises a sequence of motions to be performed by or on the one or more objects and one or more optimal standard positions of one or more manipulation devices, configured to interact with the one or more objects, relative to the corresponding one or more objects.
In some embodiments, executing at least one of the one or more interactions on the corresponding one or more objects includes, for each of the one or more interactions: positioning, by the one or more processors, one or more manipulation devices within a proximity of the corresponding one or more objects; identifying, by the one or more processors, an optimal standard position of the one or more manipulation devices relative to the corresponding one or more objects, wherein the optimal standard position is selected from one or more standard positions of the one or more manipulation devices; positioning, by the one or more processors, the one or more manipulation devices at the identified optimal standard position using one or more positioning techniques; and executing, by the one or more processors, using the one or more manipulation devices, the one or more interactions on the corresponding one or more objects.
In some embodiments, the one or more positioning techniques includes at least one of object template matching technique and marker-based technique, wherein the object template matching technique is used for standard objects and the marker-based technique is used for standard and non-standard objects.
In some embodiments, positioning one or more manipulation devices at an optimal standard position using the object template matching technique includes: retrieving, by the one or more processors, an object template of a target object from a remote storage associated with the universal robotic assistant system, wherein the target object is an object currently being subjected to one or more interactions, wherein the object template comprises at least one of shape, color, surface and material characteristics of the target object; positioning, by the one or more processors, the one or more manipulation devices to a first position proximal to the target object; receiving, by the one or more processors, one or more images, in real-time, of the target object from at least one image capturing device associated with the one or more manipulation devices, wherein the one or more images are captured by at least one image capturing device when the one or more manipulation devices are at the first position; comparing, by the one or more processors, the object template of the target object with the one or more images of the target object; and performing, by the one or more processors, at least one of: adjusting position of the one or more manipulation devices towards the optimal standard position based on position of the one or more manipulation devices in previous iteration and reiterating the steps of receiving and comparing, when the comparison results in mismatch; or inferring that the one or more manipulation devices reached the optimal standard position when the comparison results in a match.
In some embodiments, positioning one or more manipulation devices at optimal standard position using the marker-based technique includes: detecting one or more markers associated with a target object; and adjusting position of the one or more manipulation devices towards the optimal standard position based on the detected one or more markers associated with the target object, wherein the position is adjusted using a real-time image of the target objected received from at least one image capturing device associated with the one or more manipulation devices.
In some embodiments, the one or more markers includes at least one of: a physical marker disposed on the target object; and a virtual marker corresponding to one or more points on the target object, wherein the one or more markers enable computation of position parameters comprising distance, orientation, angle, and slope, of the one or more manipulation devices with respect to the target object.
In some embodiments, the one or more markers associated with the target object are physical markers when the target object is a standard object and the one or more markers associated with the target object are virtual markers when the target object is a non-standard object.
In some embodiments, the one or more markers include the physical marker disposed on the target object, wherein the physical marker is a triangle-shaped marker, and wherein adjusting position of the one or more manipulation devices includes: moving, by the one or more processors, the one or more manipulation devices towards the triangle-shaped marker until at least one side of the triangle-shaped marker has a preferred length; rotating, by the one or more processors, the one or more manipulation devices until a bottom vertex of the triangle-shaped marker is disposed in a bottom position of the real-time image of the target object; shifting, by the one or more processors, the one or more manipulation devices along an X and/or Y axis of the real-time image of the target object until a center of the triangle-shaped marker is in a center position of the real-time image of the target object; and adjusting, by the one or more processors, a slope of the one or more manipulation devices until each angle of the triangle-shaped marker are at least one of equal to approximately 60 degrees or equal to a predetermined maximum difference between the angles that is smaller than their difference prior to initiating the adjustment of the position of the one or more manipulation devices, wherein achieving at least one of the two conditions mentioned above, indicates that the one or more manipulation devices reached the optimal standard position.
In some embodiments, the one or more markers include the physical marker disposed on the target object, wherein the physical marker is a chessboard-shaped marker, and wherein adjusting position of the one or more manipulation devices includes: calibrating, by the one or more processors, each image capturing device associated with the one or more manipulation devices using the chessboard-shaped marker, wherein the calibration comprises estimating at least one of focus length, principal point and distortion coefficients of each image capturing device with respect to the chessboard-shaped marker; identifying, by the one or more processors, in real-time, images of the target object and image co-ordinates of corners of square slots in the chessboard-shaped marker; assigning, by the one or more processors, real-world coordinates to each internal corner among the corners of the square slots in the real-time image based on the image co-ordinates; and determining, by the one or more processors, position of the one or more manipulation devices based on the calibration, image co-ordinates and the real-time co-ordinates with respect to the chessboard-shaped marker, wherein the steps of calibrating, identifying, assigning and determining are repeated until the position of the one or more manipulation devices is equal to the optimal standard position.
In some embodiments, the virtual markers are placed on the target object using at least one of shape analysis technique, particle filtering technique and Convolutional Neural Network (CNN) technique.
In some embodiments, placing the virtual markers using shape analysis technique includes: receiving, by the one or more processors, real-time images of a target object from at least one image capturing device associated with one or more manipulating devices; determining, by the one or more processors, shape of the target object and longest and shortest sides of the target object. The sides of the target object are determined as longest and shortest with reference to length of each side of the target object; determining, by the one or more processors, geometric centre of the target object based on the shape of the target object and, the longest and the shortest sides of the target object; and projecting, by the one or more processors, an equilateral triangle on the target object, wherein each side of the equilateral triangle is equal to half of the shortest side of the target object; the equilateral triangle is oriented along the longest side of the target object; and geometric centre of the equilateral triangle is coinciding with the geometric centre of the target object; and placing, by the one or more processors, the virtual markers at each vertex of the equilateral triangle.
In some embodiments, placing the virtual markers using particle filtering technique includes: retrieving, by the one or more processors, one or more ideal values corresponding to ideal positions of a target object from a remote storage associated with the universal robotic assistant systems; receiving, by the one or more processors, real-time images of the target object from at least one image capturing device associated with one or more manipulating devices; generating, by the one or more processors, special points within boundaries of the target object using the real-time images; determining, by the one or more processors, an estimated value for combination of visual features in neighborhood of each special point, wherein the visual features comprises at least one of histograms of gradients, spatial color distributions and texture features; comparing, by the one or more processors, each estimated value with each of the one or more ideal values to identify respective proximal match; and placing, by the one or more processors, the virtual markers at each position on the target object corresponding to each proximal match.
In some embodiments, placing the virtual markers using the CNN technique includes: downloading, by the one or more processors, a CNN model corresponding to a target object from libraries stored in a remote storage associated with the universal robotic assistant systems; and detecting positions on the target object for placing the virtual markers based on the CNN model.
Hereafter, various embodiments of the present disclosure are explained in terms of a kitchen environment. However, this should not be construed as a limitation of the present disclosure, as the present disclosure may be applicable to any environment other than a kitchen environment.
Embodiments of the present disclosure are directed to methods, computer program products, and computer systems of a robotic apparatus with robotic instructions replicating a food dish with substantially the same result as if a chef had prepared the food dish. In a first embodiment, the robotic assistant system in a standardized robotic kitchen comprises two robotic arms and hands that replicate the precise movements of the chef in same sequence (or substantially the same sequence). The two robotic arms and hands replicate the movements in the same timing (or substantially the same timing) to prepare the food dish based on a previously recorded document (a recipe-script) of the chef's precise movements in preparing the same food dish. In a second embodiment, a computer-controlled cooking apparatus prepares a food dish based on a sensory-curve, such as temperature over time, which was previously recorded in a software file where the chef prepared the same food dish with the cooking apparatus with sensors for which a computer recorded the sensor values over time when the chef previously prepared the food dish on the cooking apparatus fitted with the sensors. In a third embodiment, the kitchen apparatus comprises the robotic arms in the first embodiment and the cooking apparatus with sensors in the second embodiment to prepare a dish that combines both the robotic arms and one or more sensory curves, where the robotic arms are capable of quality-checking a food dish during the cooking process, for such characteristics as taste, smell, and appearance, allowing for any cooking adjustments to the preparation steps of the food dish. In a fourth embodiment, the kitchen apparatus comprises a food storage system with computer-controlled containers and container identifiers for storing and supplying ingredients for a user to prepare the food dish by following the chef's cooking instructions. In a fifth embodiment, a robotic kitchen comprises a robotic assistant system with arms and a kitchen apparatus in which the robotic assistant system moves around the kitchen apparatus to prepare a food dish by emulating a chef's precise cooking movements, including possible real-time modifications/adaptations to the preparation process defined in the recipe-script.
A robotic cooking engine comprises detection, recording, and chef emulation cooking movements, controlling significant parameters, such as temperature and time, and processing the execution with designated appliances, equipment, and tools, thereby reproducing a gourmet dish that tastes identical to the same dish prepared by a chef and served at a specific and convenient time. In one embodiment, a robotic cooking engine provides robotic arms for replicating a chef's identical movements with the same ingredients and techniques to produce an identical tasting dish.
The underlying motivation of the present disclosure centers around humans being monitored with sensors during their natural execution of an activity, and then, being able to use monitoring-sensors, capturing-sensors, computers, and software to generate information and commands to replicate the human activity using one or more robotic and/or automated systems. While one can conceive of multiple such activities (e.g. cooking, painting, playing an instrument, etc.), one aspect of the present disclosure is directed to the cooking of a meal: in essence, a robotic meal preparation application. Monitoring a human chef is carried out in an instrumented application-specific setting (a standardized kitchen in this case), and involves using sensors and computers to watch, monitor, record, and interpret the motions and actions of the human chef, in order to develop a robot-executable set of commands robust to variations and changes in an environment that is capable of allowing a robotic or automated system in a robotic kitchen prepare the same dish to the standards and quality as the dish prepared by the human chef.
The use of multimodal sensing systems is the means by which the necessary raw data is collected. Sensors capable of collecting and providing such data include environment and geometrical sensors, such as two-dimensional (cameras, etc.) and three-dimensional (lasers, sonar, etc.) sensors, as well as human motion-capture systems (human-worn camera-targets, instrumented suits/exoskeletons, instrumented gloves, etc.), as well as instrumented (sensors) and powered (actuators) equipment used during recipe creation and execution (instrumented appliances, cooking-equipment, tools, ingredient dispensers, etc.). All this data is collected by one or more distributed/central computers and processed by various processes. The processors of the distributed/central computers will process and abstract the data to the point that a human and a computer-controlled robotic kitchen can understand the activities, tasks, actions, equipment, ingredients and methods, and processes used by the human, including replication of key skills of a particular chef. The raw data is processed by one or more software abstraction engines to create a recipe-script that is both human-readable and, through further processing, machine-understandable and machine-executable, spelling out all actions and motions for all steps of a particular recipe that a robotic kitchen would have to execute. These commands range in complexity from controlling individual joints, to a particular joint-motion profile over time, to abstraction levels of commands, with lower-level motion-execution commands embedded therein, associated with specific steps in a recipe. Abstraction motion-commands (e.g. “crack an egg into the pan”, “sear to a golden color on both sides”, etc.) can be generated from the raw data, refined, and optimized through a multitude of iterative learning processes, carried out live and/or off-line, allowing the robotic kitchen systems to successfully deal with measurement-uncertainties, ingredient variations, etc., enabling complex (adaptive) minimanipulation motions using fingered-hands mounted to robot-arms and wrists, based on fairly abstraction/high-level commands (e.g. “grab the pot by the handle”, “pour out the contents”, “grab the spoon off the countertop and stir the soup”, etc.).
The ability to create machine-executable command sequences, now contained within digital files capable of being shared/transmitted, allowing any robotic kitchen to execute them, opens up the option to execute the dish-preparation steps anywhere at any time. Hence, it allows the option to buy/sell recipes online, allowing users to access and distribute recipes on a per-use or subscription basis.
The replication of a dish prepared by a human is performed by a robotic kitchen, which is in essence a standardized replica of the instrumented kitchen used by the human chef during the creation of the dish, except that the human's actions are now carried out by a set of robotic arms and hands, computer-monitored and computer-controllable appliances, equipment, tools, dispensers, etc. The degree of dish-replication fidelity will thus be closely tied to the degree to which the robotic kitchen is a replica of the kitchen (and all its elements and ingredients), in which the human chef was observed while preparing the dish.
Broadly stated, a humanoid having a robot computer controller operated by robot operating system (ROS) with robotic instructions comprises a database having a plurality of electronic minimanipulation libraries, each electronic minimanipulation library including a plurality of minimanipulation elements. The plurality of electronic minimanipulation libraries can be combined to create one or more machine executable application-specific instruction sets, and the plurality of minimanipulation elements within an electronic minimanipulation library can be combined to create one or more machine executable application-specific instruction sets; a robotic structure having an upper body and a lower body connected to a head through an articulated neck, the upper body including torso, shoulder, arms, and hands; and a control system, communicatively coupled to the database, a sensory system, a sensor data interpretation system, a motion planner, and actuators and associated controllers, the control system executing application-specific instruction sets to operate the robotic structure.
In addition, embodiments of the present disclosure are directed to methods, computer program products, and computer systems of a robotic apparatus for executing robotic instructions from one or more libraries of minimanipulations. Two types of parameters, elemental parameters and application parameters, affect the operations of minimanipulations. During the creation phase of a minimanipulation, the elemental parameters provide the variables that test the various combinations, permutations, and the degrees of freedom to produce successful minimanipulations. During the execution phase of minimanipulations, application parameters are programmable or can be customized to tailor one or more libraries of minimanipulations to a particular application, such as food preparation, making sushi, playing piano, painting, picking up a book, and other types of applications.
Minimanipulations comprise a new way of creating a general programmable-by-example platform for humanoid robots. The state of the art largely requires explicit development of control software by expert programmers for each and every step of a robotic action or action sequence. The exception to the above are for very repetitive low-level tasks, such as factory assembly, where the rudiments of learning-by-imitation are present. A minimanipulation library provides a large suite of higher-level sensing-and-execution sequences that are common building blocks for complex tasks, such as cooking, taking care of the infirm, or other tasks performed by the next generation of humanoid robots. More specifically, unlike the previous art, the present disclosure provides the following distinctive features. First, a potentially very large library of pre-defined/pre-learned sensing-and-action sequences are called minimanipulations. Second, each mini-manipulation encodes preconditions required for the sensing-and-action sequences to produce successfully the desired functional results (i.e. the post conditions) with a well-defined probability of success (e.g. 100% or 97% depending on the complexity and difficulty of the minimanipulation). Third, each minimanipulation references a set of variables whose values may be set a-priori or via sensing operations, before executing the minimanipulation actions. Fourth, each minimanipulation changes the value of a set of variables to represent the functional result (the post conditions) of executing the action sequence in the minimanipulation. Fifth, minimanipulations may be acquired by repeated observation of a human tutor (e.g. an expert chef) to determine the sensing-and-action sequence, and to determine the range of acceptable values for the variables. Sixth, minimanipulations may be composed into larger units to perform end-to-end tasks, such as preparing a meal, or cleaning up a room. These larger units are multi-stage applications of minimanipulations either in a strict sequence, in parallel, or respecting a partial order wherein some steps must occur before others, but not in a total ordered sequence (e.g. to prepare a given dish, three ingredients need to be combined in exact amounts into a mixing bowl, and then mixed; the order of putting each ingredient into the bowl is not constrained, but all must be placed before mixing). Seventh, the assembly of minimanipulations into end-to-end-tasks is performed by robotic planning, taking into account the preconditions and post conditions of the component minimanipulations. Eighth, case-based reasoning wherein observation of humans performing end-to-end tasks, or other robots doing so, or the same robot's past experience can be used to acquire a library of reusable robotic plans form cases (specific instances of performing an end-to-end task), both successful ones to replicate, and unsuccessful ones to learn what to avoid.
In a first aspect of the present disclosure, the robotic apparatus performs a task by replicating a human-skill operation, such as food preparation, playing piano, or painting, by accessing one or more libraries of minimanipulations. The replication process of the robotic apparatus emulates the transfer of a human's intelligence or skill set through a pair of hands, such as how a chef uses a pair of hands to prepare a particular dish; or a piano maestro playing a master piano piece through his or her pair of hands (and perhaps through the feet and body motions, as well). In a second aspect of the present disclosure, the robotic apparatus comprises a humanoid for home applications where the humanoid is designed to provide a programmable or customizable psychological, emotional, and/or functional comfortable robot, and thereby providing pleasure to the user. In a third aspect of the present disclosure, one or more minimanipulation libraries are created and executed as, first, one or more general minimanipulation libraries, and second, as one or more application specific minimanipulation libraries. One or more general minimanipulation libraries are created based on the elemental parameters and the degrees of freedom of a humanoid or a robotic apparatus. The humanoid or the robotic apparatus are programmable, so that the one or more general minimanipulation libraries can be programmed or customized to become one or more application specific minimanipulation libraries specific tailored to the user's request in the operational capabilities of the humanoid or the robotic apparatus.
Some embodiments of the present disclosure are directed to the technical features relating to the ability of being able to create complex robotic humanoid movements, actions and interactions with tools and the environment by automatically building movements for the humanoid, actions, and behaviors of the humanoid based on a set of computer-encoded robotic movement and action primitives. The primitives are defined by motion/actions of articulated degrees of freedom that range in complexity from simple to complex, and which can be combined in any form in serial/parallel fashion. These motion-primitives are termed to be Minimanipulations (MMs) and each MM has a clear time-indexed command input-structure, and output behavior-/performance-profile that are intended to achieve a certain function. MMs can range from the simple (‘index a single finger joint by 1 degree’) to the more involved (such as ‘grab the utensil’) to the even more complex (‘fetch the knife and cut the bread’) to the fairly abstract (‘play the 1st bar of Schubert's piano concerto #1’).
Thus, MMs are software-based and represented by input and output data sets and inherent processing algorithms and performance descriptors, akin to individual programs with input/output data files and subroutines, contained within individual run-time source-code, which when compiled generates object-code that can be compiled and collected within various different software libraries, termed as a collection of various Minimanipulation-Libraries (MMLs). MMLs can be grouped in to multiple groupings, whether these be associated to (i) particular hardware elements (finger/hand, wrist, arm, torso, foot, legs, etc.), (ii) behavioral elements (contacting, grasping, handling, etc.), or even (iii) application-domains (cooking, painting, playing a musical instrument, etc.). Furthermore, within each of these groupings, MMLs can be arranged based on multiple levels (simple to complex) relating to the complexity of behavior desired.
It should thus be understood that the concept of Minimanipulation (MM) (definitions and associations, measurement and control variables and their combinations and value-usage and—modification, etc.) and its implementation through usage of multiple MMLs in a near infinite combination, relates to the definition and control of basic behaviors (movements and interactions) of one or more degrees of freedom (movable joints under actuator control) at levels ranging from a single joint (knuckle, etc.) to combinations of joints (fingers and hand, arm, etc.) to ever higher degree of freedom systems (torso, upper-body, etc.) in a sequence and combination that achieves a desirable and successful movement sequence in free space and achieves a desirable degree of interaction with the real world so as to be able to enact a desirable function or output by the robot system, on and with, the surrounding world via tools, utensils, and other items.
Examples for the above definition can range from (i) a simple command sequence for a digit to flick a marble along a table, through (ii) stirring a liquid in a pot using a utensil, to (iii) playing a piece of music on an instrument (violin, piano, harp, etc.). The basic notion is that MMs are represented at multiple levels by a set of MM commands executed in sequence and in parallel at successive points in time, and together create a movement and action/interaction with the outside world to arrive at a desirable function (stirring the liquid, striking the bow on the violin, etc.) to achieve a desirable outcome (cooking pasta sauce, playing a piece of Bach concerto, etc.).
The basic elements of any low-to-high MM sequence comprise movements for each subsystem, and combinations thereof are described as a set of commanded positions/velocities and forces/torques executed by one or more articulating joints under actuator power, in such a sequence as required. Fidelity of execution is guaranteed through a closed-loop behavior described within each MM sequence and enforced by local and global control algorithms inherent to each articulated joint controller and higher-level behavioral controllers.
Implementation of the above movements (described by articulating joint positions and velocities) and environment interactions (described by joint/interface torques and forces) is achieved by having computer playback desirable values for all required variables (positions/velocities and forces/torques) and feeding these to a controller system that faithfully implements them on each joint as a function of time at each time step. These variables and their sequence and feedback loops (hence not just data files, but also control programs), to ascertain the fidelity of the commanded movement/interactions, are all described in data-files that are combined into multi-level MMLs, which can be accessed and combined in multiple ways to allow a humanoid robot to execute multiple actions, such as cooking a meal, playing a piece of classical music on a piano, lifting an infirm person into/out of a bed, etc. There are MMLs that describe simple rudimentary movement/interactions, which are then used as building-blocks for ever higher-level MMLs that describe ever-higher levels of manipulation, such as ‘grasp’, ‘lift’, ‘cut’ to higher level primitives, such as ‘stir liquid in pot’/‘pluck harp-string to g-flat’ or even high-level actions, such as ‘make a vinaigrette dressing’/‘paint a rural Brittany summer landscape’/‘play Bach's Piano-concerto #1’, etc. Higher level commands are simply a combination towards a sequence of serial/parallel lower- and mid-level MM primitives that are executed along a common timed stepped sequence, which is overseen by a combination of a set of planners running sequence/path/interaction profiles with feedback controllers to ensure the required execution fidelity (as defined in the output data contained within each MM sequence).
The values for the desirable positions/velocities and forces/torques and their execution playback sequence(s) can be achieved in multiple ways. One possible way is through watching and distilling the actions and movements of a human executing the same task, and distilling from the observation data (video, sensors, modeling software, etc.) the necessary variables and their values as a function of time and associating them with different minimanipulations at various levels by using specialized software algorithms to distill the required MM data (variables, sequences, etc.) into various types of low-to-high MMLs. This approach would allow a computer program to automatically generate the MMLs and define all sequences and associations automatically without any human involvement.
Another way would be (again by way of an automated computer-controlled process employing specialized algorithms) to learn from online data (videos, pictures, sound logs, etc.) how to build a required sequence of actionable sequences using existing low-level MMLs to build the proper sequence and combinations to generate a task-specific MML.
Yet another way, although most certainly more (time-) inefficient and less cost-effective, might be for a human programmer to assemble a set of low-level MM primitives to create an ever-higher level set of actions/sequences in a higher-level MML to achieve a more complex task-sequence, again composed of pre-existing lower-level MMLs.
Modification and improvements to individual variables (meaning joint position/velocities and torques/forces at each incremental time-interval and their associated gains and combination algorithms) and the motion/interaction sequences are also possible and can be effected in many different ways. It is possible to have learning algorithms monitor each and every motion/interaction sequence and perform simple variable-perturbations to ascertain outcome to decide on if/how/when/what variable(s) and sequence(s) to modify in order to achieve a higher level of execution fidelity at levels ranging from low-to high-levels of various MMLs. Such a process would be fully automatic and allow for updated data sets to be exchanged across multiple platforms that are interconnected, thereby allowing for massively parallel and cloud-based learning via cloud computing.
Advantageously, the robotic apparatus in a standardized robotic kitchen has the capabilities to prepare a wide array of cuisines from around the world through a global network and database access, as compared to a chef who may specialize in one type of cuisine. The standardized robotic kitchen also is able to capture and record favorite food dishes for replication by the robotic apparatus whenever desired to enjoy the food dish without the repetitive process of laboring to prepare the same dish repeatedly.
The structures and methods of the present disclosure are disclosed in detail in the description below. This summary does not purport to define the disclosure. The disclosure is defined by the claims. These and other embodiments, features, aspects, and advantages of the disclosure will become better understood with regard to the following description, appended claims, and accompanying drawings.
In some embodiments, an electronic inventory system comprises a storage unit configured to store one or more objects; one or more image capturing devices configured in the storage unit to: capture one or more images of each of the one or more objects, in real-time; and transmit each of the one or more images to a display screen configured on the storage unit and one or more embedded processors configured in the storage unit; one or more sensors configured in the storage unit to provide corresponding sensor data to at least one of the one or more embedded processors associated with position and orientation of each of the one or more objects; one or more light sources configured in the storage unit to facilitate the one or more image capturing devices in capturing one or more images of each of the one or more objects in the storage unit, by providing uniform illumination in the storage unit; one or more embedded processors configured in the storage unit, wherein the one or more embedded processors interact with a central processor of the robotic assistant system through a communication network, configured to: detect each of the one or more objects stored in the storage unit based on the one or more images and the sensor data; and transmit the one or more images and the sensor data to the central processor in real-time or periodically.
In some embodiments, the one or more sensors comprises at least one of a temperature sensor, a humidity sensor, an ultrasound sensor, a laser measurement sensors and SONAR.
In some embodiments, the one or more embedded processors detect each of the one or more objects by detecting presence/absence of the one or more objects, estimating content stored in the one or more objects, detecting position and orientation of each of the one or more objects, reading at least one of visual markers and radio type markers attached to each of the one or more objects and reading object identifiers.
In some embodiments, the one or more embedded processors detect the one or more objects based on Convolutional Neural Network (CNN) techniques.
In some embodiments, the storage unit comprises a display screen fixed on external surface of the storage, configured to display images and videos of the one or more objects and one or more interactions performed on each of the one or more objects, in real-time.
In some embodiments, the display screen enables a user to visualize and locate each of the one or more objects stored in the storage unit, without opening doors of the storage unit.
In some embodiments, the storage unit is further configured with motor devices to enable performing one or more actions on doors of the storage unit, automatically, wherein the one or more actions comprise at least one of opening, closing, locking and unlocking the doors of the storage unit.
In some embodiments, each of the one or more sensors, each of the one or more light sources and each of the one or more image capturing devices of the storage unit are electrically connected to an extension board configured in the storage unit, wherein extension board of each storage unit is connected to a Power over Ethernet (PoE) switch.
In some embodiments, the storage unit is further configured with a fan block for providing air circulation inside the storage unit and a thermoelectric cooler element to cool electric components in the storage unit.
In one non-limiting embodiment of the present disclosure, a coupling device for coupling one or more objects to a robotic system is provided. The coupling device comprising a first coupling member defined onto the robotic system and a second coupling member defined onto the one or more objects, and connectable with the first coupling member. A locking mechanism is defined at an interface of each of the first coupling member and the second coupling member, for coupling the one or more objects with the robotic system.
In an embodiment, the first coupling member is defined by a first connection surface connectable to the robotic system and a first mating surface defined with a plurality of first projections along its periphery.
In an embodiment, the second coupling member is defined by a second connection surface connectable to the one or more objects and a second mating surface defined with a plurality of second projections along its periphery.
In an embodiment, the plurality of first projections and the plurality of second projections are complementary to each other to facilitate coupling of the first coupling member with the second coupling member.
In an embodiment, the first connection surface is connectable with the robotic system by at least one of a mechanical means, an electromechanical means, a vacuum means and a magnetic means.
In an embodiment, the second connection surface is connectable to the one or more objects by at least one of the mechanical means, the electromechanical means, the vacuum means and the magnetic means.
In an embodiment, the material of the first coupling member and the second coupling member are selected to facilitate joining between the first mating surface and the second mating surface.
In an embodiment, the first coupling member is made of either of an electromagnetic material or a ferromagnetic material.
In an embodiment, the second coupling member is made either of the ferromagnetic material or the electromagnetic material.
In an embodiment, an interface port is defined on the first coupling member and interfaced to the robotic system, for peripheral connection between the robotic and the second coupling member, to facilitate manipulation of the one or more objects by the robotic system.
In an embodiment, each of the one or more objects is at least one of a kitchen appliance and a kitchen tool.
In an embodiment, at least one sensor unit is defined in the robotic system, wherein the at least one sensor unit is configured to detect orientation of the plurality of first projections with the plurality of second projections, during coupling of the first coupling member with the second coupling member.
In an embodiment, the locking mechanism comprises at least one notch defined on the first mating surface and at least one protrusion defined on the second mating surface. The at least one protrusion is adapted to engage with the at least one notch for coupling the first mating surface with the second mating surface.
In an embodiment, the at least one protrusion is shaped corresponding to the configuration of the at least one notch.
In an embodiment, the locking mechanism comprises at least one notch defined on the second mating surface and at least one protrusion defined on the first mating surface. The at least one protrusion is adapted to engage with the at least one notch for coupling the first mating surface with the second mating surface.
In an embodiment, the at least one notch is shaped in at least one of a triangular shape, a circular shape, and a polygonal shape.
In another non-limiting embodiment of the present disclosure, a coupling device for coupling one or more objects to a robotic system is provided. The coupling device comprising a first coupling member defined onto the robotic system and a second coupling member defined onto the one or more objects, and connectable with the first coupling member. A locking mechanism is defined at an interface of each of the first coupling member and the second coupling member, for coupling the one or more objects with the robotic system. The locking mechanism comprises at least one triangular notch defined on either of the first coupling member and the second coupling member and at least one triangular protrusion defined on the corresponding first coupling member and the second coupling member. The at least one triangular protrusion is adapted to engage with the at least one triangular notch for coupling the first coupling member with the second coupling member.
In another non-limiting embodiment of the present disclosure, a coupling device for coupling one or more objects to a robotic system is provided. The coupling device comprises a first coupling member defined onto the robotic system and a second coupling member defined onto the one or more objects, and connectable with the first coupling member. A locking mechanism is defined at an interface of each of the first coupling member and the second coupling member, for coupling the one or more objects with the robotic system. The locking mechanism comprises at least one circular notch defined on either of the first coupling member and the second coupling member and at least one circular protrusion defined on the corresponding first coupling member and the second coupling member. The locking mechanism is adapted to engage with the at least one circular notch for coupling the first coupling member with the second coupling member.
In another non-limiting embodiment of the present disclosure, a coupling device for coupling one or more objects to a robotic system is provided. The coupling device comprises a first coupling member defined onto the robotic system and a second coupling member defined onto the one or more objects, and connectable with the first coupling member. A locking mechanism defined at an interface of each of the first coupling member and the second coupling member, for coupling the one or more objects with the robotic system. The locking mechanism comprises at least one notch defined on either of the first coupling member and the second coupling member, wherein each of the at least one notch is configured to receive an electromagnet. Also, at least one protrusion is defined on the corresponding first coupling member and the second coupling member and adapted to engage with the electromagnet in the at least one notch for coupling the first coupling member with the second coupling member.
In an embodiment, the at least one protrusion is made of ferromagnetic material for joining with the electromagnet in the at least one notch.
In an embodiment, the at least one notch includes a groove defined along its periphery.
In an embodiment, the at least one protrusion includes a pin, shaped corresponding to the configuration of the groove in the at least one notch and adapted to engage with the groove for improving stability of the coupling between the first coupling member and the second coupling member.
In an embodiment, a locking mechanism for securing one or more objects to a robotic system is provided. The locking mechanism comprising at least one first locking member fixed on a manipulator of the robotic system and at least one second locking member is mounted on the manipulator and adapted to be operable between a first position and a second position. At least one actuator assembly is associated with the at least one second locking member and adapted to operate the at least one second locking member between the first position and the second position. The at least one actuator operates the at least one second locking member from the first position to the second position, to engage each of the one or more objects between the at least one first locking member and the at least one second locking member, thereby securing the one or more objects with the robotic system.
In an embodiment, the at least one first locking member and the at least one second locking member located in a same plane of the manipulator.
In an embodiment, the at least one actuator assembly is configured on a rear surface of the manipulator.
In an embodiment, the at least one first locking member and the at least one second locking member are located on a front surface of the manipulator.
In an embodiment, each of the one or more objects includes a holding portion, defined with a plurality of slots along its periphery for engaging with the at least one first locking member and the at least one second locking member.
In an embodiment, shape of the plurality of slots corresponds to the configuration of the at least one first locking member and the at least one second locking member.
In an embodiment, the at least one actuator assembly is actuated by the robotic system, to slide the at least one second holding member from the first position to the second position, when the manipulator approaches vicinity of each of the one or more objects.
In an embodiment, the manipulator includes a guideway for guiding each of the at least one second holding member between the first position and the second position.
In an embodiment, the at least one first holding member and the at least one second holding member is a hook member.
In an embodiment, the at least one actuator assembly is selected from at least one of a linear actuator and a rotary actuator.
In an embodiment, the at least one actuator assembly comprises a lead screw mounted onto the manipulator, a motor interfaced with the robotic system and coupled to the lead screw, to axially rotate the lead screw and a nut mounted onto the lead screw and engaged with the at least one second holding means. The nut is configured to traverse along the lead screw during its axial rotation, thereby operating the at least one second holding means between the first position and the second position.
In an embodiment, a lead screw holder is provided for mounting the lead screw on the manipulator, such that the lead screw is aligned along a horizontal axis of the manipulator.
In an embodiment, the lead screw includes a plurality of threads with a lead angle ranging from about 6 degrees to about 12 degrees, to restrict movement of the nut, when the motor ceases to operate.
In an embodiment, the nut is engaged with the at least one second holding means via at least one bracket member.
In an embodiment, the nut is configured to slide the at least one second holding means from the first position to the second position, during clockwise rotation of the lead screw.
In an embodiment, the nut is configured to slide the at least one second holding means from the second position to the first position, during anti-clockwise rotation of the lead screw.
In an embodiment, the nut is configured to slide the at least one second holding means from the first position to the second position, during anti-clockwise rotation of the lead screw.
In an embodiment, the nut is configured to slide the at least one second holding means from the second position to the first position, during clockwise rotation of the lead screw.
In an embodiment, the motor is supported onto the manipulator via a clamp.
In an embodiment, the at least one second holding means, extends from the rear surface of the manipulator and protrudes over the front surface of the manipulator to position itself in the same plane as that of the at least one first holding means.
In an embodiment, the at least one actuator assembly comprises a housing mounted onto the manipulator, the housing includes a solenoid coil configured to be energized by a power source. A plunger is accommodated within the housing and suspended concentrically to the solenoid coil, wherein the plunger is adapted to be actuated by the solenoid coil in an energized condition. A frame member is mounted to the plunger and connected to the at least one second holding means. The frame member is configured to transfer actuation of the plunger to the at least one second holding means during the energized condition of the solenoid coil, thereby operating the at least one second holding means between the first position and the second position.
In an embodiment, the power source for energizing the solenoid coil is selected from at least one of an alternating current and a direct current.
In an embodiment, a damper member is provided such that, one end is fixed to the housing and another end connected to the frame member, to control movement of the frame member.
In an embodiment, the frame member includes one or more link members connected to each of the at least one second holding means.
It is to be understood that the aspects and embodiments of the disclosure described above may be used in any combination with each other. Several of the aspects and embodiments may be combined to form a further embodiment of the disclosure.
The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.
The structures and methods of the present disclosure are disclosed in detail in the description below. This summary does not purport to define the disclosure. The disclosure is defined by the claims. These and other embodiments, features, aspects, and advantages of the disclosure will become better understood with regard to the following description, appended claims, and accompanying drawings.
The novel features and characteristic of the disclosure are set forth in the appended claims. The disclosure itself, however, as well as a preferred mode of use, further objectives and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings. One or more embodiments are now described, by way of example only, with reference to the accompanying drawings wherein like reference numerals represent like elements and in which:
The figures depict embodiments of the disclosure for purposes of illustration only. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
A description of structural embodiments and methods of the present disclosure is provided with reference to
The following definitions apply to the elements and steps described herein. These terms may likewise be expanded upon.
Abstraction Data—refers to the abstraction recipe of utility for machine-execution, which has many other data-elements that a machine needs to know for proper execution and replication. This so-called meta-data, or additional data corresponding to a particular step in the cooking process, whether it be direct sensor-data (clock-time, water-temperature, camera-image, utensil or ingredient used, etc.) or data generated through interpretation or abstraction of larger data-sets (such as a 3-Dimensional range cloud from a laser used to extract the location and types of objects in the image, overlaid with texture and color maps from a camera-picture, etc.). The meta-data is time-stamped and used by the robotic kitchen to set, control, and monitor all processes and associated methods and equipment needed at every point in time as it steps through the sequence of steps in the recipe.
Abstraction Recipe—refers to a representation of a chef's recipe, which a human knows as represented by the use of certain ingredients, in certain sequences, prepared and combined through a sequence of processes and methods, as well as skills of the human chef. An abstraction recipe used by a machine for execution in an automated way requires different types of classifications and sequences. While the overall steps carried out are identical to those of the human chef, the abstraction recipe of utility to the robotic kitchen requires that additional meta-data be a part of every step in the recipe. Such meta-data includes the cooking time and variables, such as temperature (and its variations over time), oven-setting, tool/equipment used, etc. Basically a machine-executable recipe-script needs to have all possible measured variables of import to the cooking process (all measured and stored while the human chef was preparing the recipe in the chef studio) correlated to time, both overall and that within each process-step of the cooking-sequence. Hence, the abstraction recipe is a representation of the cooking steps mapped into a machine-readable representation or domain, which takes the required process from the human-domain to that of the machine-understandable and machine-executable domain through a set of logical abstraction steps.
Acceleration—refers to the maximum rate of speed-change at which a robotic arm can accelerate around an axis or along a space-trajectory over a short distance.
Accuracy—refers to how closely a robot can reach a commanded position. Accuracy is determined by the difference between the absolute positions of the robot compared to the commanded position. Accuracy can be improved, adjusted, or calibrated with external sensing, such as sensors on a robotic hand or a real-time three-dimensional model using multiple (multi-mode) sensors.
Action Primitive—in one embodiment, the term refers to an indivisible robotic action, such as moving the robotic apparatus from location X1 to location X2, or sensing the distance from an object (for food preparation) without necessarily obtaining a functional outcome. In another embodiment, the term refers to an indivisible robotic action in a sequence of one or more such units for accomplishing a minimanipulation. These are two aspects of the same definition. (smallest functional subblock—lower level minimanpualtion
Alternative Functional Action Primitive (AFAP)—refers to an alternative functional action primitive, rather than a particular functional action primitive, by changing the initial parameters (including initial position, initial orientation, and/or the way how the robot moves in order to obtain a functional result) of the robot relative to an operated object or the operating environment, to accomplish the same functional result of that particular functional action primitive.
Automated Dosage System—refers to dosage containers in a standardized kitchen module where a particular size of food chemical compounds (such as salt, sugar, pepper, spice, any kind of liquids, such as water, oil, essences, ketchup, etc.) is released upon application.
Automated Storage and Delivery System—refers to storage containers in a standardized kitchen module that maintain a specific temperature and humidity for storing food; each storage container is assigned a code (e.g., a bar code) for the robotic kitchen to identify and retrieve where a particular storage container delivers the food contents stored therein.
Coarse—refers to movements whose magnitude is within 75% of the maximum workspace dimension achievable of a particular subsystem. As an example, a coarse movement for a manipulator arm would be any motion that is within 75% of the largest dimension contained within the volume described by the maximum three-dimensional reach of the robot arm itself in all possible directions. Furthermore the resolution of motion typical (due to many factors such as sensor-resolution, controller discretization, mechanical tolerances, assembly slop, etc.) for such systems is at best 1/100 to 1/200 of said maximum workspace dimension. So if a human-arm sized robot arm can reach anywhere within a 6-foot diameter half-sphere, its maximum resolvable (and thus controllable) motion-increment, would lie somewhere between 0.072 in to 0.14 in at full reach.
Data Cloud—refers to a collection of sensor or data-based numerical measurement values from a particular space (three-dimensional laser/acoustic range measurement, RGB-values from a camera image, etc.) collected at certain intervals and aggregated based on a multitude of relationships, such as time, location, etc.
Dedicated—refers to hardware elements such as processors, sensors, actuators and buses, that are solely used by a particular element or subsystem. In particular, each subsystem within the macro- and micro-manipulation systems, contain elements that utilize their own processors and sensor and actuators that re solely responsible for the movements of the hardware element (shoulder, arm-joint, wrist, finger, etc.) they are associated with.
Degree of Freedom (“DOF”)—refers to a defined mode and/or direction in which a mechanical device or system can move. The number of degrees of freedom is equal to the total number of independent displacements or aspects of motion. The total number of degrees of freedom is doubled for two robotic arms.
Direct Environment—refers to a defined working space that is reachable from the current position of the robot.
Direct Standard Environment—refers to direct environment that is in a defined and known state.
Edge Detection—refers to a software-based computer program(s) capable of identifying the edges of multiple objects that may be overlapping in a two-dimensional-image of a camera yet successfully identifying their boundaries to aid in object identification and planning for grasping and handling.
Environment—refers to collection of any kind of physical objects that the robot can interact or collide with, including structures, movable objects, other robots, humans, tools, etc.
Equilibrium Value—refers to the target position of a robotic appendage, such as a robotic arm where the forces acting upon it are in equilibrium, i.e. there is no net force and thus no net movement.
Execution Sequence Planner—refers to a software-based computer program(s) capable of creating a sequence of execution scripts or commands for one or more elements or systems capable of being computer controlled, such as arm(s), dispensers, appliances, etc.
Fine—refers to movements that are within 75% of the largest dimension of the three-dimensional workspace of a micro-manipulation subsystem. As an example, the workspace of a multi-fingered hand could be described as a three-dimensional ellipsoid or sphere; the largest dimension (major-axis for ellipsoid or diameter for a sphere) would represent the largest dimension of a fine motion. Furthermore the resolution of motion typical for (due to many factors such as sensor-resolution, controller discretization, mechanical tolerances, assembly slop, etc.) such sized systems is at best 1/500 to 1/1,000 of said maximum workspace dimension. So if a human-sized robot hand can reach anywhere within a 6-inch diameter half-sphere, its maximum resolvable (and thus controllable) motion-increment, would lie somewhere between 0.0125 in to 0.006 in at full reach.
Food Execution Fidelity—refers to a robotic kitchen, which is intended to replicate the recipe-script generated in the chef studio by watching, measuring, and understanding the steps, variables, methods, and processes of the human chef, thereby trying to emulate his/her techniques and skills. The fidelity of how close the execution of the dish-preparation comes to that of the human-chef is measured by how close the robotically-prepared dish resembles the human-prepared dish as measured by a variety of subjective elements, such as consistency, color, taste, etc. The notion is that the more closely the dish prepared by the robotic kitchen is to that prepared by the human chef, the higher the fidelity of the replication process.
Food Preparation Stage (also referred to as “Cooking Stage”)—refers to a combination, either sequential or in parallel, of one or more minimanipulations including action primitives, and computer instructions for controlling the various kitchen equipment and appliances in the standardized kitchen module. One or more food preparation stages collectively represent the entire food preparation process for a particular recipe.
Functional Action Primitive (FAP)—refers to an indivisible action primitive that obtains a necessary functional outcome.
Functional Action Primitive Subblocks (FAPSBs)—refers to either robot trajectories, vision system commands or appliance commands.
Geometric Reasoning—refers to a software-based computer program(s) capable of using a two-dimensional (2D)/three-dimensional (3D) surface, and/or volumetric data to reason as to the actual shape and size of a particular volume. The ability to determine or utilize boundary information also allows for inferences as to the start and end of a particular geometric element and the number present in an image or model.
Grasp Reasoning—refers to a software-based computer program(s) capable of relying on geometric and physical reasoning to plan a multi-contact (point/area/volume) interaction between a robotic end-effector (gripper, link, etc.), or even tools/utensils held by the end-effector, so as to successfully contact, grasp, and hold the object in order to manipulate it in a three-dimensional space.
Hardware Automation Device—fixed process device capable of executing pre-programmed steps in succession without the ability to modify any of them; such devices are used for repetitive motions that do not need any modulation.
Ingredient Management and Manipulation—refers to defining each ingredient in detail (including size, shape, weight, dimensions, characteristics, and properties), one or more real-time adjustments in the variables associated with the particular ingredient that may differ from the previous stored ingredient details (such as the size of a fish fillet, the dimensions of an egg, etc.), and the process in executing the different stages for the manipulation movements to an ingredient.
Kitchen Module (or Kitchen Volume)—a standardized full-kitchen module with standardized sets of kitchen equipment, standardized sets of kitchen tools, standardized sets of kitchen handles, and standardized sets of kitchen containers, with predefined space and dimensions for storing, accessing, and operating each kitchen element in the standardized full-kitchen module. One objective of a kitchen module is to predefine as much of the kitchen equipment, tools, handles, containers, etc. as possible, so as to provide a relatively fixed kitchen platform for the movements of robotic arms and hands. Both a chef in the chef kitchen studio and a person at home with a robotic kitchen (or a person at a restaurant) uses the standardized kitchen module, so as to maximize the predictability of the kitchen hardware, while minimizing the risks of differentiations, variations, and deviations between the chef kitchen studio and a home robotic kitchen. Different embodiments of the kitchen module are possible, including a standalone kitchen module and an integrated kitchen module. The integrated kitchen module is fitted into a conventional kitchen area of a typical house. The kitchen module operates in at least two modes, a robotic mode and a normal (manual) mode.
Live Planning—refers to plans that are created just before execution, usually dependent on the direct environment.
Machine Learning—refers to the technology wherein a software component or program improves its performance based on experience and feedback. One kind of machine learning often used in robotics is reinforcement learning, where desirable actions are rewarded and undesirable ones are penalized. Another kind is case-based learning, where previous solutions, e.g. sequences of actions by a human teacher or by the robot itself are remembered, together with any constraints or reasons for the solutions, and then are applied or reused in new settings. There are also additional kinds of machine learning, such as inductive and transductive methods.
Minimanipulation (MM)—generally, MM refers to one or more behaviors or task-executions in any number or combinations and at various levels of descriptive abstraction, by a robotic apparatus that executes commanded motion-sequences under sensor-driven computer-control, acting through one or more hardware-based elements and guided by one or more software-controllers at multiple levels, to achieve a required task-execution performance level to arrive at an outcome approaching an optimal level within an acceptable execution fidelity threshold. The acceptable fidelity threshold is task-dependent and therefore defined for each task (also referred to as “domain-specific application”). In the absence of a task-specific threshold, a typical threshold would be 0.001 (0.1%) of optimal performance.
Model Elements and Classification—refers to one or more software-based computer program(s) capable of understanding elements in a scene as being items that are used or needed in different parts of a task; such as a bowl for mixing and the need for a spoon to stir, etc. Multiple elements in a scene or a world-model may be classified into groupings allowing for faster planning and task-execution.
Motion Primitives—refers to motion actions that define different levels/domains of detailed action steps, e.g. a high-level motion primitive would be to grab a cup, and a low-level motion primitive would be to rotate a wrist by five degrees.
Multimodal Sensing Unit—refers to a sensing unit comprised of multiple sensors capable of sensing and detecting multiple modes or electromagnetic bands or spectra: particularly, capable of capturing three-dimensional position and/or motion information. The electromagnetic spectrum can range from low to high frequencies and does not need to be limited to that perceived by a human being. Additional modes might include, but are not limited to, other physical senses such as touch, smell, etc.
Number of Axes—three axes are required to reach any point in space. To fully control the orientation of the end of the arm (i.e. the wrist), three additional rotational axes (yaw, pitch, and roll) are required.
Parameters—refers to variables that can take numerical values or ranges of numerical values. Three kinds of parameters are particularly relevant: parameters in the instructions to a robotic device (e.g. the force or distance in an arm movement), user-settable parameters (e.g. prefers meat well done vs. medium), and chef-defined parameters (e.g. set oven temperature to 350 F).
Parameter Adjustment—refers to the process of changing the values of parameters based on inputs. For instance changes in the parameters of instructions to the robotic device can be based on the properties (e.g. size, shape, orientation) of, but not limited to, the ingredients, position/orientation of kitchen tools, equipment, appliances, speed, and time duration of a minimanipulation.
Payload or Carrying Capacity—refers to how much weight a robotic arm can carry and hold (or even accelerate) against the force of gravity as a function of its endpoint location.
Physical Reasoning—refers to a software-based computer program(s) capable of relying on geometrically-reasoned data and using physical information (density, texture, typical geometry, and shape) to assist an inference-engine (program) to better model the object and also predict its behavior in the real world, particularly when grasped and/or manipulated/handled.
Properly Sequenced—refers to a set of consecutive instructions, in our case namely time-based motion instructions that are consecutive in time, issued to one or more robotic actuation elements within each of the manipulation subsystems. The implication of a “properly sequenced” set of instructions, carries with it the knowledge that a high-level planner has created said instructions and concatenated and placed them in a sequence, so as to ensure that each actuated element within each of the addressed subsystems will carry out said instructions, thereby achieving a properly synchronized set of motions that achieve the desired task execution result.
Pre-planning—refers to a type of planning where plans are made in advance of execution in a direct environment, which the pre-planning data and direct environment data are saved together.
Raw Data—refers to all measured and inferred sensory-data and representation information that is collected as part of the chef-studio recipe-generation process while watching/monitoring a human chef preparing a dish. Raw data can range from a simple data-point such as clock-time, to oven temperature (over time), camera-imagery, three-dimensional laser-generated scene representation data, to appliances/equipment used, tools employed, ingredients (type and amount) dispensed and when, etc. All the information the studio-kitchen collects from its built-in sensors and stores in raw, time-stamped form, is considered raw data. Raw data is then used by other software processes to generate an even higher level of understanding and recipe-process understanding, turning raw data into additional time-stamped processed/interpreted data.
Robotic Apparatus—refers the set of robotic sensors and effectors. The effectors comprise one or more robotic arms and one or more robotic hands for operation in the standardized robotic kitchen. The sensors comprise cameras, range sensors, and force sensors (haptic sensors) that transmit their information to the processor or set of processors that control the effectors.
Recipe Cooking Process—refers to a robotic script containing abstract and detailed levels of instructions to a collection of programmable and hard-automation devices, to allow computer-controllable devices to execute a sequenced operation within its environment (e.g. a kitchen replete with ingredients, tools, utensils, and appliances).
Recipe Script—refers to a recipe script as a sequence in time containing a structure and a list of commands and execution primitives (simple to complex command software) that, when executed by the robotic kitchen elements (robot-arm, automated equipment, appliances, tools, etc.) in a given sequence, should result in the proper replication and creation of the same dish as prepared by the human chef in the studio-kitchen. Such a script is sequential in time and equivalent to the sequence employed by the human chef to create the dish, albeit in a representation that is suitable and understandable by the computer-controlled elements in the robotic kitchen.
Recipe Speed Execution—refers to managing a timeline in the execution of recipe steps in preparing a food dish by replicating a chef's movements, where the recipe steps include standardized food preparation operations (e.g., standardized cookware, standardized equipment, kitchen processors, etc.), MMs, and cooking of non-standardized objects.
Repeatability—refers to an acceptable preset margin in how accurately the robotic arms/hands can repeatedly return to a programmed position. If the technical specification in a control memory requires the robotic hand to move to a certain X-Y-Z position and within +/−0.1 mm of that position, then the repeatability is measured for the robotic hands to return to within +/−0.1 mm of the taught and desired/commanded position.
Robotic Recipe Script—refers to a computer-generated sequence of machine-understandable instructions related to the proper sequence of robotically/hard-automation execution of steps to mirror the required cooking steps in a recipe to arrive at the same end-product as if cooked by a chef.
Robotic Costume—External instrumented device(s) or clothing, such as gloves, clothing with camera-tractable markers, jointed exoskeleton, etc., used in the chef studio to monitor and track the movements and activities of the chef during all aspects of the recipe cooking process(es).
Scene Modeling—refers to a software-based computer program(s) capable of viewing a scene in one or more cameras' fields of view and being capable of detecting and identifying objects of importance to a particular task. These objects may be pre-taught and/or be part of a computer library with known physical attributes and usage-intent.
Smart Kitchen Cookware/Equipment—refers to an item of kitchen cookware (e.g., a pot or a pan) or an item of kitchen equipment (e.g., an oven, a grill, or a faucet) with one or more sensors that prepares a food dish based on one or more graphical curves (e.g., a temperature curve, a humidity curve, etc.).
Software Abstraction Food Engine—refers to a software engine that is defined as a collection of software loops or programs, acting in concert to process input data and create a certain desirable set of output data to be used by other software engines or an end-user through some form of textual or graphical output interface. An abstraction software engine is a software program(s) focused on taking a large and vast amount of input data from a known source in a particular domain (such as three-dimensional range measurements that form a data-cloud of three-dimensional measurements as seen by one or more sensors), and then processing the data to arrive at interpretations of the data in a different domain (such as detecting and recognizing a table-surface in a data-cloud based on data having the same vertical data value, etc.), in order to identify, detect, and classify data-readings as pertaining to an object in three-dimensional space (such as a table-top, cooking pot, etc.). The process of abstraction is basically defined as taking a large data set from one domain and inferring structure (such as geometry) in a higher level of space (abstracting data points), and then abstracting the inferences even further and identifying objects (pots, etc.) out of the abstraction data-sets to identify real-world elements in an image, which can then be used by other software engines to make additional decisions (handling/manipulation decisions for key objects, etc.). A synonym for “software abstraction engine” in this application could be also “software interpretation engine” or even “computer-software processing and interpretation algorithm”.
Task Reasoning—refers to a software-based computer program(s) capable of analyzing a task-description and breaking it down into a sequence of multiple machine-executable (robot or hard-automation systems) steps, to achieve a particular end result defined in the task description.
Three-dimensional World Object Modeling and Understanding—refers to a software-based computer program(s) capable of using sensory data to create a time-varying three-dimensional model of all surfaces and volumes, to enable it to detect, identify, and classify objects within the same and understand their usage and intent.
Torque Vector—refers to the torsion force upon a robotic appendage, including its direction and magnitude.
Volumetric Object Inference (Engine)—refers to a software-based computer program(s) capable of using geometric data and edge-information, as well as other sensory data (color, shape, texture, etc.), to allow for identification of three-dimensionality of one or more objects to aid in the object identification and classification process.
Robotic assistants and/or robotic apparatuses, including the interactions or minimanipulations performed thereby are described in further detail, for example, in the following applications: U.S. patent application Ser. No. 14/627,900 entitled “Methods and Systems for Food Preparation in a Robotic Cooking Kitchen,” filed 20 Feb. 2015; U.S. Provisional Application Ser. No. 62/202,030 entitled “Robotic Manipulation Methods and Systems Based on Electronic Mini-Manipulation Libraries,” filed 6 Aug. 2015; U.S. Provisional Application Ser. No. 62/189,670 entitled “Robotic Manipulation Methods and Systems Based on Electronic Minimanipulation Libraries,” filed 7 Jul. 2015; U.S. Provisional Application Ser. No. 62/166,879 entitled “Robotic Manipulation Methods and Systems Based on Electronic Minimanipulation Libraries,” filed 27 May 2015; U.S. Provisional Application Ser. No. 62/161,125 entitled “Robotic Manipulation Methods and Systems Based on Electronic Minimanipulation Libraries,” filed 13 May 2015; U.S. Provisional Application Ser. No. 62/146,367 entitled “Robotic Manipulation Methods and Systems Based on Electronic Minimanipulation Libraries,” filed 12 Apr. 2015; U.S. Provisional Application Ser. No. 62/116,563 entitled “Method and System for Food Preparation in a Robotic Cooking Kitchen,” filed 16 Feb. 2015; U.S. Provisional Application Ser. No. 62/113,516 entitled “Method and System for Food Preparation in a Robotic Cooking Kitchen,” filed 8 Feb. 2015; U.S. Provisional Application Ser. No. 62/109,051 entitled “Method and System for Food Preparation in a Robotic Cooking Kitchen,” filed 28 Jan. 2015; U.S. Provisional Application Ser. No. 62/104,680 entitled “Method and System for Robotic Cooking Kitchen,” filed 16 Jan. 2015; U.S. Provisional Application Ser. No. 62/090,310 entitled “Method and System for Robotic Cooking Kitchen,” filed 10 Dec. 2014; U.S. Provisional Application Ser. No. 62/083,195 entitled “Method and System for Robotic Cooking Kitchen,” filed 22 Nov. 2014; U.S. Provisional Application Ser. No. 62/073,846 entitled “Method and System for Robotic Cooking Kitchen,” filed 31 Oct. 2014; U.S. Provisional Application Ser. 62/055,799 entitled “Method and System for Robotic Cooking Kitchen,” filed 26 Sep. 2014; U.S. Provisional Application Ser. No. 62/044,677, entitled “Method and System for Robotic Cooking Kitchen,” filed 2 Sep. 2014; U.S. Provisional Application Ser. No. 62/116,563 entitled “Method and System for Food Preparation in a Robotic Cooking Kitchen,” filed 16 Feb. 2015; U.S. Provisional Application Ser. No. 62/113,516 entitled “Method and System for Food Preparation in a Robotic Cooking Kitchen,” filed 8 Feb. 2015; U.S. Provisional Application Ser. No. 62/109,051 entitled “Method and System for Food Preparation in a Robotic Cooking Kitchen,” filed 28 Jan. 2015; U.S. Provisional Application Ser. No. 62/104,680 entitled “Method and System for Robotic Cooking Kitchen,” filed 16 Jan. 2015; U.S. Provisional Application Ser. No. 62/090,310 entitled “Method and System for Robotic Cooking Kitchen,” filed 10 Dec. 2014, U.S. Provisional Application Ser. No. 62/083,195 entitled “Method and System for Robotic Cooking Kitchen,” filed 22 Nov. 2014; U.S. Provisional Application Ser. No. 62/073,846 entitled “Method and System for Robotic Cooking Kitchen,” filed 31 Oct. 2014; U.S. Provisional Application Ser. 62/055,799 entitled “Method and System for Robotic Cooking Kitchen,” filed 26 Sep. 2014; U.S. Provisional Application Ser. No. 62/044,677, entitled “Method and System for Robotic Cooking Kitchen,” filed 2 Sep. 2014; U.S. Provisional Application Ser. No. 62/024,948 entitled “Method and System for Robotic Cooking Kitchen,” filed 15 Jul. 2014; U.S. Provisional Application Ser. No. 62/013,691 entitled “Method and System for Robotic Cooking Kitchen,” filed 18 Jun. 2014; U.S. Provisional Application Ser. No. 62/013,502 entitled “Method and System for Robotic Cooking Kitchen,” filed 17 Jun. 2014; U.S. Provisional Application Ser. No. 62/013,190 entitled “Method and System for Robotic Cooking Kitchen,” filed 17 Jun. 2014; U.S. Provisional Application Ser. No. 61/990,431 entitled “Method and System for Robotic Cooking Kitchen,” filed 8 May 2014; U.S. Provisional Application Ser. No. 61/987,406 entitled “Method and System for Robotic Cooking Kitchen,” filed 1 May 2014; U.S. Provisional Application Ser. No. 61/953,930 entitled “Method and System for Robotic Cooking Kitchen,” filed 16 Mar. 2014; and U.S. Provisional Application Ser. No. 61/942,559 entitled “Method and System for Robotic Cooking Kitchen,” filed 20 Feb. 2014.
For additional information on replication by a robotic apparatus and minimanipulation library, see the pending U.S. non-provisional patent application Ser. No. 14/627,900, now U.S. Pat. No. 9,815,191, entitled “Methods and Systems for Food Preparation in Robotic Cooking Kitchen,” and the pending U.S. nonprovisional patent application Ser. No. 14/829,579, entitled “Robotic Manipulation Methods and Systems for Executing a Domain-Specific Application in an Instrumented Environment with Electronic Manipulation Libraries,” the disclosures of which are incorporated herein by reference in their entireties. For additional information on containers in a domain-specific application in an instrumented environment, see pending U.S. non-provisional patent application Ser. No. 15/382,369, entitled, “Robotic Manipulation Methods and Systems for Executing a Domain-Specific Application in an Instrumented Environment with Containers and Electronic Manipulation Libraries,” the disclosure of which is incorporated herein by reference in its entirety.
The robotic food preparation software 14 includes the multimodal three-dimensional sensors 20, a capturing module 28, a calibration module 30, a conversion algorithm module 32, a replication module 34, a quality check module 36 with a three-dimensional vision system, a same result module 38, and a learning module 40. The capturing module 28 captures the movements of the chef as the chef prepares a food dish. The calibration module 30 calibrates the robotic arms 22 and robotic hands 24 before, during, and after the cooking process. The conversion algorithm module 32 is configured to convert the recorded data from a chef's movements collected in the chef studio into recipe modified data (or transformed data) for use in a robotic kitchen where robotic hands replicate the food preparation of the chef's dish. The replication module 34 is configured to replicate the chef's movements in a robotic kitchen. The quality check module 36 is configured to perform quality check functions of a food dish prepared by the robotic kitchen during, prior to, or after the food preparation process. The same result module 38 is configured to determine whether the food dish prepared by a pair of robotic arms and hands in the robotic kitchen would taste the same or substantially the same as if prepared by the chef. The learning module 40 is configured to provide learning capabilities to the computer 16 that operates the robotic arms and hands.
The standardized robotic kitchen 50 is designed for detecting, recording, and emulating a chef's cooking movements, controlling significant parameters such as temperature over time, and process execution at robotic kitchen stations with designated appliances, equipment, and tools. The chef kitchen 44 provides a computing kitchen environment 16 with gloves with sensors or a costume with sensors for recording and capturing a chef's 50 movements in the food preparation for a specific recipe. Upon recording the movements and recipe process of the chef 49 for a particular dish into a software recipe file in memory 52, the software recipe file is transferred from the chef kitchen 44 to the robotic kitchen 48 via a communication network 46, including a wireless network and/or a wired network connected to the Internet, so that the user (optional) 60 can purchase one or more software recipe files or the user can be subscribed to the chef kitchen 44 as a member that receives new software recipe files or periodic updates of existing software recipe files. The household robotic kitchen system 48 serves as a robotic computing kitchen environment at residential homes, restaurants, and other places in which the kitchen is built for the user 60 to prepare food. The household robotic kitchen system 48 includes the robotic cooking engine 56 with one or more robotic arms and hard-automation devices for replicating the chef's cooking actions, processes, and movements based on a received software recipe file from the chef studio system 44.
The chef studio 44 and the robotic kitchen 48 represent an intricately linked teach-playback system, which has multiple levels of fidelity of execution. While the chef studio 44 generates a high-fidelity process model of how to prepare a professionally cooked dish, the robotic kitchen 48 is the execution/replication engine/process for the recipe-script created through the chef working in the chef studio. Standardization of a robotic kitchen module is a means to increase performance fidelity and success/guarantee.
The varying levels of fidelity for recipe-execution depend on the correlation of sensors and equipment (besides of course the ingredients) between those in the chef studio 44 and that in the robotic kitchen 48. Fidelity can be defined as a dish tasting identical to that prepared by a human chef (indistinguishably so) at one of the (perfect replication/execution) ends of the spectrum, while at the opposite end the dish could have one or more substantial or fatal flaws with implications to quality (overcooked meat or pasta), taste (burnt elements), edibility (incorrect consistency) or even health-implications (undercooked meat such as chicken/pork with salmonella exposure, etc.).
A robotic kitchen that has identical hardware and sensors and actuation systems that can replicate the movements and processes akin to those by the chef that were recorded during the chef-studio cooking process is more likely to result in a higher fidelity outcome. The implication here is that the setups need to be identical, and this has a cost and volume implication. The robotic kitchen 48 can, however, still be implemented using more standardized non-computer-controlled or computer-monitored elements (pots with sensors, networked appliances, such as ovens, etc.), requiring more sensor-based understanding to allow for more complex execution monitoring. Since uncertainty has now increased as to key elements (correct amount of ingredients, cooking temperatures, etc.) and processes (use of stirrer/masher in case a blender is not available in a robotic home kitchen), the guarantees of having an identical outcome to that from the chef will undoubtedly be lower.
An emphasis in the present disclosure is that the notion of a chef studio 44 coupled with a robotic kitchen is a generic concept. The level of the robotic kitchen 48 is variable all the way from a home-kitchen outfitted with a set of arms and environmental sensors, all the way to an identical replica of the studio-kitchen, where a set of arms and articulated motions, tools, and appliances and ingredient-supply can replicate the chef's recipe in an almost identical fashion. The only variable to contend with will be the quality-degree of the end-result or dish in terms of quality, looks, taste, edibility, and health.
A potential method to mathematically describe this correlation between the recipe-outcome and the input variables in the robotic kitchen can best be described by the function below:
F
recipe-outcome
=F
studio(I,E,P,M,V)+FRobKit(Ef,I,Re, Pmf)
The above equation relates the degree to which the outcome of a robotically-prepared recipe matches that a human chef would prepare and serve (Frecipe-outcome) to the level that the recipe was properly captured and represented by the chef studio 44 (Fstudio) based on the ingredients (I) used, the equipment (E) available to execute the chef's processes (P) and methods (M) by properly capturing all the key variables (V) during the cooking process; and how the robotic kitchen is able to represent the replication/execution process of the robotic recipe script by a function (FRobKit) that is primarily driven by the use of the proper ingredients (I), the level of equipment fidelity (Ef) in the robotic kitchen compared to that in the chef studio, the level to which the recipe-script can be replicated (Re) in the robotic kitchen, and to what extent there is an ability and need to monitor and execute corrective actions to achieve the highest process monitoring fidelity (Pmf) possible.
The functions (Fstudio) and (FRobKit) can be any combination of linear or non-linear functional formulas with constants, variables, and any form of algorithmic relationships. An example for such algebraic representations for both functions could be:
F
studio
=I(fct. sin(Temp))+E (fct. Cooptopl*5)+P(fct. Circle(spoon)+V (fct. 0.5*time)
Delineating that the fidelity of the preparation process is related to the temperature of the ingredient, which varies over time in the refrigerator as a sinusoidal function, the speed with which an ingredient can be heated on the cooktop on specific station at a particular multiplicative rate, and related to how well a spoon can be moved in a circular path of a certain amplitude and period, and that the process needs to be carried out at no less than ½ the speed of the human chef for the fidelity of the preparation process to be maintained.
F
RobKit
=E
f
(Cooktop2,Size)+I(1.25*Size+Linear(Temp))+Re(Motion-Profile)+Pmf(Sensor-Suite Correspondence)
Delineating that the fidelity of the replication process in the robotic kitchen is related to the appliance type and layout for a particular cooking-area and the size of the heating-element, the size and temperature profile of the ingredient being seared and cooked (thicker steak requiring more cooking time), while also preserving the motion-profile of any stirring and bathing motions of a particular step like searing or mousse-beating, and whether the correspondence between sensors in the robotic kitchen and the chef-studio is sufficiently high to trust the monitored sensor data to be accurate and detailed enough to provide a proper monitoring fidelity of the cooking process in the robotic kitchen during all steps in a recipe.
The outcome of a recipe is not only a function of what fidelity the human chef's cooking steps/methods/process/skills were captured with by the chef studio, but also with what fidelity these can be executed by the robotic kitchen, where each of them has key elements that impact their respective subsystem performance.
The standardized (hard) automation dispenser(s) 82 is a device or a series of devices that is/are programmable and/or controllable via the cooking computer 16 to feed or provide pre-packaged (known) amounts or dedicated feeds of key materials for the cooking process, such as spices (salt, pepper, etc.), liquids (water, oil, etc.), or other dry materials (flour, sugar, etc.). The standardized hard automation dispensers 82 may be located at a specific station or may be able to be robotically accessed and triggered to dispense according to the recipe sequence. In other embodiments, a robotic hard automation module may be combined or sequenced in series or parallel with other modules, robotic arms, or cooking utensils. In this embodiment, the standardized robotic kitchen 50 includes robotic arms 70 and robotic hands 72; robotic hands, as controlled by the robotic food preparation engine 56 in accordance with a software recipe file stored in the memory 52 for replicating a chef's precise movements in preparing a dish to produce the same tasting dish as if the chef had prepared it himself or herself. The three-dimensional vision sensors 66 provide the capability to enable three-dimensional modeling of objects, providing a visual three-dimensional model of the kitchen activities, and scanning the kitchen volume to assess the dimensions and objects within the standardized robotic kitchen 50. The retractable safety glass 68 comprises a transparent material on the robotic kitchen 50, which when in an ON state extends the safety glass around the robotic kitchen to protect surrounding human beings from the movements of the robotic arms 70 and hands 72, hot water and other liquids, steam, fire and other dangers influents. The robotic food preparation engine 56 is communicatively coupled to an electronic memory 52 for retrieving a software recipe file previously sent from the chef studio system 44 for which the robotic food preparation engine 56 is configured to execute processes in preparing and replicating the cooking method and processes of a chef as indicated in the software recipe file. The combination of robotic arms 70 and robotic hands 72 serves to replicate the precise movements of the chef in preparing a dish, so that the resulting food dish will taste identical (or substantially identical) to the same food dish prepared by the chef. The standardized cooking equipment 74 includes an assortment of cooking appliances 46 that are incorporated as part of the robotic kitchen 50, including, but not limited to, a stove/induction/cooktop (electric cooktop, gas cooktop, induction cooktop), an oven, a grill, a cooking steamer, and a microwave oven. The standardized cookware and sensors 76 are used as embodiments for the recording of food preparation steps based on the sensors on the cookware and cooking a food dish based on the cookware with sensors, which include a pot with sensors, a pan with sensors, an oven with sensors, and a charcoal grill with sensors. The standardized cookware 78 includes frying pans, sauté pans, grill pans, multi-pots, roasters, woks, and braisers. The robotic arms 70 and the robotic hands 72 operate the standardized handles and utensils 80 in the cooking process. In one embodiment, one of the robotic hands 72 is fitted with a standardized handle, which is attached to a fork head, a knife head, and a spoon head for selection as required. The standardized hard automation dispensers 82 are incorporated into the robotic kitchen 50 to provide for expedient (via both robot arms 70 and human use) key and common/repetitive ingredients that are easily measured/dosed out or pre-packaged. The standardized containers 86 are storage locations that store food at room temperature. The standardized refrigerator containers 88 refer to, but are not limited to, a refrigerator with identified containers for storing fish, meat, vegetables, fruit, milk, and other perishable items. The containers in the standardized containers 86 or standardized storages 88 can be coded with container identifiers from which the robotic food preparation engine 56 is able to ascertain the type of food in a container based on the container identifier. The standardized containers 86 provide storage space for non-perishable food items such as salt, pepper, sugar, oil, and other spices. Standardized cookware with sensors 76 and the cookware 78 may be stored on a shelf or a cabinet for use by the robotic arms 70 for selecting a cooking tool to prepare a dish. Typically, raw fish, raw meat, and vegetables are pre-cut and stored in the identified standardized storages 88. The kitchen countertop 90 provides a platform for the robotic arms 70 to handle the meat or vegetables as needed, which may or may not include cutting or chopping actions. The kitchen faucet 92 provides a kitchen sink space for washing or cleaning food in preparation for a dish. When the robotic arms 70 have completed the recipe process to prepare a dish and the dish is ready for serving, the dish is placed on a serving counter 90, which further allows for the dining environment to be enhanced by adjusting the ambient setting with the robotic arms 70, such as placement of utensils, wine glasses, and a chosen wine compatible with the meal. One embodiment of the equipment in the standardized robotic kitchen module 50 is a professional series to increase the universal appeal to prepare various types of dishes.
The standardized robotic kitchen module 50 has as one objective: the standardization of the kitchen module 50 and various components with the kitchen module itself to ensure consistency in both the chef kitchen 44 and the robotic kitchen 48 to maximize the preciseness of recipe replication while minimizing the risks of deviations from precise replication of a recipe dish between the chef kitchen 44 and the robotic kitchen 48. One main purpose of having the standardization of the kitchen module 50 is to obtain the same result of the cooking process (or the same dish) between a first food dish prepared by the chef and a subsequent replication of the same recipe process via the robotic kitchen. Conceiving a standardized platform in the standardized robotic kitchen module 50 between the chef kitchen 44 and the robotic kitchen 48 has several key considerations: same timeline, same program or mode, and quality check. The same timeline in the standardized robotic kitchen 50 where the chef prepares a food dish at the chef kitchen 44 and the replication process by the robotic hands in the robotic kitchen 48 refers to the same sequence of manipulations, the same initial and ending time of each manipulation, and the same speed of moving an object between handling operations. The same program or mode in the standardized robotic kitchen 50 refers to the use and operation of standardized equipment during each manipulation recording and execution step. The quality check refers to three-dimensional vision sensors in the standardized robotic kitchen 50, which monitor and adjust in real time each manipulation action during the food preparation process to correct any deviation and avoid a flawed result. The adoption of the standardized robotic kitchen module 50 reduces and minimizes the risks of not obtaining the same result between the chef's prepared food dish and the food dish prepared by the robotic kitchen using robotic arms and hands. Without the standardization of a robotic kitchen module and the components within the robotic kitchen module, the increased variations between the chef kitchen 44 and the robotic kitchen 48 increase the risks of not being able to obtain the same result between the chef's prepared food dish and the food dish prepared by the robotic kitchen because more elaborate and complex adjustment algorithms will be required with different kitchen modules, different kitchen equipment, different kitchenware, different kitchen tools, and different ingredients between the chef kitchen 44 and the robotic kitchen 48.
The standardized robotic kitchen module 50 includes the standardization of many aspects. First, the standardized robotic kitchen module 50 includes standardized positions and orientations (in the XYZ coordinate plane) of any type of kitchenware, kitchen containers, kitchen tools, and kitchen equipment (with standardized fixed holes in the kitchen module and device positions). Second, the standardized robotic kitchen module 50 includes a standardized cooking volume dimension and architecture. Third, the standardized robotic kitchen module 50 includes standardized equipment sets, such as an oven, a stove, a dishwasher, a faucet, etc. Fourth, the standardized robotic kitchen module 50 includes standardized kitchenware, standardized cooking tools, standardized cooking devices, standardized containers, and standardized food storage in a refrigerator, in terms of shape, dimension, structure, material, capabilities, etc. Fifth, in one embodiment, the standardized robotic kitchen module 50 includes a standardized universal handle for handling any kitchenware, tools, instruments, containers, and equipment, which enable a robotic hand to hold the standardized universal handle in only one correct position, while avoiding any improper grasps or incorrect orientations. Sixth, the standardized robotic kitchen module 50 includes standardized robotic arms and hands with a library of manipulations. Seventh, the standardized robotic kitchen module 50 includes a standardized kitchen processor for standardized ingredient manipulations. Eighth, the standardized robotic kitchen module 50 includes standardized three-dimensional vision devices for creating dynamic three-dimensional vision data, as well as other possible standard sensors, for recipe recording, execution tracking, and quality check functions. Ninth, the standardized robotic kitchen module 50 includes standardized types, standardized volumes, standardized sizes, and standardized weights for each ingredient during a particular recipe execution.
The input module 50 is configured to receive any type of input information, such as software recipe files sent from another computing device. The calibration module 94 is configured to calibrate itself with the robotic arms 70, the robotic hands 72, and other kitchenware and equipment components within the standardized robotic kitchen module 50. The quality check module 96 is configured to determine the quality and freshness of raw meat, raw vegetables, milk-associated ingredients, and other raw foods at the time that the raw food is retrieved for cooking, as well as checking the quality of raw foods when receiving the food into the standardized food storage 88. The quality check module 96 can also be configured to conduct quality testing of an object based on senses, such as the smell of the food, the color of the food, the taste of the food, and the image or appearance of the food. The chef movements recording module 98 is configured to record the sequence and the precise movements of the chef when the chef prepares a food dish. The cookware sensor data recording module 100 is configured to record sensory data from cookware equipped with sensors (such as a pan with sensors, a grill with sensors, or an oven with sensors) placed in different zones within the cookware, thereby producing one or more sensory curves. The result is the generation of a sensory curve, such as temperature curve (and/or humidity), that reflects the temperature fluctuation of cooking appliances over time for a particular dish. The memory module 102 is configured as a storage location for storing software recipe files, for either replication of chef recipe movements or other types of software recipe files including sensory data curves. The recipe abstraction module 104 is configured to use recorded sensor data to generate machine-module specific sequenced operation profiles. The chef movements replication module 106 is configured to replicate the chef's precise movements in preparing a dish based on the stored software recipe file in the memory 52. The cookware sensory replication module 108 is configured to replicate the preparation of a food dish by following the characteristics of one or more previously recorded sensory curves, which were generated when the chef 49 prepared a dish by using the standardized cookware with sensors 76. The robotic cooking module 110 is configured to control and operate autonomously standardized kitchen operations, minimanipulations, non-standardized objects, and the various kitchen tools and equipment in the standardized robotic kitchen 50. The real time adjustment module 112 is configured to provide real-time adjustments to the variables associated with a particular kitchen operation or a mini operation to produce a resulting process that is a precise replication of the chef movement or a precise replication of the sensory curve. The learning module 114 is configured to provide learning capabilities to the robotic cooking engine 56 to optimize the precise replication in preparing a food dish by robotic arms 70 and the robotic hands 72, as if the food dish was prepared by a chef, using a method such as case-based (robotic) learning. The minimanipulation library database module 116 is configured to store a first database library of minimanipulations. The standardized kitchen operation library database module 117 is configured to store a second database library of standardized kitchenware and information on how to operate this standardized kitchenware. The output module 118 is configured to send output computer files or control signals external to the robotic cooking engine.
These individual software modules generate such information (but are not thereby limited to only these modules) as (i) chef-location and cooking-station ID via a location and configuration module 154, (ii) configuration of arms (via torso), (iii) tools handled, when and how, (iv) utensils used and locations on the station through the hardware and variable abstraction module 156, (v) processes executed with them, and (vi) variables (temperature, lid y/n, stirring, etc.) in need of monitoring through the process module 158, (vii) temporal (start/finish, type) distribution and (viii) types of processes (stir, fold, etc.) being applied, and (ix) ingredients added (type, amount, state of prep, etc.) through the cooking sequence and process abstraction module 160.
All this information is then used to create a machine-specific (not just for the robotic-arms, but also ingredient dispensers, tools, and utensils, etc.) set of recipe instructions through the stand-alone module 162, which are organized as script of sequential/parallel overlapping tasks to be executed and monitored. This recipe-script is stored 164 alongside the entire raw data set 166 in the data storage module 168 and is made accessible to either a remote robotic cooking station through the robotic kitchen interface module 170 or a human user 172 via a graphical user interface (GUI) 174.
The robotic kitchen execution is dependent on the type of kitchen available to the user. If the robotic kitchen uses the same/identical (at least functionally) equipment as used in the in the chef studio, the recipe replication process is primarily one of using the raw data and playing it back as part of the recipe-script execution process. Should the kitchen however differ from the ideal standardized kitchen, the execution engine(s) will have to rely on the abstraction data to generate kitchen-specific execution sequences to try to achieve a similar step-by-step result.
Since the cooking process is continually monitored by all sensor units in the robotic kitchen via a monitoring process 194, regardless of whether the known studio equipment 196 or the mixed/atypical non-chef studio equipment 198 is being used, the system is able to make modifications as needed depending on a recipe progress check 200. In one embodiment of the standardized kitchen, raw data is typically played back through an execution module 188 using chef-studio type equipment, and the only adjustments that are expected are adaptations 202 in the execution of the script (repeat a certain step, go back to a certain step, slow down the execution, etc.) as there is a one-to-one correspondence between taught and played-back data-sets. However, in the case of the non-standardized kitchen, the chances are very high that the system will have to modify and adapt the actual recipe itself and its execution, via a recipe script modification module 204, to suit the available tools/appliances 192 which differ from those in the chef studio 44 or the measured deviations from the recipe script (meat cooking too slowly, hot-spots in pot burning the roux, etc.). Overall recipe-script progress is monitored using a similar process 206, which differs depending on whether chef-studio equipment 208 or mixed/atypical kitchen equipment 210 is being used.
A non-standardized kitchen is less likely to result in a close-to-human chef cooked dish, as compared to using a standardized robotic kitchen that has equipment and capabilities reflective of those used in the studio-kitchen. The ultimate subjective decision is of course that of the human (or chef) tasting, or a quality evaluation 212, which yields to a (subjective) quality decision 214.
A data process-mapping algorithm 220 uses the simpler (typically single-unit) variables to determine where the process action is taking place (cooktop and/or oven, fridge, etc.) and assigns a usage tag to any item/appliance/equipment being used whether intermittently or continuously. It associates a cooking step (baking, grilling, ingredient-addition, etc.) to a specific time-period and tracks when, where, which, and how much of what ingredient was added. This (time-stamped) information dataset is then made available for the data-melding process during the recipe-script generation process 222.
The data extraction and mapping process 224 is primarily focused on taking two-dimensional information (such as from monocular/single-lensed cameras) and extracting key information from the same. In order to extract the important and more abstraction descriptive information from each successive image, several algorithmic processes have to be applied to this dataset. Such processing steps can include (but are not limited to) edge-detection, color and texture-mapping, and then using the domain-knowledge in the image, coupled with object-matching information (type and size) extracted from the data reduction and abstraction process 226, to allow for the identification and location of the object (whether an item of equipment or ingredient, etc.), again extracted from the data reduction and abstraction process 226, allowing one to associate the state (and all associated variables describing the same) and items in an image with a particular process-step (frying, boiling, cutting, etc.). Once this data has been extracted and associated with a particular image at a particular point in time, it can be passed to the recipe-script generation process 222 to formulate the sequence and steps within a recipe.
The data-reduction and abstraction engine (set of software routines) 226 is intended to reduce the larger three-dimensional data sets and extract from them key geometric and associative information. A first step is to extract from the large three-dimensional data point-cloud only the specific workspace area of importance to the recipe at that particular point in time. Once the data set has been trimmed, key geometric features will be identified by a process known as template matching. This allows for the identification of such items as horizontal tabletops, cylindrical pots and pans, arm and hand locations, etc. Once typical known (template) geometric entities are determined in a data-set a process of object identification and matching proceeds to differentiate all items (pot vs. pan, etc.) and associates the proper dimensionality (size of pot or pan, etc.) and orientation of the same, and places them within the three-dimensional world model being assembled by the computer. All this abstraction/extracted information are then also shared with the data-extraction and mapping engine 224, prior to all being fed to the recipe-script generation engine 222.
The recipe-script generation engine process 222 is responsible for melding (blending/combining) all the available data and sets into a structured and sequential cooking script with clear process-identifiers (prepping, blanching, frying, washing, plating, etc.) and process-specific steps within each, which can then be translated into robotic-kitchen machine-executable command-scripts that are synchronized based on process-completion and overall cooking time and cooking progress. Data melding will at least involve, but will not solely be limited to, the ability to take each (cooking) process step and populating the sequence of steps to be executed with the properly associated elements (ingredients, equipment, etc.), methods and processes to be used during the process steps, and the associated key control (set oven/cooktop temperatures/settings), and monitoring-variables (water or meat temperature, etc.) to be maintained and checked to verify proper progress and execution. The melded data is then combined into a structured sequential cooking script that will resemble a set of minimally descriptive steps (akin to a recipe in a magazine) but with a much larger set of variables associated with each element (equipment, ingredient, process, method, variable, etc.) of the cooking process at any one point in the procedure. The final step is to take this sequential cooking script and transform it into an identically structured sequential script that is translatable by a set of machines/robot/equipment within a robotic kitchen 48. It is this script the robotic kitchen 48 uses to execute the automated recipe execution and monitoring steps.
All raw (unprocessed) and processed data as well as the associated scripts (both structure sequential cooking-sequence script and the machine-executable cooking-sequence script) are stored in the data and profile storage unit/process 228 and time-stamped. It is from this database that the user, by way of a GUI, can select and cause the robotic kitchen to execute a desired recipe through the automated execution and monitoring engine 230, which is continually monitored by its own internal automated cooking process, with necessary adaptations and modifications to the script generated by the same and implemented by the robotic-kitchen elements, in order to arrive at a completely plated and served dish.
The minimanipulation library is a command-software repository, where motion behaviors and processes are stored based on an off-line learning process, where the arm/wrist/finger motions and sequences to successfully complete a particular abstract task (grab the knife and then slice; grab the spoon and then stir; grab the pot with one hand and then use other hand to grab spatula and get under meat and flip it inside the pan; etc.). This repository has been built up to contain the learned sequences of successful sensor-driven motion-profiles and sequenced behaviors for the hand/wrist (and sometimes also arm-position corrections), to ensure successful completions of object (appliance, equipment, tools) and ingredient manipulation tasks that are described in a more abstract language, such as “grab the knife and slice the vegetable”, “crack the egg into the bowl”, “flip the meat over in the pan”, etc. The learning process is iterative and is based on multiple trials of a chef-taught motion-profile from the chef studio, which is then executed and iteratively modified by the offline learning algorithm module, until an acceptable execution-sequence can be shown to have been achieved. The minimanipulation library (command software repository) is intended to have been populated (a-priori and offline) with all the necessary elements to allow the robotic-kitchen system to successfully interact with all equipment (appliances, tools, etc.) and main ingredients that require processing (steps beyond just dispensing) during the cooking process. While the human chef wore gloves with embedded haptic sensors (proximity, touch, contact-location/-force) for the fingers and palm, the robotic hands are outfitted with similar sensor-types in locations to allow their data to be used to create, modify and adapt motion-profiles to execute successfully the desired motion-profiles and handling-commands.
The object-manipulation portion of the robotic-kitchen cooking process (robotic recipe-script execution software module for the interactive manipulation and handling of objects in the kitchen environment) 252 is further elaborated below. Using the robotic recipe-script database 254 (which contains data in raw, abstraction cooking-sequence and machine-executable script forms), the recipe script executor module 256 steps through a specific recipe execution-step. The configuration playback module 258 selects and passes configuration commands through to the robot arm system (torso, arm, wrist and hands) controller 270, which then controls the physical system to emulate the required configuration (joint-positions/-velocities/-torques, etc.) values.
The notion of being able to carry out proper environment interaction manipulation and handling tasks faithfully is made possible through a real-time process-verification by way of (i) 3D world modeling as well as (ii) minimanipulation. Both the verification and manipulation steps are carried out through the addition of the robot wrist and hand configuration modifier 260. This software module uses data from the 3D world configuration modeler 262, which creates a new 3D world model at every sampling step from sensory data supplied by the multimodal sensor(s) unit(s), in order to ascertain that the configuration of the robotic kitchen systems and process matches that required by the recipe script (database); if not, it enacts modifications to the commanded system-configuration values to ensure the task is completed successfully. Furthermore, the robot wrist and hand configuration modifier 260 also uses configuration-modifying input commands from the minimanipulation motion profile executor 264. The hand/wrist (and potentially also arm) configuration modification data fed to the configuration modifier 260 are based on the minimanipulation motion profile executor 264 knowing what the desired configuration playback should be from 258, but then modifying it based on its 3D object model library 266 and the a-priori learned (and stored) data from the configuration and sequencing library 268 (which was built based on multiple iterative learning steps for all main object handling and processing steps).
While the configuration modifier 260 continually feeds modified commanded configuration data to the robot arm system controller 270, it relies on the handling/manipulation verification software module 272 to verify not only that the operation is proceeding properly but also whether continued manipulation/handling is necessary. In the case of the latter (answer ‘N’ to the decision), the configuration modifier 260 re-requests configuration-modification (for the wrist, hands/fingers and potentially the arm and possibly even torso) updates from both the world modeler 262 and the minimanipulation profile executor 264. The goal is simply to verify that a successful manipulation/handling step or sequence has been successfully completed. The handling/manipulation verification software module 272 carries out this check by using the knowledge of the recipe script database F2 and the 3D world configuration modeler 262 to verify the appropriate progress in the cooking step currently being commanded by the recipe script executor 256. Once progress has been deemed successful, the recipe script index increment process 274 notifies the recipe script executor 256 to proceed to the next step in the recipe-script execution.
The multimodal sensor-unit(s) 302, comprising, but not limited to, video cameras 304, IR cameras and rangefinders 306, stereo (or even trinocular) camera(s) 308 and multi-dimensional scanning lasers 310, provide multi-spectral sensory data to the main software abstraction engines 312 (after being acquired & filtered in the data acquisition and filtering module 314). The data is used in a scene understanding module 316 to carry out multiple steps such as (but not limited to) building high- and lower-resolution (laser: high-resolution; stereo-camera: lower-resolution) three-dimensional surface volumes of the scene, with superimposed visual and IR-spectrum color and texture video information, allowing edge-detection and volumetric object-detection algorithms to infer what elements are in a scene, allowing the use of shape-/color-/texture- and consistency-mapping algorithms to run on the processed data to feed processed information to the Kitchen Cooking Process Equipment Handling Module 318. In the module 318, software-based engines are used for the purpose of identifying and three-dimensionally locating the position and orientation of kitchen tools and utensils and identifying and tagging recognizable food elements (meat, carrots, sauce, liquids, etc.) so as to generate data to let the computer build and understand the complete scene at a particular point in time so as to be used for next-step planning and process monitoring. Engines required to achieve such data and information abstraction include, but are not limited to, grasp reasoning engines, robotic kinematics and geometry reasoning engines, physical reasoning engines and task reasoning engines. Output data from both engines 316 and 318 are then used to feed the scene modeler and content classifier 320, where the 3D world model is created with all the key content required for executing the robotic cooking script executor. Once the fully-populated model of the world is understood, it can be used to feed the motion and handling planner 322 (if robotic-arm grasping and handling are necessary, the same data can be used to differentiate and plan for grasping and manipulating food and kitchen items depending on the required grip and placement) to allow for planning motions and trajectories for the arm(s) and attached end-effector(s) (grippers, multi-fingered hands). A follow-on Execution Sequence planner 324 creates the proper sequencing of task-based commands for all individual robotic/automated kitchen elements, which are then used by the robotic kitchen actuation systems 326. The entire sequence above is repeated in a continuous closed loop during the robotic recipe-script execution and monitoring phase.
Some suitable robotic hands that can be modified for use with the robotic kitchen 48 include Shadow Dexterous Hand and Hand-Lite designed by Shadow Robot Company, located in London, the United Kingdom; a servo-electric 5-finger gripping hand SVH designed by SCHUNK GmbH & Co. KG, located in Lauffen/Neckar, Germany; and DLR HIT HAND II designed by DLR Robotics and Mechatronics, located in Cologne, Germany.
Several robotic arms 72 are suitable for modification to operate with the robotic kitchen 48, which include UR3 Robot and UR5 Robot by Universal Robots A/S, located in Odense S, Denmark, Industrial Robots with various payloads designed by KUKA Robotics, located in Augsburg, Bavaria, Germany, Industrial Robot Arm Models designed by Yaskawa Motoman, located in Kitakyushu, Japan.
In some embodiments, a chef performs the same food preparation operation multiple times, yielding values of the sensor reading, and parameters in the corresponding robotic instructions that vary somewhat from one time to the next. The set of sensor readings for each sensor across multiple repetitions of the preparation of the same food dish provides a distribution with a mean, standard deviation and minimum and maximum values. The corresponding variations on the robotic instructions (also called the effector parameters) across multiple executions of the same food dish by the chef also define distributions with mean, standard deviation, minimum and maximum values. These distributions may be used to determine the fidelity (or accuracy) of subsequent robotic food preparations.
In one embodiment the estimated average accuracy of a robotic food preparation operation is given by:
Where C represents the set of Chef parameters (1st through nth) and R represents the set of Robotic Apparatus parameters (correspondingly (1st through nth). The numerator in the sum represents the difference between robotic and chef parameters (i.e. the error) and the denominator normalizes for
the maximal difference). The sum gives the total normalized cumulative error and multiplying by 1/n gives the average error. The complement of the average error corresponds to the average accuracy.
Another version of the accuracy calculation weighs the parameters for importance, where each coefficient (each αi) represents the importance of the ith parameter, the normalized cumulative error is
and the estimated average accuracy is given by:
As depicted in
A person of skill in the art will appreciate that the probability of overall success can be low even if the probability of success of individual stages is relatively high. For instance, given 10 stages and a probability of success of each stage being 90%, the probability of overall success is (0.9)10, 0.28 or 28%.
A stage in preparing a food dish comprises one or more minimanipulations, where each minimanipulation comprises one or more robotic actions leading to a well-defined intermediate result. For instance, slicing a vegetable can be a minimanipulation comprising grasping the vegetable with one hand, grasping a knife with the other, and applying repeated knife movements until the vegetable is sliced. A stage in preparing a dish can comprise one or multiple slicing minimanipulations.
The probability of success formula applies equally well at the level of stages and at the level of minimanipulations, so long as each minimanipulation is relatively independent of other minimanipulations.
In one embodiment, in order to mitigate the problem of reduced certainty of success due to potential compounding errors, standardized methods for most or all of the minimanipulations in all of the stages are recommended. Standardized operations are ones that can be pre-programmed, pre-tested, and if necessary pre-adjusted to select the sequence of operations with the highest probability of success. Hence, if the probability of standardized methods via the minimanipulations within stages is very high, so will be the overall probability of success of preparing the food dish, due to the prior work, until all of the steps have been perfected and tested. For instance, to return to the above example, if each stage utilizes reliable standardized methods, and its success probability is 99% (instead of 90% as in the earlier example), then the overall probability of success will be (0.99)10=90.4%, assuming there are 10 stages as before. This is clearly better than 28% probability of an overall correct outcome.
In another embodiment, more than one alternative method is provided for each stage, wherein, if one alternative fails, another alternative is tried. This requires dynamic monitoring to determine the success or failure of each stage, and the ability to have an alternate plan. The probability of success for that stage is the complement of the probability of failure for all of the alternatives, which mathematically is written as:
In the above expression, si is the stage and A(si) is the set of alternatives for accomplishing si. The probability of failure for a given alternative is the complement of the probability of success for that alternative, namely 1−P(si|ai), and the probability of all the alternatives failing is the product in the above formula. Hence, the probability that not all will fail is the complement of the product. Using the method of alternatives, the overall probability of success can be estimated as the product of each stage with alternatives, namely:
With this method of alternatives, if each of the 10 stages had 4 alternatives, and the expected success of each alternative for each stage was 90%, then the overall probability of success would be (1−(1−(0.9))4)10=0.99 or 99% versus just 28% without the alternatives. The method of alternatives transforms the original problem from a chain of stages with multiple single points of failure (if any stage fails) to one without single points of failure, since all the alternatives would need to fail in order for any given stage to fail, providing more robust outcomes.
In another embodiment, both standardized stages, comprising of standardized minimanipulations and alternate means of the food dish preparation stages, are combined, yielding a behavior that is even more robust. In such a case, the corresponding probability of success can be very high, even if alternatives are only present for some of the stages or minimanipulations.
In another embodiment only the stages with lower probability of success are provided alternatives, in case of failure, for instance stages for which there is no very reliable standardized method, or for which there is potential variability, e.g. depending on odd-shaped materials. This embodiment reduces the burden of providing alternatives to all stages.
A predefined minimanipulation is available to achieve each functional result (e.g., the egg is cracked). Each minimanipulation comprises of a collection of action primitives which act together to accomplish the functional result. For example, the robot may begin by moving its hand towards the egg, touching the egg to localize its position and verify its size, and executing the movements and sensing actions necessary to grasp and lift the egg into the known and predetermined configuration.
Multiple minimanipulations may be collected into stages such as making a sauce for convenience in understanding and organizing the recipe. The end result of executing all of the minimanipulations to complete all of the stages is that a food dish has been replicated with a consistent result each time.
The robotic hand 72 has the RGB-D sensor 500 placed in or near the middle of the palm for detecting the distance and shape of an object, as well as the distance of the object, and for handling a kitchen tool. The RGB-D sensor 500 provides guidance to the robotic hand 72 in moving the robotic hand 72 toward the direction of the object and to make necessary adjustments to grab an object. Second, a sonar sensor 502f and/or a tactile pressure sensor are placed near the palm of the robotic hand 72, for detecting the distance and shape, and subsequent contact, of the object. The sonar sensor 502f can also guide the robotic hand 72 to move toward the object. Additional types of sensors in the hand may include ultrasonic sensors, lasers, radio frequency identification (RFID) sensors, and other suitable sensors. In addition, the tactile pressure sensor serves as a feedback mechanism so as to determine whether the robotic hand 72 continues to exert additional pressure to grab the object at such point where there is sufficient pressure to safely lift the object. In addition, the sonar sensor 502f in the palm of the robotic hand 72 provides a tactile sensing function to grab and handle a kitchen tool. For example, when the robotic hand 72 grabs a knife to cut beef, the amount of pressure that the robotic hand exerts on the knife and applies to the beef can be detected by the tactile sensor when the knife finishes slicing the beef, i.e. when the knife has no resistance, or when holding an object. The pressure distributed is not only to secure the object, but also not to break it (e.g. an egg).
Furthermore, each finger on the robotic hand 72 has haptic vibration sensors 502a-e and sonar sensors 504a-e on the respective fingertips, as shown by a first haptic vibration sensor 502a and a first sonar sensor 504a on the fingertip of the thumb, a second haptic vibration sensor 502b and a second sonar sensor 504b on the fingertip of the index finger, a third haptic vibration sensor 502c and a third sonar sensor 504c on the fingertip of the middle finger, a fourth haptic vibration sensor 502d and a fourth sonar sensor 504d on the fingertip of the ring finger, and a fifth haptic vibration sensor 502e and a fifth sonar sensor 504e on the fingertip of the pinky. Each of the haptic vibration sensors 502a, 502b, 502c, 502d and 502e can simulate different surfaces and effects by varying the shape, frequency, amplitude, duration and direction of a vibration. Each of the sonar sensors 504a, 504b, 504c, 504d and 504e provides sensing capability on the distance and shape of the object, sensing capability for the temperature or moisture, as well as feedback capability. Additional sonar sensors 504g and 504h are placed on the wrist of the robotic hand 72.
The shape of the deformable palm will be described using locations of feature points relative to a fixed reference frame, as shown in
Feature points are measured by calibrated cameras mounted in the workspace as the chef performs cooking tasks. Trajectories of feature points in time are used to match the chef motion with the robot motion, including matching the shape of the deformable palm. Trajectories of feature points from the chef's motion may also be used to inform robot deformable palm design, including shape of the deformable palm surface and placement and range of motion of the joints of the robot hand.
As illustrated in
Shape feature point locations are determined based on sensor signals. The sensors provide an output that allows calculation of distance in a reference frame, which is attached to the magnet, which furthermore is attached to the hand of the robot or the chef.
The three-dimensional location of each shape feature point is calculated based on the sensor measurements and known parameters obtained from sensor calibration. The shape of the deformable palm is comprised of a vector of three-dimensional shape feature points, all of which are expressed in the reference coordinate frame, which is fixed to the hand of the robot or the chef. For additional information on common contact regions on the human hand and function in grasping, see the material from Kamakura, Noriko, Michiko Matsuo, Harumi Ishii, Fumiko Mitsuboshi, and Yoriko Miura. “Patterns of static pretension in normal hands.” American Journal of Occupational Therapy 34, no. 7 (1980): 437-445, which this reference is incorporated by reference herein in its entirety.
The robotic hand 72 includes a camera sensor 684, such as an RGB-D sensor, an imaging sensor or a visual sensing device, placed in or near the middle of the palm for detecting the distance and shape of an object, as well as the distance of the object, and for handling a kitchen tool. The imaging sensor 682f provides guidance to the robotic hand 72 in moving the robotic hand 72 towards the direction of the object and to make necessary adjustments to grab an object. In addition, a sonar sensor, such as a tactile pressure sensor, may be placed near the palm of the robotic hand 72, for detecting the distance and shape of the object. The sonar sensor 682f can also guide the robotic hand 72 to move toward the object. Each of the sonar sensors 682a, 682b, 682c, 682d, 682e, 682f, 682g includes ultrasonic sensors, laser, radio frequency identification (RFID), and other suitable sensors. In addition, each of the sonar sensors 682a, 682b, 682c, 682d, 682e, 682f, 682g serves as a feedback mechanism to determine whether the robotic hand 72 continues to exert additional pressure to grab the object at such point where there is sufficient pressure to grab and lift the object. In addition, the sonar sensor 682f in the palm of the robotic hand 72 provides tactile sensing function to handle a kitchen tool. For example, when the robotic hand 72 grabs a knife to cut beef, the amount of pressure that the robotic hand 72 exerts on the knife and applies to the beef, allows the tactile sensor to detect when the knife finishes slicing the beef, i.e., when the knife has no resistance. The distributed pressure is not only to secure the object, but also so as not to exert too much pressure so as to, for example, not to break an egg). Furthermore, each finger on the robotic hand 72 has a sensor on the finger tip, as shown by the first sensor 682a on the finger tip of the thumb, the second sensor 682b on the finger tip of the index finger, the third sensor 682c on the finger tip of the middle finger, the fourth sensor 682d on the finger tip of the ring finger, and the fifth sensor 682f on the finger tip of the pinky. Each of the sensors 682a, 682b, 682c, 682d, 682e provide sensing capability on the distance and shape of the object, sensing capability for temperature or moisture, as well as tactile feedback capability.
The RGB-D sensor 684 and the sonar sensor 682f in the palm, plus the sonar sensors 682a, 682b, 682c, 682d, 682e in the fingertip of each finger, provide a feedback mechanism to the robotic hand 72 as a means to grab a non-standardized object, or a non-standardized kitchen tool. The robotic hands 72 may adjust the pressure to a sufficient degree to grab ahold of the non-standardized object. A program library 690 that stores sample grabbing functions 692, 694, 696 according to a specific time interval for which the robotic hand 72 can draw from in performing a specific grabbing function, is illustrated in
To create the minimanipulation that results in cracking an egg with a knife, multiple parameter combinations must be tested to identify a set of parameters that ensure the desired functional result—that the egg is cracked—is achieved. In this example, parameters are identified to determine how to grasp and hold an egg in such a way so as not to crush it. An appropriate knife is selected through testing, and suitable placements are found for the fingers and palm so that it may be held for striking. A striking motion is identified that will successfully crack an egg. An opening motion and/or force are identified that allows a cracked egg to be opened successfully.
The teaching/learning process for the robotic apparatus 75 involves multiple and repetitive tests to identify the necessary parameters to achieve the desired final functional result.
These tests may be performed over varying scenarios. For example, the size of the egg can vary. The location at which it is to be cracked can vary. The knife may be at different locations. The minimanipulations must be successful in all of these variable circumstances.
Once the learning process has been completed, results are stored as a collection of action primitives that together are known to accomplish the desired functional result.
As an example of the operative relationship between the creation of a minimanipulation in
At step 862, the computer 16 tests and validates the specific successful parameter combination for X number of times (such as one hundred times). At step 864, the computer 16 computes the number of failed results during the repeated test of the specific successful parameter combination. At step 866, the computer 16 selects the next one-time successful parameter combination from the temporary library, and returns the process back to step 862 for testing the next one-time successful parameter combination X number of times. If no further one-time successful parameter combination remains, the computer 16 stores the test results of one or more sets of parameter combinations that produce a reliable (or guaranteed) result at step 868. If there are more than one reliable sets of parameter combinations, at step 870, the computer 16 determines the best or optimal set of parameter combinations and stores the optimal set of parameter combination which is associated with the specific minimanipulation for use in the minimanipulation library database by the robotic apparatus 75 in the standardized robotic kitchen 50 during the food preparation stages of a recipe.
The standardized robotic kitchen 50 in
Based on the proper placement of the augmented sensor system 1152 placed somewhere in the robotic kitchen, such as on a computer-controllable railing, or on the torso of a robot with arms and hands, allows for 3D-tracking and raw data generation, both during chef-monitoring for machine-specific recipe-script generation, and monitoring the progress and successful completion of the robotically-executed steps in the stages of the dish replication in the standardized robotic kitchen 50.
The standardized robotic kitchen 50 depicts another possible configuration for the use of one or more augmented sensor systems 20. The standardized robotic kitchen 50 shows a multitude of augmented sensor systems 20 placed in the corners above the kitchen work-surface along the length of the kitchen axis with the intent to effectively cover the complete visible three-dimensional workspace of the standardized robotic kitchen 50.
The proper placement of the augmented sensor system 20 in the standardized robotic kitchen 50, allows for three-dimensional sensing, using video-cameras, lasers, sonars and other two- and three-dimensional sensor systems to enable the collection of raw data to assist in the creation of processed data for real-time dynamic models of shape, location, orientation and activity for robotic arms, hands, tools, equipment and appliances, as they relate to the different steps in the multiple sequential stages of dish replication in the standardized robotic kitchen 50.
Raw data is collected at each point in time to allow the raw data to be processed to be able to extract the shape, dimension, location and orientation of all objects of importance to the different steps in the multiple sequential stages of dish replication in the standardized robotic kitchen 50 in a step 1162. The processed data is further analyzed by the computer system to allow the controller of the standardized robotic kitchen to adjust robotic arm and hand trajectories and minimanipulations, by modifying the control signals defined by the robotic script. Adaptations to the recipe-script execution and thus control signals is essential in successfully completing each stage of the replication for a particular dish, given the potential for variability for many variables (ingredients, temperature, etc.). The process of recipe-script execution based on key measurable variables is an essential part of the use of the augmented (also termed multi-modal) sensor system 20 during the execution of the replicating steps for a particular dish in a standardized robotic kitchen 50.
In
The top level 1292-1 contains multiple cabinet-type modules with different units to perform specific kitchen functions by way of built-in appliances and equipment. At the simplest level a shelf/cabinet storage area 1294 is included, a cabinet volume 1296 used for storing and accessing cooking tools and utensils and other cooking and serving ware (cooking, baking, plating, etc.), a storage ripening cabinet volume 1298 for particular ingredients (e.g. fruit and vegetables, etc.), a chilled storage zone 1300 for such items as lettuce and onions, a frozen storage cabinet volume 1302 for deep-frozen items, another storage pantry zone 1304 for other ingredients and rarely used spices, and a hard automation ingredient supplier 1305, and others.
The counter level 1292-2 not only houses the robotic arms 70, but also includes a serving counter 1306, a counter area with a sink 1308, another counter area 1310 with removable working surfaces (cutting/chopping board, etc.), a charcoal-based slatted grill 1312 and a multi-purpose area for other cooking appliances 1314, including a stove, cooker, steamer and poacher.
The lower level 1292-3 houses the combination convection oven and microwave 1316, the dish-washer 1318 and a larger cabinet volume 1320 that holds and stores additional frequently used cooking and baking ware, as well as tableware and packing materials and cutlery.
The perspective view of the robotic kitchen 50 clearly identifies one of the many possible layouts and locations for equipment at all three levels, including the top level 1292-1 (storage pantry 1304, standardized cooking tools and ware 1320, storage ripening zone 1298, chilled storage zone 1300, and frozen storage zone 1302, the counter level 1292-2 (robotic arms 70, sink 1308, chopping/cutting area 1310, charcoal grill 1312, cooking appliances 1314 and serving counter 1306) and the lower level (dish-washer 1318 and oven and microwave 1316).
The top level contains multiple cabinet-type modules with different units to perform specific kitchen functions by way of built-in appliances and equipment. At the simplest level this includes a cabinet volume 1296 used for storing and accessing standardized cooking tools and utensils and other cooking and serving ware (cooking, baking, plating, etc.), a storage ripening cabinet volume 1298 for particular ingredients (e.g. fruit and vegetables, etc.), a chilled storage zone 1300 for such items as lettuce and onions, a frozen storage cabinet volume 86 for deep-frozen items, and another storage pantry zone 1294 for other ingredients and rarely used spices, etc. Each of the modules within the top level contains sensor units 1884 providing data to one or more control units 1886, either directly or by way of one or more central or distributed control computers, to allow for computer-controlled operations.
The counter level 1292-2 houses not only monitoring sensors 1884 and control units 1886, but also visual command monitoring devices 1316 while also including a counter area with a sink and electronic faucet 1308, another counter area 1310 with removable working surfaces (cutting/chopping board, etc.), a (smart) charcoal-based slatted grill 1312 and a multi-purpose area for other cooking appliances 1314, including a stove, cooker, steamer and poacher. Each of the modules within the counter level contains sensor units 1184 providing data to one or more control units 1186, either directly or by way of one or more central or distributed control computers, to allow for computer-controlled operations. Additionally, one or more visual command monitoring devices (not shown) are also provided within the counter level for the purposes of monitoring the visual operations of the human chef in the studio kitchen as well as the robotic arms or human user in the standardized robotic kitchen, where data is fed to one or more central or distributed computers for processing and subsequent corrective or supportive feedback and commands sent back to the robotic kitchen for display or script-following execution.
The lower level 1292-3 houses the combination convection oven and microwave as well as steamer, poacher and grill 1316, the dish-washer 1318, the hard automation controlled ingredient dispensers 86 (not showed)s, and a larger cabinet volume 1309 that holds and stores additional frequently used cooking and baking ware, as well as tableware, flatware, utensils (whisks, knives, etc.) and cutlery. Each of the modules within the lower level contains sensor units 1307 providing data to one or more control units 376, either directly or by way of one or more central or distributed control computers, to allow for computer-controlled operations.
The computer module 2374 includes modules that include, but are not limited to, a robotic painting engine 2376 interfaced to a painting movement emulator 2378, a painting control module 2380 that acts based on visual feedback of the painting execution processes, a memory module 2382 to store painting execution program files, algorithms 2384 for learning the selection and usage of the appropriate drawing tools, as well as an extended simulation validation and calibration module 2386.
One embodiment of the art platform standardization is defined as follows. First, standardized position and orientation (xyz) of any kind of art tools (brushes, paints, canvas, etc.) in the art platform. Second, standardized operation volume dimensions and architecture in each art platform. Third, standardized art tools set in each art platform. Fourth, standardized robotic arms and hands with a library of manipulations in each art platform. Fifth, standardized three-dimensional vision devices for creating dynamic three-dimensional vision data for painting recording and execution tracking and quality check function in each art platform. Sixth, standardized type/producer/mark/of all using paints during particular painting execution. Seventh, standardized type/producer/mark/size of canvas during particular painting execution.
One main purpose to have Standardized Art Platform is to achieve the same result of the painting process (i.e., the same painting) executing by the original painter and afterward duplicated by robotic Art Platform. Several main points to emphasize in using the standardized Art Platform: (1) have the same timeline (same sequence of manipulations, same initial and ending time of each manipulation, same speed of moving object between manipulations) of Painter and automatic robotic execution; and (2) there are quality checks (3D vision, sensors) to avoid any fail result after each manipulation during the painting process. Therefore, the risk of not having the same result is reduced if the painting was done at the standardized art platform. If a non-standardized art platform is used, this will increase the risk of not having the same result (i.e. not the same painting) because adjustment algorithms may be required when the painting is not executed at not the same volume, with the same art tools, with the same paint or with the same canvas in the painter studio as in the robotic art platform.
In the case where the human wants to be intimately involved in the selection of the title/composer, the system provides a list of performers for the selected title to the human on a display in step 2503. In step 2504 the user selects the desired performer, a choice input that the system receives in step 2505. In step 2506, the robotic musician engine generates and uploads the instrument playing execution program files, and proceeds in step 2507 to compare potential limitations between a human and a robotic musician's playing performance on a particular instrument, thereby allowing it to calculate a potential performance gap. A checking step 2508 decides whether there exists a gap. Should there be a gap, the system will suggest other selections based on the user's preference profile in step 2509. Should there be no performance gap, the robotic musician engine will confirm the selection in step 2510 and allow the user to proceed to step 2511, where the user may select the ‘start’ button to play the program file for the selection.
In general terms, there may be considered a method of motion capture and analysis for a robotics system, comprising sensing a sequence of observations of a person's movements by a plurality of robotic sensors as the person prepares a product using working equipment; detecting in the sequence of observations minimanipulations corresponding to a sequence of movements carried out in each stage of preparing the product; transforming the sensed sequence of observations into computer readable instructions for controlling a robotic apparatus capable of performing the sequences of minimanipulations; storing at least the sequence of instructions for minimanipulations to electronic media for the product. This may be repeated for multiple products. The sequence of minimanipulations for the product is preferably stored as an electronic record. The minimanipulations may be abstraction parts of a multi-stage process, such as cutting an object, heating an object (in an oven or on a stove with oil or water), or similar. Then, the method may further comprise transmitting the electronic record for the product to a robotic apparatus capable of replicating the sequence of stored minimanipulations, corresponding to the original actions of the person. Moreover, the method may further comprise executing the sequence of instructions for minimanipulations for the product by the robotic apparatus 75, thereby obtaining substantially the same result as the original product prepared by the person.
In another general aspect, there may be considered a method of operating a robotics apparatus, comprising providing a sequence of pre-programmed instructions for standard minimanipulations, wherein each minimanipulation produces at least one identifiable result in a stage of preparing a product; sensing a sequence of observations corresponding to a person's movements by a plurality of robotic sensors as the person prepares the product using equipment; detecting standard minimanipulations in the sequence of observations, wherein a minimanipulation corresponds to one or more observations, and the sequence of minimanipulations corresponds to the preparation of the product; transforming the sequence of observations into robotic instructions based on software implemented methods for recognizing sequences of pre-programmed standard minimanipulations based on the sensed sequence of person motions, the minimanipulations each comprising a sequence of robotic instructions and the robotic instructions including dynamic sensing operations and robotic action operations; storing the sequence of minimanipulations and their corresponding robotic instructions in electronic media. Preferably, the sequence of instructions and corresponding minimanipulations for the product are stored as an electronic record for preparing the product. This may be repeated for multiple products. The method may further include transmitting the sequence of instructions (preferably in the form of the electronic record) to a robotics apparatus capable of replicating and executing the sequence of robotic instructions. The method may further comprise executing the robotic instructions for the product by the robotics apparatus, thereby obtaining substantially the same result as the original product prepared by the human. Where the method is repeated for multiple products, the method may additionally comprise providing a library of electronic descriptions of one or more products, including the name of the product, ingredients of the product and the method (such as a recipe) for making the product from ingredients.
Another generalized aspect provides a method of operating a robotics apparatus comprising receiving an instruction set for a making a product comprising of a series of indications of minimanipulations corresponding to original actions of a person, each indication comprising a sequence of robotic instructions and the robotic instructions including dynamic sensing operations and robotic action operations; providing the instruction set to a robotic apparatus capable of replicating the sequence of minimanipulations; executing the sequence of instructions for minimanipulations for the product by the robotic apparatus, thereby obtaining substantially the same result as the original product prepared by the person.
A further generalized method of operating a robotic apparatus may be considered in a different aspect, comprising executing a robotic instructions script for duplicating a recipe having a plurality of product preparation movements; determining if each preparation movement is identified as a standard grabbing action of a standard tool or a standard object, a standard hand-manipulation action or object, or a non-standard object; and for each preparation movement, one or more of: instructing the robotic cooking device to access a first database library if the preparation movement involves a standard grabbing action of a standard object; instructing the robotic cooking device to access a second database library if the food preparation movement involves a standard hand-manipulation action or object; and instructing the robotic cooking device to create a three-dimensional model of the non-standard object if the food preparation movement involves a non-standard object. The determining and/or instructing steps may be particularly implemented at or by a computer system. The computing system may have a processor and memory.
Another aspect may be found in a method for product preparation by robotic apparatus 75, comprising replicating a recipe by preparing a product (such as a food dish) via the robotic apparatus 75, the recipe decomposed into one or more preparation stages, each preparation stage decomposed into a sequence of minimanipulations and active primitives, each minimanipulation decomposed into a sequence of action primitives. Preferably, each mini manipulation has been (successfully) tested to produce an optimal result for that mini manipulation in view of any variations in positions, orientations, shapes of an applicable object, and one or more applicable ingredients.
A further method aspect may be considered in a method for recipe script generation, comprising receiving filtered raw data from sensors in the surroundings of a standardized working environment module, such as a kitchen environment; generating a sequence of script data from the filtered raw data; and transforming the sequence of script data into machine-readable and machine-executable commands for preparing a product, the machine-readable and machine-executable commands including commands for controlling a pair of robotic arms and hands to perform a function. The function may be from the group comprising one or more cooking stages, one or more minimanipulations, and one or more action primitives. A recipe script generation system comprising hardware and/or software features configured to operate in accordance with this method may also be considered.
In any of these aspects, the following may be considered. The preparation of the product normally uses ingredients. Executing the instructions typically includes sensing properties of the ingredients used in preparing the product. The product may be a food dish in accordance with a (food) recipe (which may be held in an electronic description) and the person may be a chef. The working equipment may comprise kitchen equipment. These methods may be used in combination with any one or more of the other features described herein. One, more than one or all of the features of the aspects may be combined, so a feature from one aspect may be combined with another aspect for example. Each aspect may be computer-implemented and there may be provided a computer program configured to perform each method when operated by a computer or processor. Each computer program may be stored on a computer-readable medium. Additionally or alternatively, the programs may be partially or fully hardware-implemented. The aspects may be combined. There may also be provided a robotics system configured to operate in accordance with the method described in respect of any of these aspects.
In another aspect, there may be provided a robotics system, comprising: a multi-modal sensing system capable of observing human motions and generating human motions data in a first instrumented environment; and a processor (which may be a computer), communicatively coupled to the multi-modal sensing system, for recording the human motions data received from the multi-modal sensing system and processing the human motions data to extract motion primitives, preferably such that the motion primitives define operations of a robotics system. The motion primitives may be minimanipulations, as described herein (for example in the immediately preceding paragraphs) and may have a standard format. The motion primitive may define specific types of action and parameters of the type of action, for example a pulling action with a defined starting point, end point, force and grip type. Optionally, there may be further provided a robotics apparatus, communicatively coupled to the processor and/or multi-modal sensing system. The robotics apparatus may be capable of using the motion primitives and/or the human motions data to replicate the observed human motions in a second instrumented environment.
In a further aspect, there may provided a robotics system, comprising: a processor (which may be a computer), for receiving motion primitives defining operations of a robotics system, the motion primitives being based on human motions data captured from human motions; and a robotics system, communicatively coupled to the processor, capable of using the motion primitives to replicate human motions in an instrumented environment. It will be understood that these aspects may be further combined.
A further aspect may be found in a robotics system comprising: first and second robotic arms; first and second robotic hands, each hand having a wrist coupled to a respective arm, each hand having a palm and multiple articulated fingers, each articulated finger on the respective hand having at least one sensor; and first and second gloves, each glove covering the respective hand having a plurality of embedded sensors. Preferably, the robotics system is a robotic kitchen system.
There may further be provided, in a different but related aspect, a motion capture system, comprising: a standardized working environment module, preferably a kitchen; plurality of multi-modal sensors having a first type of sensors configured to be physically coupled to a human and a second type of sensors configured to be spaced away from the human. One or more of the following may be the case: the first type of sensors may be for measuring the posture of human appendages and sensing motion data of the human appendages; the second type of sensors may be for determining a spatial registration of the three-dimensional configurations of one or more of the environment, objects, movements, and locations of human appendages; the second type of sensors may be configured to sense activity data; the standardized working environment may have connectors to interface with the second type of sensors; the first type of sensors and the second type of sensors measure motion data and activity data, and send both the motion data and the activity data to a computer for storage and processing for product (such as food) preparation.
An aspect may additionally or alternatively be considered in a robotic hand coated with a sensing gloves, comprising: five fingers; and a palm connected to the five fingers, the palm having internal joints and a deformable surface material in three regions; a first deformable region disposed on a radial side of the palm and near the base of the thumb; a second deformable region disposed on a ulnar side of the palm, and spaced apart from the radial side; and a third deformable region disposed on the palm and extend across the base of the fingers. Preferably, the combination of the first deformable region, the second deformable region, the third deformable region, and the internal joints collectively operate to perform a mini manipulation, particularly for food preparation.
In respect of any of the above system, device or apparatus aspects, there may further be provided method aspects comprising steps to carry out the functionality of the system. Additionally or alternatively, optional features may be found based on any one or more of the features described herein with respect to other aspects.
Commercial robotic system 2720 comprises a user 2721, a computer 2722 with a robotic execution engine and a minimanipulation library 2723. The computer 2722 comprises a general or special purpose computer and may be any compilation of processors and or other standard computing devices. Computer 2722 comprises a robotic execution engine for operating robotic elements such as arms/hands or a complete humanoid robot to recreate the movements captured by the recording system. The Computer 2722 may also operate standardized objects (e.g. tools and equipment) of the creator's 2711 according to the program files or app's captured during the recording process. Computer 2722 may also control and capture 3-D modeling feedback for simulation model calibration and real time adjustments. Minimanipulation library 2723 stores the captured minimanipulations that have been downloaded from the creator's recording system 2710 to the commercial robotic system 2720 via communications link 2701. Minimanipulation library 2723 may store the minimanipulations locally or remotely and may store them in a predetermined or relational basis. Communications link 2701 conveys program files or app's for the (subject) human skill to the commercial robotic system 2720 on a purchase, download, or subscription basis. In operation robotic human-skill replication system 2700 allows a creator 2711 to perform a task or series of tasks which are captured on computer 2712 and stored in memory 2713 creating minimanipulation files or libraries. The minimanipulation files may then be conveyed to the commercial robotic system 2720 via communications link 2701 and executed on computer 2722 causing a set of robotic appendage of hands and arms or a humanoid robot to duplicate the movements of the creator 2711. In this manner, the movements of the creator 2711 are replicated by the robot to complete the required task.
Standard skill movement and object Parameters module 2802 may be a modules implemented in software or hardware and is intended to define standard movements of objects and or basic skills. It may comprise subject parameters, which provide the robotic replication engine with information about standard objects that may need to be utilized during a robotic procedure. It may also contain instructions and or information related to standard skill movements, which are not unique to any one minimanipulation. Maintenance module 2810 may be any routine or hardware that is used to monitor and perform routine maintenance on the system and the robotic replication engine. Maintenance module 2810 may allow for controlling, updating, monitoring, and troubleshooting any other module or system coupled to the robotic human-skill replication engine. Maintenance module 2810 may comprise hardware or software and may be implemented utilizing any number or combination of logic circuits. Output module 2811 allows for communications from the robotic human-skill replication engine 2800 to any other system component or module. Output module 2811 may be used to export, or convey the captured minimanipulations to a commercial robotic system 2720 or may be used to convey the information into storage. Output module 2811 may comprise hardware or software and may be implemented utilizing any number or combination of logic circuits. Bus 2812 couples all the modules within the robotic human-skill replication engine and may be a parallel bus, serial bus, synchronous or asynchronous. It may allow for communications in any form using serial data, packetized data, or any other known methods of data communication.
Minimanipulation movement and object parameter module 2809 may be used to store and/or categorize the captured minimanipulations and creator's movements. It may be coupled to the replication engine as well as the robotic system under control of the user.
Computer 2712 comprises robotic human-skill replication engine 2800, movement control module 2820, memory 2821, skills movement emulator 2822, extended simulation validation and calibration module 2823 and standard object algorithms 2824. As described with respect to
Robotic human-skill replication engine 2800 is coupled to movement control module 2820, which may be used to control or configure the movement of various robotic components based on visual, auditory, tactile or other feedback obtained from the robotic components. Memory 2821 may be coupled to computer 2712 and comprises the necessary memory components for storing skill execution program files. A skill execution program file contains the necessary instructions for computer 2712 to execute a series of instructions to cause the robotic components to complete a task or series of tasks. Skill movement emulator 2822 is coupled to the robotic human-skill replication engine 2800 and may be used to emulate creator skills without actual sensor input. Skill movement emulator 2822 provides alternate input to robotic human-skill replication engine 2800 to allow for the creation of a skill execution program without the use of a creator 2711 providing sensor input. Extended simulation validation and calibration module 2823 may be coupled to robotic human-skill replication engine 2800 and provides for extended creator input and provides for real time adjustments to the robotic movements based on 3-D modeling and real time feedback. Computer 2712 comprises standard object algorithms 2824, which are used to control the robotic hands 72/the robotic arms 70 or humanoid robot 2830 to complete tasks using standard objects. Standard objects may include standard tools or utensils or standard equipment, such as a stove or EKG machine. The algorithms in 2824 are precompiled and do not require individual training using robotic human-skills replication.
Computer 2712 is coupled to one or more motion sensing devices 2825. Motion sensing device 2825 may be visual motion sensors, IR motion sensors, tracking sensors, laser monitored sensors, or any other input or recording device that allows computer 2712 to monitor the position of the tracked device in 3-D space. Motion sensing devices 2825 may comprise a single sensor or a series of sensors that include single point sensors, paired transmitters and receivers, paired markers and sensors or any other type of spatial sensor. Robotic human-skill replication system 2700 may comprise standardized objects 2826 Standardized objects 2826 is any standard object found in a standard orientation and position within the robotic human-skill replication system 2700. These may include standardized tools or tools with standardized handles or grips 2826-a, standard equipment 2826-b, or a standardized space 2826-c. Standardized tools 2826-a may be those depicted in
Also within the robotic human-skill replication system 2700 may be non standard objects 2827. Non standard objects may be for example, cooking ingredients such as meats and vegetables. These non standard sized, shaped and proportioned objects may be located in standard positions and orientations, such as within drawers or bins but the items themselves may vary from item to item.
Visual, audio, and tactile input devices 2829 may be coupled to computer 2712 as [part of the robotic human-skill replication system 2700. Visual, audio, and tactile input devices 2829 may be cameras, lasers, 3-D steroptics, tactile sensors, mass detectors, or any other sensor or input device that allows computer 21712 to determine an object type and position within 3-D space. It may also allow for the detection of the surface of an object and detect objects properties based on touch sound, density or weight.
Robotic arms/hands or humanoid robot 2830 may be directly coupled to computer 2712 or may be connected over a wired or wireless network and may communicate with robotic human-skill replication engine 2800. Robotic arms/hands or humanoid robot 2830 is capable of manipulating and replicating any of the movements performed by creator 2711 or any of the algorithms for using a standard object.
In some embodiments a human performs the same skill multiple times, yielding values of the sensor reading, and parameters in the corresponding robotic instructions that vary somewhat from one time to the next. The set of sensor readings for each sensor across multiple repetitions of the skill provides a distribution with a mean, standard deviation and minimum and maximum values. The corresponding variations on the robotic instructions (also called the effector parameters) across multiple executions of the same skill by the human also defines distributions with mean, standard deviation, minimum and maximum values. These distributions may be used to determine the fidelity (or accuracy) of subsequent robotic skills.
In one embodiment the estimated average accuracy of a robotic skill operation is given by:
Where C represents the set of human parameters (1st through nth) and R represents the set of the robotic apparatus 75 parameters (correspondingly (1st through nth). The numerator in the sum represents the difference between robotic and human parameters (i.e. the error) and the denominator normalizes for the maximal difference). The sum gives the total normalized cumulative error
and multiplying by 1/n gives the average error. The complement of the average error corresponds to the average accuracy.
Another version of the accuracy calculation weighs the parameters for importance, where each coefficient (each αi) represents the importance of the ith parameter, the normalized cumulative error is
and the estimated average accuracy is given by:
In order to operate a mechanical robotic mechanism such as the ones described in the embodiments of this disclosure, a skilled artisan realizes that many mechanical and control problems need to be addressed, and the literature in robotics describes methods to do just that. The establishment of static and/or dynamic stability in a robotics system is an important consideration. Especially for robotic manipulation, dynamic stability is a strongly desired property, in order to prevent accidental breakage or movements beyond those desired or programmed.
In addition to the robotic planning, sensing and acting, the robotic control platform can also communicate with humans via icons, language, gestures, etc. via the robot-human interfaces module 3030, and can learn new minimanipulations by observing humans perform building-block tasks corresponding to the minimanipulations and generalizing multiple observations into minimanipulations, i.e., reliable repeatable sensing-action sequences with preconditions and postconditions by a minimanipulation learning module 3032.
The computer architecture 3050 for executing minimanipulations comprises a combination of disclosure of controller algorithms and their associated controller-gain values as well as specified time-profiles for position/velocity and force/torque for any given motion/actuation unit, as well as the low-level (actuator) controller(s) (represented by both hardware and software elements) that implement these control algorithms and use sensory feedback to ensure the fidelity of the prescribed motion/interaction profiles contained within the respective datasets. These are also described in further detail below and so designated with appropriate color-code in the associated
The MML generator 3051 is a software system comprising multiple software engines GG2 that create both minimanipulation (MM) data sets GG3 which are in turn used to also become part of one or more MML Data bases GG4.
The MML Generator 3051 contains the aforementioned software engines 3052, which utilize sensory and spatial data and higher-level reasoning software modules to generator parameter-sets that describe the respective manipulation tasks, thereby allowing the system to build a complete MM data set 3053 at multiple levels. A hierarchical MM Library (MML) builder is based on software modules that allow the system to decompose the complete task action set in to a sequence of serial and parallel motion-primitives that are categorized from low- to high-level in terms of complexity and abstraction. The hierarchical breakdown is then used by a MML database builder to build a complete MML database 3054.
The previously mentioned parameter sets 3053 comprise multiple forms of input and data (parameters, variables, etc.) and algorithms, including task performance metrics for a successful completion of a particular task, the control algorithms to be used by the humanoid actuation systems, as well as a breakdown of the task-execution sequence and the associated parameter sets, based on the physical entity/subsystem of the humanoid involved as well as the respective manipulation phases required to execute the task successfully. Additionally, a set of humanoid-specific actuator parameters are included in the datasets to specify the controller-gains for the specified control algorithms, as well as the time-history profiles for motion/velocity and force/torque for each actuation device(s) involved in the task execution.
The MML database 3054 comprises multiple low- to higher-level of data and software modules necessary for a humanoid to accomplish any specific low- to high-level task. The libraries not only contain MM datasets generated previously, but also other libraries, such as currently-existing controller-functionality relating to dynamic control (KDC), machine-vision (OpenCV) and other interaction/inter-process communication libraries (ROS, etc.). The humanoid controller 3056 is also a software system comprising the high-level controller software engine 3057 that uses high-level task-execution descriptions to feed machine-executable instructions to the low-level controller 3059 for execution on, and with, the humanoid robot platform.
The high-level controller software engine 3057 builds the application-specific task-based robotic instruction-sets, which are in turn fed to a command sequencer software engine that creates machine-understandable command and control sequences for the command executor GG8. The software engine 3052 decomposes the command sequence into motion and action goals and develops execution-plans (both in time and based on performance levels), thereby enabling the generation of time-sequenced motion (positions & velocities) and interaction (forces and torques) profiles, which are then fed to the low-level controller 3059 for execution on the humanoid robot platform by the affected individual actuator controllers 3060, which in turn comprise at least their own respective motor controller and power hardware and software and feedback sensors.
The low level controller contain actuator controllers which use digital controller, electronic power-driver and sensory hardware to feed software algorithms with required set-points for position/velocity and force/torque, which the controller is tasked to faithfully replicate along a time-stamped sequence, relying on feedback sensor signals to ensure the required performance fidelity. The controller remains in a constant loop to ensure all set-points are achieved over time until the required motion/interaction step(s)/profile(s) are completed, while higher-level task-performance fidelity is also being monitored by the high-level task performance monitoring software module in the command executor 3058, leading to potential modifications in the high-to-low motion/interaction profiles fed to the low-level controller to ensure task-outcomes fall within required performance bounds and meet specified performance metrics.
In a teach-playback controller 3061, a robot is led through a set of motion profiles, which are continuously stored in a time-synched fashion, and then ‘played-back’ by the low-level controller by controlling each actuated element to exactly follow the motion profile previously recorded. This type of control and implementation are necessary to control a robot, some of which may be available commercially. While the present described disclosure utilizes a low-level controller to execute machine-readable time-synched motion/interaction profiles on a humanoid robot, embodiments of the present disclosure are directed to techniques that are much more generic than teach-motions, more automated and far more capable process, more complexity, allowing one to create and execute a potentially high number of simple to complex tasks in a far more efficient and cost-effective manner.
Sensors have been grouped in three categories based on their physical location and portion of a particular interaction that will need to be controlled. Three types of sensors (External 3071, Internal 3073, and Interface 3072) feed their data sets into a data-suite process 3074 that forwards the data over the proper communication link and protocol to the data processing and/or robot-controller engine(s) 3075.
External Sensors 3071 comprise sensors typically located/used external to the dual-arm robot torso/humanoid and tend to model the location and configuration of the individual systems in the world as well as the dual-arm torso/humanoid. Sensor types used for such a suite would include simple contact switches (doors, etc.), electromagnetic (EM) spectrum based sensors for one-dimensional range measurements (IR rangers, etc.), video cameras to generate two-dimensional information (shape, location, etc.), and three-dimensional sensors used to generate spatial location and configuration information using bi-/tri-nocular cameras, scanning lasers and structured light, etc.).
Internal Sensors 3073 are sensors internal to the dual-arm torso/humanoid, mostly measuring internal variables, such as arm/limb/joint positions and velocity, actuator currents and joint- and Cartesian forces and torques, haptic variables (sound, temperature, taste, etc.) binary switches (travel limits, etc.) as well as other equipment-specific presence switches. Additional One-/two- and three-dimensional sensor types (such as in the hands) can measure range/distance, two-dimensional layouts via video camera and even built-in optical trackers (such as in a torso-mounted sensor-head).
Interface-sensors 3072 are those kinds of sensors that are used to provide high-speed contact and interaction movements and forces/torque information when the dual-arm torso/humanoid interacts with the real world during any of its tasks. These are critical sensors as they are integral to the operation of critical MM sub-routine actions such as striking a piano-key in just the right way (duration and force and speed, etc.) or using a particular sequence of finger-motions to grab and achieve a safe grab of a knife to orient it to be able for a particular task (cut a tomato, strike an egg, crush garlic gloves, etc.). These sensors (in order of proximity) can provide information related to the stand-off/contact distance between the robot appendages to the world, the associated capacitance/inductance between the endeffector and the world measurable immediately prior to contact, the actual contact presence and location and its associated surface properties (conductivity, compliance, etc.) as well as associated interaction properties (force, friction, etc.) and any other haptic variables of importance (sound, heat, smell, etc.).
The interest in grouping the physical layout as shown in FIG. BB is related to the fact that MM actions can readily be split into actions performed mostly by a certain portion of a hand or limb/joint, thereby reducing the parameter-space for control and adaptation/optimization during learning and playback, dramatically. It is a representation of the physical space into which certain subroutine or main minimanipulation (MM) actions can be mapped, with the respective variables/parameters needed to describe each minimanipulation (MM) being both minimal/necessary and sufficient.
A breakdown in the physical space-domain also allows for a simpler breakdown of minimanipulation (MM) actions for a particular task into a set of generic minimanipulation (sub-) routines, dramatically simplifying the building of more complex and higher-level complexity minimanipulation (MM) actions using a combination of serial/parallel generic minimanipulation (MM) (sub-) routines. Note that the physical domain breakdown to readily generate minimanipulation (MM) action primitives (and/or sub-routines), is but one of the two complementary approaches1 allowing for simplified parametric descriptions of minimanipulation (MM) (sub-) routines to allow one to properly build a set of generic and task-specific minimanipulation (MM) (sub-) routines or motion primitives to build up a complete (set of) motion-library(ies).
Hence in order to build an ever more complex and higher level set of minimanipulation (MM) motion-primitive routines form a set of generic sub-routines, a high-level minimanipulation (MM) can be thought of as a transition between various phases of any manipulation, thereby allowing for a simple concatenation of minimanipulation (MM) sub-routines to develop a higher-level minimanipulation routine (motion-primitive). Note that each phase of a manipulation (approach, grasp, maneuver, etc.) is itself its own low-level minimanipulation described by a set of parameters involved in controlling motions and forces/torques (internal, external as well as interface variables) involving one or more of the physical domain entities [finger(s), palm, wrist, limbs, joints (elbow, shoulder, etc.), torso, etc.].
Arm 1 3131 of a dual-arm system, can be thought of as using external and internal sensors as defined in
Note that should a minimanipulation (MM) sub-routine action fail (such as needing to re-grasp), all the minimanipulation sequencer has to do is to jump back backwards to a prior phase and repeat the same actions (possibly with a modified set of parameters to ensure success, if needed). More complex sets of actions, such playing a sequence of piano-keys with different fingers, involves a repetitive jumping-loops between the Approach 3133, 3134 and the Contact 3134, 3144 phases, allowing for different keys to be struck in different intervals and with different effect (soft/hard, short/long, etc.); moving to different octaves on the piano key-scale would simply require a phase-backwards to the configuration-phase 3132 to reposition the arm, or possibly even the entire torso 3140 through translation and/or rotation to achieve a different arm and torso orientation 3151.
Arm 2 3140 could perform similar activities in parallel and independent of Arm 3130, or in conjunction and coordination with Arm 3130 and Torso 3150, guided by the movement-coordination phase 315 (such as during the motions of arms and torso of a conductor wielding a baton), and/or the contact and interaction control phase 3153, such as during the actions of dual-arm kneading of dough on a table.
One aspect depicted in
Notice that coupling a minimanipulation (sub-) routine to a not only a set of parameters required to be monitored and controlled during a particular phase of a task-motion as depicted in
In a more detailed view, it is shown how sensory data is filtered and input into a sequence of processing engines to arrive at a set of generic and task-specific minimanipulation motion primitive libraries. The processing of the sensory data 3162 identified in
The MM data-processing and structuring engine 3165 creates an interim library of motion-primitives based on identification of motion-sequences 3165-1, segmented groupings of manipulation steps 3165-2 and then an abstraction-step 3165-3 of the same into a dataset of parameter-values for each minimanipulation step, where motion-primitives are associated with a set of pre-defined low- to high-level action-primitives 3165-5 and stored in an interim library 3165-4. As an example, process 3165-1 might identify a motion-sequence through a dataset that indicates object-grasping and repetitive back-and-forth motion related to a studio-chef grabbing a knife and proceeding to cut a food item into slices. The motion-sequence is then broken down in 3165-2 into associated actions of several physical elements (fingers and limbs/joints) shown in
The interim library data 3165-4 is fed into a learning-and-tuning engine 3166, where data from other multiple studio-sessions 3168 is used to extract similar minimanipulation actions and their outcomes 3166-1 and comparing their data sets 3166-2, allowing for parameter-tuning 3166-3 within each minimanipulation group using one or more of standard machine-learning/-parameter-tuning techniques in an iterative fashion 3166-5. A further level-structuring process 3166-4 decides on breaking the minimanipulation motion-primitives into generic low-level sub-routines and higher-level minimanipulations made up of a sequence (serial and parallel combinations) of sub-routine action-primitives.
A following library builder 3167 then organizes all generic minimanipulation routines into a set of generic multi-level minimanipulation action-primitives with all associated data (commands, parameter-sets and expected/required performance metrics) as part of a single generic minimanipulation library 3167-2. A separate and distinct library is then also built as a task-specific library 3167-1 that allows for assigning any sequence of generic minimanipulation action-primitives to a specific task (cooking, painting, etc.), allowing for the inclusion of task-specific datasets which only pertain to the task (such as kitchen data and parameters, instrument-specific parameters, etc.) which are required to replicate the studio-performance by a remote robotic system.
A separate MM library access manager 3169 is responsible for checking-out proper libraries and their associated datasets (parameters, time-histories, performance metrics, etc.) 3169-1 to pass onto a remote robotic replication system, as well as checking back in updated minimanipulation motion primitives (parameters, performance metrics, etc.) 3169-2 based on learned and optimized minimanipulation executions by one or more same/different remote robotic systems. This ensures the library continually grows and is optimized by a growing number of remote robotic execution platforms.
At a high level, this is achieved by downloading the task-descriptive libraries containing the complete set of minimanipulation datasets required by the robotic system, and providing them to a robot controller for execution. The robot controller generates the required command and motion sequences that the execution module interprets and carries out, while receiving feedback from the entire system to allow it to follow profiles established for joint and limb positions and velocities as well as (internal and external) forces and torques. A parallel performance monitoring process uses task-descriptive functional and performance metrics to track and process the robot's actions to ensure the required task-fidelity. A minimanipulation learning-and-adaptation process is allowed to take any minimanipulation parameter-set and modify it should a particular functional result not be satisfactory, to allow the robot to successfully complete each task or motion-primitive. Updated parameter data is then used to rebuild the modified minimanipulation parameter set for re-execution as well as for updating/rebuilding a particular minimanipulation routine, which is provided back to the original library routines as a modified/re-tuned library for future use by other robotic systems. The system monitors all minimanipulation steps until the final result is achieved and once completed, exits the robotic execution loop to await further commands or human input.
In specific detail the process outlined above, can be detailed as the sequences described below. The MM library 3170, containing both the generic and task-specific MM-libraries, is accessed via the MM library access manager 3171, which ensures all the required task-specific data sets 3172 required for the execution and verification of interim/end-result for a particular task are available. The data set includes at least, but is not limited to, all necessary kinematic/dynamic and control parameters, time-histories of pertinent variables, functional and performance metrics and values for performance validation and all the MM motion libraries relevant to the particular task at hand.
All task-specific datasets 3172 are fed to the robot controller 3173. A command sequencer 3174 creates the proper sequential/parallel motion sequences with an assigned index-value ‘I’, for a total of ‘i=N’ steps, feeding each sequential/parallel motion command (and data) sequence to the command executor 3175. The command executor 3175 takes each motion-sequence and in turn parses it into a set of high-to-low command signals to actuation and sensing systems, allowing the controllers for each of these systems to ensure motion-profiles with required position/velocity and force/torque profiles are correctly executed as a function of time. Sensory feedback data 3176 from the (robotic) dual-arm torso/humanoid system is used by the profile-following function to ensure actual values track desired/commanded values as close as possible.
A separate and parallel performance monitoring process 3177 measures the functional performance results at all times during the execution of each of the individual minimanipulation actions, and compares these to the performance metrics associated with each minimanipulation action and provided in the task-specific minimanipulation data set provided in 3172. Should the functional result be within acceptable tolerance limits to the required metric value(s), the robotic execution is allowed to continue, by way of incrementing the minimanipulation index value to ‘i++’, and feeding the value and returning control back to the command-sequencer process 3174, allowing the entire process to continue in a repeating loop. Should however the performance metrics differ, resulting in a discrepancy of the functional result value(s), a separate task-modifier process 3178 is enacted.
The minimanipulation task-modifier process 3178 is used to allow for the modification of parameters describing any one task-specific minimanipulation, thereby ensuring that a modification of the task-execution steps will arrive at an acceptable performance and functional result. This is achieved by taking the parameter-set from the ‘offending’ minimanipulation action-step and using one or more of multiple techniques for parameter-optimization common in the field of machine-learning, to rebuild a specific minimanipulation step or sequence MMi into a revised minimanipulation step or sequence MMi*. The revised step or sequence MMi* is then used to rebuild a new command-0sequence that is passed back to the command executor 3175 for re-execution. The revised minimanipulation step or sequence MMi* is then fed to a re-build function that re-assembles the final version of the minimanipulation dataset, that led to the successful achievement of the required functional result, so it may be passed to the task- and parameter monitoring process 3179.
The task- and parameter monitoring process 3179 is responsible for checking for both the successful completion of each minimanipulation step or sequence, as well as the final/proper minimanipulation dataset considered responsible for achieving the required performance-levels and functional result. As long as the task execution is not completed, control is passed back to the command sequencer 3174. Once the entire sequences have been successfully executed, implying ‘i=N’, the process exits (and presumably awaits further commands or user input. For each sequence-counter value ‘I’, the monitoring task 3179 also forwards the sum of all rebuilt minimanipulation parameter sets Σ(MMi*) back to the MM library access manager 3171 to allow it to update the task-specific library(ies) in the remote MM library 3170 shown in
The example depicted in
The above example illustrates the process of building a minimanipulation routine based on simple sub-routine motions (themselves also minimanipulations) using both a physical entity mapping and a manipulation-phase approach which the computer can readily distinguish and parameterize using external/internal/interface sensory feedback data from the studio-recording process. This minimanipulation library building-process for process-parameters generates ‘parameter-vectors’ which fully describe a (set of) successful minimanipulation action(s), as the parameter vectors include sensory-data, time-histories for key variables as well as performance data and metrics, allowing a remote robotic replication system to faithfully execute the required task(s). The process is also generic in that it is agnostic to the task at hand (cooking, painting, etc.), as it simply builds minimanipulation actions based on a set of generic motion- and action-primitives. Simple user input and other pre-determined action-primitive descriptors can be added at any level to more generically describe a particular motion-sequence and to allow it to be made generic for future use, or task-specific for a particular application. Having minimanipulation datasets comprised of parameter vectors, also allows for continuous optimization through learning, where adaptions to parameters are possible to improve the fidelity of a particular minimanipulation based on field-data generated during robotic replication operations involving the application (and evaluation) of minimanipulation routines in one or more generic and/or task-specific libraries.
A working memory 1 3192 contains all the sensor readings for a period of time until the present: a few seconds to a few hours—depending on how much physical memory, typical would be about 60 seconds. The sensor readings come from the on-board or off-board robotic sensors and may include video from cameras, ladar, sonar, force and pressure sensors (haptic), audio, and/or any other sensors. Sensor readings are implicitly or explicitly time-tagged or sequence-tagged (the latter means the order in which the sensor readings were received).
A working memory 2 3193 contains all of the actuator commands generated by the Central Robotic Control and either passed to the actuators, or queued to be passed to same at a given point in time or based on a triggering event (e.g. the robot completing the previous motion). These include all the necessary parameter values (e.g. how far to move, how much force to apply, etc.).
A first database (database 1) 3194 contains the library of all minimanipulations (MM) known to the robot, including for each MM, a triple <PRE, ACT, POST>, where PRE={s1, s2, . . . , sn} is a set of items in the world state that must be true before the actions ACT=[a1, a2, . . . , ak] can take place, and result in a set of changes to the world state denoted as POST={p1, p2, . . . , pm}. In a preferred embodiment, the MMs are index by purpose, by sensors and actuators they involved, and by any other factor that facilitates access and application. In a preferred embodiment each POST result is associated with a probability of obtaining the desired result if the MM is executed. The Central Robotic Control both accesses the MM library to retrieve and execute MM's and updates it, e.g. in learning mode to add new MMs.
A second database (database 2) 3195 contains the case library, each case being a sequence of minimanipulations to perform a give task, such as preparing a given dish, or fetching an item from a different room. Each case contains variables (e.g. what to fetch, how far to travel, etc.) and outcomes (e.g. whether the particular case obtained the desired result and how close to optimal—how fast, with or without side-effects etc.). The Central Robotic Control both accesses the Case Library to determine if has a known sequence of actions for a current task, and updates the Case Library with outcome information upon executing the task. If in learning mode, the Central Robotic Control adds new cases to the case library, or alternately deletes cases found to be ineffective.
A third database (database 3) 3196 contains the object store, essentially what the robot knows about external objects in the world, listing the objects, their types and their properties. For instance, an knife is of type “tool” and “utensil” it is typically in a drawer or countertop, it has a certain size range, it can tolerate any gripping force, etc. An egg is of type “food”, it has a certain size range, it is typically found in the refrigerator, it can tolerate only a certain amount of force in gripping without breaking, etc. The object information is queried while forming new robotic action plans, to determine properties of objects, to recognize objects, and so on. The object store can also be updated when new objects introduce and it can update its information about existing objects and their parameters or parameter ranges.
A fourth database (database 4) 3197 contains information about the environment in which the robot is operating, including the location of the robot, the extent of the environment (e.g. the rooms in a house), their physical layout, and the locations and quantities of specific objects within that environment. Database 4 is queried whenever the robot needs to update object parameters (e.g. locations, orientations), or needs to navigate within the environment. It is updated frequently, as objects are moved, consumed, or new objects brought in from the outside (e.g. when the human returns form the store or supermarket).
An example of a very rudimentary behavior might be ‘finger-curl’, with a motion primitive related to ‘grasp’ that has all 5 fingers curl around an object, with a high-level behavior termed ‘fetch utensil’ that would involve arm movements to the respective location and then grasping the utensil with all five fingers. Each of the elementary behaviors (incl. the more rudimentary ones as well) have a correlated functional result and associated calibration variables describing and controlling each.
Linking allows for behavioral data to be linked with the physical world data, which includes data related to the physical system (robot parameters and environmental geometry, etc.), the controller (type and gains/parameters) used to effect movements, as well as the sensory-data (vision, dynamic/static measures, etc.) needed for monitoring and control, as well as other software-loop execution-related processes (communications, error-handling, etc.).
Conversion takes all linked MM data, from one or more databases, and by way of a software engine, termed the Actuator Control Instruction Code Translator & Generator, thereby creating machine-executable (low-level) instruction code for each actuator (A1 thru An) controller (which themselves run a high-bandwidth control loop in position/velocity and/or force/torque) for each time-period (t1 thru tm), allowing for the robot system to execute commanded instruction in a continuous set of nested loops.
Additionally, humanoid robot 3220 may have a neck 3230 with a number of DOF for forward/backward, up/down, left/right and rotation movements. It may have shoulder 3232 with a number of DOF for forward/backward, rotation movements, elbow with a number of DOF for forward/backward movements, and wrists 314 with a number of DOF for forward/backward, rotation movements. The humanoid robot 3220 may have hips 3234 with a number of DOF for forward/backward, left/right and rotation movements, knees 3236 with a number of DOF for forward/backward movements, and ankles 3236 with a number of DOF for forward/backward and left/right movements. The humanoid robot 3220 may house a battery 3238 or other power source to allow it to move untethered about its operational space. The battery 3238 may be rechargeable and may be any type of battery or other power source known.
Various possible parameters for each minimanipulation 1.1-1.n are tested to find the best way to execute a specific movement. For example minimanipulation 1.1 (MM1.1) may be holding an object or playing a chord on a piano. For this step of the overall minimanipulation 3290, all the various sub-minimanipulations for the various parameters are explored that complete step 1.1. That is, the different positions, orientations, and ways to hold the object, are tested to find an optimal way to hold the object. How does the robotic arm, hand or humanoid hold their fingers, palms, legs, or any other robotic part during the operation. All the various holding positions and orientations are tested. Next, the robotic hand, arm, or humanoid may pick up a second object to complete minimanipulation 1.2. The 2nd object, i.e., a knife may be picked up and all the different positions, orientations, and the way to hold the object may be tested and explored to find the optimal way to handle the object. This continues until minimanipulation 1.n is completed and all the various permutations and combinations for performing the overall minimanipulation are completed. Consequently, the optimal way to execute the mini-manipulation 3290 is stored in the library database of mini-manipulations broken down into sub-minimanipulations 1.1-1.n. The saved minimanipulation then comprise the best way to perform the steps, of the desired task, i.e., the best way to hold the first object, the best way to hold the 2nd object, the best way to strike the 1st object with the second object, etc. These top combinations are saved as the best way to perform the overall minimanipulation 3290.
To create the minimanipulation that results in the best way to complete the task, multiple parameter combinations are tested to identify an overall set of parameters that ensure the desired functional result is achieved. The teaching/learning process for the robotic apparatus 75 involves multiple and repetitive tests to identify the necessary parameters to achieve the desired final functional result.
These tests may be performed over varying scenarios. For example, the size of the object can vary. The location at which the object is found within the workspace, can vary. The second object may be at different locations. The mini-manipulation must be successful in all of these variable circumstances.
Once the learning process has been completed, results are stored as a collection of action primitives that together are known to accomplish the desired functional result.
where G represents the set of objective (or “goal”) parameters (1st through nth) and P represents the set of Robotic apparatus 75 parameters (correspondingly (1st through nth). The numerator in the sum represents the difference between robotic and goal parameters (i.e. the error) and the denominator normalizes for the maximal difference). The sum gives the total normalized cumulative error
and multiplying by 1/n gives the average error. The complement of the average error (i.e. subtracting it from 1) corresponds to the average accuracy.
In another embodiment the accuracy calculation weighs the parameters for their relative importance, where each coefficient (each αi) represents the importance of the ith parameter, the normalized cumulative error is
and the estimated average accuracy is given by:
In
In an embodiment, in order to complete minimanipulations 1-.1-1.3, to yield the functional result, right hand and left hand must sense and receive feedback on the object and the state change of the object in the hand or palm, or leg. This sensed state change may result in an adjustment to the parameters that comprise the minimanipulation. Each change in one parameter may yield in a change to each subsequent parameter and each subsequent required minimanipulation until the desired tasks result is achieved.
The second sub-minimanipulation in step 3351 may be 3351b. The step 3351b requires positioning the standard knife object in a correct orientation and applying the correct pressure, grasp, and orientation to slice the fish on the board. Simultaneously, the left hand, leg, pal, etc. is required to be performing coordinate steps to complement and coordinate the completion of the sub-minimanipulation. All these starting positions, times, and other sensor feedbacks and signals need to be captured and optimized to ensure a successful implementation of the action primitive to complete the sub-minimanipulation.
The minimanipulations required to complete this task may be broken down into a series of techniques for the body and for each hand and foot. For example, there may be a series of right hand minimanipulations that successfully press and hold a series of piano keys according to playing techniques 1-n. Similarly, there may be a series of left hand minimanipulations that successfully press and hold a series of piano keys according to playing techniques 1-n. There may also be a series of minimanipulations identified to successfully press a piano pedal with the right or left foot. As will be understood by one skilled in the art, each minimanipulation for the right and left hands and feet, can be further broken down into sub-minimanipulations to yield the desired functional result, e.g. playing a musical composition on the piano.
One embodiment requires placing a motor 3510 that controls the position of a robotic hand 72 not at the wrist where it would normally be placed in proximity of the hand, but rather further up in the robotic arm 70, preferentially just below the elbow 3212. In that embodiment the advantage of the motor placement closer to the elbow 3212 can be calculated as follows, starting with the original torque on the hand 72 caused by the weight of the hand.
T
original(hand)=(whand+wmotor)dh(hand,elbow)
where weight wi=gmi (gravitational constant g times mass of object i), and horizontal distance dh=length(hand, elbow) cos θv for the vertical angle theta. However, if the motor is placed near (epsilon away from the joint), then the new torque is:
T
new(hand)=(whand)dh(hand,elbow)+(wmotor)ϵh
Since the motor 3510 next to the elbow-joint 3212 the robotic arm contributes only epsilon-distance to the torque the torque in the new system is dominated by the weight of the hand, including whatever the hand may be carrying. The advantage of this new configuration is that the hand may lift greater weight with the same motor since the motor itself contributes very little to the torque.
A skilled artisan will appreciate the advantage of this aspect of the disclosure, and would also realize that a small corrective factor is needed to account for the mass of the device used to transmit the force exerted by the motor to the hand—such a device could be a set of small axels. Hence, the full new torque with this small corrective factor would be:
T
new(hand)=(whand)dh(hand,elbow)+(wmotor)ϵh+½waxeldh(hand,elbow)
where the weight of the axel exerts half-torque since its center of gravity is half way between the hand and the elbow. Typically the weight of the axels is much less than the weight of the motor.
One embodiment of the present disclosure illustrates a universal android-type robotic device that comprises the following features or components. A robotic software engine, such as the robotic food preparation engine 56, is configured to replicate any type of human hands movements and products in an instrumented or standardized environment. The resulting product from the robotic replication can be (1) physical, such as a food dish, a painting, a work of art, etc., and (2) non-physical, such as the robotic apparatus playing a musical piece on a musical instrument, a health care assistant procedure, etc.
Several significant elements in the universal android-type (or other software operating systems) robotic device may include some or all of the following, or in combination with other features. First, the robotic operating or instrumented environment operates a robotic device providing standardized (or “standard”) operating volume dimensions and architecture for Creator and Robotic Studios. Second, the robotic operating environment provides standardized position and orientation (xyz) for any standardized objects (tools, equipment, devices, etc.) operating within the environment. Third, the standardized features extend to, but are not limited by, standardized attendant equipment set, standardized attendant tools and devices set, two standardized robotic arms, and two robotic hands that closely resemble functional human hands with access to one or more libraries of minimanipulations, and standardized three-dimensional (3D) vision devices for creating dynamic virtual 3D-vision model of operation volume. This data can be used for hand motion capturing and functional result recognizing. Fourth, hand motion gloves with sensors are provided to capture precise movements of a creator. Fifth, the robotic operating environment provides standardized type/volume/size/weight of the required materials and ingredients during each particular (creator) product creation and replication process. Sixth, one or more types of sensors are use to capture and record the process steps for replication.
Software platform in the robotic operating environment includes the following subprograms. The software engine (e.g., robotic food preparation engine 56) captures and records arms and hands motion script subprograms during the creation process as human hands wear gloves with sensors to provide sensory data. One or more minimanipulations functional library subprograms are created. The operating or instrumented environment records three-dimensional dynamic virtual volume model subprogram based on a timeline of the hand motions by a human (or a robot) during the creation process. The software engine is configured to recognize each functional minimanipulation from the library subprogram during a task creation by human hands. The software engine defines the associated minimanipulations variables (or parameters) for each task creation by human hands for subsequent replication by the robotic apparatus. The software engine records sensor data from the sensors in an operating environment, which quality check procedure can be implemented to verify the accuracy of the robotic execution in replicating the creator's hand motions. The software engine includes an adjustment algorithms subprogram for adapting to any non-standardized situations (such as an object, volume, equipment, tools, or dimensions), which make a conversion from non-standardized parameters to standardized parameters to facilitate the execution of a task (or product) creation script. The software engine stores a subprogram (or sub software program) of a creator's hand motions (which reflect the intellectual property product of the creator) for generating a software script file for subsequent replication by the robotic apparatus. The software engine includes a product or recipe search engine to locate the desirable product efficiently. Filters to the search engine are provided to personalize the particular requirements of a search. An e-commerce platform is also provided for exchanging, buying, and selling any IP script (e.g., software recipe files), food ingredients, tools, and equipment to be made available on a designated website for commercial sale. The e-commerce platform also provides a social network page for users to exchange information about a particular product of interest or zone of interest.
One purpose of the robotic apparatus replicating is to produce the same or substantially the same product result, e.g., the same food dish, the same painting, the same music, the same writing, etc. as the original creator through the creator's hands. A high degree of standardization in an operating or instrumented environment provides a framework, while minimizing variance between the creator's operating environment and the robotic apparatus operating environment, which the robotic apparatus is able to produce substantially the same result as the creator, with some additional factors to consider. The replication process has the same or substantially the same timeline, with preferable the same sequence of minimanipulations, the same initial start time, the same time duration and the same ending time of each minimanipulation, while the robotic apparatus autonomously operates at the same speed of moving an object between minimanipulations. The same task program or mode is used on the standardized kitchen and standardized equipment during the recording and execution of the minimanipulation. A quality check mechanism, such as a three-dimensional vision and sensors, can be used to minimize or avoid any failed result, which adjustments to variables or parameters can be made to cater to non-standardized situations. An omission to use a standardized environment (i.e., not the same kitchen volume, not the same kitchen equipment, not the same kitchen tools, and not the same ingredients between the creator's studio and the robotic kitchen) increases the risk of not obtaining the same result when a robotic apparatus attempts to replicate a creator's motions in hopes of obtaining the same result.
The robotic kitchen can operate in at least two modes, a computer mode and a manual mode. During the manual mode, the kitchen equipment includes buttons on an operating console (without the requirement to recognize information from a digital display or without the requirement to input any control data through touchscreen to avoid any entering mistake, during either recording or execution). In case of touchscreen operation, the robotic kitchen can provide a three-dimensional vision capturing system for recognizing current information of the screen to avoid incorrect operation choice. The software engine is operable with different kitchen equipment, different kitchen tools, and different kitchen devices in a standardized kitchen environment. A creator's limitation is to produce hand motions on sensor gloves that are capable of replication by the robotic apparatus in executing mini-manipulations. Thus, in on embodiment, the library (or libraries) of minimanipulations that are capable of execution by the robotic apparatus serves as functional limitations to the creator's motion movements. The software engine creates an electronic library of three-dimensional standardized objects, including kitchen equipment, kitchen tools, kitchen containers, kitchen devices, etc. The pre-stored dimensions and characteristics of each three-dimensional standardized object conserve resources and reduce the amount of time to generate a three-dimensional modeling of the object from the electronic library, rather than having to create a three-dimensional modeling in real time. In one embodiment, the universal android-type robotic device is capable to create a plurality of functional results. The functional results make success or optimal results from the execution of minimanipulations from the robotic apparatus, such as the humanoid walking, the humanoid running, the humanoid jumping, the humanoid (or robotic apparatus) playing musical composition, the humanoid (or robotic apparatus) painting a picture, and the humanoid (or robotic apparatus) making dish. The execution of minimanipulations can occur sequentially, in parallel, or one prior minimanipulation must be completed before the start of the next minimanipulation. To make humans more comfortable with a humanoid, the humanoid would make the same motions (or substantially the same) as a human and at a pace comfortable to the surrounding human(s). For example, if a person likes the way that a Hollywood actor or a model walks, the humanoid can operate with minimanipulations that exhibits the motion characteristics of the Hollywood actor (e.g., Angelina Jolie). The humanoid can also be customized with a standardized human type, including skin-looking cover, male humanoid, female humanoid, physical, facial characteristics, and body shape. The humanoid covers can be produced using three-dimensional printing technology at home.
One example operating environment for the humanoid is a person's home; while some environments are fixed, others are not. The more that the environment of the house can be standardized, the less risk in operating the humanoid. If the humanoid is instructed to bring a book, which does not relate to a creator's intellectual property/intellectual thinking (IP), it requires a functional result without the IP, the humanoid would navigate the pre-defined household environment and execute one or more minimanipulations to bring the book and give the book to the person. Some three-dimensional objects, such as a sofa, have been previously created in the standardized household environment when the humanoid conducts its initial scanning or perform three-dimensional quality check. The humanoid may necessitate creating a three-dimensional modeling for an object that the humanoid does not recognized or that was not previously defined.
Sample types of kitchen equipment are illustrated as Table A in
While it is possible to conceive of a dynamic approach to define the macro-/micro-manipulation subsystems at any desirable time-domain level (task-based or even at every controller sampling-step) using an intelligent subsystem separation planner, providing a system that operates at a more optimal and minimal computational load level, it would however add substantial complexity to the system. In our embodiment we propose an a-priori logical and physical separation of the entire physical system into its macro- and micro-manipulation subsystems, each carrying out their own dedicated planning and control tasks based on a more real-world and computationally-motivated level. The implemented separation allows for the use of well known and understood real-time inverse kinematic planners for free-space translational and rotational movements in all degrees of freedom, namely 3 translational (XYZ) and 3 rotational (roll, pitch, yaw), adding up to 6 DoFs. Beyond that, we are able to use a separate multi-DoF inverse kinematic planner that addresses the remaining manipulation elements, namely the palm and fingers with their attached tools/utensils and vessels, thereby decoupling the entire inverse kinematic planning into multiple sets of computationally-manageable processes, each capable of providing solutions in real time for each of their respective (sub-)systems. For workspace movements beyond those of the stationary articulation system, namely the articulated arm/hand systems, a separate planner can be used that allows a coarse positioning system, in our case the Cartesian XYZ positioner, to provide an inverse kinematic solution to said system that can re-center the available workspace around that of the arm/hand system (akin to moving the robotic system along rails to reach parts of the workspace that lie outside of the reach of the articulated robot-arm).
The robotic system operating in a real-world environment has been split into three (3) separate physical entities, namely the (1) articulated base, which includes the (a) upper-extremity (sensor-head) and torso, and (b) linked appendages, which are typically articulated serial-configuration arms (but need not be) with multiple DoFs of differing types; (2) endeffectors, which include a wrist with a variety of end-of-arm (EoA) tooling such as fingers, docking-fixtures, etc., and (3) the domain-application itself, such as a fully-instrumented laboratory, bathroom or kitchen, where the latter would contain cooking tools, pots/pans, appliances, ingredients, user-interaction devices, etc.
A typical manipulation system, particularly those requiring substantial mobility over larger workspaces while still needing appreciable endpoint motion accuracy, can be physically and logically subdivided into a macro-manipulation subsystem comprising of a large workspace positioner 3540, coupled with an articulated body 3531 comprising multiple elements 3541 for coarse motion, and a micro-manipulation subsystem 3549 utilized for fine motions, physically joined and interacting with the environment 3551 they operate in.
For larger workspace applications, where the workspace exceeds that of a typical articulated robotic system, it is possible to increase the systems' reach by adding a positioner, typically in free-space, allowing movements in XYZ (three translational coordinates) space, as depicted by 3540 allowing for workspace repositioning 3544. Such a positioner could be a mobile wheeled or legged base, aerial platform, or simply a gantry-style orthogonal XYZ positioner, capable of positioning an articulated body 3531. Such an articulated body 3531 targeted at applications where a humanoid-type configuration is one of the possible physical robot instantiations, said articulated body 3531 would describe a physical set of interlinked elements 3541, comprising of upper-extremities 3545 and linked appendages 3546. Each of these interlinked elements within the macro-manipulation subsystem 3541 and 3540 would consist of a instrumented articulated and controller-actuated sub-elements, including a head 3542 replete with a variety of environment perception and modelling sensing elements, connected to an instrumented articulated and controller-actuated shouldered torso 3534 and an instrumented articulated and controller-actuated waist 3543. The shoulders in the torso can have attached to it linked appendages 3546, such as one (typically two) or more instrumented articulated and controller-actuated jointed arms 3536 to each of which would be attached an instrumented articulated and controller-actuated wrist 3537. A waist may also have attached to its mobility elements such as one or more legs 3535, in order to allow the robotic system to operate in a much more expanded workspace.
A physically attached micro-manipulation subsystem 3549 is used in applications where fine position and/or velocity trajectory-motions and high-fidelity control of interaction forces/torques is required, that a macro-manipulation subsystem 3541, whether coupled to a positioner 3540 or not, would not be able to sense and/or control to the level required for a particular domain-application. The micro-manipulation subsystem 3549 is typically attached to each of the linked appendages 3546 interface mounting locations of the instrumented articulated and controller-actuated wrist 3537. It is possible to attach a variety of instrumented articulated and controller-actuated end-of-arm (EoA) tooling 3547 to said mounting interface(s). While a wrist 3537 itself can be an instrumented articulated and controller-actuated multi-degree-of-freedom (DoE; such as a typical three-DoF rotation configuration in roll/pitch/yaw), it is also the mounting platform to which one may choose to attach a highly dexterous instrumented articulated and controller-actuated multi-fingered hand including fingers with a palm 3538. Other options could also include a passive or actively controllable fixturing-interface 3539 to allow the grasping of particularly designed devices meant to mate to the same, many times allowing for a rigid mechanical and also electrical (data, power, etc.) interface between the robot and the device. The depicted concept need not be limited to the ability to attach fingered hands 3538 or fixturing devices 3539, but potentially other devices 3550, which can include rigidly anchoring to the surface, or even other devices.
The variety of endeffectors 3532 that can form part of the micro-manipulation subsystem 3549 allow for high-fidelity interactions between the robotic system and the environment/world 3548 by way of a variety of devices 3551. The types of interactions depend on the domain application 3533. In the case of the domain application being that of a robotic kitchen with a robotic cooking system, the interactions would occur with such elements as cooking tools 3556 (whisks, knives, forks, spoons, whisks, etc.), vessels including pots and pans 3555 among many others, appliances 3554 such as toasters, electric-beater or -knife, etc., cooking ingredients 3553 to be handled and dispensed (such as spices, etc.), and even potential live interactions with a user 3552 in case of required human-robot interactions called for in the recipe or due to other operational considerations.
The macro-/micro-distinctions provide differentiations on athe types of minimanipulation libraries and their relative descriptors and improved and higher-fidelity learning results based on more localized and higher-accuracy sensory elements contained within the endeffectors, rather than relying on sensors that are typically part of (and mounted on) the articulated base (for larger FoV, but thereby also lower resolution and fidelity when it comes to monitoring finer movements at the “product-interface” (where the cooking tasks mostly take place when it comes to decision-making).
The overall structure in
The macro-/micro-level split also allows: (1) presence and integration of sensing systems at the macro (base) and micro (endeffector) levels (not to speak of the varied sensory elements one could list, such as cameras, lasers, haptics, any EM-spectrum based elements, etc.); (2) application of varied learning techniques at the macro- and micro levels to apply to different minimanipulation libraries suitable to different levels of manipulation (such as coarser movements and posturing of the articulated base using macro-minimanipulation databases, and finer and higher-fidelity configurations and interaction forces/torques of the respective endeffectors using micro-minimanipulation databases), and each thus with descriptors and sensors better suited to execute/monitor/optimize said descriptors and their respective databases; (3) need and application of distributed and embedded processors and sensory architecture, as well as the real-time operating system and multi-speed buses and storage elements; (4) use of the “0-Position” method, whether aided by markers or fixtures, to aid in acquiring and handling (reliably and accurately) any needed tool or appliance/pot/pan or other elements; and (5) interfacing of an instrumented inventory system (for tools, ingredients, etc.) and a smart Utensil/Container/Ingredient storage system.
A multi-level robotic operational system, in this case one of a two-level macro- and micro-manipulation subsystem (3541 and 3549, respectively), comprising of a macro-level articulated and instrumented large workspace coarse-motion articulated and instrumented base 3610, connected to a micro-level fine-motion high-fidelity environment interaction instrumented EoA-tooling subsystem 3620, allows for position and velocity motion planners to provide task-specific motion commands through Mini-manipulation libraries 3630 at both the macro- and micro-levels (3631 and 3632, respectively). The ability to share feedback data and send and receive motion commands is only possible through the use of a distributed processor and sensing architecture 3650, implemented via a (distributed) real-time operating system interacting over multiple varied-speed bus interfaces 3640, taking in high-level task-execution commands from a high-level planner 3660, which are in turn broken down into separate yet coordinated trajectories for both the macro and micro manipulation subsystems.
The macro-manipulation subsystem 610 instantiated by an instrumented articulated and controller-actuated articulated instrumented base 3610 requires a multi-element linked set of operational blocks 3611 thru 3616 to function properly. Said operational blocks rely on a separate and distinct set of processing and communication bus hardware responsible for the macro-level sensing and control tasks at the macro-level. In a typical macro-level subsystem said operational blocks require the presence of a macro-level command translator 3616, that takes in mini-manipulation commands from a library 3630 and its macro-level mini-manipulation sublibrary 3631, and generates a set of properly sequenced machine-readable commands to a macro-level planning module 3612, where the motions required for each of the instrumented and actuated elements are calculated in at least the joint- and Cartesian-space. Said motion commands are sequentially fed to an execution block 3613, which controls all instrumented articulated and actuated joints in at least joint- or Cartesian space to ensure the movements track the commanded trajectories in position/velocity and/or torque/force. A feedback sensing block 3614 provides feedback data from all sensors to the execution block 3613 as well as an environment perception block/module 3611 for further processing. Feedback is not only provided to allow tracking the internal state of variables, but also sensory data from sensor measuring the surrounding environment and geometries. Feedback data from said module 3614 is used by the execution module 3613 to ensure actual values track their commanded setpoints, as well as an environment perception module 3611 to image and map, model and identify the state of each articulated element, the overall configuration of the robot as well as the state of the surrounding environment the robot is operating in. Additionally, said feedback data is also provided to a learning module 3615 responsible for tracking the overall performance of the system and comparing it to known required performance metrics, allowing one or more learning methods to develop a continuously updated set of descriptors that define all mini-manipulations contained within their respective mini-manipulation library 3630, in this case the macro-level mini-manipulation sublibrary 3631.
In the case of the micro-manipulation system 620 instantiated by an instrumented articulated and controller-actuated articulated instrumented EoA-tooling subsystem 3620, the logical operational blocks described above are similar except that operations are targeted and executed only for those elements that form part of the micro-manipulation subsystem 620. Said instrumented articulated and controller-actuated articulated instrumented EoA-tooling subsystem 3620, requires a multi-element linked set of operational blocks 3621 thru 3626 to function properly. Said operational blocks rely on a separate and distinct set of processing and communication bus hardware responsible for the micro-level sensing and control tasks at the micro-level. In a typical micro-level subsystem said operational blocks require the presence of a micro-level command translator 3626, that takes in mini-manipulation commands from a library 3630 and its micro-level mini-manipulation sublibrary 3632, and generates a set of properly sequenced machine-readable commands to a micro-level planning module 3622, where the motions required for each of the instrumented and actuated elements are calculated in at least the joint- and Cartesian-space. Said motion commands are sequentially fed to an execution block 3623, which controls all instrumented articulated and actuated joints in at least joint- or Cartesian space to ensure the movements track the commanded trajectories in position/velocity and/or torque/force. A feedback-sensing block 3624 provides feedback data from all sensors to the execution block 3623 as well as a task perception block/module 3621 for further processing. Feedback is not only provided to allow tracking the internal state of variables, but also sensory data from sensors measuring the immediate EoA configuration/geometry as well as the measured process and product variables such as contact force, friction, interaction product sate, etc. Feedback data from said module 3624 is used by the execution module 3623 to ensure actual values track their commanded setpoints, as well as a task perception module 3621 to image and map, model and identify the state of each articulated element, the overall configuration of the EoA-tooling as well as the type and state of the environment interaction variables the robot is operating in, as well as the particular variables of interest of the element/product being interacted with (as an example a paintbrush bristle width during painting or a the consistency and of egg whites being beaten or the cooking-state of a fried egg). Additionally, said feedback data is also provided to a learning module 3625 responsible for tracking the overall performance of the system and comparing it to known required performance metrics for each task and its associated mini-manipulation commands, allowing one or more learning methods to develop a continuously updated set of descriptors that define all mini-manipulations contained within their respective mini-manipulation library 3630, in this case the micro-level mini-manipulation sublibrary 3632.
In the case of the macro-manipulation subsystem 3710, a connection is made to the world perception and modelling subsystem 3730 through a dedicated sensor bus 3770, with the sensors associated with said subsystem responsible for sensing, modelling and identifying the world around the entire robot system and the latter itself, within said world. The raw and processed macro-manipulation subsystem sensor data is then forwarded over the same sensor bus 3770 to the macro-manipulation planning and execution module 3750, where a set of separate processors are responsible for executing task-commands received from the task mini-manipulation parallel task execution planner 3830, which in turn receives its task commands from the high-level mini-manipulation planner 3870 over a data and controller bus 3780, and controlling the macro-manipulation subsystem 3710 to complete said tasks based on the feedback it receives from the world perception and modelling module 3730, by sending commands over a dedicated controller bus 3760. Commands received through this controller bus 3760, are executed by each of the respective hardware modules within the articulated and instrumented base subsystem 3710, including the positioner system 3713, the repositioning single kinematic chain system 3712, to which are attached the head system 3711 as well as the appendage system 3714 and the thereto attached wrist system 3715.
The positioner system 3713 reacts to repositioning movement commands to its Cartesian XYZ positioner 3713a, where an integral and dedicated processor-based controller executes said commands by controlling actuators in a high-speed closed loop based on feedback data from its integral sensors, allowing for the repositioning of the entire robotic system to the required workspace location. The repositioning single kinematic chain system 3712 attached to the positioner system 3713, with the appendage system 3714 attached to the repositioning single kinematic chain system 3712 and the wrist system 3715 attached to the ends of the arms articulation system 3714a, uses the same architecture described above, where each of their articulation subsystems 3712a, 3714a and 3715a, receive separate commands to their respective dedicated processor-based controllers to command their respective actuators and ensure proper command-following through monitoring built-in integral sensors to ensure tracking fidelity. The head system 3711 receives movement commands to the head articulation subsystem 3711a, where an integral and dedicated processor-based controller executes said commands by controlling actuators in a high-speed closed loop based on feedback data from its integral sensors.
The architecture is similar for the micro-manipulation subsystem. The micro-manipulation subsystem 3720, communicates with the product and process modelling subsystem 3740 through a dedicated sensor bus 3771, with the sensors associated with said subsystem responsible for sensing, modelling and identifying the immediate vicinity at the EoA, including the process of interaction and the state and progression of any product being handled or manipulated. The raw and processed micro-manipulation subsystem sensor data is then forwarded over its own sensor bus 3771 to the micro-manipulation planning and execution module 3751, where a set of separate processors are responsible for executing task-commands received from the mini-manipulation parallel task execution planner 3830, which in turn receives its task commands from the high-level mini-manipulation planner 3870 over a data and controller bus 3780, and controlling the micro-manipulation subsystem 3720 to complete said tasks based on the feedback it receives from the product and process perception and modelling module 3740, by sending commands over a dedicated controller bus 3761. Commands received through this controller bus 3761, are executed by each of the respective hardware modules within the instrumented EoA tooling subsystem 3720, including the hand system 3723 and the cooking-system 3722. The hand system 3723 receives movement commands to its palm and fingers articulation subsystem 3723a with its respective dedicated processor-based controllers commanding their respective actuators to ensure proper command-following through monitoring built-in integral sensors to ensure tracking fidelity. The cooking system 3722, which encompasses specialized tooling and utensils 3722a (which may be completely passive and devoid of any sensors or actuators or contain simply sensing elements without any actuation elements), is responsible for executing commands addressed to it, through a similar dedicated processor-based controller executing a high-speed control-loop based on sensor-feedback, by sending motion commands to its integral actuators. Furthermore, a vessel subsystem 3722b representing containers and processing pots/pans, which may be instrumented through built-in dedicated sensors for various purposes, can also be controlled over a common bus spanning between the hand system 3723 and the cooking system 3722.
A high-level task executor 3900 provides a task description to the mini-manipulation sequence selector 3910, that selects candidate action-primitives (elemental motions and controls) separately to the separate macro- and micro-manipulation subsystems 3810 and 420 respectively, where said components are processed to yield a separate stack of commands to the mini-manipulation parallel task execution planner 430 that combines and checks them for proper functionality and synchronicity through simulation, and then forwards them to each of the respective macro- and micro-manipulation planner and executor modules 350 and 351, respectively.
In the case of the macro-manipulation subsystem, input data used to generate the respective mini-manipulation command stack sequence, includes raw and processed sensor feedback data 3860 from the instrumented base, environment perception and modelling data 3850 from the world perception modeller 330. The incoming mini-manipulation component candidates 491 are provided to the macro mini-manipulation database 3811 with its respective integral descriptors, which organizes them by type and sequence 3815, before they are processed further by its dedicated mini-manipulation planner 3812; additional input to said database 3811 occurs by way of mini-manipulation candidate descriptor updates 3814 provided by a separate learning process described later. Said macro manipulation subsystem planner 3812 also receives input from the mini-manipulation progress tracker 3813, which is responsible to provide progress information on task execution variables and status, as well as observed deviations, to said planning system 3812. The progress tracker 3813 carries out its tracking process by comparing inputs comprising of the required baseline performance 3817 for each task-execution element with sensory feedback data 3860 (raw & processed) from the instrumented base as well as environment perception and modelling data 3850 in a comparator, which generates deviation data 3816 and process improvement data 3818 comprising of performance increases through descriptor variable and constant modifications developed by an integral learning system, back to the planner system 3812.
The mini-manipulation planner system 3812 takes in all these input data streams 3816, 3818 and 3815, and performs a series of steps on this data, in order to arrive at a set of sequential command stacks for task execution commands 3892 developed for the macro-manipulation subsystem, which are fed to the mini-manipulation parallel task execution planner 3830 for additional checking and combining before being converted into machine-readable mini-manipulation commands 3870 provided to each macro- and micro-manipulation subsystem separately for execution. The mini-manipulation planner system 3812 generates said command sequence 3892, through a set of steps, including but not limited to nor necessarily in this sequence but also with possible internal looping, passing the data through: (i) an optimizer to remove any redundant or overlapping task-execution timelines, (ii) a feasibility evaluator to verify that each sub-task is completed according a to a given set of metrics associated with each subtask, before proceeding to the next subtask, (iii) a resolver to ensure no gaps in execution-time or task-steps exist, and finally (iv) a combiner to verify proper task execution order and end-result, prior to forwarding all command arguments to (v) the mini-manipulation command generator that maps them to the physical configuration of the macro-manipulation subsystem hardware.
The process is similar for the generation of the command-stack sequence of the mini-manipulation subsystem 3820, with a few notable differences identified in the description below. As above, input data used to generate the respective mini-manipulation command stack sequence for the micro-manipulation subsystem, includes raw and processed sensor feedback data 3890 from the EoA tooling, product process and modelling data 3880 from the interaction perception modeller 3740. The incoming mini-manipulation component candidates 3892 are provided to the micro mini-manipulation database 3821 with its respective integral descriptors, which organizes them by type and sequence 3825, before they are processed further by its dedicated mini-manipulation planner 3822; additional input to said database 3821 occurs by way of mini-manipulation candidate descriptor updates 3824 provided by a separate learning process described previously and again below. Said micro manipulation subsystem planner 3822 also receives input from the mini-manipulation progress tracker 3823, which is responsible to provide progress information on task execution variables and status, as well as observed deviations, to said planning system 3822. The progress tracker 3823 carries out its tracking process by comparing inputs comprising of the required baseline performance 3827 for each task-execution element with sensory feedback data 3890 (raw & processed) from the instrumented EoA-tooling as well as product and process perception and modelling data 3880 in a comparator, which generates deviation data 3826 and process improvement data 3828 comprising of performance increases through descriptor variable and constant modifications, developed by an integral learning system, back to the planner system 3822.
The mini-manipulation planner system 3822 takes in all these input data streams 3826, 3828 and 3825, and performs a series of steps on this data, in order to arrive at a set of sequential command stacks for task execution commands 3893 developed for the micro-manipulation subsystem, which are fed to the mini-manipulation parallel task execution planner 3830 for additional checking and combining before being converted into machine-readable mini-manipulation commands 3870 provided to each macro- and micro-manipulation subsystem separately for execution. AS for the macro-manipulation subsystem planning process outlined for 3812 before, the mini-manipulation planner system 3822 generates said command sequence 3893, through a set of steps, including but not limited to nor necessarily in this sequence but also with possible internal looping, passing the data through: (i) an optimizer to remove any redundant or overlapping task-execution timelines, (ii) a feasibility evaluator to verify that each sub-task is completed according a to a given set of metrics associated with each subtask, before proceeding to the next subtask, (iii) a resolver to ensure no gaps in execution-time or task-steps exist, and finally (iv) a combiner to verify proper task execution order and end-result, prior to forwarding all command arguments to (v) the mini-manipulation command generator that maps them to the physical configuration of the macro-manipulation subsystem hardware.
The AP-repository is akin to a relational database, where each AP described as AP1 through APn (3922, 3923, 3926, 3927) associated with a separate task, regardless of the level of abstraction by which the task is described, consists of a set of elemental APi-subblocks (APSB1 through APSBm; 39221→m, 3923a1→m, 3927a1→m) which can be combined and concatenated in order to satisfy task-performance criteria or metrics describing task-completion in terms of any individual or combination of such physical variables as time, energy, taste, color, consistency, etc. Hence any complexity of task can be described through a combination of any number of AP-alternatives (APAa through APAz; 3921, 3925), which could result in the successful completion of a specific task, well understanding that there is more than a single APAi that satisfies the baseline performance requirements of a task, however they may be described.
The mini-manipulation AP components sequence selector 3910 hence uses a specific APA selection process 3913 to develop a number of potential APAa thru z candidates from the AP repository 3920, by taking in the high-level task executor task-directive 3940, processing it to identify a sequence of necessary and sufficient sub-tasks in module 3911, and extracting a set of overall and subtask performance criteria and en-states for each sub-task in step 3912, before forwarding said set of potentially viable APs for evaluation. The evaluation process 3914 compares each APAi for overall performance and en-states along any of multiple stand-alone or combined metrics developed previously in 3912, including such metrics as time required, energy-expended, workspace required, component reachability, potential collisions, etc. Only the one APAi that meets a pre-determined set of performance metrics is forwarded to the planner 3915, where the required movement profiles for the macro- and micro manipulation subsystems are generated in one or more movement spaces, such as joint- or Cartesian-space. Said trajectories are then forwarded to the synchronization module 3916, where said trajectories are processed further by concatenating individual trajectories into a single overall movement profile, each actuated movement s synchronized in the overall timeline of execution as well as with its preceding and following movements, and combined further to allow for coordinated movements of multi-arm/-limb robotic appendage architectures. The final set of trajectories are then passed to a final step of mini-manipulation generation 3917, where said movements are transformed into machine-executable command-stack sequences that define the mini-manipulation sequences for a robotic system. In the case of a physical or logical separation, command-stack sequences are generated for each subsystem separately, such as in this case for the macro-manipulation subsystem command-stack sequence 3891 and the micro-manipulation subsystem command-stack sequence 3892.
The hardware systems innate within each the macro- and micro-manipulation subsystems are reflected at both the macro-manipulation subsystem level through the instrumented articulated and controller-actuated articulated base 4010, and the micro-manipulation level through the instrumented articulated and controller-actuated humanoid-like appendages 4020 subsystems. Both are connected to their perception and modelling systems 4030 and 4040, respectively.
In the case of the macro-manipulation subsystem 4010, a connection is made to the world perception and modelling subsystem 4030 through a dedicated sensor bus 4070, with the sensors associated with said subsystem responsible for sensing, modelling and identifying the world around the entire robot system and the latter itself, within said world. The raw and processed macro-manipulation subsystem sensor data is then forwarded over the same sensor bus 4070 to the macro-manipulation planning and execution module 4050, where a set of separate processors are responsible for executing task-commands received from the task mini-manipulation parallel task execution planner 3830, which in turn receives its task commands from the high-level mini-manipulation task/action parallel execution planner 3870 over a data and controller bus 4080, and controlling the macro-manipulation subsystem 4010 to complete said tasks based on the feedback it receives from the world perception and modelling module 4030, by sending commands over a dedicated controller bus 4060. Commands received through this controller bus 4060, are executed by each of the respective hardware modules within the articulated and instrumented base subsystem 4010, including the positioner system 4013, the repositioning single kinematic chain system 4012, to which is attached the central control system 4011.
The positioner system 4013 reacts to repositioning movement commands to its Cartesian XYZ positioner 4013a, where an integral and dedicated processor-based controller executes said commands by controlling actuators in a high-speed closed loop based on feedback data from its integral sensors, allowing for the repositioning of the entire robotic system to the required workspace location. The repositioning single kinematic chain system 4012 attached to the positioner system 4013, uses the same architecture described above, where each of their articulation subsystems 4012a and 4013a, receive separate commands to their respective dedicated processor-based controllers to command their respective actuators and ensure proper command-following through monitoring built-in integral sensors to ensure tracking fidelity. The central control system 4011 receives movement commands to the head articulation subsystem 4011a, where an integral and dedicated processor-based controller executes said commands by controlling actuators in a high-speed closed loop based on feedback data from its integral sensors.
The architecture is similar for the micro-manipulation subsystem. The micro-manipulation subsystem 4020, communicates with the interaction perception and modeller subsystem 4040 responsible for product and process perception and modelling, through a dedicated sensor bus 4071, with the sensors associated with said subsystem responsible for sensing, modelling and identifying the immediate vicinity at the EoA, including the process of interaction and the state and progression of any product being handled or manipulated. The raw and processed micro-manipulation subsystem sensor data is then forwarded over its own sensor bus 4071 to the micro-manipulation planning and execution module 4051, where a set of separate processors are responsible for executing task-commands received from the mini-manipulation parallel task execution planner 3830, which in turn receives its task commands from the high-level mini-manipulation planner 3870 over a data and controller bus 4080, and controlling the micro-manipulation subsystem 4020 to complete said tasks based on the feedback it receives from the interaction perception and modelling module 4040, by sending commands over a dedicated controller bus 4061. Commands received through this controller bus 4061, are executed by each of the respective hardware modules within the instrumented EoA tooling subsystem 4020, including the one or more single sinematic chain system 4023, to which is attached the wrist system 4025, to which in turn is attached the hand-/end-effector system 4023, allowing for the handling of the thereto attached cooking-system 4022. The single kinematic chain system contains such elements as one or more limbs/legs and/or arms subsystems 4024a, which receive commands to their respective elements each with their respective dedicated processor-based controllers commanding their respective actuators to ensure proper command-following through monitoring built-in integral sensors to ensure tracking fidelity. The wrist system 4025 receives commands passed through the single kinematic chain system 4024 which are forwarded to its wrist articulation subsystem 4025a with its respective dedicated processor-based controllers commanding their respective actuators to ensure proper command-following through monitoring built-in integral sensors to ensure tracking fidelity. The hand system 4023 which is attached to the wrist system 4025, receives movement commands to its palm and fingers articulation subsystem 4023a with its respective dedicated processor-based controllers commanding their respective actuators to ensure proper command-following through monitoring built-in integral sensors to ensure tracking fidelity. The cooking system 4022, which encompasses specialized tooling and utensil subsystem 4022a (which may be completely passive and devoid of any sensors or actuators or contain simply sensing elements without any actuation elements), is responsible for executing commands addressed to it, through a similar dedicated processor-based controller executing a high-speed control-loop based on sensor-feedback, by sending motion commands to its integral actuators. Furthermore, a vessel subsystem 4022b representing containers and processing pots/pans, which may be instrumented through built-in dedicated sensors for various purposes, can also be controlled over a common bus spanning from the single kinematic chain system 4024, through the wrist system 4025 and onwards through the hand/effector system 4023, terminating (whether through a hardwired or a wireless connection type) in the operated object system 4022.
As tends to be the case with manipulation system, particularly those requiring substantial mobility over larger workspaces while still needing appreciable endpoint motion accuracy, as is shown in this alternate embodiment in
For larger workspace applications, where the workspace exceeds that of a typical articulated robotic system, it is possible to increase the systems' reach and operational boundaries by adding a positioner, typically capable of movements in free-space, allowing movements in XYZ (three translational coordinates) space, as depicted by 4140 allowing for workspace repositioning 4143. Such a positioner could be a mobile wheeled or legged base, aerial platform, or simply a gantry-style orthogonal XYZ positioner, capable of positioning an articulated body 4142. Such an articulated body 4142 targeted at applications where a humanoid-type configuration is one of the possible physical robot instantiations, said articulated body 4142 would describe a physical set of interlinked elements 4110, comprising of upper-extremities 4117 and lower-extremities 4117a. Each of these interlinked elements within the macro-manipulation subsystem 4110 and 4140 would consist of an instrumented articulated and controller-actuated sub-elements, including a head 4111 replete with a variety of environment perception and modelling sensing elements, connected to an instrumented articulated and controller-actuated shouldered torso 4112 and an instrumented articulated and controller-actuated waist 4113. The waist 4113 may also have attached to its mobility elements such as one or more legs 3535, or even articulated wheels, in order to allow the robotic system to operate in a much more expanded workspace. The shoulders in the torso can have attachment points for mini-manipulation subsystem elements in a kinematic chain described further below.
A micro-manipulation subsystem 4120 physically attached to the macro-manipulation subsystem 4110 and 4140, is used in applications where fine position and/or velocity trajectory-motions and high-fidelity control of interaction forces/torques is required, that a macro-manipulation subsystem 4110, whether coupled to a positioner 4140 or not, would not be able to sense and/or control to the level required for a particular domain-application. The micro-manipulation subsystem 4120 consists of shoulder-attached linked appendages 4116, such as one (typically two) or more instrumented articulated and controller-actuated jointed arms 4114 to each of which would be attached an instrumented articulated and controller-actuated wrist 4118. It is possible to attach a variety of instrumented articulated and controller-actuated end-of-arm (EoA) tooling 4125 to said mounting interface(s). While a wrist 4118 itself can be an instrumented articulated and controller-actuated multi-degree-of-freedom (DoE; such as a typical three-DoF rotation configuration in roll/pitch/yaw) element, it is also the mounting platform to which one may choose to attach a highly dexterous instrumented articulated and controller-actuated multi-fingered hand including fingers with a palm 4122. Other options could also include a passive or actively controllable fixturing-interface 4123 to allow the grasping of particularly designed devices meant to mate to the same, many times allowing for a rigid mechanical and also electrical (data, power, etc.) interface between the robot and the device. The depicted concept need not be limited to the ability to attach fingered hands 4122 or fixturing devices 4123, but potentially other devices 4124, through a process which may include rigidly anchoring them to the surface, or even other devices.
The variety of endeffectors 4126 that can form part of the micro-manipulation subsystem 4120 allow for high-fidelity interactions between the robotic system and the environment/world 4138 by way of a variety of devices 4130. The types of interactions depend on the domain application 4139. In the case of the domain application being that of a robotic kitchen with a robotic cooking system, the interactions would occur with such elements as cooking tools 4131 (whisks, knives, forks, spoons, whisks, etc.), vessels including pots and pans 4132 among many others, appliances 4133 such as toasters, electric-beater or -knife, etc., cooking ingredients 4134 to be handled and dispensed (such as spices, etc.), and even potential live interactions with a user 4135 in case of required human-robot interactions called for in the recipe or due to other operational considerations.
In some embodiments, a multi-level robotic system for high speed and high fidelity manipulation operations segmented into two physical and logical subsystems made up of instrumented, articulated and controller-actuated subsystems, each comprising of a larger- and coarser-motion macro-manipulation system responsible for operations in larger unconstrained environment workspaces at a reduced endpoint accuracy, and a smaller- and finer-motion micro-manipulation system responsible for operations in a smaller workspace and while interacting with tooling and the environment at a higher endpoint motion accuracy, each carrying out mini-manipulation trajectory-following tasks based on mini-manipulation commands provided through a dual-level database specific to the macro- and micro-manipulation subsystems, and each supported by a dedicated and separate distributed processor and sensor architecture operating under an overall real-time operating system communicating with all subsystems over multiple bus interfaces specific to sensor-, command and database-elements.
In some embodiments, the macro-manipulation subsystem contains dedicated sensors, actuators and processors interconnected over one or more dedicated interface buses, including a sensor suite used for perceiving the surrounding environment, which includes imaging and mapping the same and modeling elements within the environment and identifying said elements, performing macro-manipulation subsystem relevant motion planning in one or more of Joint- and/or Cartesian-space based on mini-manipulation commands provided by a dedicated macro-level mini-manipulation library, executing said commands through position or velocity or joint or force based control at the joint-actuator level, and providing sensory data back to the macro-manipulation control and perception subsystems, while also monitoring all processes to allow for learning algorithms to provide improvements to the mini-manipulation macro-level command-library to improve future performance based on criteria such as execution-time, energy-expended, collision-avoidance, singularity-avoidance and workspace-reachability.
In some embodiments, the micro-manipulation subsystem contains dedicated sensors, actuators and processors interconnected over one or more dedicated interface buses, including a sensor suite used for perceiving the immediate environment, which includes imaging and mapping the same and modeling elements within the environment and identifying said elements, particularly as it relates to interaction variables between the micro-manipulation system and associated tools during contact with the environment itself, performing micro-manipulation subsystem relevant motion planning in one or more of joint- and/or Cartesian-space based on mini-manipulation commands provided by a dedicated micro-level mini-manipulation library, executing said commands through position or velocity or joint or force based control at the joint-actuator level, and providing sensory data back to the micro-manipulation control and perception subsystems, while also monitoring all processes to allow for learning algorithms to provide improvements to the mini-manipulation micro-level command-library to improve future performance based on criteria such as execution-time, energy-expended, collision-avoidance, singularity-avoidance and workspace-reachability.
In some embodiments, a robotic cooking system configured into at least a dual-layer physical and logical macro-manipulation and micro-manipulation system capable of independent and coordinated task-motions by way of instrumented, articulated and controller-actuated subsystems, where the macro-manipulation system is used for coarse positioning of the entire robot assembly in free space using its own dedicated sensing-, positioning and motion execution subsystems, with a thereto attached one or more respective micro-manipulation subsystems for local sensing, fine-positioning and motion execution of the end effectors interacting with the environment, with both of the macro- and micro-manipulation system each configured with their own separate and dedicated buses for sensing, data-communication and control of associated actuators with their associated processors, with each of the macro- and micro-manipulation system receiving motion and behavior commands based on separate mini-manipulation commands from their dedicated planners, and with each planner receiving coordinated time- and process-progress dependent mini-manipulation commands from a central planner.
In some embodiments, the macro-manipulation system comprising of a large workspace translational Cartesian-space positioner with an attached body system made up of a sensor-head connected to a shoulder and torso with one or more articulated multi-jointed manipulator arms each with a thereto attached wrist capable of positioning one or more of the micro-manipulation subsystems via dedicated sensors and actuators interfaced through at least one or more dedicated controllers.
In some embodiments, the mini-manipulation system comprising of at least one thereto attached palm and dexterous multi-fingered end-of-arm end effector for handling utensils and tools, as well as any vessel needed in any stages of dish preparation cooking, via dedicated sensors and actuators interfaced through at least one or more controllers.
In some embodiments, where a set of legs or wheels is attached to a waist attached to the macro-manipulation system for larger workspace movements.
In some embodiments, providing sensor feedback data to a world perception and modeling system responsible for perceiving the macro-manipulation subsystem free-space environment as well as the entire robotic system pose.
In some embodiments, providing said world perception feedback and model data over one or more dedicated interface buses to a dedicated macro-manipulation planning, execution and tracking module operating on one or more stand-alone and separate processors.
In some embodiments, macro-manipulation motion commands may be provided from a separate stand-alone task-decomposition and planning module.
In some embodiments, a planning system generating mini-manipulation command-stack sequence that is configured to perform planning actions for the entire robot system combining and coordinating separately planned mini-manipulations from the macro- and micro-planners, where the macro-manipulation planner plans and generates time- and process-progress dependent mini-manipulations for the macro-manipulation subsystem, and where the micro-manipulation planner plans and generates time- and process-progress dependent mini-manipulations for the micro-manipulation subsystem.
In some embodiments, each of the subsystem planners may include a task-progress tracking module, a mini-manipulation planning module, and a mini-manipulation database for macro-manipulation tasks.
In some embodiments, the task-progress tracking module may include progress comparator module that tracks differences between commanded and actual task progress, model and environment data as well as product and process model data combined with all relevant sensor feedback data, and a learning module that creates and tracks variations that impact deviations in the descriptors of said mini-manipulations for potential future upgrades to the respective database.
In some embodiments, the mini-manipulation planning system module which generates mini-manipulation commands based on a set of steps that use mini-manipulation commands from a database which subsequently get evaluated for applicability, resolved for application to individual movable components, combined in space for a smooth motion profile, and optimized for optimum timing and subsequently translated into a machine-readable set of mini-manipulation commands configured into a command-stack sequence.
In some embodiments, mini-manipulation commands for one or both the macro- or micro-manipulation subsystems may be generated, through a process of receiving a high-level task-execution command, and selecting from an action-primitive repository, a set of alternative action primitives that are evaluated and selected to achieve the commanded task based on a set of pre-determined criteria of importance to the application which describe the required entry boundary conditions as well as the minimum necessary exit boundary conditions defining a successful task-completion state at its start and its completion.
In some embodiments, the process of mini-manipulation command generation for one or both the macro- or micro-manipulation subsystems, comprises receiving a high-level task execution command, identifying individual subtasks which will be mapped to the applicable robotic subsystems, generation of individual performance criteria and measurable success end-state criteria for each of the above subtasks, selection of one or more in either a stand-alone or combination, of the most suitable action primitive candidates, evaluation of these action primitive alternatives for maximizing or minimizing such measures as execution-time, energy expended, robot reachability, collision avoidance or any other task-critical criteria, generation of either or both macro- and/or micro-manipulation subsystem trajectories in one or more motion spaces, including joint- and Cartesian-space, synchronizing said trajectories for path consecutiveness, path-segment smoothness, intra-segment time-stamp synchronization and coordination amongst multi-arm robot subsystems, and generating a machine-executable command-sequence stack for one or both the macro- and/or micro manipulation subsystems.
In some embodiments, the step of receiving mini-manipulation descriptor updates generated during the mini-manipulation progress tracking and performance learning process, may involve extracting relevant constants and variables related to specific mini-manipulations and their associated action primitives, assigning variances for each variable and constant for each affected action primitive, and providing the updates back to the action-primitive repository to allow each of the updates to be logged and implemented within said repository or database.
To do reliable and efficient manipulations in unstructured environments, objects and the robot are brought into standardised positions, then pre-calculated movement plans are executed on the robot. Planning for repeated parts of the manipulation is done off-line to allow for better quality and immediately available plans.
Functional action primitives are the minimal building blocks a MM consists of. The AP data structure is shown in illustration, although not all fields are necessary. They are simple actions of the robotic kitchen that can be reused in multiple minimanipulations and recipes. An AP consists of multiple AP Alternatives (APA) which represent alternative actions that reach the same final state. APAs are prioritized according to their preferability (which can include duration of execution, safety distance from objects, energy usage or others). An APA contains an ordered list of AP Sub Blocks (APSB), which are either robot trajectories, vision system commands or appliance commands. All APSBs in an APA are associated with a start timestamp, which implicitly also defines the order in which they are to be executed, possible breaks in between, and simultaneous execution when this timing overlaps as described later. A robot trajectory can either be a joint space trajectory, which defines the position for all robot joints in time and in the robot's joint space, or a Cartesian trajectory, which defines the position of an object or a robot end effector in the kitchen's Cartesian space.
An example minimanipulation includes five FAPs with one to three APSBs is displayed in
To select the preferred FAPA, all available FAPAs are prepared and tested for executability according to the following process in the prioritized order, and the first one that is executable will be executed.
FAPs are meant to be reusable in different contexts and they possess parameters to make them customisable. Values for these parameters are passed down from the minimanipulation that uses the AP. In preparation for the AP execution process, all parameters are evaluated and a working copy of the AP is modified according to them. One of this parameters is the speed factor. It is a factor that scales the trajectory speed, and is applied by multiplying all timestamps with it (which are relative to the start of the AP).
First, by evaluating all of the FAP's FAPSBs start timestamps and durations, overlaps in time are detected, which—in the case of FAPSBs with Cartesian trajectories—mean that multiple arms are supposed to perform a specified action at the same time. Arms that have a specified trajectory are called active, the others passive.
There are two simple possibilities for the passive arm(s) to behave: They can either keep their joints in the same joint space configuration or keep their respective end effectors in their respective Cartesian Poses (position and orientations). If the robot has no joints that are shared amongst the kinematic chains between the base and any end effector, these two possibilities behave the same.
Other than that, the passive arms can be allowed to do any movement as long as it is collision free. Which behavior is executed is specified by the FAP data. An example action that would specify the passive end effector to keep in its position is stirring (active arm) while holding the pot (passive arm). By using FAPSB timing information and a simple logical behavior switch for the passive arm(s) as described above, creating FAPs is simplified and their maintainability and reusability is increased.
If there are multiple active arms at any time in the FAP, it means that the trajectories of those arms needs to be planned for together, then for the transitions between a different number of active arms, motion plans must be calculated.
If the FAP has specified a trajectory for an object that is attached to the robot, the object's trajectory is converted to an end effector trajectory (which is part of the kinematic chain) by using either a saved transformation from the used grasp or the last known geometric relation between the object and the robot end effector. This is to plan a movement trajectory for the robot by using an inverse kinematics solver for chain robots, which requires that a trajectory defines the movement of a part of the kinematic chain.
To simplify models of real world objects, all models are riding, while real objects deform when in contact with each other due to the contact forces. Also for other reasons, real objects are never represented by models perfectly. Therefore, when Motion Planning or Cartesian Planning, if contacts between multiple objects are supposed to happen, for example when grasping an object, the planning algorithm must ignore collisions between the specified objects.
The information when to allow and disallow contacts (which for the planner are collisions) is stored in the FAPSB that contains the respective trajectories and must be forwarded to the planner before and after execution of those trajectories.
If an FAPSB contains information about a change in grasp status, this change is implemented by logically attaching or deattaching an object from the robot's kinematic chain after the APSB has been executed. This change will cause Motion or Cartesian Planners to consider collisions by the attached objects and allow the system to update the object's position based on the movement of the kinematic chain.
An FAPSB can contain information about transitions in dynamic object attributes. This can for example be about a container containing a certain amount of water after executing an APSB that fills uses the tap, or a pot having a lid attached after placing the lid above the pot. This information is used in future actions for example for collision detection.
Planning is the process of turning a goal or requirement for a movement into a series of joint space configurations that fulfil this goal. Before any planning starts, the internal representation of the world the robot must be updated to correspond to their real counterparts.
The process of finding a joint space path that moves the robot from a start configuration to an end configuration without collisions is called Motion Planning. Start and end configuration always contain all joints of the robot. It is done using an algorithm that samples the joint space looking for collision free configurations, then generating a graph between the samples and connecting it with start and end configurations. Any path between start and end on this graph is a valid motion plan that fulfils the requirements.
Motion Planning can be done while considering various constraints. This constraints can include an end effector orientation constraint in order to hold a container upright to not drop its contents. Constraints are implemented in the planner by randomly varying its sample configurations until the constraints are met.
Motion Planning is done without any timing information and its plans are made executable in the postprocessing step.
When a Cartesian movement is to be executed by the robot, Cartesian Planning generates a joint space trajectory that fulfils this movement. The input of Cartesian Planning is a list of Cartesian trajectories, each for a different robot end effector.
Before a set of trajectories is planned for, a feasibility check is done. This check will calculate reachability of the end effector Poses and collisions only by the end effector and its statically connected arm links. This check is a necessary criterion for executability and is a fast way to detect trajectories that cannot be executed, in order for the system to skip the current AP Alternative and continue with the next one.
It is implemented by first trying to finding an IK solution for all given trajectory poses without considering collisions. If at least one poses has no IK solution, the trajectory is not executable. If all poses have corresponding IK solutions, the check continues: Now, the trajectory is executable if and only if for all trajectory poses, the end effector and its statically connected arm links are not in collision.
When planning for a Cartesian movement, typically only a subset of the robot's degrees of freedom are used. The redundant degrees of freedom can therefore be used to optimize the movement. As a Cartesian movement is typically executed in the context of a recipe with other movements, all Cartesian plans are done in a similar configuration. Because the movements are recorded by humans for a humanoid robot to perform actions that are typically performed by humans, the preferred configuration which all movements should be similar to is defined as a humanlike standby configuration as depicted in
The Cartesian planner finds similar IK solutions for the whole trajectory. This similarity in joint space ensures minimal movements between the trajectory poses. Similarity is achieved by using an iteratively optimizing IK solver, starting the search for each pose with the solution of the previous one. The first solution is based on the preferred humanlike configuration or another configuration specified by the system that coordinates the planner (see next section). This way, the IK solver will follow the same local optimum while planning for the whole trajectory. Each IK solution must be collision free and must be similar to the last solution. If an IK solver cannot find a similar solution, the search will be restarted with a different configuration for the first pose. Candidates for this configuration are different saved preferred humanlike configurations or random variations of it, or if necessary completely random configurations. For some APSBs, it may also be suitable to use a configuration specific to this FAPSB.
In the case of input trajectories that cover only a part of the robot, a static configuration is specified for the passive arms and kept during the planning process, or the other arm is put into any position that does not collide as described above. The IK solver then only considers the active parts of the robot, but collision checking still needs to be done for the whole robot.
Cartesian Planning is done in the same resolution and with the same time information as the input trajectory and its plans may not be immediately executable because the movements of the input trajectory may be too fast for the robot to follow. Only the postprocessing step (see below) makes it executable.
FAPSBs typically specify only certain parts of the recipe's robot movements, via Cartesian trajectories or Joint space trajectories. The intermediate movements that connect the ones specified by APSBs are implicit and done by Motion Planning. This is to increase flexibility and reusability.
In order to provide smooth transitions between the different kinds of trajectories, the Cartesian Planning is always done first, trying to find a plan with joint space solutions similar the current state of the robot. Then, Motion Planning is done to move the robot from its current state to the start of the Cartesian trajectory.
As Cartesian trajectories are processed first with resulting joint space trajectories, Motion Planning is always done to connect these joint space trajectories, also generating a joint space trajectory. It can only be skipped if the joint space trajectories from the APs are already connected (the end configuration of the first FAP's joint space trajectory matches the beginning of the second AP's joint space trajectory.
As all plans are done without considering the dynamic properties of the robot, they may be too fast for the robot to execute. Therefore, before execution they are processed and parts of it may be slowed down using a suitable time parametrization algorithm.
In order to speed up future planning, the resulting trajectory is saved together with environmental information and performance metrics. This includes the APSB identifier and version, all object positions and model information and model and state information for the kitchen and robot. Performance metrics can include the trajectory length (in joint space or Cartesian space), energy usage, or humanlikeness. To use a cached trajectory for an APSB, either all of the saved environment needs to be identical with the current environment, or the saved trajectory needs to be checked for collisions in the current environment.
To check the whole saved trajectory for collisions with the environment in an efficient manner, its bounding volume is calculated and saved together with it. This allows testing saved trajectories for validity in real time. A trajectory trail is displayed in
Once a set of cached plans exist, for every new APSB that needs to be planned, saved trajectories are checked in the order of decreasing performance rating, so that the best feasible one is used.
Alternatively to alternate planning and executing, plans can be made once the future environment is known sufficiently enough. The system simulates the movement of the robot, objects and the change in environment caused by a certain APSB. This is possible because all APSBs have predictable outcomes. The planner can make a plan in this future environment for the APSB that follows. The plan must be invalidated if the environment changes in an unexpected way. Otherwise it can be executed. Using this method, it is possible to execute a sequence of APSBs (and in extension APs) without waiting for planning.
Being able to reuse cached trajectories has numerous advantages. As planning can be done ahead of time, more computationally expensive algorithms can be used and enough time can be given for calculation, not only to reliably find solutions at all, but also to find optimal solutions according to a performance metrics as mentioned above. Also, trajectories can be manually selected for performance metrics or criteria that are hard to formalizable (for example aesthetics) and it is easier to confirm their reliability when no planning is included.
As many robotic manipulation actions can be separated into: (1) moving objects and the robot into a defined configuration, and (2) doing manipulations using these objects and configuration.
It is worthwhile to also split up the planning tasks into a part that can be solved with optimized pre-planning and one that is solved with live planning (which is done just before execution).
To be able to use optimized pre-planned trajectories, the direct environment where the manipulation is to be executed must be in a defined state (standard direct environment). This means that all objects that can possibly collide with the robot and all objects that are manipulated or are needed for the manipulation are in positions that match the positions recorded together with the pre-planned trajectory. The positions may be specified in respect to the robot or in respect to the environment. The rest of the environment may be non-standardized. A cached trajectory is always associated with a direct environmental state and can only be reused once the real state matches the saved state.
So before executing a pre-planned manipulation, first all objects that are manipulated or are relevant for the manipulation must be moved to the position and orientation as defined by the pre-planned manipulation. This part of the handling possibly requires live planning to move objects from the non-standard environment to the standard environment.
An example for this in the kitchen context is grasping and moving ingredients and tools from the storage area (cluttered, unpredictable, changes often) to the worktop surface into a defined Poses, then moving the robot to the defined configuration, then executing a trajectory that grasps and mixes the ingredients using the tools.
With this method, optimal Cartesian and Motion Plans for standard environments are generated off-line in a dedicated and calculation resource intense way and then transferred to be used by the robot. The data modeling is implemented either by retaining the regular FAP structure and using plan caching, or by replacing some Cartesian trajectories in the FAPs by pre-planned joint space trajectories, including joint space trajectories to connect the trajectories for individual APSBs inside the APs to even replace some parts of live motion planning during the manipulations in the standard environment. In the latter case, there are two sets of FAPs: One set that has “source” Cartesian trajectories suitable for planning, and one with optimized joint space trajectories.
Tolerances for the differences between real direct environment and direct environment of the saved optimised trajectory, which can be determined using experimental methods, are saved per trajectory or per FAPSB.
Using pre-planned manipulations can be extended to include positioning the robot, especially along linear or axes, to be able to execute pre-planned manipulations on a variety of positions. Another application is placing a humanoid robot in a defined relation to other objects (for example a window in a residential house) and then starting a pre-planned manipulation trajectory (for example cleaning the window).
Furthermore, we propose that time management scheme must not only reduce the average sum of waiting times between the executions of movements but also reduce the variability of total waiting time. Specifically, this is very important for cooking processes where the recipes set up the required timing for the operations. Thus, we introduce the cost function which is given by the probability of cooking failure, namelyP(τ>τfailure), where u is the total time of operation execution. Given the probability distribution p(τ) is determined by its average <τ> and the variance στ2 and neglecting higher order
moments some monotonic increasing function, (which is for example just the error function ƒ(x)=erf(x) if the higher order moments indeed vanish and p(τ) has normal distribution). Therefore for the time management scheme it is beneficial to reduce both the average time and its variance, when the average is below the failure time. Since the total time is the sum of consequential and independently obtained waiting and execution times, the total average and variance are the sums of individual averages and variances. Minimizing the time average and variance at each individual scheme improves the performance by reducing the probability of cooking failure.
To reduce the uncertainty and thus the variance of the planning times (and therefore the variance in the waiting times) we propose to use the data sets of pre-planned and stored sequences that perform typical FAPs. These sequences are optimized beforehand with heavy computational power for the best time performance and any other relevant criteria. Essentially, their uncertainty is reduced to zero and thus they have zero contribution to the total time variance. So if the time management scheme finds a solution that allows the system to come to a pre-defined state from where the sequence of actions to reach the target state is known and does so before the cooking failure time, the probability of cooking failure is reduced to zero since it has zero estimated time variance. In general if the pre-defined sequence is just a part of a total AP it still does not contribute to the total time variance and has the beneficial effect on uncertainty of the total execution time.
To reduce the complexity and thus the average of the planning times (and therefore the average of the waiting times) we propose to use the data sets of pre-planned and stored configurations for which the number of constraints is minimal. As shown in
The logic of this scheme as follows, once the timeout to find a solution is reached (typically set by the execution time of previous FAPSB) and executable trajectory is not found we perform a transitional FAPSB from the incomplete FAPA which does not lead to the target state but rather leads to the new IK problem with reduced complexity. In effect we trade the unknown waiting time with long tail distribution and high average into a fixed time spent on the additional FAPSB and unknown waiting time for the new IK problem with lower average. The time course of the decisions made in this scheme is shown in
Between the internal and external constraints, the internal constraints are due to the limitations of the robotic arm movements and their role is increased when the joints are in complex positions. Thus the typical constraint removal APSB is the retraction of the robotic arm to one the pre-set joint configurations. The external constraints are due to the objects located in the direct environment. The typical constraint removal APSB is the relocation of the object to one of the pre-set locations. The separation of internal and external constraints is used for the selection of APA from the executable complete and incomplete sets.
To combine the complexity reduction with the uncertainty reduction to decrease both the average and the variance of the total execution time, the following structure of the pre-planned and stored data sets is proposed. The sequences of the IK solutions are stored for the list of manipulations with each type of objects that are executable in the dedicated area. In this area we have no external objects and the robotic arm is in one of the pre-defined standard positions. This ensures the minimal number of constraints. So if the direct solution for FAP is not readily obtained we find and use the solution for FAPA which leads to relocation of the object under consideration to a dedicated area, where the manipulation is performed. This result in a massive constraint removal and allows for the usage of pre-computed sequences that minimizes the uncertainty of the execution times. After the manipulation is performed in the dedicated area the object is returned to the working area to complete the FAP.
In some embodiments, time management system that minimizes the probability of failure to meet the temporal deadline requirements by minimizing the average and the variance of waiting times, comprising of pre-defined list of states and corresponding list of operations, pre-computed and stored set of optimized sequences of IK solutions to perform the operations in the pre-defined state, parallel search and generation of AP and APAs (Cartesian trajectories or sequences of IK solutions) towards the target state and the set of the pre-defined states, APSB selection among the executable APAs or AP, based on the performance metrics for the corresponding APA.
In some embodiments, the average and the variance of waiting times may be minimized with the use of pre-defined and pre-calculated states and solutions, which essentially produce zero contribution to the total average and variance when performed in a sequence of actions, from initial state to pre-defined state where the stored sequence is executed and then back to target state.
In some embodiments, the choice of the pre-defined states with minimal number of constraints, the empirically obtained list may include, but not limited to,
In some embodiments, the APSB selection scheme performs the following sequence of choices:
More specifically, as illustrated in
While a robotic kitchen is referenced herein as the type of robotic assisted environment 5002 in connection with some exemplary embodiments, it should be understood that the robotic assisted environment 5002 can be any of a number of other environments known to those of skill in the art. Moreover, as known to those of skill in the art, the functionality of robotic kitchens that are configured as the robotic assisted environment 5002 can be applied to other such environments.
Non-exhaustive illustrative examples of other types of robotic assisted environments 5002 are shown in Table 1 below. Table 1 below also illustrates examples of objectives to be achieved via interactions performed by a robotic assistant 5002r deployed in each corresponding type of robotic assisted environment.
Other examples of robotic assisted environments can include a street, bedroom living room, and the like.
As described above, the robotic assisted environment 5002 is associated with or includes a workspace 5002w. Although only a single workspace 5002w is illustrated in connection with robotic assisted environment 5002, it should be understood that multiple workspaces 5002w can be included in the robotic assisted environment 5002. The workspace 5002w can be any area, section, or part of the robotic assisted environment 5002 in or with which the robotic assistant 5002r can operate (e.g., interact, communicate, etc.). For instance, in some embodiments in which the robotic assisted environment 5002 is a robotic kitchen, the robotic assisted workspace 5002w can be a counter, a cooking surface or module, a washing station, a storage area (e.g., cupboard) and/or the like. That is, a robotic assisted workspace 5002w can refer to individual areas, sections, or parts of the robotic kitchen, or a group thereof that are coupled or decoupled physically or logically to one another. While the robotic assisted workspace 5002w can refer to physical sections or parts of the environment 5002, in some embodiments, the robotic assisted workspace 5002w can refer to and/or include non-tangible parts, such as a space or area between multiple physical parts. In some embodiments, the robotic assisted workspace 5002w (and/or the robotic assisted environment 5002) can include parts, systems, components or the like that are remotely located (e.g., cloud system, remote storage, remote client stations, etc.)
The robotic assisted workspace 5002w is described in further detail below. Nonetheless, in some embodiments, the robotic assisted workspace 5002w (and/or its respective environment 5002) includes and/or is associated with one or more workspace (or environment) objects, which can include physical parts, components, instruments, systems, elements or the like. More specifically, in some embodiments, the objects can include at least one module, sensor, utensil, equipment, stovetop/cooktop, sink, appliance (e.g., dishwasher, refrigerator, blender, etc.), and other objects known to those of skill in the art that can be used by the robotic assistant 5002r to achieve a target objective, as described in further detail herein. It should be understood that the objects can be embedded or built-in the workspace 5002w (e.g., dishwasher) or environment 5002 (e.g., kitchen), can be detachable or separable therefrom (e.g., pan, mixer), or can be fully or partially remotely located from the workspace 5002w and/or environment 5002 (e.g., remote storage). In some embodiments, the robotic assisted workspace 5002w can be or include the objects illustrated in
Objects can be categorized, for example, as static objects, dynamic objects, standard objects and/or non-standard objects. The categorization of each object can impact the manner in which the robotic assistant interacts with the objects. Static objects are objects that can be interacted with but cannot or are typically not moved or physically altered. For example, static objects in a kitchen environment can include an overhead light, a sink, and a shelf. On the other hand, dynamic objects are objects that can be (or are actively) changed or altered (e.g., physically). For example, dynamic objects in a kitchen environment can include a spoon (which can be moved) and a fruit (which can be changed and altered).
Standard objects are those objects do not typically change in size, material, format, and/or texture; are not typically modifiable; and/or do not typically necessitate any adjustment thereof to be manipulated. Illustrative, non-exhaustive examples of standard objects in a kitchen environment include plates, cups, knives, griddles, lamps, bottles, and the like. Non-standard objects are typically enabled or configured to be modified; and/or do not typically necessitate detection and/or identification of their characteristics (e.g., size, material, format, texture, etc.) to be optimally manipulated or interacted with. Illustrative, non-exhaustive examples of non-standard objects include hand soap, candles, pencils, ingredients (e.g., sugar, oil), produce and other plants (e.g., herbs, tomatoes), and the like.
As described in further detail herein, the objects of the robotic assisted workspace 5002w are manipulated and/or interacted with by the robotic assistant 5002r to achieve target objectives. In some embodiments, a specific environment or type of environment is defined in part by and/or associated with a set of standard and/or non-standard objects, as well as a set of interactions available to be performed on, to or with those objects within that type of environment. Table 3 below illustrates non-exhaustive examples of objects in and/or defining an environment, and the interactions that can be performed thereon, therewith or thereto.
It should be understood that different and/or additional data regarding the environment, object and interactions, and/or other fields not illustrated in Table 3 above, can be maintained and/or stored. In some embodiments, data (e.g., templates) relating to specific environments or types of environments, including for example the objects (e.g., object templates) and interactions configured therefor, can be stored in a memory of the robotic assisted environment 5002, cloud computing system 5006, the robotic assistant management system 5004, and/or one or more of the third-party systems 5008. Each set of environment data can be stored and/or referred to as an environment library. An environment library can be downloaded to a memory of the robotic assistant 5002r such that, in turn, the processors (e.g., high level processors, low level processors) can control or command the parts (e.g., end effectors) of the robotic assistant 5002r to perform the desired interactions. As described in further detail below, in some embodiments, objects in an environment are detected, identified, classified and/or categorized prior to and/or during interactions to optimize the results of the interactions and overall target objectives. As also described in further detail below, in some embodiments, objects can be provided with one or more markers to enable or optimize interactions therewith by the robotic assistant 5002r.
In some embodiments, to achieve the target objectives, the robotic assistant 5002r (together with the objects, robotic assisted workspace 5002, and/or robotic assisted environment) can communicate with the cloud computing system 5006, the robotic assistant management system 5004, and/or the third-party systems 5008, via the network 5010, prior to, during or after the execution of the interactions designed to achieve the objective. The network 5010 can include one or more networks. Non-limiting examples of the network 5010 include the Internet, a private area network (PAN), a local area network (LAN), a wide area network (WAN), an enterprise private network (EPN), a virtual private network (VPN), and the like. Such communications via the network 5010 can be performed using a variety of wired and wireless techniques, standards and protocols, known to those of skill in the art, including Wi-Fi, Bluetooth, cellular or satellite service, and short- and long-range communications known to those of skill in the art.
The cloud computing system 5006 refers to an infrastructure made up of shared computing resources and data that is accessible to other systems or devices of the ecosystem 5000. The shared computing resources of the cloud computing system 5006 can include networks, servers, storage, applications, data, and services. A person of skill in the art will understand that any type of data and devices can be included in the cloud 5006. For example, the cloud 5006 can store recipes or libraries of information (e.g., environments, objects, etc.) that can be made available to and downloaded by or to the robotic assisted environment 5002 (or workspace 5002w, robotic assistant 5002r, and the like). The robotic assisted environment 5002 can request and/or receive the data or information from the cloud computing system 5006. For instance, in one example embodiment in which the robotic assisted environment 5002 is a robotic kitchen, the cloud computing system 5006 can store cooking recipes (e.g., created and uploaded by or from other robotic kitchens) that can in turn be downloaded to the environment 5002 for execution.
The robotic assisted environment 5002 can also communicate, via the network 5010, with the robotic assistant management system 5004. The robotic assistant management system 5004 is a system or set of systems that are controlled or managed by a robotic assistant management entity. Such an entity can be a manufacturer of the robotic assistant environment 5002 or robotic assistant 5002r, or an entity configured to provide supervision or oversight, for instance, by deploying updates, expansions, patches, fixes, and the like.
The third-party systems 5008 can be any system or set of systems managed by third party entities (or individuals), with which the robotic assisted environment 5002 and/or robotic assistant 5002r can communicate to achieve desired objectives. Non-exhaustive examples of such third-party entities include developers, chef (e.g., corresponding to chef studio kitchens), social media providers, retail companies, e-commerce providers, and/or the like. These entities and their corresponding systems 5008 can be used to collect, generate, send, and/or store data such as social media feeds, product delivery information, weather, shipment status, plugins, widgets, apps, and other data as known to those of skill in the art.
Robotic Assistant
As discussed above, the robotic assisted environment 5002 includes the robotic assistant 5002r. The robotic assistant 5002r can be deployed in various workspaces 5002w (or environments 5002), and in different structural configurations. For example,
The robotic assistant 5002r can consist of a single, continuous body or structure, or can be made up of detached or detachable components. The robotic assistant 5002r can also include or have portions that are remotely located from other portions or bodies of the robotic assistant 5002r. For example, as shown for example in
In some embodiments, the robotic assistant 5002r can be configured or programmed to work solely with or within a particular robotic assisted environment 5002 and/or workspace 5002w. For instance, the robotic assistant 5002r can be attached (e.g., fixedly, movably and/or removably) to or within the robotic assisted environment 5002, in part or in whole. In some embodiments, as described herein, portions of the robotic assistant 5002r can be attached or fixed to rails, actuators, or the like at one or more areas of the robotic assisted environment 5002 (e.g.,
Moreover, the anatomy and/or structure of the robotic assistant 5002r can vary and be configured in accordance with its intended purpose, objectives and/or corresponding environment.
The exemplary robot anatomy 5002r-1 includes head 5002r-1a, torso 5002r-1b, end effector 5002r-1c, . . . , and end effector 5002r-1n. Again, it should be understood that the number and types of parts of the robot anatomy 5002r-1, and the physical and/or logical (e.g., communicative, software, non-tangible) connections therebetween, can vary from the illustrated example. In some embodiments, the head 5002r-1a refers to an upper portion of the robotic assistant 5002r; the torso 5002r-1b refers to a portion of the robotic assistant 5002r that extends away from the head 5002r-1, and to which the end effectors 5002r-1c and 5002r-1n are connected (e.g., directly or through another portion (e.g., arm)).
Each of these parts of the robot anatomy 5002r-1 can have or be connected to (physically and/or logically) one or more of the processors 5002r-2, memories 5002r-3, and/or sensors 5002r-4 (and/or other components, systems, subsystems, as known to those of skill in the art that are not illustrated in
In some embodiments, the end effectors 5002r-1c and/or 5002r-1n (sometimes interchangeably referred to herein as “manipulators”) refer to portions of the robotic assistant 5002r that are configured and/or designed to interact with objects in an environment, as described in further detail below. The end effectors extend and/or are disposed away from the torso 5002r-1b, and are physically connected thereto directly or indirectly. That is, in some embodiments, the end effectors can refer to a grouping of parts (e.g., robotic shoulder, arm, wrist, hand, palm, fingers, grippers, and/or objects connected thereto) or to a single part (e.g., gripper) of the robotic assistant 5002r that are disposed at a distalmost position relative to the torso 5002r-1b of the robotic assistant 5002r. In some embodiments, the end effectors can be connected directly to the torso 5002r-1b, or can be indirectly connected thereto through another part or set of parts, such as a gripper type end effector 5002r-1c that is connected to the torso 5002r-1b through an arm not deemed to be part of the end effector. It should be understood that the configuration (e.g., aspects, design, structure, purpose, materials, size, functions, etc.) of the end effectors (e.g., 5002r-1c and 5002r-1n) can vary from one to the next, as deemed optimal or necessary for each of their respective objectives, such that, for example, one end effector can have five fingers and another end effector can have two fingers. In some embodiments, an end effector refers to a hand having one or more robotic fingers, a palm, and a wrist.
The processors 5002r-2a, 5002r-2b, and 5002r-2n (collectively referred to herein as “5002r-2”) refers to processors that are physically or logically connected to the robotic assistant 5002r. For example, one of the processors 5002r-2 can be located remotely (e.g., server, cloud) relative to the physical robotic assistant 5002r and/or its anatomy 5002r-1, while another one of the processors 5002r-2 can be embedded and/or physically disposed on or in the robotic assistant 5002r and/or its anatomy 5002r-1. It should be understood that although the processors 5002r-2 are illustrated in
In some embodiments, the processors 5002r-2 can include high level processors and/or low-level processors. For illustrative purposes, in
In some embodiments, the high-level processor 5002r-2a can be configured to, for example, receive, generate and/or direct execution of processes or algorithms, such as algorithms (e.g., algorithms of interaction) that can correspond to recipes. For example, in some embodiments, the high-level processor 5002r-2a generates or obtains (e.g., downloads) an algorithm (e.g., algorithm of interaction). The algorithm can correspond to a part or all of a recipe or process. The algorithm is made up of commands (or instructions) to be executed to perform an interaction. Based on the algorithm, the high-level processor 5002r-2a commands or sends the commands to the appropriate low level processor 5002r-2b (or low level processors). The low-level processor 5002r-2b executes the commands received from or commanded by the high level processor 5002r-2a. In some embodiments, to execute the commands, the low-level processor 5002r-2b controls its corresponding part(s) of the anatomy, such as an end effector, for instance, by causing a local driver unit to operate the kinematic chains or other components of the part(s) of the anatomy.
In some embodiments, the execution of the commands can be performed by or in conjunction with the high-level processor 5002r-2a. Thus, other components or subsystems (e.g., memories, sensors) corresponding to or controlled by each of the processors 5002r-2 can be synchronized or communicatively coupled to provide accurate and efficient interaction between the processors 5002r-2. For example, in some embodiments, memories 5002r-3a and 5002r-3b can correspond to the high-level processor 5002r-2a and low-level processor 5002r-2b, respectively. These two memories can thus be mapped to or between each other for enhanced processing. The processors 5002r-2 can also share data, such as information about workspace models that define the robotic assisted workspaces (e.g., robotic assisted workspace 5002w). Such information can include, for example, data about objects in the workspace, including their position, size, types, materials, gravity directions, weights, velocities, expected positions and the like. Using this information, the low-level processor 5002r-2b corresponding to a particular part of the robot anatomy 5002r-1 (e.g., end effector 5002r-1c) can control that part to interact with or manipulate objects more effectively, for instance, by avoiding collisions. Moreover, as described in further detail herein, the workspace data (e.g., workspace models) can be stored and updated to provide reinforced learning abilities for training of the robotic assistant 5002r to perform optimal movements for each object, interaction, condition, and the like. Workspace data and/or workspace models can be generated, updated, and/or stored (e.g., with data obtained by the sensors 5002r-4), as described in further detail herein, for example,
As described herein, workspace data (e.g., workspace models) and/or environment data (e.g., environment libraries) can be stored in a remote memory (e.g., cloud computing system 5006) and/or in one of the memories 5002r-3 of the robotic assistant 5002r. The memories 5002r-3 of the robotic assistant 5002r include memory 5002r-3a, 5002r-3b, and 5002r-3n (collectively referred to herein as “5002r-3”). These memories 5002r-3 refer to memories that are physically or logically connected to the robotic assistant 5002r. For example, one of the memories 5002r-3 can be located remotely (e.g., server, cloud) relative to the physical robotic assistant 5002r and/or its anatomy 5002r-1, while another one of the memories 5002r-3 can be embedded and/or physically disposed on or in the robotic assistant 5002r and/or its anatomy 5002r-1. It should be understood that although the memories 5002r-3 are illustrated in
Still with reference to
Thus, as illustrated in
In some embodiments, the sensors 5002r-4 can include pressure type sensors.
Although not illustrated in connection with
Moreover, while also not illustrated in connection with
Traditionally humans can readily interact with capacitive screens or surfaces using, for example, a finger, which can transmit a small electrical charge to the touchscreen or surface. Other instruments such as styluses and similar conductive devices can also be configured to pass a charge from a user's hand, through the stylus, to the touchscreen. In some embodiments, the robotic assistant 5002r and/or its end effectors (e.g., 5002r-1c, 5002r-1n) can include end portions such as tips (e.g., fingertips) that are configured to transmit an electrical charge onto a capacitive touchscreen or surface. The electrical charge can be obtained, for example, from a motor electrical terminal, battery or other power source included in or coupled to the robotic assistant 5002r and/or its end effectors. Portions of the end effectors 5002r-1c and/or 5002r-1n can be made of a material, or covered in a material or paint, that has conductive properties that enable the electrical charge to pass from its source, through the end effector and its capacitive portions (e.g., fingertips), onto the touchscreen or surface. By virtue of such a configuration, the robotic assistant 5002r can interact, much like a human, with capacitive touchscreens and surfaces such as mobile devices (e.g., iPhone, iPad), wearable devices (e.g., iWatch), laptops, or any other touchscreen or touch surface provided throughout the environment 5002 in which the robotic assistant 5002r is deployed (e.g., screens provided on drawers in a virtual or computer-controlled kitchen).
It should be understood that the robotic assistant 5002r can execute commands and instructions to perform a target objective of a recipe. The execution of commands and instructions can be performed using the robot anatomy 5002r-2, processors 5002r-2, memories 5002r-3, sensors 5002r-4 and/or other software or hardware components of the robotic assistant 5002r that are not illustrated in
For instance, the robotic assisted environment 5002 and/or the robotic assisted workspace 5002w can include and/or be associated with processors, memories, sensors, and parts or components thereof, which as described above can communicate and/or interact with the robotic assistant 5002r. In some embodiments, the robotic assisted environment 5002 can include as part of its sensors one or more cameras. The cameras can (1) be any type of camera known to those of skill in the art; (2) be mounted or static; and (3) be configured to capture still or moving images during a recipe performance process by the robotic assistant. The captured data (e.g., images) can be shared with the robotic assistant 5002r in real-time, to provide more accurate executing of commands. In one illustrative embodiment, a camera of the robotic assisted environment 5002 can be used to identify the position of a part of an object that cannot be readily or effectively imaged by a camera of the robotic assistant 5002r. In this way, the two cameras can work together to ensure that the position and other characteristics of the object are optimally known and thus perfect or near-perfect interactions with that object can be performed. Other sensors or components of the robotic assisted environment 5002 can be employed to achieve such optimal execution of commands and recipes.
In some embodiments, aspects of the robotic assistant 5002r, robotic assisted workspace 5002w, and/or robotic environment 5002 cooperate to form subsystems or modules configured to, among other things, position the robotic assistant 5002r in a desired location, scan an environment or workspace to detect objects therein and their characteristics, and/or change the environment or workspace, for example, by interacting with or manipulating objects. Examples of these modules and/or subsystems include a navigation module and a vision subsystem (e.g., general, embedded), which are described in further detail below.
Interactions Using the Robotic Assistant System
As described above, the robotic assistant system 5002r can be deployed within the robotic assisted environment 5002 to perform a recipe, which is a series of interactions configured to achieve a desired object. For instance, a recipe can be a series of interactions in an automobile shop configured to achieve the objective of changing a tire, or can be a series of interactions in a kitchen to achieve the objective of cooking a desired dish. It should be understood that interactions refer to actions or manipulations performed by the robotic assistant 5002r to or with, among other things, the objects in the robotic assisted environment 5002 and/or workspace 5002w.
As shown, at step 6052, the robotic assistant 5002r navigates to a desired or target environment or workspace in which a recipe is to be performed. In the example embodiment described with reference to
Navigating to the target environment 5002 and workspace 5002w can be triggered by a command received by the robotic assistant locally (e.g., via a touchscreen or audio command), received remotely (e.g., from a client system, third party system, etc.), or received from an internal processor of the robotic assistant that identifies the need to perform a recipe (e.g., according a predetermined schedule). In response to such a trigger, the robotic assistant 5002r moves and/or positions itself at an optimal area within the environment 5002. Such an optimal area can be a predetermined or preconfigured position (e.g., position 0, described in further detail below) that is a default starting point for the robotic assistant. Using a default position enables the robotic assistant 5002r to have a starting point of reference, which can provide more accurate execution of commands.
As described above, the robotic assistant 5002r can be a standalone and independently movable structure (e.g., a body on wheels) or a structure that is movably attached to the environment or workspace (e.g., robotic parts attached to a multi-rail and actuator system). In either structural scenario, the robotic assistant 5002r can navigate to the desired or target environment. In some embodiments, the robotic assistant 5002 includes a navigation module that can be used to navigate to the desired position in the environment 5002 and/or workspace 5002w.
In some embodiments, the navigation module is made up of one or more software and hardware components of the robotic assistant 5002r. For example, the navigation module that can be used to navigate to a position in the environment 5002 or workspace 5002w employs robotic mapping and navigation algorithms, including simultaneous localization and mapping (SLAM) and scene recognition (or classification, categorization) algorithms, among others known to those of skill in the art, that are designed to, among other things, perform or assist with robotic mapping and navigation. At step 6050, for instance, the robotic assistant 5002r navigates to the workspace 5002w in the environment 5002 by executing a SLAM algorithm or the like to generate or approximate a map of the environment 5002, and localize itself (e.g., its position) or plan its position within that map. Moreover, using the SLAM algorithm, the navigation module enables the robotic assistant 5002r to identify its position with respect or relative to distinctive visual features within the environment 5002 or workspace 5002w and plan its movement relative to those visual features within the map. Still with reference to step 6050, the robotic assistant 5002r can also employ scene recognition algorithms in addition to or in combination with the navigation and localization algorithms, to identify and/or understand the scenes or views within the environment 5002, and/or to confirm that the robotic assistant 502r achieved or reached its desired position, by analyzing the detected images of the environment.
In some embodiments, the mapping, localization and scene recognition performed by the navigation module of the robotic assistant can be trained, executed and re-trained using neural networks (e.g., convolutional neural networks). Training of such neural networks can be performed using exemplary or model workspaces or environments corresponding to the workspace 5002w and the environment 5000.
It should be understood that the navigation module of the robotic assistant 5002r can include and/or employ one or more of the sensors 5002r-4 of the robotic assistant 5002r, or sensors of the environment 5002 and/or the workspace 5002w, to allow the robotic assistant 5002r to navigate to the desired or target position. That is, for example, the navigation module can use a position sensor and/or a camera, for example, to identify the position of the robotic assistant 5002r, and can also use a laser and/or camera to capture images of the “scenes” of the environment to perform scene recognition. Using this captured or sensed data, the navigation module of the robotic assistant 5002r can thus execute the requisite algorithms (e.g., SLAM, scene recognition) used to navigate the robotic assistant 5002r to the target location in the workspace 5002w within the environment 5002.
At step 6052, the robotic assistant 5002r identifies the specific instance and/or type of the workspace 5002w and/or environment 5002, in which the robotic assistant navigates to at step 6050 to execute a recipe. It should be understood that the identification of step 6052 can occur prior to, simultaneously with, or after the navigation of step 6050. For instance, the robotic assistant 5002r can identify the instance or type of the workspace 5002w and/or environment 5002 using information received or retrieved in order to trigger the navigation of step 6050. Such information, as discussed above, can include a request received from a client, third party system, or the like. Such information can therefore identify the workspace and environment with which a request to execute the recipe is associated. For example, the request can identify that the workspace 5002w is a RoboKitchen model 1000. ON the other hand, during or after the navigation of step 6050, the robotic assistant can identify that the environment and workspace through which it is navigating is a RoboKitchen (model 1000), by identifying distinctive features in the images obtained during the navigation. As described below, tis information can be used to more effectively and/or efficiently identify the objects therein with which the robotic assistant can interact.
At step 6054, the robotic assistant 5002r identifies the objects in the environment 5002 and/or workspace 5002w and thus with which the robotic assistant 5002r can communicate. Identifying of the objects of step 6054 can be performed either (1) based on the instance or type of environment and workspace identified at step 6052, and/or (2) based on a scan of the workspace 5002w. In some embodiments, identifying the objects at step 6054 is performed using, among other things, using a vision subsystem of the robotic assistant 5002r, such as a general-purpose vision subsystem (described in further detail below). As described in further detail below, the general-purpose vision subsystem can include or use one or more of the components of the robotic assistant 5002r illustrated in
Still with reference to
Still with reference to
Moreover, at step 6054, objects can be identified using the general-purpose vision subsystem 5002r-5 of the robotic assistant 5002r, which is used to scan the environment 5002 and/or workspace 5002w and identify the objects that are actually (rather than expectedly) present in therein. The objects identified by the general-purpose vision subsystem 5002r-5 can be used to supplement and/or further narrow down the list of “known” objects identified as described above based on the specific instance or type of environment and/or workspace identified at step 6052. That is, the objects recognized by the scan of the general-purpose vision subsystem 5002r-5 can be used to cut down the list of known objects by eliminating therefrom objects that, while known and/or expected to be present in the environment 5002 and/or workspace 5002w, are actually not found therein at the time of the scan. Alternatively, the list of known objects can be supplemented by adding thereto any objects that are identified by the scan of the general-purpose vision subsystem 5002r-5. Such objects can be objects that were not expected to be found in the environment 5002 and/or workspace 5002w, but were indeed identified during the scan (e.g., by being manually inserted or introduced into the environment 5002 and/or workspace 5002w). By identifying the identification of objects using these two techniques, an optimal list of objects with which the robotic assistant 5002r is to interact is generated. Moreover, by referencing a pre-generated list of known objects, errors (e.g., omitted or misidentified objects) due to incomplete or less-than-optimal imaging by the general-purpose vision subsystem 5002r-5 can be avoided or reduced.
As shown in
Still with reference to
In some embodiments, the cameras 5002r-4 illustrated in
The camera system can also be said to include the illustrated structured light and smooth light, which can be built or embedded in the cameras 5002r-4 or separate therefrom. It should be understood that the lights can be embedded in or separate from (e.g., logically connected to) the robotic assistant 5002r. Moreover, the camera system can also be said to include the illustrated camera calibration module 5002r-5-1 and the rectification and stitching module 5002r-5-2.
At step 7050, the cameras 5002r-4 are used to capture images of the workspace 5002w for calibration. Prior to capturing the images to be used for camera calibration, a checkerboard or chessboard pattern (or the like, as known to those of skill in the art) is disposed or provided on predefined positions of the workspace 5002w. The pattern can be formed on patterned markers that are outfitted on the workspace 5002w (e.g., top surface thereof). Moreover, in some embodiments such as the one illustrated in
In turn, at step 7054, calibration of the cameras is performed to provide more accurate imaging such that optimal and/or perfect execution of commands of a recipe can be performed. That is, camera calibration enables more accurate conversion of image coordinates obtained from images captured by the cameras 5002r-4 into real world coordinates of or in the workspace 5002w. In some embodiments, the camera calibration module 5002r-5-2 of the general-purpose vision subsystem 5002r-5 is used calibrate the cameras 5002r-4. As illustrated, the camera calibration module 5002r-5-2 can be driven by the CPU 5002r-2a.
The cameras 5002r-4, in some embodiments, are calibrated as follows. The CPU 5002r-2a detects the pattern (e.g., checkerboard) in the images of the workspace 5002w captured at step 7050. Moreover, the CPU 5002r-2a locates the internal corners in the detected pattern in of the captured images. The internal corners are the corners where four squares of the checkerboard meet and that do not form part of the outside border of the checkerboard pattern disposed on the workspace 5002w. For each of the identified internal corners, the general-purpose vision subsystem 5002r-5 identifies the corresponding pixel coordinates. In some embodiments, the pixel coordinates refer to the coordinate on the captured images at which the pixel corresponding to each of the internal corners is located. In other words, the pixel coordinates indicate where each internal corner of the checkerboard pattern is located in the images captured by the cameras 500r-4, as measured in an array of pixel.
Still with reference to the calibration of step 7054, real world coordinates are assigned to each of the identified pixel coordinates of the internal corners of the checkerboard pattern of. In some embodiments, the respective real-world coordinates can be received from another system (e.g., library of environments stored in the cloud computing system 5006) and/or can be input to the robotic apparatus 5002r and/or the general-purpose vision subsystem 5002r-5. For example, the respective real-world coordinates can be input by a system administrator or support engineer. The real-world coordinates indicate the real-world position in space of the internal corners of the checkerboard pattern of the markers on the workspace 5002w.
Using the calculated pixel coordinates and real-world coordinates for each internal corner of the checkerboard pattern, the general-purpose vision subsystem 5002r-5 can generate and/or calculate a projection matrix for each of the cameras 5002r-4. The projection matrix thus enables the general-purpose vision subsystem 5002r-5 to convert pixel coordinates into real world coordinates. Thus, the pixel coordinate position and other characteristics of objects, as viewed in the images captured by the cameras 5002r-4, can be translated into real world coordinates in order to identify where in the real world (as opposed to where in the captured image) the objects are positioned.
As described herein, the robotic assistant 5002r can be a standalone and independently movable system or can be a system that is fixed to the workspace 5002w and/or other portion of the environment 5002. In some embodiments, parts of the robotic assistant 5002r can be freely movable while other parts are fixed to (and/or be part of) portions of the workspace 5002w. Nevertheless, in some embodiments in which the camera system of the general-purpose vision subsystem 5002r-5 is fixed, the calibration of the cameras 5002r-4 is performed only once and later reused based on that same calibration. Otherwise, if the robotic assistant 5002r and/or its cameras 5002r-4 are movable, camera calibration is repeated each time that the robotic assistant 5002r and/or any of its cameras 5002r-4 change position.
It should be understood that the checkerboard pattern (or the like) used for camera calibration can be removed from the workspace 5002w once the cameras have been calibrated and/or use of the pattern is no longer needed. Although, in some cases, it may be desirable to remove the checkerboard pattern as soon as the initial camera calibration is performed, in other cases it may be optimal to preserve the checkerboard markers on the workspace 5002w such that subsequent camera calibrations can more readily be performed.
With the cameras 5002r-4 calibrated, the general-purpose vision subsystem 5002r-5 can begin identifying objects with more accuracy. To this end, at step 7056, the cameras 5002r-4 capture images of the workspace 5002w (and/or environment 5002) and transmit those captured images to the CPU 5002r-2a. The images can be still images, and/or video made up of a sequence of continuous images. Although the sequence diagram 7000 of
At step 7058, the captured images received at step 7056 are rectified by the rectification and stitching module 5002r-5-2 using the CPU 5002r-2a. In some example embodiments, rectification of the images captured by each of the cameras 5002r-4 includes removing distortion in the images, compensating each camera's angle, and other rectification techniques known to those of skill in the art. In turn, at step 7060, the rectified images captured from each of the cameras 5002r-4 are stitched together by the rectification and stitching module 5002r-5-2 to generate a combined captured image of the workspace 5002w (e.g., the entire workspace 5002w). The X and Y axes of the combined captured image are then aligned with the real-world X and Y axes of the workspace 5002w. Thus, pixel coordinates (x,y) on the combined image of the workspace 5002w can be transferred or translated into corresponding (x,y) real world coordinates. In some embodiments, such a translation of pixel coordinates to real world coordinates can include performing calculations using a scale or scaling factor calculated by the calibration module 5002r-5-2 during the camera calibration process.
In turn, at step 7062, the combined (e.g., stitched) image generated by the rectification and stitching module 5002r-5-2 is shared (e.g., transmitted, made available) with other modules, including the object detection module 5002r-5-4, to identify the presence of objects in the workspace 5002w and/or environment 5002 by detecting objects within the captured image. Moreover, at step 7064, the cloud computing system 5006 transmits libraries of known objects and surfaces stored therein to the general-purpose vision subsystem 5002r-5, and in particular to the GPU 5002r-2b. As discussed above, the libraries of known objects and surfaces that is transmitted to the general-purpose vision subsystem 5002r-5 can be specific to the instance or type of the environment 5002 and/or the workspace 5002w, such that only data definitions of objects known or expected to be identified are sent. Transmission of these libraries can be initiated by the cloud computing system 5006 (e.g., pushed), or can be sent in response to a request from the GPU 5002r-2b and/or the general-purpose vision subsystem 5002r-5. It should be understood that transmission of the libraries of known objects can be performed in one or multiple transmissions, each or all of which can occur immediately prior to or at any point before the object detection of step 7068 is initiated.
At step 7066, the GPU 5002r-2b of the general-purpose vision subsystem 5002r-5 of the robotic apparatus 5002r downloads trained neural networks or similar mathematical models (and weights) corresponding to the known objects and surfaces associated with step 7064. These neural networks are used by the general-purpose vision subsystem 5002r-5 to detect or identify objects. As shown in
In turn, at step 7068, the object detection module 5002r-5-4 uses the GPU 5002r-2b to detect objects in the combined image (and therefore implicitly in the real-world workspace 5002w and/or environment 5002) based on or using the received and trained object detection neural networks (e.g., CNN, F-CNN, YOLO, SSD). In some embodiments, object detection includes recognizing, in the combined image, the presence and/or position of objects that match objects included in the libraries of known objects received at step 7064.
Moreover, at step 7070, the segmentation module 5002r-5-5 uses the GPU 5002r-2b segments portions of the combined image and assign an estimated type or category to that segment based on or using the trained neural network such as SegNet received at step 7066. It should be understood that, at step 7070, the combined image of the workspace 5002w is segmented into pixels, though segmentation can be performed using a unit of measurement other than a pixel as known to those of skill in the art. Still with reference to step 7070, each of the segments of the combined image is analyzed by the trained neural network in order to be classified, by determining and/or approximating a type or category to which the contents of each pixel correspond. For example, the contents or characteristics of the data of a pixel can be analyzed to determine if they resemble a known object (e.g., category: “knife”). In some embodiments, pixels that cannot be categorized as corresponding to a known object can be categorized as a “surface,” if the pixel most closely resembles a surface of the workspace, and/or as “unknown,” if the contents of the pixel cannot be accurately classified. It should be understood that the detection and segmentation of steps 7068 and 7070 can be performed simultaneously or sequentially (in any order deemed optimal).
In turn, at step 7072, the results of the object detection of step 7068 and the segmentation results (and corresponding classifications) of step 7070 are transmitted by the GPU 5002r-2b to the CPU 5002r-2a. Based on these, at step 7074, the object analysis is performed by the marker detection module 5002r-5-3 and the contour analysis module 5002r-5-6, using the CPU 5002r-2a, to, among other things, identify markers (described in further detail below) on the detected objects, and calculate (or estimate) the shape and pose of each of the objects.
That is, at step 7074, the marker detection module 5002r-5-3 determines whether the detected objects include or are provided with markers, such as ArUco or checkerboard/chessboard pattern markers. Traditionally, standard objects are provided with markers. As known to those of skill in the art, such markers can be used to more easily determine the pose (e.g., position) of the object and manipulate it using the end effectors of the robotic assistant 5002r. Nonetheless, non-standard objects, when not equipped with markers can be analyzed to determine their pose in the workspace 5002w using neural networks and/or models trained on that type of non-standard object, which allows the general-purpose vision subsystem 5002r-5 to estimate, among other things, the orientation and/or position of the object. Such neural networks and models can be downloaded and/or otherwise obtained from other systems such as the cloud computing system 5006, as described above in detail. In some embodiments, analysis of the pose of objects, particularly non-standard objects, can be aided by the use of structured lighting. That is, neural networks or models can be trained using structured lighting matching that of the environment 5002 and/or workspace 5002w. The structured lighting highlights aspects or portions of the objects, thereby allowing the module 5002r-5-3 to calculate the object's position (and shape, which is described below) to provide more optimal orientation and positioning of the object for manipulations thereon. Still with reference to stop 7074, analysis of the detected objects can also include determining the shape of the objects, for instance, using the contours analysis module 5002r-5-6 of the general-purpose vision subsystem 5002r-5. In some embodiments, contour analysis includes identifying the exterior outlines or boundaries of the shape of detected objects in the combined image, which can be executed using a variety of contour analysis techniques and algorithms known to those of skill in the art.
At step 7076, a quality check process is performed by the quality check module 5002r-5-7 using the CPU 5002r-2a, to further process segments of the image that were classified as unknown. This further processing by the quality check process serves as a fall back mechanism to provide last minute classification of “unknown” segments.
At step 7078, the results of the analysis of step 7074 and the quality check of step 7076 are used to update and/or generate the workspace model 5002w-1 corresponding to the model 5002w. In other words, data identifying the objects, and their shape, position, segment types, and other calculated or determined characteristics thereof are stored in association with the workspace model 5002w-1.
Moreover, with reference to step 6054, the process of identifying objects and downloading or otherwise obtaining information associated with each of the objects into the workspace model 5002w-1 can also include downloading or obtaining interaction data corresponding to each of the objects. That is, as described above in connection with
For example, a recipe to be performed in a kitchen can be to achieve a goal or objective such as cooking a turkey in an oven. Such a recipe can include or be made up of steps for marinating the turkey, moving the turkey to the refrigerator to marinate, moving the turkey to the oven, removing the turkey from the oven, etc. These steps that make up a recipe are made up of a list or set of specifically tailored (e.g., ordered) interactions (also referred to interchangeably as “manipulations”), which can be referred to as an algorithm of interactions. These interactions can include, for example: pressing a button to turn the oven on, turning a knob to increase the temperature of the oven to a desired temperature, opening the oven door, grasping the pan on which the turkey is placed and moving it into the oven, and closing the oven door. Each of these interactions is defined by a list or set of commands (or instructions) that are readable and executable by the robotic assistant 5002r. For instance, an interaction for turning on the oven can include or be made up of the following list of ordered commands or instructions:
Move finger of robotic end effector to real world position (x1, y1), where (x1, y1) are coordinates of a position immediately in front of the oven's “ON” button;
Advance finger of robotic end effector toward the “ON” button until X amount of opposite force is sensed by a pressure sensor of the end effector;
Retract finger of robotic end effector the same distance as in the preceding command.
As discussed in further detail below, the commands can be associated with specific times at which they are to be executed and/or can simply be ordered to indicate the sequence in which they are to be executed, relative to other commands and/or other interactions (and their respective timings). The generation of an algorithm mf interaction, and the execution thereof, is described in further detail below with reference to steps 6056 and 6058 of
As described herein, the robotic assistant 5002r can be deployed to execute recipes in order to achieve desired goals or objectives, such as cooking a dish, washing clothes, cleaning a room, placing a box on a shelf, and the like). To execute recipes, the robotic assistant 5002r performs sequences of interactions (also referred to as “manipulations”) using, among other things, its end effectors 5002r-1c and 5002r-1n. In some embodiments, interactions can be classified based on the type of object that is being interacted with (e.g., static object, dynamic object). Moreover, interactions can be classified as grasping interactions and non-grasping interactions.
Non-exhaustive examples of types of grasping interactions include (1) grasping for operating, (2) grasping for manipulating, and (3) grasping for moving. Grasping for operating refers to interactions between one or more of the end effectors of the robotic assistant 5002r and objects in the workspace 5002w (or environment 5002) in which the objective is to perform a function to or on the object. Such functions can include, for example, grasping the object in order to press a button on the object (e.g., ON/OFF power button on a handheld blender, mode/speed button on a handheld blender). Grasping for manipulating refers to interactions between one or more of the end effectors of the robotic assistant 5002r and objects in the workspace 5002w (or environment 5002) in which the objective is to perform a manipulation on or to the object. Such manipulations can include, for example: compressing an object or part thereof; applying axial tension on an X,Y or an X,Y,Z axis; compressing and applying tension; and/or rotating an object. Grasping for moving refers to interactions between one or more of the end effectors of the robotic assistant 5002r and objects in the workspace 5002w (or environment 5002) in which the objective is to change the position of the object. That is, grasping for moving type interactions are intended to move an object from point A to point B (and other points, if needed or desired), or change its direction or velocity.
On the other hand, non-exhaustive examples of types of non-grasping interactions include (1) operating without grasping; (2) manipulating without grasping; and (3) moving without grasping. Operating an object without grasping refers to interactions between one or more of the end effectors of the robotic assistant 5002r and objects in the workspace 5002w (or environment 5002) in which the objective is to perform a function without having to grasp the object. Such functions can include, for example, pressing a button to operate an oven. Manipulating an object without grasping refers to interactions between one or more of the end effectors of the robotic assistant 5002r and objects in the workspace 5002w (or environment 5002) in which the objective is to perform a manipulation without the need to grasp the object. Such functions can include, for example, holding an object back or away from a position or location using the palm of the robotic hand. Moving an object without grasping refers to interactions between one or more of the end effectors of the robotic assistant 5002r and objects in the workspace 5002w (or environment 5002) in which the objective is to move an object from point A to point B (and other points, if needed or desired), or change its direction or velocity, without having to grasp the object. Such non-grasping movement can be performed, for example, using the palm or backside of the robotic hand.
While interactions with dynamic objects can also be classified into grasping and non-grasping interactions, n some embodiments, interactions with dynamic objects (as opposed to static objects) can be approached differently by the robotic assistant 5002r, as compared with interactions with static objects. For example, when performing interactions with dynamic objects, the robotic assistant additionally: (1) estimates each object's motion characteristics, such as direction and velocity; (2) calculates each objects expected position at each time instance or moment of an interaction; and (3) preliminarily positions its parts or components (e.g., end effectors, kinematic chains) according to the calculated expected position of each object. Thus, in some embodiments, interactions with dynamic objects can be more complex than interactions with static objects, because, among other reasons, they require synchronization with the dynamically changing position (and other characteristics, such as orientation and state) of the dynamic objects.
Moreover, interactions between end effectors of the robotic assistant 5002r and objects can also or alternatively be classified based on whether the object is a standard or non-standard object. As discussed above in further detail, standard objects are those objects that do not typically have changing characteristics (e.g., size, material, format, texture, etc.) and/or are typically not modifiable. Non-exhaustive, illustrative examples of standard objects include plates, cups, knives, lamps, bottles, and the like. Non-standard objects are those objects that are deemed to be “unknown” (e.g., unrecognized by the robotic assistant 5002r), and/or are typically modifiable, adjustable, or otherwise require identification and detection of their characteristics (e.g., size, material, format, texture, etc.). Non-exhaustive, illustrative examples of non-standard objects include fruits, vegetables, plants, and the like.
Similar to
When the identification of the object is performed by the end effector and the group of environment applicable to the object is identified, the list of interactions (being downloaded to the embedded processor that is operating the end effector) is narrowed to the particular list of possible operations or interactions with the object that are available in the exact environment where the end effector is located. When the lists (e.g., all lists) of objects are identified, they (e.g., their libraries) are made ready to be downloaded to embedded low-level processors that perform or control the actual operations of the end effector in accordance with the commands of the main processor of the high level. In some embodiments, every end effector can have its own sensors and cameras, and can receive enough final data to perform the interactions with the objects.
In some embodiments, cameras (e.g., of each end effector) can be located or positioned at (but not limited to):
In some embodiments, cameras such as the camera located on the wrist of the robotic hand are constructed to be able to rotate by up to 360 degrees. This enables the positioning or repositioning of the cameras to achieve the required observation of the interaction areas.
Moreover, the system of cameras located on the hand and wrist enables or facilitates: (1) performing observations anytime that the area of the interaction between the object and the robotic apparatus or robotic end-effector (or object fixed in robotic hand/end effector); (2) identifying the points of control that define how successful or unsuccessful the process of interaction is proceeding, and helps to perform the check and capture of the successful final interaction stage processing.
The commands that make up and are configured to perform an interaction are executed in accordance with an algorithm of interaction. With reference to
Still with reference to step 6056, the algorithm of interaction is customized and/or specifically tailored for the specific configuration of the instance or type of environment 5002, workspace 5002w (and identified objects therein), and robotic assistant 5002r (including its parts, components, subsystems, and the like). That is, because each environment, workspace, and robotic apparatus can vary (e.g., in dimensions, arrangements, contents, etc.), a single algorithm of interaction preferably should generally not be used by any robotic assistant, and/or in any workspace or environment to perform a given interaction. For instance, moving a cup on a kitchen counter top having a surface slicker than the counter top surface of another kitchen would require that different amounts of force be applied by the robotic hand onto the cup in order to slide the cup to the desired position. Thus, because even slight differences in workspace, environment, robotic apparatus and other aspects of an interaction can generate less-than-optimal interaction results, for instance, in which the results do not perfectly (or substantially perfectly) match the expected results of that interaction. As one illustrative example, in some cases, cooking an ingredient over heat that is even slightly higher or lower than the expected or target heat could result in an entirely unusable (e.g., undercooked, burnt) ingredient.
As described above, the algorithm of interaction can be generated to achieve the goal or objective initially received and/or identified by the robotic assistant 5002r. For instance, the robotic assistant 5002r can be instructed by another system (e.g., client system, third party system, etc.) to execute a recipe, such as cooking a dish. Or, the robotic assistant can be triggered by internal logic (e.g., scheduler) to execute a recipe (e.g., cook a dish at 5 pm every Tuesday; cook a dish when low grocery count is identified in refrigerator and pantry). Having identified the recipe to be executed, the robotic assistant 5002r, at step 6056, can generate the algorithm of interaction for instance, by customizing a generic recipe algorithm of interaction for the identified environment 5002, workspace 5002w, and robotic apparatus 5002w (and its parts (e.g., joints, etc.)). It should be understood that, because the robotic assistant 5002r only downloads data definitions of identified objects and or interactions capable of being executed in the environment 5002 and/or workspace 5002w, the robotic assistant can minimize its storage and processing burdens by not having to download, store and/or consider and/or process a vast amount of inapplicable or irrelevant objects and interactions.
An algorithm of interaction is made up of multiple commands, and each command is made up as a sequence coordinates for each part (e.g., joint) of the robotic assistant 5002r. The coordinates can include three space coordinates (e.g., X, Y and Z axis coordinate) and time coordinates. In other words, the command definitions indicate where each joint of the robotic assistant 5002r should be at a defined time (e.g., as shown in
As described above, the algorithm of interaction is designed to perform error-free (or substantially error-free interactions). As stated, because interactions performed by the robotic assistant 5002r can sometimes not be aided by supervision or human quality control, it is vital that the algorithm perform each command as precisely as possible. This precision can be achieved by training the robotic assistant to perform each command and/or interaction to perfection (or as perfectly as possible). For instance, when training the robotic assistant 5002r to perform an interaction such as pressing the “ON” button of an oven in the workspace 5002w, the robotic assistant can repeatedly perform (or attempt to perform) each of the commands necessary for that interaction until they are sufficiently accurately performed. When each command of the interaction is successfully performed by the robotic assistant, the underlying instructions and/or parameters such as the space coordinates (e.g., X, Y, Z, axis) of each part of the robotic assistant at each time period can be stored and/or used to program the robotic assistant 5002r accordingly. This training process can be referred to as “reinforced training” and/or “reinforced learning.”
Reinforced learning, which is a process of training the robotic assistant 5002r (or other systems or processors) to execute interactions without errors (or substantially without errors), through one or more iterations of a testing and learning plan. The testing and learning plan of the reinforced learning process can be performed until the reliability of the interactions is achieved and confirmed. Reinforced learning can be performed by the robotic assistant 5002r at any point prior to executing interactions. For example, the robotic assistant 5002r can be trained when it is manufactured, when aspects of the robotic assistant 5002r have been changed (e.g., system update), and/or when the robotic assistant 5002r is deployed to a new environment 5002 and/or workspace 5002w. Additionally or alternatively, the robotic assistant 5002r can continuously be trained during its lifecycle. It should be understood that, in some embodiments, training of the robotic assistant 5002r can be performed specifically using the robotic assistant 5002r. In other cases, training can be performed on an robotic assistant that is the same as the robotic assistant 5002r. In such cases, training results or instructions customized for the type of the robotic assistant 5002r can transmitted to the robotic assistant 5002r for execution, such that the robotic assistant 5002r does not have to perform the training itself.
Reinforced training for a particular interaction is performed by executing a number of test interactions until a predetermined threshold number of successful cases or executions of that interaction have been achieved. It should be understood that the data that constitutes a successful case is fixed (e.g., known) and can be predetermined or preset prior to the interaction. For example, a successful case for an interaction of turning on the oven can be defined by the oven having its power state changed from the “OFF” position to the “ON” position. In cases where the training interaction is a grasping interaction for moving, the successful case can be defined by the coordinates of a position, orientation and the like to which the object should be moved. The training of the robotic assistant 5002r is therefore measured against these predetermined successful result criteria.
Training the robot to achieve successful cases (e.g., executions) of the desired interaction can include first imaging the object to be interacted with by one or more of the cameras 5002r-4 of the robotic assistant. In some embodiments, this includes moving the object or moving the object to a desired relative position of the cameras 5002r-4 and parts (e.g., end effectors) of the robotic assistant 5002r, where the object is imaged. This desired relative position is sometimes referred to as a “gauging point.” When the object and/or robotic assistant are positioned at the gauging point, the cameras 5002r-4 can be used to image the object from various angles. For example, the object can be imaged from the top, front, side and/or bottom to produce various frames of observation (e.g., 5 frames). Thus, the robotic assistant 5002r iteratively attempts to perform an interaction (or portion thereof) until the gauging point is consistently reached as measured by the cameras 5002r-4 and/or other sensors, thereby indicating that the robotic assistant has repeatedly successfully performed that aspect of the interaction.
For example, to train the robotic assistant to grasp an apple, the robotic assistant 5002r attempts to use its end effectors 5002r-1, 5002r-1n to grasp the apple, and move the apple to the gauging point. After each attempt, the robotic assistant 5002r uses its cameras to image the object and determine whether its position, orientation, etc. match that of the successful case. If the apple is deemed to be placed, oriented, or otherwise incorrectly, the robotic assistant 5002r modifies its parameters until the apple is moved to a position, orientation, etc. that match the successful case. In some cases, training the robotic assistant 5002r to interact with non-standard objects include: scanning (e.g., imaging the object); creating rules for the interaction, including based on the size and other characteristics of the object; and classifying the object to specific types of interactions based on the size of the object.
At step 6058 of
In turn, at step 8052, the embedded processor of the end effector 5002r-1c of the robotic assistant 5002r uses this information to initiate its interaction responsibilities, by positioning the end effector 5002r-1c and/or object to be interacted with at a preliminary position relative to one another. In some embodiments, the preliminary position includes the end effector being placed above the object to be manipulated, based on the imaging of the object performed during the object identification process described above.
At step 8054, the robotic assistant positions the end effector at position 0 (also referred as optimal standard position). Prior to positioning the end effector at position 0, the one or more processors (comprising the high-level processor/central processor and low-level processors) detect the environment and objects present in the environment. Therefore, initially, the one or more processors may receive environment data corresponding to a current environment, from one or more sensors configured in the robotic assistant system (also referred as robotic assistant). In some embodiments, each of the one or more sensors may be associated with a sensor collector configured in the robotic assistant system. The complete hierarchy/architecture of the illustrates of the robotic system is as shown in the
As shown in the
From the top down, high-level displacement commands in 3-dimensional space are converted into instructions for moving each particular kinematic chain (also referred as an end effector/one or more manipulation devices), each link in the manipulator, eventually turning into motor control signals (objective lenses, light sources, laser). Any subsystem is understood as a hardware-software unit, isolated mechanically, constructively, (optionally) electrically and logically (interface). Each subsystem, in turn, can also contain separate components (boards of connection for external interfaces, hardware acceleration modules, power supplies, etc.)
The purpose is to determine the minimum required set for each subsystem:
Further, the main components of the hierarchy/architecture are described below in more detail.
Actuators & Sensors Group
This subsystem comprising a group of actuators and sensors is a set of mechanical components of a single kinematic chain. It also includes the entire set of physical sensors built into the one or more manipulation devices (also referred as manipulator/s). The set of sensors, the number of links, as well as the electromechanical properties of the subsystem interface are always unique for a certain type of manipulator.
Since different types of manipulators can have different sets of sensors (and the sensors themselves—differ by interfaces), a component that isolates the features (electrical, mechanical, logical) of the manipulator's parts may be necessary. The sensors collector abstracts the operation of higher-level subsystems with specific sensors and control objects embedded into the manipulator. Sensors Collector converts the Signals Bus to a “standard” stream for upstream event subsystems. And in the opposite direction—it converts the command flow into the corresponding sequence of control signals on the buses of sensors, motors, drives, lamps, etc. The design of the Sensors Collector may always correspond to the types of the specific end effector and/or manipulator.
Kinematic Chain Processor System
The kinematic chain processor system i.e. group of processor or the one or more processors is designed to solve the task of controlling one kinematic chain. Kinematic Chain Processor System provides the primary image processing from the manipulator cameras, recognizes vector objects in the geometric boundaries specified by the Central Processor (back-end CV). Kinematic Chain Processor system provides identification of a vector object as a possible element of the ‘Local’ Workplace using the identifier database received from the Central Processor. Also, the Kinematic Chain Processor system synchronizes its Local Workplace with the ‘Robot’ Workplace of its parent subsystem (Central Processor).
Central Processor
The Central Processor provides control of a group of kinematic circuits, either by executing high-level commands from the remote control system, or by autonomously executing a pre-loaded script. One or more Kinematic Chain Processor Systems can be connected to the Central Processor. The Central Processor provides the solution of the kinematic stability problem. The Central Processor solves the front-end tasks of computer vision by synchronizing the ‘Local’ Workplaces from the subordinate Kinematic Chain Processor Systems to its ‘Robot’ Workplace. Further, the Central Processor distributes low-level CV tasks on the connected Chain Controller, providing the necessary coverage of the ‘Robot’ Workplace. Also, the Central Processor includes human-machine interface elements such as recognition of voice commands, gestures, interfaces with user input/output devices (keyboard, touchscreen, etc.)
Standalone Robot
The stand-alone robot is a robot capable of moving, containing a set of a number of manipulators, the corresponding number of Kinematic chains (with Kinematic Chain Processor System), and also one Central Processor System placed in a single construction shows a variant with three kinematic chains.
The connections between the actuators and sensors group, sensors collector, kinematic chain processor system and the central processor system is as shown in the
Further, a remote control system is a subsystem for managing a group of Standalone robots combined through a local network or the Internet.
At each processing level of the hierarchy/architecture, a particular subsystem not only converts data, reducing the bandwidth requirements of data channels from the bottom to the top, but also, obviously, delays the propagation of information from sensors (or complex events based on a group of sensors) to the general processing path. For the aggregate of data paths in the proposed architecture, the growth of delays is characterized by a decrease in the volume of transmitted data on the way from the sensors to Central Processor System and Remote Control System. At the same time, strict real-time requirements may be put forward for the data streams terminating on the Central Processor, since the task of kinematic stability would be solved at the Central Processor level. A scheme representing connection between the bandwidth and latency in a hard real-time environment is shown in the
Further, data flows from the connected Kinematic Chain Processor Systems to the Central Processor level may be derived from data received from the Sensor Collector, and those from physical sensors on the manipulators or end effectors. The delay from the physical sensors to the Central Processor can be different because of the different number of sensors served by the Sensor Collectors, or the different types of interfaces that make their own, unique, delays. To facilitate the communication of the command with the feedback data stream, the information from the primary sensors receive timestamps, and the commands themselves are acknowledged. Each local clock in each of the subsystems mentioned as part of the hierarchy/architecture may be synchronized.
Using the architecture/hierarchy as explained above, the one or more processors may receive environment data corresponding to a current environment, from one or more sensors configured in the robotic assistant system via the sensors collector. In some embodiments, the environment data may include, but not limited to, position data and image data of the current environment, which may be obtained from the navigation systems and one or more image capturing devices configured in the robotic assistant system. The one or more processors may further transmit the environment data to a remote storage associated with the universal robotic assistant systems, wherein the remote storage comprises a library of environment candidates. Thereafter, the one or more processors may receive type of the current environment determined based on the environment data, from among the library of environment candidates. Further, the one or more processors may detect the one or more objects in the current environment, wherein the one or more objects are associated with the type of the current environment. The one or more objects may be detected based on at least one of the type of the current environment, the environment data corresponding to the current environment, and object data, from a plurality of objects belonging to the current environment. In some embodiments, the plurality of objects may be retrieved from a remote storage associated with the robotic assistant system. The one or more processors may detect the one or more objects by analysing features of the one or more objects, such as, but not limited to, shape, size, texture, colour, state, material and pose of the one or more objects.
Upon detecting the one or more objects and the current environment, the one or more processors may identify one or more interactions associated with each of the one or more objects based on interaction data that is retrieved from the remote storage. Further, the one or more interactions may be executed on the corresponding one or more objects based on the interaction data. However, execution of the one or more interactions requires that sequence of motions be performed by or on the one or more objects from the optimal standard position i.e. position 0. Therefore, initially, the one or more processors, are positioned on one or more manipulation devices within a proximity of the corresponding one or more objects and then
the optimal standard position of the one or more manipulation devices relative to the corresponding one or more objects is identified, wherein the optimal standard position is selected from the one or more standard positions of the one or more manipulation devices. Upon identifying the optimal standard positions, the one or more manipulation devices may be positioned at the identified optimal standard position using one or more positioning techniques, after which the one or more interactions may be executed on/by the corresponding one or more objects using the one or more manipulation devices.
In some embodiments, position 0 is a starting or default relative position and orientation of the end effector 5002r-1c and object to the object to be interacted with. In some embodiments, the position 0 can serve as a basis or relative point from which each command or motion performed during an interaction is measured. Position 0 is used to ensure that interactions being executed based on prior reliable trainings are performed without errors. Position 0 is the standard position and orientation of the object relative to the end effector at the beginning of the interaction and manipulation process. Exit to Position 0 is the exit to that distance and that relative position between end effector and object where the robotic apparatus is trained by the reinforced learning or any other learning processor or system to have an error-free (or substantially error-free) interaction with the object and passes through a cyclical testing plan to confirm reliability of those interactions.
If the speed is the same, the position modification is the same, the object of the interaction is the same, then the result of the interaction will be the same and therefore error-free (or substantially error-free) in case of the achievement of the standard position between the end effector and object from which robotic apparatus is trained to start the interaction with the object. In other words, a successful functioning interaction between the end effector and object depends on the achievement of the adjustment to the standard position. In order to further facilitate or achieve an error-free (or substantially error-free), reliable and functional interaction between the end effector and object, a camera can be installed to the end effector to identify the position and object orientation, and compare it with expected one (Position 0).
As mentioned above, the end effector 5002r-1c (also referred as the one or more manipulation devices) may be positioned at position 0 using one or more positioning techniques. The one or more positioning techniques comprises at least one of object template matching technique and marker-based technique, wherein the object template matching technique is used for standard objects and the marker-based technique is used for standard and non-standard objects. Positioning (e.g., to position 0) can be performed using a 3D object template of the object to be interacted with. An object template is a detailed data description or definition of the 3D object's shape, color, surface, material, and other characteristics known to those of skill in the art. In some embodiments, the object template is used to compare the images of the object obtained from the cameras (e.g., cameras embedded in or of the end effector), as positioned and oriented at step 8052, to the data definition of the object stored in the 3D object template, which indicates the expected characteristics of the object when successfully positioned at position 0. To explain in detail, the one or more processors may retrieve an object template of a target object from a remote storage associated with the universal robotic assistant systems, wherein the target object is an object currently being subjected to one or more interactions, wherein the object template comprises at least one of shape, colour, surface and material characteristics of the target object. Further, the one or more processors may position the one or more manipulation devices to a first position proximal to the target object. Thereafter, the one or more processors may receive the one or more images, in real-time, of the target object from at least one of image capturing devices associated with the one or more manipulation devices, wherein the one or more images are captured by at least one of the image capturing devices when the one or more manipulation devices are at the first position. Further, the one or more processors may compare the object template of the target object with the one or more images of the target object. Further to comparison, the one or more processors may perform at least one of: adjusting position of the one or more manipulation devices towards the optimal standard position based on position of the one or more manipulation devices in previous iteration and reiterating steps of receiving and comparing, when the comparison results in mismatch; or inferring that the one or more manipulation devices reached the optimal standard position when the comparison results in a match.
In some embodiments, object templates cannot or are preferably not applied to non-standard objects that are modifiable (and/or that don't have a known shape, size, etc.). Modifiable objects include objects that can change their 3D shape after being interacted with (e.g., opened, squeezed, etc.). Other exceptional objects that cannot or are preferably not defined as object templates include glass or transparent objects, objects with reflective surfaces, objects of non-standardized shape and size, temporary objects, and the like. These objects, whose characteristics and features are not readily known or stored in an object template, can be interacted with (e.g., to be moved to position 0, using real-world i.e. physical markers and/or virtual markers, as described below in further detail. In some embodiments, the physical markers may be disposed on the target object, whereas the virtual markers corresponding to one or more points on the target object, wherein the one or more markers enable computation of position parameters comprising distance, orientation, angle, and slope, of the one or more manipulation devices with respect to the target object. In some embodiments, the target object may be an object currently being subjected to one or more interactions. In some embodiments, the one or more markers associated with the target object are physical markers when the target object is a standard object and the one or more markers associated with the target object are virtual markers when the target object is a non-standard object.
Marker based technique for navigation to position 0 uses one or more markers placed on the object to be interacted with and/or moved to position 0. The marker can be made up of one or more 2D patterns, which are placed on the object. Because markers have a pattern that is known to the end effector 5002r-1c and/or the robotic assistant 5002r, the markers can more easily be detected, such that the estimation of the pose of the object can be more computationally efficient and accurate. That is, markers can be used to more reliably and accurately to compute the orientation and distance between the object and end effector, and/or its cameras (which, in some embodiments, make up an embedded vision system (described in further detail below)). Markers placed on the one or more manipulating devices enables to estimate pose and orientation of the one or more manipulating devices with respect to General Purpose Vision system and kitchen surface origin. This can be used for calibration and check of positioning accuracy, damage or run-out of the one or more manipulating devices. However, markers placed on the one or more objects are used to compute pose and orientation of the one or more manipulating devices with respect to the one or more objects or vice versa.
As an example, the one or more markers may include, but not limited to, Quick Response (QR) codes, Augmented Reality (AR) markers, Infrared (IR) markers, chessboard/checkerboard markers, geometry and color markers, and combinations thereof (e.g., triangle marker, which consists of three other markers placed in the vertices of equilateral triangle). Markers are used for computing the orientation and distance between the object and dynamic image capturing devices embedded in the one or more manipulating devices or the static camera, which is part of Global Scene Vision system, more reliably and accurately. The types of markers that may be used depends on, for example, scene, object type and structure, lighting conditions and the like. Different types of markers enable the computation of distance and orientation of the object with different use of computational resources.
Each type of marker has characteristics that are considered for their selection, and which are known by the robotic assistant 5002r and that can be considered during a manipulation. These characteristics can include: orientation/symmetry; maximum viewing angle (maximum allowed angle that the marker can be detected from); tolerance to variations of lighting; ability to encode values; detection accuracy (e.g., in pixels); built in correction capabilities; computation resources, and the like. Based on these, optimal markers can be selected and used for particular objects and conditions of the environment 5002 and/or workspace 5002w. Each type of marker has different advantages. For example, some markers are oriented and some are not; detection of some markers is computationally more efficient than of other markers; detection of some kinds of markers is more precise due to angle and lighting conditions tolerance. For example, AR markers are oriented, encode integer values, and their detection is computationally efficient. Chessboard markers, on the other hand, are often not oriented and require almost twice as much or more computational resources to be detected. However, chessboard markers can be localized with higher (subpixel) accuracy and has built in mistakes correction capabilities. Further, IR markers is a special kind of pattern, implemented with small reflective points placed to the known points on the objects and visible in infra-red lighting. In conjunction with infrared light source and camera, reflective points may play a role of marker corners and used for pose estimation.
To make sure that system of markers or objects work flawlessly, remote identification technology such as Radio Frequency Identification (RFID) and/or Near Field Communication (NFC) in combination of different type of visual markers may be implemented. Integrated solution of these two types of technologies increase the reliability of the system and object identification in different environments.
More details about various types of markers and their particularities are provided below. Markers may vary and may be of different types and configuration to set up the distance to the object and space orientation of the object. Basically, almost every geometric shape with at least 3 sharp corners and some pattern inside can be used as an explicit marker. A calibrated camera system is able to compute the distance to and pose of the marker placed on the object, using the known distances between corners of the markers. The pattern contained in the marker may be used to filter out false detections and to encode an integer object's identifier (Id), or object's group Id. In addition to that, contained geometry pattern plays an important role in object's design and typically is based on company logo or symbolics. One of the key features of high quality marker detection technology is good design of internal pattern, information capacity of the internal pattern and robust detection in various lighting conditions and poses.
Using markers and marker detection during interactions, particularly interactions with dynamic objects, increases reliability because they can be quickly detected and thus pose estimation can be computationally efficiently done in real time, even with subpar hardware. In some embodiments, using markers during interactions, as described in detail herein, includes: detecting markers on the object in the images obtained from related cameras (e.g., overhead cameras of a general-purpose vision subsystem, described herein, which can be used for global scene monitoring); calculating real world coordinates of the markers at given time periods (e.g., every 10 milliseconds); estimating the trajectory (e.g., velocity and direction) of the object; calculating the expected position and pose of the object for a future moment m; moving the one or more manipulating devices (also referred as end-effector) to the estimated position in advance; holding the end effector in the required position and pose for the required moment of time; performing the actual interaction. Marker detection therefore enables the synchronization of positions and poses of the moving object and the end-effector.
As discussed above, one type of marker that can be provided on an object is a triangle marker. A triangle marker includes three detectable markers or patterns placed at the vertices of an equilateral triangle.
In some embodiments, triangle markers are disposed and/or applied to objects at areas where they are to be interacted with—e.g., grasped. For instance, as shown in
Adjusting the end effector (e.g., to position 0) can, in some embodiments, be performed as follows:
The one or more processors may move the one or more manipulation devices towards the triangle-shaped marker until at least one side of the triangle-shaped marker has a preferred length (or range of sizes, threshold size). As an example, the end effector may be moved or positioned toward the triangle-shaped marker, until at least one side of the triangle marker, as viewed through imaging captured by a camera of the end effector, measures for instance 225 pixels.
Further, the one or more processors may rotate the one or more manipulation devices until a bottom vertex of the triangle-shaped marker is disposed in a bottom position of the real-time image of the target object captured by the camera of the one or more manipulation devices.
Thereafter, the one or more processors may shift the one or more manipulation devices along an X-axis and/or Y-axis of the real-time image of the target object until a center of the triangle-shaped marker is positioned at the center of the real-time image of the target object captured by the camera of the one or more manipulation devices.
Finally, a slope of the angle of the camera relative to the triangle-shaped marker is adjusted (e.g., by moving the end effector and/or moving the object) until each angle of the triangle-shaped marker is at least one of equal to approximately 60 degrees or equal to a predetermined maximum difference between the angles that is smaller than their difference prior to initiating the adjustment of the position of the one or more manipulation devices. In some embodiments, achieving at least one of the two conditions mentioned above, indicates that the one or more manipulation devices have reached the optimal standard position.
These above-referenced steps can be iteratively performed until all angles of the triangle are equivalent to 60 degrees and all sides of the triangle are equal to a required or predetermined size, as viewed through the image captured by the camera of the end effector.
In turn, once the camera plane and the triangle are aligned—e.g., such that they are parallel to one another, and the triangle is on the optical axis of the camera, the projected triangle, as seen by the camera of the end effector is also equilateral. This means that all sides of the triangle are equal (or substantially equal) to each other and all angles are equal (or substantially equal) to 60 degrees.
When the end effector identifies a slope between the planes of the camera and the triangle-shaped marker, one of the triangle's angles, as imaged by the camera, is seen as being larger than the other two angles. This angle disparity indicates that the vertex having the apparently larger angle is closer or father to the camera than the other two angles or vertices. This can be seen in
In some embodiments, the vertex positions and movements of the triangle marker (e.g., during positioning) can be identified and/or calculated using a mathematical model that receives, as inputs, three axes X, Y and Z. X and Y are angles of the triangle—the fourth angle is therefore determinable based thereon—and the Z axis is a distance from the camera to the object. Distance is defined and/or calculated by the size of each of the triangle's axes. That is, closer triangles to the camera of the end effector result in longer sides of the triangle being imaged. Using these assumptions and information with the model, the end effector can be moved (e.g., forward, backward, left, right) as needed in order to make the sides equal in the imaging of the marker.
Depending on the slopes of the camera at various positions, altered as the camera is rotated to the right, left, front and/or back, the angles of the triangle-shaped marker, as visualized by the camera, are changed. The bigger the angle is or becomes on the image of the camera, the further it is or moves from the camera. Because the sum of all angles of a triangle always equal 360 degrees, the robotic assistant 5002r can calculate or determine which angles of the triangle are being observed by the camera, and the position in which the angles are positioned in relation to the camera. In this way, the end effector, depending on the images captured by its camera, can calculate or identify how the end effector should move—e.g., to which side, distance and inclination—in order to reach or achieve the position 0. When the imaged triangle-shaped marker is determined to match the angle of two of its vertices, it becomes possible for the robotic assistant to calculate the inclinations and lengths of the sides of the triangle, which thereby also makes it possible to calculate the distance from the camera to the triangle marker. As a result, the robotic assistant can, in turn, move in the opposite direction (e.g., in the direction of decreasing angle) to achieve the position 0 of the end effector.
In some embodiments, the robotic assistant can calculate the required shift or rotation of the end effector to position it at the target position, such as position 0, by using the visible and expected coordinates of a triangle marker. For the calculation, the robotic assistant considers two triangle markers, illustrated in
Using this information about the triangles ΔABC and ΔÁ{acute over (B)}Ć, the end effector and/or robotic assistant can perform the affine transformation shown in
To this end, the robotic assistant can calculate the parameters of an affine transformation of the points of the triangle ΔABC of
{acute over ({right arrow over (x)})}={right arrow over (r)}+M·{right arrow over (x)},{right arrow over (R)}={right arrow over (O)}−{acute over ({right arrow over (O)})}
Therein, the center of triangle ΔABC can be represented as:
And the center of triangle ΔÁ{acute over (B)}Ć can be represented as:
The centers of the triangles can be assumed to lie on the camera axis.
In turn, a matrix M is calculated as follows:
That is, M can be represented as a composition of rotation and stretching matrices, such that a pair of perpendicular vectors are formed that remain perpendicular after the transformation, as shown below:
{right arrow over (F)}
0
·{right arrow over (F)}
1=0
,{right arrow over (T)}
0
=M·{right arrow over (F)}
0
,{right arrow over (T)}
1
=M·{right arrow over (F)}
1
{right arrow over (T)}
0
·{right arrow over (T)}
1=0
The perpendicularity of {right arrow over (F)}0 and {right arrow over (F)}1 indicates that that:
Thus, the perpendicularity of {right arrow over (T)}0 and {right arrow over (T)}1 indicate that:
In turn, after two vectors that remain perpendicular prior to and after the transformation have been identified, the robotic assistant and/or end effector identify or determine parameters of M, including (1) rotation angle α and coefficients of stretching: k0 and k1 as follows, for example:
For example, let
is the current distance between the camera and the triangle. β is the angle of rotation of the triangle relative to the axis {right arrow over (T)}j, namely
In some embodiments, it may not be possible to calculate sin [β] from the initial data, for example, because there are two possible triangle positions that correspond to the same camera image and thus, only the absolute value of sin (β) can be found, while its sign is unknown. Therefore, calculating sin (β) can be performed as follows. First, to decrease the angle β, the camera of the end effector must be moved along the axis {right arrow over (T)}i, as shown in
The necessary movement of the camera in order to reach the target position of the camera of the end effector relative to the object (e.g., position 0), is calculated as follows:
Let {right arrow over (P)} be the then-present position of the camera, directed along {right arrow over (I)} and rotated relative to it by angle ω. Further calculations are made in camera's relative coordinate system, as follows:
and τ indicates a coefficient of proportionality between the actual length of the object and its dimensions in the image from the camera.
Accordingly, as shown in
Because the camera movement is in some embodiments calculated based on its own relative coordinate system, it can be necessary to transfer the camera's relative coordinate system to an absolute coordinate system. To do so, matrix A must be calculated as shown below, keeping in mind that it is known that an orthogonal transformation can be represented as a composition of three rotations relative to X and Z axis:
A=Az(ξ)·Ax(ψ)·Az(ω),
where ψ, ξ, ω are Euler angles (e.g., ψ is the precession angle; ξ is the nutation angle; and ω is the intrinsic rotation angle). Matrix A is therefore calculated as follows:
The Euler angles can be calculated from the following equations:
Δ{right arrow over (Pabsolute)}=A·Δ{right arrow over (P)},Δ{right arrow over (Iabsolute)}=A·ΔĪ,Δω=α
In the camera or end effector movement calculations described herein (e.g., to move the camera or end effector), in some embodiments, it is assumed that the camera image has no aberration affects and that any movement of the triangle perpendicular to the camera axis does not change its size on the image. The movement calculation algorithm may, in some cases, calculate the exact movement of the camera, but does it with an appropriate degree of accuracy. Nonetheless, the described algorithms decrease inaccuracies rapidly as the camera approaches the target or desired camera/end effector position, thus minimizing their impact on final positioning.
In some embodiments, a triangle marker can be created for use with an object by using the object's own contour points (or contour of a portion of the object). To this end, a series of points n points {right arrow over (A)}0, {right arrow over (A)}1,{right arrow over (A)}2, . . . , {right arrow over (A)}i, . . . , {right arrow over (A)}n is received from a sensor (e.g., camera) of the robotic assistant or end effector. This series of points define the contour of an object, as imaged.
For example, one technique to calculate a triangular marker from an object's contour assumes that (1) the object is highly planar, and (2) the shape of the object is not round or elliptic. Because the contour of the object has several bends that are distinguishable from various points of view of imaging, these bends can be used as (or as a basis) for finding the vectors of polygon sides and calculating their length and angles between consequent sides, as illustrated for example by the following equation:
{right arrow over (l)}
i
={right arrow over (A)}
i+1
−{right arrow over (A)}
i
,l
i
=|{right arrow over (l)}
i,αi={right arrow over (l)}i,+1
The parameters of the preceding equation are illustrated in connection with
where α is a parameter that defines the curvature of bends that are sought to be identified when calculating a marker. As bends are found, points for triangle marker {right arrow over (B)}i are constructed by intercepting sides {right arrow over (i)}first and {right arrow over (i)}last+1 in the bend sequence, as shown in
As shown in
In some embodiments, it is preferable to obtain the image of the object from which contour points are analyzed to create a marker from a camera perspective in which the camera is imaging or looking straight down at the object when the object is placed on a plane that is parallel to the surface on which the object is positioned. Once the camera is so positioned, it is possible to use the sequence of points of the contour to create or identify the triangular marker as described herein.
In some embodiments, a chessboard marker (also referred as chessboard-shaped marker or checkerboard marker) can be used as an alternative to other markers (e.g., as show in
In some embodiments, chessboard marker-based positioning can be performed as follows, in connection with exemplary
The one or more processors may calibrate the image capturing devices (eg: camera) associated with the one or more manipulation devices using the chessboard marker, i.e. the camera of the end effector is calibrated. In some embodiments, the calibration may include, but not limited to, estimating focus length, principal point and distortion coefficients of the image capturing device with respect to the chessboard-shaped marker In some embodiments, camera calibration may be performed only once.
The one or more processors may identify image co-ordinates of corners of square slots in the chessboard marker in real-time images of the target object, i.e. the internal corners of the chessboard marker are located from the captured imaged, which is analyzed to identify points of interest therein such as the white points shown in
Further, the one or more processors may assign real-world coordinates to each internal corner among the corners of the square slots in the real-time image based on the image co-ordinates. Assuming that origin of the real-world coordinate system is in the top-left internal corner of the chessboard marker, then the X and Y coordinates of the top right corner of the chessboard marker can be, for example: (6*cell_size_mm, 0); and the X and Y coordinates of the bottom left corner of the chessboard marker can be, for example: (0, 2*cell_size_mm). Accordingly, real world coordinates are assigned to each internal corner.
Based on the above, the end effector and/or robotic assistant can calculate or identify, among other things, the following information, which can be used with the chessboard marker to determine position of the one or manipulation devices and navigate the camera and/or the one or more manipulation devices:
Camera parameters (e.g., focus length, projection center, distortion coefficients);
On-image coordinates of internal corners (e.g., in pixels);
Real-world coordinates of internal corners (e.g., in mm).
In some embodiments, real-world coordinates and on-image coordinates can be calculated or converted from the other. In one example embodiment, this conversion can be expressed as:
In this exemplary expression,
(u,v): refers to on-image coordinates (e.g., computed after the tangential/radial distortion is eliminated);
fx, fy, cx, cy: refer to camera focus and projection center points;
R and T: refer to rotation and translation matrices that can be found by placing known coordinates to equations and solving them (e.g., using Ransac).
Once R and T are known or identified, it is possible to know or identify the position of the camera with respect to the position of the marker, in which: (1) T1, T2 and T3 are X, Y and Z coordinates of the top-left chessboard corner; and (2) R11 . . . R33 are rotation components, as shown below:
where:
In some embodiments, the process of calibrating, identifying, assigning and determining are repeated until the position of the one or more manipulation devices is equal to the optimal standard position.
In some embodiments, the same approach can be applied to positioning on the base of triangle marker, because its vertices are also at fixed positions relative to each other and so can be the base for RT matrix calculation.
Still with reference to
In some embodiments, moving the end effector to position 0 at step 8054 can be performed for non-standard objects. In some embodiments, non-standard objects do not or may not have markers placed thereon. Thus, alternatively (or additionally), shape and visual feature analysis techniques are performed. Further, the one or more objects, that may not be suitable for 3D template matching or marker-based positioning, can be processed with intelligent visual feature analysis that produces coordinates for placing “virtual markers”. These virtual markers may be further considered for navigating the one or more manipulation devices and/or the camera configured in the one or more manipulation devices to exit to Position 0. This virtual marker based technique may be used for objects such as ingredients and other types of objects having no fixed shape and size. Different types of objects have different visual features, suitable for detection and positioning, so there are multiple types of analyses, that can be used, depending on type of the object.
In some embodiments, the virtual markers are placed on the target object using at least one of shape analysis technique, particle filtering technique and Convolutional Neural Network (CNN) technique, based on type of the target object.
An algorithm is developed and stored for performing feature analysis techniques on each non-standard object or object type. The algorithms are configured to, among other things, detect capturing points on the images using embedded cameras (e.g., as shown in
To this end, the robotic assistant system can obtain or download the feature (e.g., shape and visual features) analysis algorithm for the respective object, for example, from a library of environment (e.g., stored in the cloud computing system 5006). In turn, the algorithm can be executed on the object using the cameras of the end effector(s) and/or other sensors thereof. Points detected by the end effector using the algorithm can be treated like virtual markers, and used for positioning in the same or similar manner as described above in connection with real-world-markers.
In some embodiments, objects such as non-standard objects that are not suitable for 3D template matching or marker-based positioning (e.g., ingredients, objects with non-fixed size and shape) can be processed or interacted with using intelligent feature (e.g., shape, visual) analysis algorithms and techniques that produce coordinates (e.g.,
It should be understood that virtual marker can be made up of multiple points corresponding to an object. The points of the virtual marker are measured to calculate their actual or relative characteristics or features, which can be relative to one another and/or to other systems or components such as a camera through which the virtual markers are imaged. For example, the features or characteristics of the points or coordinates of the virtual marker that can be measured or calculated can include their distance, orientation, angles, slope, and the like. Based on the measurements of the virtual marker features and characteristics, actual or relative characteristics of the object associated with the virtual marker can be calculated. For example, the measured characteristics of the virtual marker enable the detection of the distance, orientation, slope, and other geometric and position data thereof.
In some embodiments, the virtual markers are placed on the target object using at least one of shape analysis technique, particle filtering technique and Convolutional Neural Network (CNN) technique, based on type of the target object.
One example of performing visual or feature analysis of objects using virtual markers is shape analysis technique, as shown in
Another type of visual feature analysis using virtual markers can include the use of a combination of visual features, such as histograms of gradients, spatial color distributions, texture features, and the like, computed in the neighbourhood of special points. Each special point is considered as a candidate of virtual marker position. For each type of object, there are known or predetermined feature values computed for ideal virtual marker position—also referred to as “ideal values.” Thus, one approach includes identifying or finding the positions on the object that best match the “ideal values.” Initially, the one or more processors may retrieve one or more ideal values corresponding to ideal positions of the target object from the remote storage associated with the robotic assistant system. Further, the one or more processors may receive real-time images of the target object from at least one image capturing device associated with one or more manipulating devices. Thereafter, the one or more processors generate special points within boundaries of the target object using the real-time images. Upon generating the special points, the one or more processors determine an estimated value for combination of visual features in neighbourhood of each special point, wherein the visual features may include at least one of histograms of gradients, spatial colour distributions and texture features. Further, the one or more processors may compare each estimated value with each of the one or more ideal values to identify respective proximal match. Finally, the one or more processors may place the virtual markers at each position on the target object corresponding to each proximal match.
Because in some embodiments marker patterns include a triangle, there can be at least three ideal values and three respective best matching positions, forming a triangle virtual marker.
Another type of visual feature analysis using virtual markers can include using CNN technique to detect virtual markers on arbitrary or non-standard object. Because objects can be very different, separate models can be trained for each type of object. On the execution stage, the one or more processors may download a CNN model corresponding to the target object from libraries stored in the remote storage associated with the robotic assistant system. Upon downloading the libraries, the one or more processors may detect positions on the target object for placing the virtual markers based on the CNN model.
All three methods provide positions, that can be treated like a virtual markers and used for positioning as if there were real markers on the object.
It should be understood that, in some embodiments, moving the end effector to the position 0 is a fundamental task of interaction between kinematic chains in end effectors and the object. This way, if the object is standard and the environment is standard, then all the processes of interaction can be standard. In this regard, a classification of environments in which the object is located are identified and stored such that standardization of environment can be achieved. For example, the blender can be placed on a stand and is on the table and in the way operational finger of robotic hand (end effector) is placed on the blender's operation button, and in either of these cases it is needed to grasp it in a different way. But in either of these cases, the robot will be trained by the reinforced learning to perform such interaction until it will grasp it correctly. As soon as the robotic end effector grasps it correctly, it will be possible to record this movement as a standard interaction in this particular case.
In some embodiments, the machine learning algorithm is a hypothesis set which is taken before considering the training data and which is used for finding the optimal model. Machine learning algorithms have 3 broad categories as illustrated in
Learning can be performed by the training system that starts from placing the object in the manipulator, then the robotic apparatus should be further programmed to bring the object to the camera system at a certain point, which can be referred to as the “gauging point”. In some embodiments, it's necessary to bring the object to the point of gauging and shoot the initial shape from all angles. Next, the robotic assistant system is taught how the particular object should be grasped or operated. The developed robotic assistant system allows the robotic apparatus to change the algorithms of grasping. Each time the robotic assistant system makes the algorithm, it brings the object to the point of gauging and checks if the form matches or not, which indicates that the robotic assistant system has learned to take/grasp the object correctly. Such robotic assistant system makes it possible to learn by itself and automatically. Having the gauging point parameter, the reinforced learning system can repeatedly grasp one object for a long period of time trying to grasp it correctly in the same initial Position 0, making various modifications of movements. However, it will always check the point by which it will be possible to say whether it is a successful case or not.
Further, the non-standard objects are classified by the type of interaction. When the robotic assistant system is grasping the target object for moving, the successful case creation utilizes, in some embodiments, the following process: scanning the object; creating of the rules (from and to sizes); classifying the object to several sizes depending on the correct grasping process; joints movement establishment in a training mode (e.g., to grasp the object in different ways to understand the mechanics of grasping and to identify where exactly should be located the fingers of the end effector). Accordingly, this process helps to identify which sizes of the objects allow to grasp the object in one way or another. However, some objects may be large in size and grasping such large objects may not be possible with one end effector, meaning that the grasping should be made by two end effectors in order not to damage the target object. Consistence of the object should preferably be known as it influences the force of end effector's compression function in terms of the interaction with the object.
Also, upon completion of the reinforced learning, the standard object may become a known object. A standard object that the robotic assistant system has not learnt to interact or operate with are termed as unknown objects. Upon completion of cycle of the learning process, the standard unknown object is becoming a standard known object. Accordingly, the cycle of learning starts with the movement of the human to the object, when human shows to the machine how the interaction should be performed with the help of motion capture, vision, vision motion capture, gloves motion capture and different types of sensors on the glove.
Moreover, like the position 0, the final orientation position of the object is also needed for reinforced learning. To start a manipulation or interaction with an object, robotic end effectors should be in the correct position to be able to observe the object under correct angles. From this final orientation position, the end effectors or manipulators are rotated in certain angles, changing the speed and compression of the fingers for learning how to grasp the particular size of the object. This can be performed, for example, in two ways: by manual training with the help of a human trainer—when the form of the object is read from the Position 0 and after the system grasps the object exactly as it was taught. However, one obstacle is that the object can differ slightly (e.g., because the object is not identical with the object used for the learning process). In such cases, learning starts exactly from this moment: parameters of the capture needs to be changed, parameters should be verified in a certain range, the object should be brought to the gauging point. Finally, if the object is grasped, then this is a successful case, if the object is not grasped, then it is not a successful case. By searching through these parameters the system finds the successful case for each object and the correct algorithm for its grasping, moving, operating or manipulating.
As illustrated in exemplary
End effector on the object and practically the robotic assistant system itself does the motion planning of the whole kinematic chain to achieve the required positions on the object. Here the number of joints is important (2, 3 or 4 fingers, parallel gripper, 3 axis grippers, robotic hands) and depending on the type of the gripper the position of joints on the object should be according to the type of the gripper. Practically, the robotic apparatus is programming the joints positions on the object depending on the type of the gripper and N-amount of interactions. To place the joints to the correct areas of the object at the same time avoiding the collisions with the object processor generates the motion planning of the kinematic chain.
Still with reference to
Each manipulation or interaction consists of fixed sequence of motions, that leads to desired result (e.g., success point, particularly when an initial position of the manipulator relative to the object is the same, as it was during the training (“position 0”). Since an object can be reached from various directions, there are several variations of the same manipulation that are trained and recorded or stored for different directions. Each variation has its own corresponding position 0. On the execution stage, the central control unit analyses the workspace model and decides which direction is optimal for the object to be manipulated, and based thereon selects the appropriate position 0. Information about the particular object type can be downloaded from the Library of Environments. In turn, the selected position 0, manipulation ID and approximate coordinates of the object, among other data, are transmitted to the universal end effector (manipulator).
Based thereon, the interaction can be performed as follows:
In some embodiments, the embedded vision subsystem assists the end effector or one or more manipulation devices on steps 2 and/or 4 above, as explained above in connection with step 8054 (moving to position 0) and below in connection with step 8060 (validating interaction results).
In turn, at step 8060, once the interaction (e.g., sequence of motions) has been performed, its results are validated. In some embodiments, validation for interactions with standard and non-standard objects can be performed in the same manner. For example, validation can be performed by analysing image obtained from various embedded visual sensors. Such analyses can be performed using convolutional neural networks or the like, trained on examples of successful and failed manipulations. For instance, if the interaction or manipulation is to “turn the blender on” and the expected result (e.g., success point) is “lightbulb lit up,” the convolutional neural network is trained on images of the blender with the lightbulb on and off, together with a corresponding ground truth. During execution of the interaction, the central control unit can download the relevant neural network from the library of environments and apply it to the image obtained from the embedded camera, thus finding out if the success point was achieved (e.g., lightbulb is on). Images of failed and successful manipulations can be continuously obtained and transmitted to a central system (e.g., the central use cases laboratory) for storage, analysis and training. Training of neural networks on the examples of failed and succeed manipulations enables automatically finding distinctive features for each particular object type and manipulation.
Due to recording data from each of the one or more sensors and success point coordinates along with movements of all fingers of human hand and further transmitting such data recorded to the robotic assistant system, the result of operations performed on the one or more objects is becoming the expected success factor. To achieve the success factor reinforced learning system in the robotic assistant system is looking for and choosing different combinations of movements of end-effector to the object based on raw data from the operator to optimize these movements for the particular model of end-effector that is used (depending on joints positions of the end effector and number of degrees of freedom). Success point can be the result of the interaction, as well as the position of the finger on the object depending on the type of the gripper. Success points are used to give the start to reinforced learning which leads the machine to learn how to achieve the success points in 100 percent of cases from the overall cases.
Further, Steps 8056, 8058, and 8060 are iterated until it is determined at step 8056 that no interactions remain to be performed according to the recipe algorithm. At that point, the process ends at step 8062, indicating that the recipe has been completed.
In some embodiments, the robotic assisted environment 5002 and/or robotic assisted workspace 5002w can include a drawer system that includes drawers for robotic operation.
Further, the one or more sensors configured in the storage unit provide corresponding sensor data associated with position and orientation of each of the one or more objects to at least one of the one or more embedded processors. As an example, the one or more sensors may include, but not limited to, a temperature sensor 9116, a humidity sensor 9117, a position sensor 9107, an image sensor 9102, an ultrasound sensor, a laser measurement sensors and SOund Navigation And Ranging (SONAR).
Further, the one or more light sources may include, but not limited to, Light Emitting Diodes (LEDs) 9101 and light bulbs. The one or more light sources are configured to assist with illumination in the storage unit, such that, the one or more image capturing devices may capture clear images of the one or more objects stored in the storage unit. Based on the one or more images, a user may view what is present inside the storage unit without even opening the storage unit or doors of the storage unit.
Further, the one or more embedded processors configured in the storage unit interact with a central processor of the robotic assistant system through a communication network. In some embodiments, the communication network may be a wired network, a wireless network or a combination of both wired and wireless networks. In some embodiments, the central processor may be remotely located or may be configured within the environment in which the storage unit is configured. The one or more embedded processors 9109 may detect each of the one or more objects stored in the storage unit based on the one or more images and the sensor data. Further, the one or more embedded processors may be configured to transmit the one or more images and the sensor data to the central processor 9104. In some embodiments, the one or more embedded processors 9109 may transmit the sensor data and the one or more images in real-time or periodically.
In some embodiments, detecting each of the one or more objects may mean identifying and locating the one or more objects. Further, detecting each of the one or more objects may include, detecting presence/absence of the one or more objects, estimating content stored in the one or more objects, detecting position and orientation of each of the one or more objects, reading at least one of visual markers and radio type markers attached to each of the one or more objects and reading object Identifiers. In some embodiments, the presence/absence of the one or more objects may be detected based on CNN or any other machine learning classifier such as decision trees, Support Vector Machine (SVM) and the like, which the one or more embedded processors 9109 and the central processor 9104 are trained with. In some embodiments, the one or more embedded processors 9109 and the central processor 9104 may be trained with example images of full storage unit, empty storage unit and the like. Further, using the same CNN or machine learning techniques, the one or more embedded processors 9109 or the central processor 9104 may estimate content stored in each of the one or more objects. Further, the position and orientation of each of the one or more objects is detected using region-based CNN by getting trained on example images. Optionally poses of the one or more objects may be refined using marker-based detection technique. Furthermore, the one or more embedded processor 9109 may read the visual markers attached to the objects, defined object ID, object position/orientation. In some embodiments, the one or more embedded processors 9109 may include detecting radio type of markers attached to each object, defined object ID, approximate object position/orientation. A combination of the aforementioned techniques may increase reliability of the electronic inventory system by confirming and double checking the result compared to other methods. In some embodiments, each of the one or more detected objects are added to an electronic inventory associated with the electronic inventory system, which is in turn shared with the central processor so that the one or more objects can be easily found during execution of recipes or during pre-check.
In some embodiments, CNN used for object detection inside the storage unit is different from the network, used for general purpose camera system. It is adopted to (fish-eye) camera and lighting conditions, specific to the storage unit.
Further, in some embodiments, the storage unit may be furnished with a specialized electronic system that allows performing the one or more actions on doors of the storage unit automatically, using servomotors and the guiding systems. As an example, the one or more actions may be closing and opening options Further, the storage unit may be configured with electronically controlled locks that can be applied to the storage unit for locking and unlocking the doors of the storage unit that prevents unauthorized closing or opening of the drawer or enable the user to program the access to only permitted ingredients at permitted hours and time of the day.
In some embodiments, each storage unit in the electronic inventory system is controlled by the central processor 9104 remotely. The central processor 9104 receives real-time data from the one or more embedded processors 9109 configured in the storage unit to locally control the storage unit. Each of the one or more embedded processors 9109 may be integrated in a peer-to-peer network, to which the central processor 9104 is also connected. The central processor 9104 communicates with the one or more embedded processors 9109 via wired or wireless protocols. Each processor system manages at least one box. It is located in the immediate environment of the box. To reduce the number of cables that are carried out inside the furniture or any applicable storage structures, the one or more embedded processors 9109 use a power line and a data line. As an example, one Cat5e cable 9106 from the nearest PoE ethernet switch 9105 may be connected to each embedded system. A group of boxes localized in one place may be controlled by a “local” switch with PoE capability, which in turn may be connected to a higher one, etc., forming a data transmission network of the complex. An example of connecting a group of boxes is shown in
The electronic inventory system (also referred as digital inventory system), (a system that allows to structure data on the inventory of various storage units, including boxes, drawers-boxes, open storage shelves, closed storage shelves, containers, safes, etc.) for storing any kinds of objects, includes of the following set of modules:
Complementary Metal-Oxide-Semiconductor (CMOS) or Charge-Coupled Device (CCD), combined with the lens. The one or more image capturing devices are connected to the extension board 9103 by a cable that provides power and control.
A computer controlled kitchen, such as those examples illustrated in
In some embodiments, storage is designed or configured for optimal use by the robotic apparatus (e.g., rather than for use by humans). The storage can have different ways to access it, e.g., from the side of the cooking volume to be accessed by the robot and from the front side of the kitchen to be accessed by human.
As shown in
The cover door of the dishwasher can be opened as traditional kitchen door in an external direction or by sliding it (e.g., up, down, sideways) using an electronic mechanism. The cover door of the dishwasher is furnished specifically and will be adjacent to the storage dishwasher block to prevent the water leaks.
In some embodiments, the robotic assistant system may be configured to interact with touchscreens and other similar technologies such as trackpads, touch surfaces and the like. The touchscreens and similar touch surfaces may refer to interfaces that allow the robotic assistant system to interact with a computer through touch or contact operations performed thereon. These interfaces can be display screens that, in addition to being configured to receive inputs via touches or contacts, can also display or output information as explained under the display screen of the storage unit. In some embodiments, touchscreens or touch surfaces can be capacitive, meaning that they rely on electrical properties to detect a contact or touch thereon. Therefore, to detect a contact, a capacitive touchscreen or surface recognizes or senses voltage changes (e.g., drops) occurring at areas (e.g., coordinates) thereon. A computing device and/or processor recognizes the contact with the touchscreen or touch surface and can execute an appropriate corresponding action. Further, a touchscreen may have the ability to detect a touch within the given display area. The touchscreen may be made up of 3 basic elements i.e. a sensor, a controller and a software driver. Each variant of the touch screen technology carry their own distinctive characteristics, with individual benefits. As an example, different variants of the touchscreen may include, but not limited to, a resistive touchscreen, a capacitive touchscreen, a Surface Acoustic Wave (SAW) touchscreen, infrared touchscreen, optical imaging touchscreen and acoustic pulse recognition touchscreen. Further, the electrical charge can be obtained from a motor electrical terminal, battery or other power source included in the robotic system and/or end effector. Portions of the end effectors can be made of a material, or covered in a material or paint, that has conductive properties that enable the electrical charge to pass from its source, through the end effector and its capacitive portions (e.g., fingertips), onto the touchscreen or surface.
A main processor of a higher level (also referred as central processor) may send commands to the processor of lower levels (also referred as one or more embedded processors in the kinematic chain). Movements of joints of the end effector are performed, either exactly or precisely, by the embedded processor of low level. The final command for interaction is thus generated by the central processor and direct execution of the commands is performed by the one or more embedded processors. The number of the processors of low level may vary and is not limited an exact number. The direct operation and algorithm execution by the end effector is performed by the embedded processors and kinematic chains. The kinematic chains are operated by local driver units, which can be switched if needed to the driver unit of the main processor. Memories of the embedded processors and main processor can be mapped between each other as well. A detailed exemplary system diagrams illustrating the processors, memories, and other hardware, together with the detailed explanation is provided herewith. The one or more embedded processors may update the workspace model in real-time with information about the status of currently executed command and all caused changes. The information may include, but not limited to states, poses (positions and orientations) and velocities of the one or more objects, visible to the embedded camera or reached by sensors, success check results of current and previous manipulations and safety check results (presence of unexpected objects, smoke and fire detection and etc.). The updated workspace model is shared with the central processor, via memory mapping or other mechanism.
Further, in the illustrated architecture, the robotized complex is viewed as a collection of subsystems hierarchically interconnected. At each level of the hierarchy, the corresponding subsystem processes the input stream from the underlying subsystems and transfers the results of processing to the higher subsystem. Thus, a high-level pipeline is formed, at each stage of which the data is transformed, ideally with some reduction of the flow. From the point of view of the data paths on the diagram, one way is shown—from the manipulator sensors to some control center of a group of robots. The schema can be considered as a two-way data pipeline. Further, from bottom to top, from the sensors of the physical environment such as video stream, temperature, humidity, accelerometers, feedback values for the forces of electric motors, etc., the data may be converted into vector object descriptors, trajectories and current manipulator positions. The one or more objects may be identified as participants in some Workplace: ‘Local’, ‘Robot’, ‘Global’. Also, there exists a reduction of pixel information into a vector, vectors into objects, and objects into workplace elements and the like.
Further, in addition to the shared commands, the central processor and the one or more embedded processors may share, among other things workspace models, which may contain information about positions, sizes and types of each of the one or more objects, surface materials, gravity directions, object weights, directions, velocities, expected positions and the like. This may enable the one or more embedded processors to plan manipulations with the current object in accordance with the positions and sizes of neighboring objects (e.g., to avoid collisions), gravity direction, weight, surface characteristics and the like.
All of this information of the workspace models (e.g., gravity, object's weight, surface and the like) can be used for the reinforced learning, which is a part of training stage, in which optimal robotic arm and end-effector movements are learned for any kind of objects, interactions and initial conditions. In this way, the embedded processor can choose and execute the optimal kinematic chains and end-effector movements, based on the current workspace model. The central processor may maintain up to date workspace model using a central camera system or by collecting the information from embedded cameras.
Further, the
Further, fullform of all the abbreviations used in the
The first coupling member 9303a may include a first connection surface 9303b, for connecting the first coupling member 9303a with the robotic system. The first connection surface 9303b may be configured with a first attachment means [not shown in Figures] for connecting the first coupling member 9303a with the robotic system. The first attachment means may ensure that, the robotic system upon coupling with the one or more objects 9302 via the coupling device 9300 is capable of holding and manipulating the one or more objects 9302 efficiently. The first coupling member 9303a also includes a first mating surface, having a plurality of first projections defined on its periphery. The plurality of first projections are configured to engage with the second coupling member 9304a, for coupling the first coupling member 9303a with the second coupling member 9304a.
Further, the second coupling member 9304a may include a second connection surface 9304b, for connecting the second coupling member 9304a with the one or more objects 9302 [as shown in
In an embodiment, the plurality of first projections may be shaped such that, the periphery of the first mating surface is machined to form a crest and trough profile. In another embodiment, the plurality of first projections may be shaped such that, the periphery of the first mating surface is machined to form at least one of a convex shape or a concave shape. In another embodiment, the plurality of first projections may be shaped to form a wavy profile or undulations along the periphery.
In an embodiment, the plurality of second projections 9304d may be shaped such that, the periphery of the second mating surface 9304c is machined to form a crest and trough profile corresponding to the configuration of the plurality of first projections. In other words, the plurality of second projections 9304d include a trough region in the corresponding crest region in the plurality of first projections and vice versa, to facilitate engagement. In another embodiment, the plurality of second projections 9304d may be shaped such that, the periphery of the second mating surface 9304c is machined to form at least one of a convex shape or a concave shape, corresponding to the configuration of the plurality of first projections. In another embodiment, the plurality of second projections 9304d may be shaped such that, the periphery of the second mating surface 9304c may be shaped to form a wavy profile or undulations, corresponding to the configuration of the plurality of first projections.
In an embodiment, the first connection surface 9303b is connected to the robotic system by means such as but not limiting to mechanical means and non-mechanical means. In an embodiment, the mechanical means of connection for the first connection surface 9303b and the second connection surface 9304b is selected from at least one of a magnetic means, a snap-fit arrangement, a screw-nut arrangement, a plug-socket 1006b means, a vacuum actuation means or any other means, as per feasibility and requirement.
In an embodiment, the first connection surface 9303b may be configured corresponding to the configuration of the robotic system, so that upon connection, the first connection surface 9303b is flush with the robotic system.
In an embodiment, the first connection surface 9303b is a rear surface of the first coupling member 9303a and the first mating surface is a front surface of the first coupling member 9303a.
In an embodiment, the first connection surface 9303b is configured to connect with a robotic arm of the robotic system.
In an embodiment, the first connection surface 9303b is connected to the robotic system via at least one of a mechanical means or an electro-mechanical means or any other means as per deign feasibility and requirement.
In an embodiment, the first coupling member 9303a and the second coupling member 9304a is configured with at least one of a cylindrical profile, a rectangular profile, a circular profile, a curved profile or any other profile as per design feasibility and requirement.′
In an embodiment, cross-section of the first coupling member 9303a and the second coupling member 9304a are selected from at least one of a rectangular cross-section, a circular cross-section, a square cross-section or any other cross-section, as per design feasibility and requirement.
Referring to
The locking mechanism 9305 includes at least one notch 9305a [shown in
In an embodiment, the at least one notch 9305a is configurable on either of the first coupling member 9303a and the second coupling member 9304a at a predetermined location, as per design feasibility and requirement. In other words, the at least one notch 9305a may be provided at any location on the first coupling member 9303a or the second coupling member 9304a, without hindering accessibility of the at least one notch 9305a with the at least one protrusion 9305b. In another embodiment, shape of the at least one notch 9305a is selected from at least one of a triangular shape, a circular shape, a rectangular shape or any other geometric shape, as per design feasibility and requirement.
In an embodiment, the at least one notch 9305a of a predetermined geometric shape may be machined on either of the first coupling member 9303a and the second coupling member 9304a. In another embodiment, a secondary attachment with at least one notch 9305a may be attached to the first coupling member 9303a or the second coupling member 9304a, for configuring the at least one notch 9305a on either of the first coupling member 9303a and the second coupling member 9304a.
In an embodiment, the at least one protrusion 9305b is configurable on either of the first coupling member 9303a and the second coupling member 9304a at a predetermined location, as per design feasibility and requirement. In other words, the at least one protrusion 9305b may be provided at any location on the first coupling member 9303a or the second coupling member 9304a, without hindering accessibility of the at least one protrusion 9305b with the at least one notch 9305a. In another embodiment, shape of the at least one protrusion 9305b is corresponding to the configuration of the at least one notch 9305a. In another embodiment, shape of the at least one protrusion 9305b is selected from at least one of a triangular shape, a circular shape, a rectangular shape or any other geometric shape, as per design feasibility and requirement.
In an embodiment, the at least one protrusion 9305b of a predetermined geometric shape may be machined on either of the first coupling member 9303a and the second coupling member 9304a. In another embodiment, a secondary attachment with at least one protrusion 9305b may be attached to the first coupling member 9303a or the second coupling member 9304a, for configuring the at least one protrusion 9305b on either of the first coupling member 9303a and the second coupling member 9304a.
In an embodiment, a combination of the at least one notch 9305a and the at least one protrusion 9305b may be provided on both the first coupling member 9303a and the second coupling member 9304a, as per design feasibility and requirement [as shown in
Referring back to
In an embodiment, the at least one sensor 9306 may be interfaced with one or more processors or control units, for transmitting signals relating to the orientation of the first coupling member 9303a with respect to the second coupling member 9304a, during engagement. The one or more processors or control units, may operate the first coupling member 9303a suitably, based on the feedback received from the at least one sensor 9306 for appropriate orientation and position. In an embodiment, the one or more processors or control units may rotate or actuate the first coupling member 9303a along different axis or axes, as per feasibility and requirement.
In an embodiment, the at least one sensor 9306 is selected from group comprising piezoelectric sensors, hall-effect sensors, infrared sensors or any other sensors which serves the purpose of determining the position and orientation of the first coupling member 9303a with respect to the second coupling member 9304a.
In an embodiment, the first coupling member 9303a may be connected to the robotic system such as a robotic arm via at least one of an electro-mechanical means, a mechanical means, a vacuum means or a magnetic means. In an exemplary embodiment, the mechanical means of connection between the first coupling member 9303a and the robotic system may be selected from at least one of a snap-fit arrangement, a screw-thread arrangement, a twist-lock arrangement or any other means which serves the design feasibility and requirement.
In an embodiment, the second coupling member 9304a may be connected to the one or more objects 9302 via at least one of an electro-mechanical means, a mechanical means, a vacuum means or a magnetic means. In an exemplary embodiment, the mechanical means of connection between the second coupling member 9304a and the one or more objects 9302 may be selected from at least one of a snap-fit arrangement, a screw-thread arrangement, a twist-lock arrangement or any other means which serves the design feasibility and requirement.
In an embodiment, an interface port 9307 [as shown in
In an embodiment, the one or more objects 9302 may be selected from at least one of a kitchen appliance, a house-hold appliance, a shop-floor appliance, an industrial appliance or any other appliance which serves the user's requirement.
In an embodiment, the robotic system may be selected from at least one of a commercial robotic system and an industrial robotic system. In another embodiment, the commercial robotic system may be selected from at least one of a house-hold robotic system, a field robotic system, a medical robotic system and an autonomous robotic system.
As illustrated in
Referring to
In an embodiment, the extension of the at least one protrusion 9305b is lesser than the extension of the plurality of first projections, thus ensure that the at least one protrusion 9305b engage with the at least one notch 9305a, only upon engagement of the plurality of first projections with the plurality of second projections 9304d.
In an embodiment, the first coupling member 9303a is made of material selected from group comprising ferromagnetic materials or non-ferro magnetic materials.
In an embodiment, the second coupling member 9304a is made of material selected from group comprising ferromagnetic materials or non-ferro magnetic materials.
Referring to
Referring to
The forces on the one or more objects 9302 are applied, as per the directions mentioned in
Based on the above-mentioned direction of forces, the below table 7 represents the force test with the locking mechanism 9305 in triangular configuration.
The numerals mentioned in the above table 7, corresponds to the direction of forces on the one or more objects 9302, which the robotic system failed to hold. However, below table 8 illustrates the results of the force-test on the one or more objects 9302 connected to the robotic system via conventional coupling device 9300.
From the table 8, it is evident that providing the locking mechanism 9305 in the triangular configuration has significantly improved the stability and strength of the coupling device 9300 in comparison with the conventional coupling devices. Hence, the one or more objects 9302 can be handled more effectively by incorporating the locking mechanism 9305 in the triangular configuration.
As illustrated in
Referring to
Referring to
Further, the load test is carried out for the one or more objects 9302 [as shown in
The below table 9 represents the force test with the locking mechanism 9305 in circular configuration.
The numerals mentioned in the above table 9, corresponds to the direction of forces on the one or more objects 9302, which the robotic system failed to hold.
From the table 9, it is evident that providing the locking mechanism 9305 in the circular configuration has significantly improved the stability and strength of the coupling device 9300 as compared to the conventional coupling devices. Hence, the one or more objects 9302 can be handled more effectively by incorporating the locking mechanism 9305 in the circular configuration. However, the locking mechanism 9305 in the circular configuration may be less stable and rigid as compared to the locking mechanism 9305 in triangular configuration.
As illustrated in
Referring to
Referring to
Referring to
Further, the load test is carried out for the one or more objects 9302 [as shown in
The below table 10 represents the force test with the locking mechanism 9305 of electromagnet configuration.
The numerals mentioned in the above table 10, corresponds to the direction of forces on the one or more objects 9302, which the robotic system failed to hold.
From the table 10, it is evident that providing the locking mechanism 9305 in electromagnet configuration has significantly improved the stability and strength of the coupling device 9300 as compared to the conventional coupling devices. Moreover, this configuration of the coupling device 9300 has higher reliability than that of the circular configuration and the triangular configurations.
As illustrated in the drawings, the second coupling member 9304a may be located on the one or more objects 9302 such that, it does not hinder manual usage of the one or more objects 9302 to the user. At the same time, the location of the second coupling member 9304a may also enable the robotic system for effective manipulation of the one or more objects 9302, once coupled. Thus, location of the second coupling member 9304a on the one or more objects 9302, plays a crucial role in its utility both for the robotic system and the user.
In an exemplary embodiment, as shown in
In an embodiment, the at least one second locking member 9402 is actuated to slide towards the at least one first locking member 9401, while operating between the first position 9402a to the second position 9402b.
In an embodiment, the at least one actuator 9403a assembly may be associated with the one or more processors. The one or more processors may be configured to actuate the at least one actuator 9403a, for operating the at least one second locking member 9402 between the first position 9402a and the second position 9402b. In another embodiment, the one or more processors may be configured to actuate the at least one actuator 9403a, when the manipulator approaches vicinity of the one or more objects 9302 to be secured.
In an embodiment, the plurality of slots 9403c defined on the one or more slots 9403c are selected corresponding to the configuration of the at least one first locking member 9401 and the second locking member.
In an embodiment, the plurality of slots 9403c are configured such that, the frictional forces acting on the at least one first locking member 9401 and the second locking member during engagement is minimal. For the frictional forces to be minimal, the plurality of slots 9403c may be provided with features such as fillets or chamfers or any other features, which mitigate frictional forces for engagement.
In an embodiment, the plurality of slots 9403c may be configured with a predetermined sliding angle, based on the strength and smoothness of travel of the at least one first locking member 9401 and the second locking member.
In an embodiment, the predetermined sliding angle for the plurality of slots 9403c are considered on various factors, as collated in below table 11.
In an embodiment, the at least one actuator 9403a assembly comprises a lead screw operated by a motor 9403d and a nut mounted on the lead screw. A motor 9403d supported by a clamp, is coupled to the lead screw, and thus upon rotation of the lead screw, the nut is configured to slide along the lead screw. The nut is fixedly connected to the at least one second locking member 9402, so that the at least one second locking member 9402 is operated along with the nut, when the motor 9403d rotates the lead screw. The motor 9403d may be associated with the one or more processors of the robotic system. The one or more processors, thus, control the rotation of the lead screw as per requirement of movement of the at least one second locking member 9402. Accordingly, the at least one second locking member 9402 is operated between the first position 9402a and the second position 9402b.
In an embodiment, the lead screw may be mounted co-axial to a horizontal axis of the manipulator. In another embodiment, the lead screw may be mounted on the manipulator as per requirement of sliding movement of the at least one second locking member 9402 between the first position 9402a and the second position 9402b for securing the one or more objects 9302.
In an embodiment, a lead screw holder may be provided for supporting the lead screw on the manipulator.
In an embodiment, the lead screw holder includes a plurality of threads with a lead angle selected to restrict movement of the nut, when the motor 9403d ceases to operate.
In an embodiment, the one or more processors are configured to actuate the motor 9403d upon alignment of either of the at least one first locking member 1000 and the at least one second locking member 9402. In an embodiment, the plurality of slots 9403c may include the indication device or the marker for determining position of the one or more objects 9302 during engagement. In another embodiment, the indication device or the marker in the plurality of slots 9403c may enable the one or more processors to align either of the least one first locking member and the second locking member with the plurality of slots 9403c prior to engagement.
In an embodiment, the motor 9403d is powered by a power source selected from at least one of an Alternating-Current power source or a Direct-Current power source, for rotating the lead screw.
In an exemplary embodiment, the torque required to operate the lead screw is calculated as follows:
Based on the required force, friction of the screw and lead/pitch of the thread, the torque was calculated for the 6 mm screw:
The equation used to calculate the required torque:
T—torque
F—load on the screw
dm—screw diameter
μ—coefficient of friction
(according to the table 1 below)
I—lead/pitch
Φ—angle of friction
λ—lead angle
In another embodiment, the material used for the lead screw and the nut material is selected from group comprising steel, stainless steel, bronze, brass, cast iron and the like, as per design feasibility and requirement.
In an exemplary embodiment, the lead screw rotation is calculated as follows:
For 1 mm per second speed of the at least one second locking member 9402, the rotation of the lead screw may be calculated via:
V=L*Rps (Eq. 2)
V—speed of the hooks
L—lead/pitch of the thread
Rps—Revolutions per second (of the screw)
Therefore: 10 mm/sec=1 mm*Rps
Rps=(10 mm/sec)/1 mm
In other words, 10 revolutions per second of the lead screw is required to move the at least one second locking member 9402 at 10 mm per second.
Therefore, the motor 9403d will have to rotate the screw 600 rpm (10 rps*60 sec) to achieve the hooks' speed equal to 10 mm per second.
In an embodiment, the at least one actuator 9403a assembly may comprise a housing 9404a mounted on the manipulator. The housing 9404a includes a solenoid coil 9404b and powered by the power source. A plunger 9404c is accommodated within the housing 9404a and is suspended concentrically to the solenoid coil 9404b. The plunger 9404c is adapted to be actuated by the solenoid coil 9404b in an energized condition. Further, a frame member 9404d extends from the plunger 9404c, to connect to the at least one second locking means. The frame member 9404d is thus configured to transfer actuation of the plunger 9404c to the at least one second locking means, and thereby operating the at least one second locking means between the first position 9402a and the second position 9402b.
In an embodiment, a damper member may be provided such that, one end of the damper member may be fixed to the housing 9404a and the other end may be connected to the frame member 9404d. This configuration ensures that the frame member 9404d returns back to an initial position, when the solenoid coil 9404b is deenergized. Thus, enable smooth translation of the frame member 9404d and the at least one second locking member 9402 between the first position 9402a and the second position 9402b.
In an embodiment, the frame member 9404d includes one or more link 9405 members which are connected to the at least one second locking member 9402 for transferring actuation of the plunger 9404c to the at least one second locking member 9402.
In an embodiment, the solenoid coil 9404b is energized by the power source selected from at least one of an Alternating-Current power source or a Direct-Current power source, for rotating the lead screw.
In an embodiment, the least one first locking member and the second locking member are hook members.
In an embodiment, the at least one sensor 6 provided in the manipulator is configured to detect secured and unsecured condition of the one or more objects 9302 with the manipulator.
In an embodiment, both the at least one first locking member 9401 and the at least one second locking member 9402 can be slidable relative to one another for securing the one or more objects 9302. In other words, instead of moving only the at least one second locking member 9402, the at least one first locking member 9401 may also be slidably operated for securing the one or more objects 9302.
In an embodiment, the plurality of slots 9403c are provided on the one or more objects 9302 corresponding to the location and position in which the one or more objects 9302 needs to be secured with the manipulator. In an exemplary embodiment, if the one or more objects 9302 needs to be positioned vertically [as shown in
In an embodiment, the solenoid assembly may be selected from at least one of pull-type solenoid actuator or a push-type solenoid actuator, as per design feasibility and requirement.
In an embodiment, the plurality of slots 9403c are located such that, manual operation of the one or more objects 9302 is not restricted.
In an embodiment, the calculations pertaining to the force generated by the solenoid coil 9404b is mentioned below:
The DC solenoid's force and response time are both directly affected by Wattage. Since DC Wattage=Voltage×Current, increasing or decreasing either voltage or current (amperage) will increase or decrease force and response time.
However, increasing the current results in temperature rising and significantly lowering the magnetic force (similar effect to increasing a duty cycle.
The magnetic force is very low at the start of the stroke and much higher at the end of the stroke (as presented below in the examples of the solenoids' characteristics). For instance, at 50% duty cycle, the magnetic force could be 5-20 times lower at the stroke equal to 5 mm than the magnetic force at the end of the stroke [as shown in
The solenoid's force is inversely proportional to the stroke (air gap “g”) squared. When the air gap is doubled, the force will quarter.
According to the equation below, the DC solenoid's magnetic field decays with the square of distance. Therefore, the solenoid's force is not linear, and increasing from small at the beginning of the stroke to very high at the end of the stroke.
where:
F=Solenoid's force in Newtons
N=Number of turns(wiring)
g=Stroke/Length of the air gap in meters
A=Area in square meters (m2)
4×PI×10−7=Magnetic constant
That means if current (I), area (A) and number of turns (N) are constant, then the force (F) is inversely proportional to the gap squared.
In an exemplary embodiment, for current (1=4 Ampers), number of turns (N=500) and the area (A=49 mm2=7 mm*7 mm) are constant. The table 1 on the right shows the results for different stokes/air gaps “g” (which in fact are the plunger 9404c's positions). The force was calculated with the equation below:
The force (F) quickly decreases when stroke increases. The force (F) is inversely proportional to the gap squared. It is displayed as a hyperbola in the force/stroke chart [as shown in
When the stroke is doubled, the force quarters:
10 mm stroke=>1.2 Newton force
20 mm stroke=>0.3 Newton force
The force is close to zero at the long stroke.
In an embodiment, for storing the one or more objects 9302 the robotic system is adapted approach the wall locking mechanism 9406 and orient the one or more objects 9302 at a predetermined angle for inserting the wall mount bracket 9407 of the one or more objects 9302. At this stage, the robotic system tilts the one or more objects 9302 suitably, to lock the wall mount bracket 9407 into the opening 9406a.
In an embodiment, the opening 9406a, the socket 9406b and the stopper may be configured corresponding to the configuration of the wall mount bracket 9407 provisioned to the one or more brackets.
In an embodiment, the wall locking mechanism 9406 may be configured to directly receive and store the one or more objects 9302.
In an embodiment, a magnet may be provided in the socket 9406b, for providing extra locking force to the one or more objects 9302.
In an embodiment, the magnet may be provided to the wall mount bracket 9407 or may be directly mounted to the one or more objects 9302 for fixing onto the wall locking mechanism 9406.
In an embodiment, wall mount mechanism is defined in at least one of a kitchen environment, a structured environment or an un-structured environment.
The disk drive unit 4340 includes a machine-readable medium 244 on which is stored one or more sets of instructions (e.g., software 4346) embodying any one or more of the methodologies or functions described herein. The software 4346 may also reside, completely or at least partially, within the main memory 3644 and/or within the processor 4326 during execution thereof the computer system 4324, the main memory 4328, and the instruction-storing portions of processor 4326 constituting machine-readable media. The software 4346 may further be transmitted or received over a network 4350 via the network interface device 248.
While the machine-readable medium 3644 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “machine-readable medium” shall also be taken to include any tangible medium that is capable of storing a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present disclosure. The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, and optical and magnetic media.
In general, a robotic control platform comprises one or more robotic sensors; one or more robotic actuators; a mechanical robotic structure including at least a robotic head with mounted sensors on an articulated neck, two robotic arms with actuators and force sensors; an electronic library database, communicatively coupled to the mechanical robotic structure, of minimanipulations, each including a sequence of steps to achieve a predefined functional result, each step comprising a sensing operation or a parameterized actuator operation; and a robotic planning module, communicatively coupled to the mechanical robotic structure and the electronic library database, configured for combining a plurality of minimanipulations to achieve one or more domain-specific applications; a robotic interpreter module, communicatively coupled to the mechanical robotic structure and the electronic library database, configured for reading the minimanipulation steps from the minimanipulation library and converting to a machine code; and a robotic execution module, communicatively coupled to the mechanical robotic structure and the electronic library database, configured for executing the minimanipulation steps by the robotic platform to accomplish a functional result associated with the minimanipulation steps.
Another generalized aspect provides a humanoid having a robot computer controller operated by robot operating system (ROS) with robotic instructions comprises a database having a plurality of electronic minimanipulation libraries, each electronic minimanipulation library including a plurality of minimanipulation elements, the plurality of electronic minimanipulation libraries can be combined to create one or more machine executable application-specific instruction sets, the plurality of minimanipulation elements within an electronic minimanipulation library can be combined to create one or more machine executable application-specific instruction sets; a robotic structure having an upper body and a lower body connected to a head through an articulated neck, the upper body including torso, shoulder, arms and hands; and a control system, communicatively coupled to the database, a sensory system, a sensor data interpretation system, a motion planner, and actuators and associated controllers, the control system executing application-specific instruction sets to operate the robotic structure.
A further generalized computer-implemented method for operating a robotic structure through the use of one more controllers, one more sensors, and one more actuators to accomplish one or more tasks comprises providing a database having a plurality of electronic minimanipulation libraries, each electronic minimanipulation library including a plurality of minimanipulation elements, the plurality of electronic minimanipulation libraries can be combined to create one or more machine executable task-specific instruction sets, the plurality of minimanipulation elements within an electronic minimanipulation library can be combined to create one or more machine executable task-specific instruction sets; executing task-specific instruction sets to cause the robotic structure to perform a commanded task, the robotic structure having an upper body connected to a head through an articulated neck, the upper body including torso, shoulder, arms and hands; sending time-indexed high-level commands for position, velocity, force, and torque to the one or more physical portions of the robotic structure; and receiving sensory data from one or more sensors for factoring with the time-indexed high-level commands to generate low-level commands to control the one or more physical portions of the robotic structure.
Another generalized computer-implemented method for generating and executing a robotic task of a robot comprises generating a plurality minimanipulations in combination with parametric minimanipulation (MM) data sets, each minimanipulation being associated with at least one particular parametric MM data set which defines the required constants, variables and time-sequence profile associated with each minimanipulation; generating a database having a plurality of electronic minimanipulation libraries, the plurality of electronic minimanipulation libraries having MM data sets, MM command sequencing, one or more control libraries, one or more machine-vision libraries, and one or more inter-process communication libraries; executing high-level robotic instructions by a high-level controller for performing a specific robotic task by selecting, grouping and organizing the plurality of electronic minimanipulation libraries from the database thereby generating a task-specific command instruction set, the executing step including decomposing high-level command sequences, associated with the task-specific command instruction set, into one more individual machine-executable command sequences for each actuator of a robot; and executing low-level robotic instructions, by a low-level controller, for executing individual machine-executable command sequences for each actuator of a robot, the individual machine-executable command sequences collectively operating the actuators on the robot to carry out the specific robot task.
A generalized computer-implemented method for controlling a robotic apparatus, comprises composing one or more minimanipulation behavior data, each minimanipulation behavior data including one or more elementary minimanipulation primitives for building one or more ever-more complex behaviors, each minimanipulation behavior data having a correlated functional result and associated calibration variables for describing and controlling each minimanipulation behavior data; linking one or more behavior data to a physical environment data from one or more databases to generate a linked minimanipulation data, the physical environment data including physical system data, controller data to effect robotic movements, and sensory data for monitoring and controlling the robotic apparatus 75; and converting the linked minimanipulation (high-level) data from the one or more databases to a machine-executable (low-level) instruction code for each actuator (A1 thru An,) controller for each time-period (t1 thru tm) to send commands to the robot apparatus for executing one or more commanded instructions in a continuous set of nested loops.
In any of these aspects, the following may be considered. The preparation of the product normally uses ingredients. Executing the instructions typically includes sensing properties of the ingredients used in preparing the product. The product may be a food dish in accordance with a (food) recipe (which may be held in an electronic description) and the person may be a chef. The working equipment may comprise kitchen equipment. These methods may be used in combination with any one or more of the other features described herein. One, more than one, or all of the features of the aspects may be combined, so a feature from one aspect may be combined with another aspect for example. Each aspect may be computer-implemented and there may be provided a computer program configured to perform each method when operated by a computer or processor. Each computer program may be stored on a computer-readable medium. Additionally or alternatively, the programs may be partially or fully hardware-implemented. The aspects may be combined. There may also be provided a robotics system configured to operate in accordance with the method described in respect of any of these aspects.
In another aspect, there may be provided a robotics system, comprising: a multi-modal sensing system capable of observing human motions and generating human motions data in a first instrumented environment; and a processor (which may be a computer), communicatively coupled to the multi-modal sensing system, for recording the human motions data received from the multi-modal sensing system and processing the human motions data to extract motion primitives, preferably such that the motion primitives define operations of a robotics system. The motion primitives may be minimanipulations, as described herein (for example in the immediately preceding paragraphs) and may have a standard format. The motion primitive may define specific types of action and parameters of the type of action, for example a pulling action with a defined starting point, end point, force and grip type. Optionally, there may be further provided a robotics apparatus, communicatively coupled to the processor and/or multi-modal sensing system. The robotics apparatus may be capable of using the motion primitives and/or the human motions data to replicate the observed human motions in a second instrumented environment.
In a further aspect, there may provided a robotics system, comprising: a processor (which may be a computer), for receiving motion primitives defining operations of a robotics system, the motion primitives being based on human motions data captured from human motions; and a robotics system, communicatively coupled to the processor, capable of using the motion primitives to replicate human motions in an instrumented environment. It will be understood that these aspects may be further combined.
A further aspect may be found in a robotics system comprising: first and second robotic arms; first and second robotic hands, each hand having a wrist coupled to a respective arm, each hand having a palm and multiple articulated fingers, each articulated finger on the respective hand having at least one sensor; and first and second gloves, each glove covering the respective hand having a plurality of embedded sensors. Preferably, the robotics system is a robotic kitchen system.
There may further be provided, in a different but related aspect, a motion capture system, comprising: a standardized working environment module, preferably a kitchen; plurality of multi-modal sensors having a first type of sensors configured to be physically coupled to a human and a second type of sensors configured to be spaced away from the human. One or more of the following may be the case: the first type of sensors may be for measuring the posture of human appendages and sensing motion data of the human appendages; the second type of sensors may be for determining a spatial registration of the three-dimensional configurations of one or more of the environment, objects, movements, and locations of human appendages; the second type of sensors may be configured to sense activity data; the standardized working environment may have connectors to interface with the second type of sensors; the first type of sensors and the second type of sensors measure motion data and activity data, and send both the motion data and the activity data to a computer for storage and processing for product (such as food) preparation.
An aspect may additionally or alternatively be considered in a robotic hand coated with a sensing gloves, comprising: five fingers; and a palm connected to the five fingers, the palm having internal joints and a deformable surface material in three regions; a first deformable region disposed on a radial side of the palm and near the base of the thumb; a second deformable region disposed on a ulnar side of the palm, and spaced apart from the radial side; and a third deformable region disposed on the palm and extend across the base of the fingers. Preferably, the combination of the first deformable region, the second deformable region, the third deformable region, and the internal joints collectively operate to perform a mini manipulation, particularly for food preparation.
A multi-level robotic system for high speed and high fidelity manipulation operations segmented into two physical and logical subsystems made up of instrumented, articulated and controller-actuated subsystems, each comprising a larger- and coarser-motion macro-manipulation system responsible for operations in larger unconstrained environment workspaces at a reduced endpoint accuracy, and a smaller- and finer-motion micro-manipulation system responsible for operations in a smaller workspace and while interacting with tooling and the environment at a higher endpoint motion accuracy, each carrying out mini-manipulation trajectory-following tasks based on mini-manipulation commands provided through a dual-level database specific to the macro-manipulation and micro-manipulation subsystems, each supported by a dedicated and separate distributed processor and sensor architecture operating under an overall real-time operating system communicating with all subsystems over multiple bus interfaces specific to sensor, command and database-elements. The robotic system of the present disclosure pertains to where the macro-manipulation subsystem contains its dedicated sensors, actuators and processors interconnected over one or more dedicated interface buses, including a sensor suite used for perceiving the surrounding environment, which includes imaging and mapping the same and modeling elements within the environment and identifying said elements, performing macro-manipulation subsystem relevant motion planning in one or more of Joint- and/or Cartesian-space based on mini-manipulation commands provided by a dedicated macro-level mini-manipulation library, executing said commands through position or velocity or joint or force based control at the joint-actuator level, and providing sensory data back to the macro-manipulation control and perception subsystems, while also monitoring all processes to allow for learning algorithms to provide improvements to the mini-manipulation macro-level command-library to improve future performance based on criteria such as execution-time, energy-expended, collision-avoidance, singularity-avoidance and workspace-reachability. The robotic system of the present disclosure pertains to wherein the micro-manipulation subsystem contains its dedicated sensors, actuators and processors interconnected over one or more dedicated interface buses, including a sensor suite used for perceiving the immediate environment, which includes imaging and mapping the same and modeling elements within the environment and identifying said elements, particularly as it relates to interaction variables between the micro-manipulation system and associated tools during contact with the environment itself, performing micro-manipulation subsystem relevant motion planning in one or more of joint- and/or Cartesian-space based on mini-manipulation commands provided by a dedicated micro-level mini-manipulation library, executing said commands through position or velocity or joint or force based control at the joint-actuator level, and providing sensory data back to the micro-manipulation control and perception subsystems, while also monitoring all processes to allow for learning algorithms to provide improvements to the mini-manipulation micro-level command-library to improve future performance based on criteria such as execution-time, energy-expended, collision-avoidance, singularity-avoidance and workspace-reachability.
A robotic cooking system configured into at least a dual-layer physical and logical macro-manipulation and micro-manipulation system capable of independent and coordinated task-motions by way of instrumented, articulated and controller-actuated subsystems, where the macro-manipulation system is used for coarse positioning of the entire robot assembly in free space using its own dedicated sensing-, positioning and motion execution subsystems, with a thereto attached one or more respective micro-manipulation subsystems for local sensing, fine-positioning and motion execution of the endeffectors interacting with the environment, with both of the macro- and micro-manipulation system each configured with their own separate and dedicated buses for sensing, data-communication and control of associated actuators with their associated processors, with each of the macro- and micro-manipulation system receiving motion and behavior commands based on separate mini-manipulation commands from their dedicated planners, with each planner receiving coordinated time- and process-progress dependent mini-manipulation commands from a central planner. The system pertains to the macro-manipulation system comprising a large workspace translational Cartesian—space positioner with an attached body system made up of a sensor-head connected to a shoulder and torso with one or more articulated multi-jointed manipulator arms each with a thereto attached wrist capable of positioning one or more of the micro-manipulation subsystems via dedicated sensors and actuators interfaced through at least one or more dedicated controllers. The system relates to the mini-manipulation system comprising of at least one thereto attached palm and dexterous multi-fingered end-of-arm endeffector for handling utensils and tools, as well as any vessel needed in any stages of dish preparation cooking, via dedicated sensors and actuators interfaced through at least one or more controllers. The system relates to where a set of legs or wheels is attached to a waist attached to the macro-manipulation system for larger workspace movements. The system provides sensor feedback data to a world perception and modeling system responsible for perceiving the macro-manipulation subsystem free-space environment as well as the entire robotic system pose. The system provides the world perception feedback and model data over one or more dedicated interface buses to a dedicated macro-manipulation planning, execution and tracking module operating on one or more stand-alone and separate processors. The planning system pertains to macro-manipulation motion commands provided to it from a separate stand-alone task-decomposition and planning module. The system provides sensor feedback data to a world perception and modeling system responsible for perceiving the micro-manipulation subsystem free-space environment as well as the entire robotic system pose. The system provides the world perception feedback and model data over one or more dedicated interface buses to a dedicated micro-manipulation planning, execution and tracking module operating on one or more stand-alone and separate processors.
The planning system provides micro-manipulation motion commands provided to it from a separate stand-alone task-decomposition and planning module.
A planning system generates mini-manipulation command-stack sequence that is configured to perform planning actions for the entire robot system combining and coordinating separately planned mini-manipulations from the macro- and micro-planners, where the macro-manipulation planner plans and generates time- and process-progress dependent mini-manipulations for the macro-manipulation subsystem, where the micro-manipulation planner plans and generates time- and process-progress dependent mini-manipulations for the micro-manipulation subsystem. Each of the subsystem planners from claim 14, comprising a task-progress tracking module, a mini-manipulation planning module, and a mini-manipulation database for macro-manipulation tasks. The task-progress tracking module includes progress comparator module that tracks differences between commanded and actual task progress, model and environment data as well as product and process model data combined with all relevant sensor feedback data, and a learning module that creates and tracks variations that impact deviations in the descriptors of said mini-manipulations for potential future upgrades to the respective database. The mini-manipulation planning system module generates mini-manipulation commands based on a set of steps that use mini-manipulation commands from a database which subsequently get evaluated for applicability, resolved for application to individual movable components, combined in space for a smooth motion profile, and optimized for optimum timing and subsequently translated into a machine-readable set of mini-manipulation commands configured into a command-stack sequence.
A method for generating mini-manipulation commands for one or both the macro-manipulation or micro-manipulation subsystems, through a process of receiving a high-level task-execution command, comprises selecting from an action-primitive repository, a set of alternative action primitives that are evaluated and selected to achieve the commanded task based on a set of pre-determined criteria of importance to the application which describe the required entry boundary conditions as well as the minimum necessary exit boundary conditions defining a successful task-completion state at its start and its completion. The method of mini-manipulation command generation for one or both the macro- or micro-manipulation subsystems, comprises receiving a high-level task execution command, identifying individual subtasks which will be mapped to the applicable robotic subsystems, generation of individual performance criteria and measurable success end-state criteria for each of the above subtasks, selection of one or more in either a stand-alone or combination, of the most suitable action primitive candidates, evaluation of these action primitive alternatives for maximizing or minimizing such measures as execution-time, energy expended, robot reachability, collision avoidance or any other task-critical criteria, generation of either or both macro- and/or micro-manipulation subsystem trajectories in one or more motion spaces, including joint- and Cartesian-space, synchronizing said trajectories for path consecutiveness, path-segment smoothness, intra-segment time-stamp synchronization and coordination amongst multi-arm robot subsystems, and generating a machine-executable command-sequence stack for one or both the macro- and/or micro manipulation subsystems. The method includes the step of receiving mini-manipulation descriptor updates generated during the mini-manipulation progress tracking and performance learning process, involving extracting relevant constants and variables related to specific mini-manipulations and their associated action primitives, assigning variances for each variable and constant for each affected action primitive, and providing the updates back to the action-primitive repository to allow each of the updates to be logged and implemented within said repository or database.
In respect of any of the above system, device or apparatus aspects there may further be provided method aspects comprising steps to carry out the functionality of the system. Additionally or alternatively, optional features may be found based on any one or more of the features described herein with respect to other aspects.
The present disclosure has been described in particular detail with respect to possible embodiments. Those skilled in the art will appreciate that the disclosure may be practiced in other embodiments. The particular naming of the components, capitalization of terms, the attributes, data structures, or any other programming or structural aspect is not mandatory or significant, and the mechanisms that implement the disclosure or its features may have different names, formats, or protocols. The system may be implemented via a combination of hardware and software, as described, or entirely in hardware elements, or entirely in software elements. The particular division of functionality between the various systems components described herein is merely example and not mandatory; functions performed by a single system component may instead be performed by multiple components, and functions performed by multiple components may instead be performed by a single component.
In various embodiments, the present disclosure can be implemented as a system or a method for performing the above-described techniques, either singly or in any combination. The combination of any specific features described herein is also provided, even if that combination is not explicitly described. In another embodiment, the present disclosure can be implemented as a computer program product comprising a computer-readable storage medium and computer program code, encoded on the medium, for causing a processor in a computing device or other electronic device to perform the above-described techniques.
As used herein, any reference to “one embodiment” or to “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiments is included in at least one embodiment of the disclosure. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
Some portions of the above are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to convey most effectively the substance of their work to others skilled in the art. An algorithm is generally perceived to be a self-consistent sequence of steps (instructions) leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical, magnetic or optical signals capable of being stored, transferred, combined, compared, transformed, and otherwise manipulated. It is convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
Furthermore, it is also convenient at times to refer to certain arrangements of steps requiring physical manipulations of physical quantities as modules or code devices, without loss of generality.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that, throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “displaying” or “determining” or the like refer to the action and processes of a computer system, or similar electronic computing module and/or device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission, or display devices.
Certain aspects of the present disclosure include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present disclosure could be embodied in software, firmware, and/or hardware, and, when embodied in software, it can be downloaded to reside on, and operated from, different platforms used by a variety of operating systems.
The present disclosure also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. Furthermore, the computers and/or other electronic devices referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
The algorithms and displays presented herein are not inherently related to any particular computer, virtualized system, or other apparatus. Various general-purpose systems may also be used with programs, in accordance with the teachings herein, or the systems may prove convenient to construct more specialized apparatus needed to perform the required method steps. The required structure for a variety of these systems will be apparent from the description provided herein. In addition, the present disclosure is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present disclosure as described herein, and any references above to specific languages are provided for disclosure of enablement and best mode of the present disclosure.
In various embodiments, the present disclosure can be implemented as software, hardware, and/or other elements for controlling a computer system, computing device, or other electronic device, or any combination or plurality thereof. Such an electronic device can include, for example, a processor, an input device (such as a keyboard, mouse, touchpad, trackpad, joystick, trackball, microphone, and/or any combination thereof), an output device (such as a screen, speaker, and/or the like), memory, long-term storage (such as magnetic storage, optical storage, and/or the like), and/or network connectivity, according to techniques that are well known in the art. Such an electronic device may be portable or non-portable. Examples of electronic devices that may be used for implementing the disclosure include a mobile phone, personal digital assistant, smartphone, kiosk, desktop computer, laptop computer, consumer electronic device, television, set-top box, or the like. An electronic device for implementing the present disclosure may use an operating system such as, for example, iOS available from Apple Inc. of Cupertino, Calif., Android available from Google Inc. of Mountain View, Calif., Microsoft Windows 7 available from Microsoft Corporation of Redmond, Wash., webOS available from Palm, Inc. of Sunnyvale, Calif., or any other operating system that is adapted for use on the device. In some embodiments, the electronic device for implementing the present disclosure includes functionality for communication over one or more networks, including for example a cellular telephone network, wireless network, and/or computer network such as the Internet.
Some embodiments may be described using the expression “coupled” and “connected” along with their derivatives. It should be understood that these terms are not intended as synonyms for each other. For example, some embodiments may be described using the term “connected” to indicate that two or more elements are in direct physical or electrical contact with each other. In another example, some embodiments may be described using the term “coupled” to indicate that two or more elements are in direct physical or electrical contact. The term “coupled,” however, may also mean that two or more elements are not in direct contact with each other, but yet still co-operate or interact with each other. The embodiments are not limited in this context.
As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof are intended to cover a non-exclusive inclusion. For example, a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Further, unless expressly stated to the contrary, “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).
The terms “a” or “an,” as used herein, are defined as one as or more than one. The term “plurality,” as used herein, is defined as two or as more than two. The term “another,” as used herein, is defined as at least a second or more.
An ordinary artisan should require no additional explanation in developing the methods and systems described herein but may find some possibly helpful guidance in the preparation of these methods and systems by examining standardized reference works in the relevant art.
While the disclosure has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments may be devised which do not depart from the scope of the present disclosure as described herein. It should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. The terms used should not be construed to limit the disclosure to the specific embodiments disclosed in the specification and the claims, but the terms should be construed to include all methods and systems that operate under the claims set forth herein below. Accordingly, the disclosure is not limited by the disclosure, but instead its scope is to be determined entirely by the following claims.
This application claims priority to and the benefit of U.S. Provisional Application Ser. No. 62/536,625 entitled ““Systems and Methods for Operating Robotic End Effectors,” filed on 25 Jul. 2017, U.S. Provisional Application Ser. No. 62/546,022 entitled “Systems and Methods for Operating Robotic End Effectors,” filed on 16 Aug. 2017, U.S. Provisional Application Ser. No. 62/597,449 entitled “Systems and Methods for Operating Robotic End Effectors”, filed on 12 Dec. 2017, U.S. Provisional Application Ser. No. 62/648,711 entitled “Systems and Methods for Managing the Operation of Robotic Assistants and End Effectors and Executing Robotic Interactions”, filed on 27 Mar. 2018, and U.S. Provisional Application Ser. No. 62/678,456 entitled “Systems and Methods for Operating Robotic End Effectors”, filed on 31 May 2018, the disclosures of which are incorporated herein by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
62536625 | Jul 2017 | US | |
62546022 | Aug 2017 | US | |
62597449 | Dec 2017 | US | |
62648711 | Mar 2018 | US | |
62678456 | May 2018 | US |