POTENTIOMETERS AS POSITION SENSOR IN DEXTEROUS ROBOTICS FINGERS

BACKGROUND
1. Field

The present disclosure relates generally to robotics and, more specifically, to sensing position of joints.

2. Description of the Related Art

Dynamic mechanical systems are often controlled with computational processes. Examples include robots, industrial processes, life support systems, and medical devices. Generally, such a process takes input from sensors indicative of state of the dynamic mechanical system and its environment and determines outputs that serve to control various types of actuators within the dynamic mechanical system, thereby changing the state of the system and potentially its environment. In recent years, control of dynamic mechanical systems has been improved using machine learning, and potential applications for dynamic mechanical systems, like robots, are numerous.

SUMMARY

The following is a non-exhaustive listing of some aspects of the present techniques. These and other aspects are described in the following disclosure.

Some applications include, in robotic systems that operate under tight volumetric constraints at a point of articulation, a compact force transmission means and a compact sensing means. Examples of a compact force transmission means may include, but are not limited to, a tendon, like a cable, and compact sensing means may include, but are not limited to, a position sensor.

An example embodiment of a tendon may couple a member having a point of articulation, such as at a joint, to an actuator at a point of actuation that drives the tendon (e.g., pulls on the tendon). The actuator (e.g., point of actuation) may be disparately located from the member and the point of articulation.

An example embodiment of a sensor may be positioned at or coupled to a point of articulation, such as at or coupled to a joint from which a member articulates. Example embodiments of a sensor may generate a feedback signal indicative of movement or position of the member coupled to the joint. Some embodiments of a sensor may be housed within the joint and detect rotation of the member about the joint (e.g., with a single degree of freedom). Some embodiments of a sensor may be housed within the joint and detect rotation of the member about the joint (e.g., with multiple degrees of freedom).

Some embodiments implement a process to control an actuator disparately located from a point of articulation. While actuation may be effectively physically separated from the point of articulation, such as by a tendon or other linkage, a machine learning model, like a control model, may rely on precise knowledge of position parameters corresponding to the point of articulation. In some embodiments, an encoder obtains feedback data from a sensor coupled to the point of articulation, from which the encoder may determine a state vector including information indicative of position of a joint or member corresponding to the point of articulation. A control model may output one or more values by which to adjust the actuator based on the state vector and compare a resulting state based on updated feedback data relative to a desired state to determine an amount of position change caused by the one or more output values.

Some aspects include a tangible, non-transitory, machine-readable medium storing instructions that when executed by a data processing apparatus cause the data processing apparatus to perform operations including the above-mentioned applications.

Some aspects include a system, including: one or more processors; one or more inertial measurement units; and memory storing instructions that when executed by the processors cause the processors to effectuate operations of the above-mentioned applications.

BRIEF DESCRIPTION OF THE DRAWINGS

The above-mentioned aspects and other aspects of the present techniques will be better understood when the present application is read in view of the following figures in which like numbers indicate similar or identical elements:

FIG. 1A shows an example hand of a robot system with a view of an S-bar and examples of space-constrained joints.

FIG. 1B shows an example bottom-up view of an S-bar of a robot hand and examples of space-constrained joints.

FIG. 1C shows an example side view of an S-bar of a robot hand.

FIG. 1D shows an example palm view of a hand of a robot system with S-bars.

FIG. 1E shows an example view of a thumb of a robot system with an S-bar.

FIG. 1F shows an example view of a hand of a robot system with S-bars.

FIG. 1G shows an example view of a hand of a robot system with an S-bar.

FIG. 1H shows an example view of a hand of a robot system with an S-bar.

FIG. 1I shows an example view of a hand of a robot system with S-bars.

FIG. 1K shows an example view of a joint and position sensor by which example techniques for determining position of space constrained joints may be implemented in accordance with some example embodiments.

FIG. 1L shows an example view of a position sensor for determining position of space constrained joints in accordance with some example embodiments.

FIG. 1M shows an example view of a position sensor for determining position of space constrained joints in accordance with some example embodiments.

FIG. 2A shows an example computing system for training robots to perform tasks.

FIG. 2B shows an example machine learning model that may be used in accordance with some embodiments.

FIG. 3 shows an example computing system that uses machine learning and teleoperation to train robots, in accordance with some embodiments.

FIG. 4 shows an example computing system that may be used in accordance with some embodiments.

FIGS. 5-11 depict the hand of the robot system in greyscale from various perspectives.

While the present techniques are susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. The drawings may not be to scale. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the present techniques to the particular form disclosed, but to the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present techniques as defined by the appended claims.

DETAILED DESCRIPTION

To mitigate the problems described herein, the inventors had to both invent solutions and, in some cases just as importantly, recognize problems overlooked (or not yet foreseen) by others in the fields of robotics and controls. Indeed, the inventors wish to emphasize the difficulty of recognizing those problems that are nascent and will become much more apparent in the future should trends in industry continue as the inventors expect. Further, because multiple problems are addressed, it should be understood that some embodiments are problem-specific, and not all embodiments address every problem with traditional systems described herein or provide every benefit described herein. That said, improvements that solve various permutations of these problems are described below.

Many dynamic mechanical systems are subject to tight volumetric constraints, for example, at the point of actuation. As robots become more feature rich and capable of more complex tasks, these improvements are often achieved by virtue of increased complexity and number of joints included in a robot. Increases in numbers of components and features for performing more varied tasks, even in a relatively large robot, are expected to create even further crowded in scarce space on robots and other dynamic mechanical systems. Further, positioning heavy actuators on distal portions of a kinematic chain can increase stress, add to inertia to moving parts, and make it harder to dampen undesirable oscillations.

Robotic controls often rely on feedback indicative of robot state (e.g., joint positions) to control actuators, and in many cases, that feedback comes from the actuator itself, e.g., a step count of a stepper motor or a position reading from a potentiometer integrated into a servo motor. Stepper motors, servo motors, and other actuators that are capable of providing precise and consistent feedback data are often too large or costly to incorporate within one or more joints, members, or other components at a point of articulation due to tight volumetric constraints. Even where stepper motor or other motor or actuator could be produced to satisfy tight volumetric constraints, this is typically achieved at the expense of other performance metrics, such as maximum torque or durability, not to mention cost of bespoke designs.

However, when the actuator is shifted up the kinematic chain (e.g., all the way to a base) to mitigate some of the issues noted above, this can make the feedback from the actuator indicative of joint position less accurate. Error compounds down the kinematic chain, making feedback from the actuator a poor proxy for direct measurement of joint position at the joint itself (Again, none of which is to suggest that these or any other approaches are disclaimed.) For example, while a stepper motor or other actuation component traditionally used in such applications may be driven with a high degree of precision based on similarly precise feedback data from a stepper motor (e.g., which typically justifies their utilization despite their expense relative to other components), moving the actuation point disparate from the articulation point may drastically increase error in feedback data, often both initially and overtime, e.g., due to wear, losses, tolerance, or other issues. Lack of precision and changes in feedback data can cause difficulties in determining (e.g., during training) associated parameters of control models or cause associated parameters of trained control models to be or become suboptimal (e.g., causing errors).

Some embodiments mitigate these and other issues with shifting actuators up the kinematic chain by co-locating rotary potentiometers (or other position sensors) at the joint axes and integrating their output into the control loop for the driving an actuator (e.g., in some cases, integrated potentiometers in servos or step counters in steppers may be omitted, disregarded, or supplemented with the on-joint measurements). Indeed, some embodiments may use a motor without integrated position sensing (e.g., non-servo, non-stepper motor) with a control circuit taking the potentiometer position as its control signal. By spatially decoupling the encoder from the driving motor, some embodiments effectively created a physically distributed servo device, which is expected to be particularly well suited for (but not limited to, which is not to suggest other descriptions are limiting) cable driven control systems. In some cases, such “distributed servos” are expected to be less expensive and more precise than systems exclusively using integrated servos (e.g., with feedback sensing, motor control, and the motor in a single housing dedicated to the servo). (Again, none of which is to suggest that these or any other approaches are disclaimed.)

As discussed in more detail below in connection with FIGS. 2-4, a robot system (e.g., the robot system 302 (FIG. 3), the robot system 219 (FIG. 2A), etc. may be trained using machine learning (e.g., reinforcement learning) to perform tasks. Performing a task in the real world presents a challenge to reinforcement learning because of the large state space (e.g., the large number of actions that a robot can perform, the many positions or locations a robot can be in may be too numerous, etc.). To reduce the state space (e.g., which may make it easier to train a robot system), two portions of a robot hand may be joined such that joint motion is coupled together (e.g., one portion of a finger moves when a second portion of a finger moves). A physical mechanism may be used to mechanically couple distal finger joint motion together. For example, an s-bar may be used to join two joints together. Joining two joints using a mechanism (e.g., an s-bar) may allow the finger to still curl around an object (e.g., which may allow the robot to grasp objects), while removing the need for independent actuation (e.g., due to volumetric constrains or other factors, such as cost or complexity). This may reduce the cost and complexity for machine learning training (e.g., as described in connection with FIGS. 2-4). In some embodiments, alternative implementations such as a rubberized, or flexible, linkage may be used, for example, to allow for selective compliance at one or more joints (e.g., the outermost joints of the robot hand). A rubberized, or other flexible linkage may allow the robot to create a more robust grasp.

FIG. 1A shows a cross-sectional view of an example robot hand 100 (e.g., the pinky of the robot hand 100). An S-bar 102 may be used to mechanically couple a distal phalange 109 with an intermediate phalange 108. The s-bar may be attached to the distal phalange 109 at location 101 and may be attached to the intermediate phalange 108 at location 103. A thumb 107 and a wrist 140 are shown for reference. The s-bar 102 is attached to the pinky of the robot hand 100. The s-bar 102 may be made out of a rubberized material (or other flexible material), metal, plastic, fiberglass, or a variety of other materials.

FIG. 1B shows an additional view of the robot hand 100. The s-bar 102 may be attached to a finger of the hand at location 101 and location 102.

FIG. 1C shows an additional view of an index finger 141 of the robot hand 100. The s-bar 105 may be attached to the index finger 141 of the hand at location 104 and location 106.

FIG. 1D shows an additional view of the robot hand 100 with the palm 142 of the robot facing up. S-bar mechanisms 110-113 may be used to mechanically couple corresponding intermediate and distal phalanges on fingers of the hand.

FIG. 1E shows an additional view of an s-bar mechanism 114 which may be used to mechanically couple an intermediate and distal phalange of a thumb of the robot hand 100.

FIG. 1F shows a zoomed-in view of the hand 100. An s-bar 110 may be used to mechanically couple finger joint motion of the hand 100.

FIG. 1G shows a zoomed-out view of the robot hand 100. The joint motion of one or more fingers may be mechanically coupled using an s-bar mechanism (e.g., such as the s-bar 120).

FIG. 1H shows an angled view of the pinky of the robot hand 100. An s-bar 121 may be used to mechanically couple joint motion in the pinky. The s-bar 121 may be attached to the pinky of the robot hand 100 at a location that is behind a potentiometer 122.

FIG. 1I shows a top-down view of the robot hand 100. A portion of s-bars 130-133 can be seen via the top-down view shown.

In connection with or separate from the above aspects pertaining to an s-bar, an actuator may be coupled to a member or a joint (or joints) like that described above, or another joint, to actuate one or more members. Due to volumetric space constraints, the actuator (e.g., like a motor) may be located disparately from a point of articulation (e.g., a joint) and coupling may be provided via a linkage, like a tendon, such as a cable. Other example linkages may include one or more rigid bars or one or more gears. As a result, the point of actuation (e.g., location of the actuator) may be disparately located from the point of articulation (e.g., location of the joint that is actuated). In order to address issues like those noted above, among others, example embodiments disclosed herein provide a sensor located at the point of articulation to provide feedback data corresponding to the actuator. In other words, embodiments spatially decouple an actuator from a sensor measurement point corresponding to the actuator (e.g., for encoding and processing within a control loop).

FIG. 1A, as mentioned above, show a robot hand of a robotic system. The robot hand may have human-like proportions, and thus may be representative of an application in which one or more components of a robotic system operate under tight volumetric constraints at the point of actuation (e.g., joints of a biomimetic humanoid robot hand). In many cases, it is infeasible to include servos or motors directly at the final joints (e.g., either within the joints or members) of a kinematic chain. Example embodiments may locate servos or motors (e.g., actuators) spatially separate from the points of actuation, such as via means of compact force transmission, such as a cable-driven tendon.

For example, a tendon may be coupled to member 108 (e.g., like a component of a finger) to cause the member 108 to rotate via joint 170A relative to another member (e.g., 167, corresponding to a hand/palm), such as to grasp an object. The tendon may be coupled to the member 108 at a point along its length, or at the joint 170A. Example embodiments may include a plurality of joints, e.g., 170A, 170B, 170C to which members in a chain of members are coupled. Tasks assigned to a robot may require actuation of one or more members in a chain.

While actuation can be effectively physically separated from the joint in question, such as via one or more tendons, machine learning algorithms by which actions of a robot to perform a task are controlled, may require precise knowledge of physical joint position (e.g., to determine information about the members coupled by the joint). Traditionally, by employing a motor at the joint or member coupled to the joint, an in-servo encoder of the motor at the driven joint may provide precise feedback data. Relocation of the motor to a spatially distanced location from the driven joint, as explained above, may diminish the precision (or accuracy) of feedback data.

Some example embodiments may implement a sensor, like a position sensor, within or coupled to the joint by which amount of rotation and thus position of a joint or member coupled to the joint may be determined. One example position sensor may be a rotary potentiometer disposed at a joint axis. Other examples of position sensors may include stretch potentiometers, capacitive-based position sensors, or optical position sensors.

In some example embodiments, a position sensor, may output a signal or reading indicative of a given position or orientation or by which a given position or orientation may be determined. For example, a sensor may output signals (e.g., a voltage indicative of position) that correspond to joint or member position measurements, and those measurements may be provided as feedback data into the control loop for driving an actuator spatially distanced from the articulation point.

Some embodiments may implement an actuator, e.g., a motor, without a servo, or that is otherwise less precise that those previously employed (e.g., to reduce cost) because the output of the sensor at the joint may be obtained as a measurement of position from which a control signal for a control circuit of a motor. In other words, the sensor positioned at the point of articulation may provide an encoder with signals by which control signals for driving a motor may be determined. By removing the need for high-precision, pre-made servos of a motor with an additional layer of control logic based on measurement signals at the joint being implemented above the motor/drive circuitry, system cost may be reduced while the physically separate sensor device may maintain high-precision measurements at the point of articulation for system control.

FIG. 1B, as mentioned above, shows an additional view of the robot hand of a robotic system. Also shown is an example joint 170. The example joint 170 may be subject to relatively tight volumetric space constraints and driven via a tendon coupled to a disparately located actuator. The actuator (not shown) may drive (e.g., pull) on a tendon coupled to member 108 (or component of the joint 170 coupled to member 108). Driving a tendon may thus cause the member 108 to move, such as by rotation 181 around the joint.

Some example embodiments of a joint 170 may include a housing 171 for a sensor. Thus, for example, a sensor, like a position sensor, may determine a position of a member 108 as it rotates 181 in relation to the joint 170 (or another member, e.g., 167). Some example embodiments of a housing 171 of a joint 170 for a sensor may include a shape corresponding to that of a body of the sensor or an index point 177 by which the orientation of a sensor may be fixed within the housing 171. Some embodiments of a housing 171 may include one or more channels 179 by which sensor leads (e.g., like wires) may be guided out from the joint 170. In some example embodiments, a member 108 may be coupled to or include a shaft interface 175 by which it is coupled to and rotates within the joint 170. The shaft may be supported within the joint 170 by one or more bushings. In some alternative embodiments, the member 108 may be coupled to the bushings and the shaft may be coupled to another member 169 to which member 108 rotates in relation.

FIG. 1J shows an example view of a finger of a robot system and joints upon which example techniques for determining position of space constrained joints may be implemented in accordance with some example embodiments. As shown, a finger (or other appendage) of a robot may have a number of joints 171A-171C having respective members that may be driven to rotate 181A-181C around their respective joint axis. Control of the finger (or other appendage) may rely on accurate position information corresponding to the joints 171A-171Cs for various tasks, such as grabbing or otherwise manipulating an object. Example joints 171A-171C may be subject to tight space constraints that are prohibitive to the inclusion of actuators at respective points of joint articulation.

Member 108 may rotate 181 in relation to joint 170 or another member 167. Member 108 may be coupled to a shaft interface 175 which rotates with the member 108. The shaft interface 175 may include splines, or a cut face, to which a sensor component may be coupled. In other embodiments, the shaft interface 175 may be a component of the sensor and coupled to the member 108, such as via one or more splines or a cut face. In either example, the member 108 and the shaft interface 175 may rotate relative to a sensor housing 171.

A body 191 of the sensor may be disposed with the sensor housing 171. In some examples, the sensor housing 171 is shaped or includes an index 177 to retain the body 191 of the sensor in position when the shaft interface 175 rotates.

Some example embodiments of a sensor may include an arm 193 coupled to the shaft interface 175. The arm 193 may be conductive (e.g., efficiently convey an electrical current) or include a conductive portion 194 at an interface 194 that engages a track 192 of resistive material (e.g., resists an electrical current relative to the conductive material). The track 192 of resistive material may be a carbon-based or other semi-resistive material.

Considering a track 192 of resistive material having a resistance R between a first lead 196A and a second lead 196B (e.g., like a V+ voltage and a V ground, respectively), interface 194 may intersect with track 192 at a given position based on a position (e.g., rotation) of member 108 to provide an output voltage tap (e.g., measurement) based on input voltage and the RA and RB values (e.g., where RA+RB=R of the track 192) resulting from the position of the interaction.

Interface 194 of the arm may be coupled to a third lead 195, which may be an output, such as an output indicative of a position of the interface 194 along the track. For example, a Vout of the sensor may be measured at lead 195 based on a Vin of the voltage across leads 196A and 196B and the position of the conductive interface 194 along the track 102. E.g.:

Vout=Vin*Rb/(Rtrack)

where the resistance value Rb changes based on position of the conductive interface 194 because of rotation of the member 108 and shaft interface 175. Rb may change linearly in accordance with a ratio of resistance to rotation (although logarithmic or other scaling could be utilized). Thus, different positions (e.g., rotation) of the shaft interface 175 may be related to each other based on their respective Vout values.

FIG. 1L shows an example view of a position sensor for determining position of space constrained joints in accordance with some example embodiments.

Other example sensor types may be utilized to output a position measurement. FIG. 1K illustrates an example member 154B and interface shaft 154B, which may rotate relative to a sensor 152 coupled to a joint 151. Rotation of the interface shaft 154B may cause a corresponding rotation of a dial 153. Sensor 152 may read, e.g., optically, magnetically, capacitively or via a conductive interface, a value indicative of a position of the dial 153 and thus the shaft interface 154A and corresponding member 154B based on their rotation 182 relative to the joint 151.

In some example embodiments, the dial 153 may include a code (or codes) or pattern that may be read by a sensor 154 to determine a position of the dial. For example, the dial 153 may include a pattern of lines corresponding to copper tracks etched in a PCB strip. The sensor 154 may also include a pattern of lines corresponding to copper tracks etched in a PCB. The sensor 154 may be positioned proximate to the dial 153 and the patterns may form a variable capacitor. As the dial 153 moves relative to the sensor 154, the sensor 154 may detect changes in capacitance to determine a measurement indicative of the position of the dial 153 relative to the sensor 154.

In another example embodiments, the dial 153 may include a pattern of lines or dots or other markings that may be read optically. For example, the sensor 154 may be an optical sensor and track movement of the dial 153 or read a pattern to determine a position of the dial 153 relative to the sensor 154.

FIG. 1M shows an example view of a position sensor for determining position of space constrained joints in accordance with some example embodiments.

In some examples, one or more sensors 152 may be employed to track movement of a member 154B within a joint 151 with multiple degrees of freedom. Rather than a shaft/bushing type interface, an example member 154B may include a ball 154A interface with a joint 151 and rotate with multiple degrees of freedom within the single joint. In some examples, the ball 154A may be engraved with a pattern (e.g., on its surface) by which one or more sensor 152 may optically, capacitively, or magnetically track its position with multiple degrees of freedom. In some examples, such as for optical sensors 152, a position and orientation of one or more points of a pattern on the ball detected by one or more sensors may be read to determine position and orientation of the member 154B. For example, a pattern of three or more points, like a constellation, may be analyzed to determine position and orientation information.

FIG. 2A shows an example computing system for training robots to perform tasks. The system 200 may include a robot 216. The robot 216 may include any component of the robot system 302 discussed below in connection with FIG. 3. The robot 216 may include a hand such as the robot hand 100 or fingers discussed above in connection with FIGS. 1A-1M. In some example embodiments, S-bars discussed herein (e.g., with reference to FIGS. 1A-1I) may be used to reduce state space or increase the efficiency of training one or more machine learning models discussed in connection with FIGS. 2-3. An encoder which determines vectors corresponding to robot state within a state space may take input from sensors (e.g., as discussed with reference to FIGS. 1A, 1B and 1J-1M and elsewhere herein) that are disposed at points of articulation that are physically distanced from the actuators that drive the articulated components. The robot 216 may be an anthropomorphic robot (e.g., with legs, arms, hands, or other parts), like those described in the application incorporated by reference. The robot may be an articulated robot (e.g., an arm having two, six, or ten degrees of freedom, etc.), a cartesian robot (e.g., rectilinear or gantry robots, robots having three prismatic joints, etc.), Selective Compliance Assembly Robot Arm (SCARA) robots (e.g., with a donut shaped work envelope, with two parallel joints that provide compliance in one selected plane, with rotary shafts positioned vertically, with an end effector attached to an arm, etc.), delta robots (e.g., parallel link robots with parallel joint linkages connected with a common base, having direct control of each joint over the end effector, which may be used for pick-and-place or product transfer applications, etc.), polar robots (e.g., with a twisting joint connecting the arm with the base and a combination of two rotary joints and one linear joint connecting the links, having a centrally pivoting shaft and an extendable rotating arm, spherical robots, etc.), cylindrical robots (e.g., with at least one rotary joint at the base and at least one prismatic joint connecting the links, with a pivoting shaft and extendable arm that moves vertically and by sliding, with a cylindrical configuration that offers vertical and horizontal linear movement along with rotary movement about the vertical axis, etc.), self-driving car, a kitchen appliance, construction equipment, or a variety of other types of robots. The robot 216 may include one or more cameras, joints, servomotors, stepper motors, pneumatic actuators, or any other component discussed in U.S. patent application Ser. No. 16/918,999, filed 1 Jul. 2020, titled “Artificial Intelligence-Actuated Robot,” which is incorporated by reference in its entirety. The robot 216 may communicate with the agent 215, and the agent 215 may be configured to send actions determined via the policy 222. The policy 222 may take as input the state (e.g., a vector representation generated by the encoder model 203) and return an action to perform.

The robot 216 may send sensor data to the encoder model 203, e.g., via the agent 215. The encoder model 203 may take as input the sensor data from the robot 216. The encoder model 203 may use the sensor data to generate a vector representation (e.g., a space embedding) indicating the state of the robot. The encoder model 203 may be trained via the encoder trainer 204. The encoder model may use the sensor data to generate a space embedding (e.g., a vector representation) indicating the state of the robot or the environment around the robot periodically (e.g., 30 times per second, 10 times per second, every two seconds, etc.). A space embedding may indicate a current position or state of the robot (e.g., the state of the robot after performing an action to turn a door handle. A space embedding may reduce the dimensionality of data received from sensors. For example, if the robot has multiple color 1080p cameras, touch sensors, motor sensors, or a variety of other sensors, then input to an encoder model for a given state of the robot (e.g., output from the sensors for a given time slice) may be tens of millions of dimensions. The encoder model may reduce the sensor data to a space embedding in an embedding space (e.g., a space between 10 and 2000 dimensions in some embodiments). Distance between a first space embedding and a second space embedding may preserve the relative dissimilarity between the state of a robot associated with the first space embedding and the state of a robot (which may be the same or a different robot) associated with the second space embedding.

The anomaly detection model 209 may receive vector representations from the encoder model 203 and determine whether each vector representation is anomalous or not. Although only one encoder model 203 is shown in FIG. 2A, there may be multiple encoder models. A first encoder model may send space embeddings to the anomaly detection model 209 and a second encoder model may send space embeddings to other components of the system 200.

The dynamics model 212 may be trained by the dynamics trainer 213 to predict a next state given a current state and action that will be performed in the current state. The dynamics model may be trained by the dynamics trainer 213 based on data from expert demonstrations (e.g., performed by the teleoperator).

The actor-critic model 206 may be a reinforcement learning model. The actor-critic model 206 may be trained by the actor-critic trainer 207. The actor-critic model 206 may be used to determine actions for the robot 216 to perform. For example, the actor-critic model 206 may be used to adjust the policy by changing what actions are performed given an input state.

The actor-critic model 206 and the encoder model 203 may be configured to train based on outputs generated by each model 206 and model 203. For example, the system 200 may adjust a first weight of the encoder model 203 based on an action determined by a reinforcement learning model (e.g., the actor-critic model 206). Additionally or alternatively, the system 200 may adjust a second weight of the reinforcement learning model (e.g., the actor-critic model 206) based on the state (e.g., a space embedding) generated via the encoder model 203.

The reward model 223 may take as input a state of the robot 216 (e.g., the state may be generated by the encoder model 203) and output a reward. The robot 216 may receive a reward for completing a task or for making progress towards completing the task. The output from the reward model 223 may be used by the actor-critic trainer 207 and actor-critic model 206 to improve ability of the model 206 to determine actions that will lead to the completion of a task assigned to the robot 216. The reward trainer 224 may train the reward model 223 using data received via the teleoperation system 219 or via sampling data stored in the experience buffers 226. The teleoperation system 219 may be the teleoperation system 304 discussed below in connection with FIG. 3. In some embodiments, the system 200 may adjust a weight or bias of the reinforcement learning model (e.g., the actor-critic model 206), such as a deep reinforcement learning model, in response to determining that a space embedding (e.g., generated by the encoder model 203) corresponds to an anomaly. Adjusting a weight of the reinforcement model may reduce a likelihood of the robot of performing an action that leads to an anomalous state.

The experience buffers 226 may store data corresponding to actions taken by the robot 216 (e.g., actions, observations, and states resulting from the actions). The data may be used to determine rewards and train the reward model 223. Additionally or alternatively, the data stored by the experience buffers 226 may be used by the actor-critic trainer to train the actor-critic model 206 to determine actions for the robot 216 to perform. The teleoperation system 219 may be used by the teleoperator 220 to control the robot 216. The teleoperation system 219 may be used to record demonstrations of the robot performing the task. The demonstrations may be used to train the robot 216 and may include sequences of observations generated via the robot 216 (e.g., cameras, touch sensors, sensors in servomechanisms, or other parts of the robot 216).

One or more machine learning models discussed herein may be implemented (e.g., in part), for example, as described in connection with the machine learning model 242 of FIG. 2B. With respect to FIG. 2B, machine learning model 242 may take inputs 244 and provide outputs 246. In one use case, outputs 246 may be fed back to machine learning model 242 as input to train machine learning model 242 (e.g., alone or in conjunction with user indications of the accuracy of outputs 246, labels associated with the inputs, or with other reference feedback and/or performance metric information). In another use case, machine learning model 242 may update its configurations (e.g., weights, biases, or other parameters) based on its assessment of its prediction (e.g., outputs 246) and reference feedback information (e.g., user indication of accuracy, reference labels, or other information). In another example use case, where machine learning model 242 is a neural network and connection weights may be adjusted to reconcile differences between the neural network's prediction and the reference feedback. In a further use case, one or more neurons (or nodes) of the neural network may require that their respective errors are sent backward through the neural network to them to facilitate the update process (e.g., backpropagation of error). Updates to the connection weights may, for example, be reflective of the magnitude of error propagated backward after a forward pass has been completed. In this way, for example, the machine learning model 242 may be trained to generate results (e.g., response time predictions, sentiment identifiers, urgency levels, etc.) with better recall, accuracy, and/or precision.

In some embodiments, the machine learning model 242 may include an artificial neural network. In such embodiments, machine learning model 242 may include an input layer and one or more hidden layers. Each neural unit of the machine learning model may be connected with one or more other neural units of the machine learning model 242. Such connections can be enforcing or inhibitory in their effect on the activation state of connected neural units. Each individual neural unit may have a summation function which combines the values of one or more of its inputs together. Each connection (or the neural unit itself) may have a threshold function that a signal must surpass before it propagates to other neural units. The machine learning model 242 may be self-learning or trained, rather than explicitly programmed, and may perform significantly better in certain areas of problem solving, as compared to computer programs that do not use machine learning. During training, an output layer of the machine learning model 242 may correspond to a classification, and an input known to correspond to that classification may be input into an input layer of machine learning model during training. During testing, an input without a known classification may be input into the input layer, and a determined classification may be output. For example, the classification may be an indication of whether an action is predicted to be completed by a corresponding deadline or not. The machine learning model 242 trained by the ML subsystem 314 may include one or more embedding layers at which information or data (e.g., any data or information discussed above in connection with FIGS. 1-3) is converted into one or more vector representations. The one or more vector representations of the message may be pooled at one or more subsequent layers to convert the one or more vector representations into a single vector representation.

The machine learning model 242 may be structured as a factorization machine model. The machine learning model 242 may be a non-linear model and/or supervised learning model that can perform classification and/or regression. For example, the machine learning model 242 may be a general-purpose supervised learning algorithm that the system uses for both classification and regression tasks. Alternatively, the machine learning model 242 may include a Bayesian model configured to perform variational inference, for example, to predict whether an action will be completed by the deadline. The machine learning model 242 may be implemented as a decision tree and/or as an ensemble model (e.g., using random forest, bagging, adaptive booster, gradient boost, XGBoost, etc.).

FIG. 3 shows an example computing system 300 for using machine learning to train robots (e.g., the robot system 302, the robot 216, etc.) to perform tasks. The computing system 300 may include a robot system 302, a teleoperation system 304, or a server 306. The robot system 302 may include a communication subsystem 312, a machine learning (ML) subsystem 314, and sensors 316.

At least some of the sensors 316 may have an architecture like that of example sensors described herein, may provide position information corresponding to joints of the robot, and may be spatially decoupled from the actuators that control movement of the joints.

The ML subsystem 314 may include a plurality of machine learning models. For example, the ML subsystem 314 may pipeline an encoder and a reinforcement learning model that are collectively trained with end-to-end learning, the encoder being operative to transform relatively high-dimensional outputs of a robot's sensor suite into lower-dimensional vector representations of each time slice in an embedding space, and the reinforcement learning model being configured to update setpoints for robot actuators based on those vectors. Some embodiments of the ML subsystem 314 may include an encoder model, a dynamic model, an actor-critic model, a reward model, an anomaly detection model, or a variety of other machine learning models (e.g., any model described in connection with FIG. 2A-2B, or ensembles thereof). One or more portions of the ML subsystem 314 may be implemented on the robot system 302, the server 306, or the teleoperation system 304. Although shown as distinct objects in FIG. 3, functionality described below in connection with the robot system 302, the server 306, or the teleoperation system 304 may be performed by any one of the robot system 302, the server 306, or the teleoperation system 304. The robot system 302, the server 306, or the teleoperation system 304 may communicate with each other via the network 350.

FIG. 4 is a physical architecture block diagram that shows an example of a computing device (or data processing system) by which some aspects of the above techniques may be implemented. Various portions of systems and methods described herein, may include or be executed on one or more computer systems similar to computing system 1000. Further, processes and modules described herein may be executed by one or more processing systems similar to that of computing system 1000.

Computing system 1000 may include one or more processors (e.g., processors 1010a-1010n) coupled to system memory 1020, an input/output I/O device interface 1030, and a network interface 1040 via an input/output (I/O) interface 1050. A processor may include a single processor or a plurality of processors (e.g., distributed processors). A processor may be any suitable processor capable of executing or otherwise performing instructions. A processor may include a central processing unit (CPU) that carries out program instructions to perform the arithmetical, logical, and input/output operations of computing system 1000. A processor may execute code (e.g., processor firmware, a protocol stack, a database management system, an operating system, or a combination thereof) that creates an execution environment for program instructions. A processor may include a programmable processor. A processor may include general or special purpose microprocessors. A processor may receive instructions and data from a memory (e.g., system memory 1020). Computing system 1000 may be a uni-processor system including one processor (e.g., processor 1010a), or a multi-processor system including any number of suitable processors (e.g., 1010a-1010n). Multiple processors may be employed to provide for parallel or sequential execution of one or more portions of the techniques described herein. Processes, such as logic flows, described herein may be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating corresponding output. Processes described herein may be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit). Computing system 1000 may include a plurality of computing devices (e.g., distributed computer systems) to implement various processing functions.

I/O device interface 1030 may provide an interface for connection of one or more I/O devices 1060 to computer system 1000. I/O devices may include devices that receive input (e.g., from a user) or output information (e.g., to a user). I/O devices 1060 may include, for example, graphical user interface presented on displays (e.g., a cathode ray tube (CRT) or liquid crystal display (LCD) monitor), pointing devices (e.g., a computer mouse or trackball), keyboards, keypads, touchpads, scanning devices, voice recognition devices, gesture recognition devices, printers, audio speakers, microphones, cameras, or the like. I/O devices 1060 may be connected to computer system 1000 through a wired or wireless connection. I/O devices 1060 may be connected to computer system 1000 from a remote location. I/O devices 1060 located on remote computer system, for example, may be connected to computer system 1000 via a network and network interface 1040.

Network interface 1040 may include a network adapter that provides for connection of computer system 1000 to a network. Network interface may 1040 may facilitate data exchange between computer system 1000 and other devices connected to the network. Network interface 1040 may support wired or wireless communication. The network may include an electronic communication network, such as the Internet, a local area network (LAN), a wide area network (WAN), a cellular communications network, or the like.

System memory 1020 may be configured to store program instructions 1100 or data 1110. Program instructions 1100 may be executable by a processor (e.g., one or more of processors 1010a-1010n) to implement one or more embodiments of the present techniques. Instructions 1100 may include modules of computer program instructions for implementing one or more techniques described herein with regard to various processing modules. Program instructions may include a computer program (which in certain forms is known as a program, software, software application, script, or code). A computer program may be written in a programming language, including compiled or interpreted languages, or declarative or procedural languages. A computer program may include a unit suitable for use in a computing environment, including as a stand-alone program, a module, a component, or a subroutine. A computer program may or may not correspond to a file in a file system. A program may be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program may be deployed to be executed on one or more computer processors located locally at one site or distributed across multiple remote sites and interconnected by a communication network.

System memory 1020 may include a tangible program carrier having program instructions stored thereon. A tangible program carrier may include a non-transitory computer readable storage medium. A non-transitory computer readable storage medium may include a machine readable storage device, a machine readable storage substrate, a memory device, or any combination thereof. Non-transitory computer readable storage medium may include non-volatile memory (e.g., flash memory, ROM, PROM, EPROM, EEPROM memory), volatile memory (e.g., random access memory (RAM), static random access memory (SRAM), synchronous dynamic RAM (SDRAM)), bulk storage memory (e.g., CD-ROM and/or DVD-ROM, hard-drives), or the like. System memory 1020 may include a non-transitory computer readable storage medium that may have program instructions stored thereon that are executable by a computer processor (e.g., one or more of processors 1010a-1010n) to cause the subject matter and the functional operations described herein. A memory (e.g., system memory 1020) may include a single memory device and/or a plurality of memory devices (e.g., distributed memory devices). Instructions or other program code to provide the functionality described herein may be stored on a tangible, non-transitory computer readable media. In some cases, the entire set of instructions may be stored concurrently on the media, or in some cases, different parts of the instructions may be stored on the same media at different times.

I/O interface 1050 may be configured to coordinate I/O traffic between processors 1010a-1010n, system memory 1020, network interface 1040, I/O devices 1060, and/or other peripheral devices. I/O interface 1050 may perform protocol, timing, or other data transformations to convert data signals from one component (e.g., system memory 1020) into a format suitable for use by another component (e.g., processors 1010a-1010n). I/O interface 1050 may include support for devices attached through various types of peripheral buses, such as a variant of the Peripheral Component Interconnect (PCI) bus standard or the Universal Serial Bus (USB) standard.

Embodiments of the techniques described herein may be implemented using a single instance of computer system 1000 or multiple computer systems 1000 configured to host different portions or instances of embodiments. Multiple computer systems 1000 may provide for parallel or sequential processing/execution of one or more portions of the techniques described herein.

Those skilled in the art will appreciate that computer system 1000 is merely illustrative and is not intended to limit the scope of the techniques described herein. Computer system 1000 may include any combination of devices or software that may perform or otherwise provide for the performance of the techniques described herein. For example, computer system 1000 may include or be a combination of a cloud-computing system, a data center, a server rack, a server, a virtual server, a desktop computer, a laptop computer, a tablet computer, a server device, a client device, a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a vehicle-mounted computer, or a Global Positioning System (GPS), or the like. Computer system 1000 may also be connected to other devices that are not illustrated, or may operate as a stand-alone system. In addition, the functionality provided by the illustrated components may in some embodiments be combined in fewer components or distributed in additional components. Similarly, in some embodiments, the functionality of some of the illustrated components may not be provided or other additional functionality may be available.

Those skilled in the art will also appreciate that while various items are illustrated as being stored in memory or on storage while being used, these items or portions of them may be transferred between memory and other storage devices for purposes of memory management and data integrity. Alternatively, in other embodiments some or all of the software components may execute in memory on another device and communicate with the illustrated computer system via inter-computer communication. Some or all of the system components or data structures may also be stored (e.g., as instructions or structured data) on a computer-accessible medium or a portable article to be read by an appropriate drive, various examples of which are described above. In some embodiments, instructions stored on a computer-accessible medium separate from computer system 1000 may be transmitted to computer system 1000 via transmission media or signals such as electrical, electromagnetic, or digital signals, conveyed via a communication medium such as a network or a wireless link. Various embodiments may further include receiving, sending, or storing instructions or data implemented in accordance with the foregoing description upon a computer-accessible medium. Accordingly, the present techniques may be practiced with other computer system configurations.

In block diagrams, illustrated components are depicted as discrete functional blocks, but embodiments are not limited to systems in which the functionality described herein is organized as illustrated. The functionality provided by each of the components may be provided by software or hardware modules that are differently organized than is presently depicted, for example such software or hardware may be intermingled, conjoined, replicated, broken up, distributed (e.g. within a data center or geographically), or otherwise differently organized. The functionality described herein may be provided by one or more processors of one or more computers executing code stored on a tangible, non-transitory, machine readable medium. In some cases, notwithstanding use of the singular term “medium,” the instructions may be distributed on different storage devices associated with different computing devices, for instance, with each computing device having a different subset of the instructions, an implementation consistent with usage of the singular term “medium” herein. In some cases, third party content delivery networks may host some or all of the information conveyed over networks, in which case, to the extent information (e.g., content) can be said to be supplied or otherwise provided, the information may provided by sending instructions to retrieve that information from a content delivery network.

The reader should appreciate that the present application describes several independently useful techniques. Rather than separating those techniques into multiple isolated patent applications, applicants have grouped these techniques into a single document because their related subject matter lends itself to economies in the application process. But the distinct advantages and aspects of such techniques should not be conflated. In some cases, embodiments address all of the deficiencies noted herein, but it should be understood that the techniques are independently useful, and some embodiments address only a subset of such problems or offer other, unmentioned benefits that will be apparent to those of skill in the art reviewing the present disclosure. Due to costs constraints, some techniques disclosed herein may not be presently claimed and may be claimed in later filings, such as continuation applications or by amending the present claims. Similarly, due to space constraints, neither the Abstract nor the Summary of the Invention sections of the present document should be taken as containing a comprehensive listing of all such techniques or all aspects of such techniques.

It should be understood that the description is not intended to limit the present techniques to the particular form disclosed, but to the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present techniques as defined by the appended claims. Further modifications and alternative embodiments of various aspects of the techniques will be apparent to those skilled in the art in view of this description. Accordingly, this description and the drawings are to be construed as illustrative only and are for the purpose of teaching those skilled in the art the general manner of carrying out the present techniques. It is to be understood that the forms of the present techniques shown and described herein are to be taken as examples of embodiments. Elements and materials may be substituted for those illustrated and described herein, parts and processes may be reversed or omitted, and certain features of the present techniques may be utilized independently, all as would be apparent to one skilled in the art after having the benefit of this description of the present techniques. Changes may be made in the elements described herein without departing from the spirit and scope of the present techniques as described in the following claims. Headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description.

As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). The words “include”, “including”, and “includes” and the like mean including, but not limited to. As used throughout this application, the singular forms “a,” “an,” and “the” include plural referents unless the content explicitly indicates otherwise. Thus, for example, reference to “an element” or “a element” includes a combination of two or more elements, notwithstanding use of other terms and phrases for one or more elements, such as “one or more.” The term “or” is, unless indicated otherwise, non-exclusive, i.e., encompassing both “and” and “or.” Terms describing conditional relationships, e.g., “in response to X, Y,” “upon X, Y,”, “if X, Y,” “when X, Y,” and the like, encompass causal relationships in which the antecedent is a necessary causal condition, the antecedent is a sufficient causal condition, or the antecedent is a contributory causal condition of the consequent, e.g., “state X occurs upon condition Y obtaining” is generic to “X occurs solely upon Y” and “X occurs upon Y and Z.” Such conditional relationships are not limited to consequences that instantly follow the antecedent obtaining, as some consequences may be delayed, and in conditional statements, antecedents are connected to their consequents, e.g., the antecedent is relevant to the likelihood of the consequent occurring. Statements in which a plurality of attributes or functions are mapped to a plurality of objects (e.g., one or more processors performing steps A, B, C, and D) encompasses both all such attributes or functions being mapped to all such objects and subsets of the attributes or functions being mapped to subsets of the attributes or functions (e.g., both all processors each performing steps A-D, and a case in which processor 1 performs step A, processor 2 performs step B and part of step C, and processor 3 performs part of step C and step D), unless otherwise indicated. Similarly, reference to “a computer system” performing step A and “the computer system” performing step B can include the same computing device within the computer system performing both steps or different computing devices within the computer system performing steps A and B. Further, unless otherwise indicated, statements that one value or action is “based on” another condition or value encompass both instances in which the condition or value is the sole factor and instances in which the condition or value is one factor among a plurality of factors. Unless otherwise indicated, statements that “each” instance of some collection have some property should not be read to exclude cases where some otherwise identical or similar members of a larger collection do not have the property, i.e., each does not necessarily mean each and every. Limitations as to sequence of recited steps should not be read into the claims unless explicitly specified, e.g., with explicit language like “after performing X, performing Y,” in contrast to statements that might be improperly argued to imply sequence limitations, like “performing X on items, performing Y on the X'ed items,” used for purposes of making claims more readable rather than specifying sequence. Statements referring to “at least Z of A, B, and C,” and the like (e.g., “at least Z of A, B, or C”), refer to at least Z of the listed categories (A, B, and C) and do not require at least Z units in each category. Unless specifically stated otherwise, as apparent from the discussion, it is appreciated that throughout this specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining” or the like refer to actions or processes of a specific apparatus, such as a special purpose computer or a similar special purpose electronic processing/computing device. Features described with reference to geometric constructs, like “parallel,” “perpendicular/orthogonal,” “square”, “cylindrical,” and the like, should be construed as encompassing items that substantially embody the properties of the geometric construct, e.g., reference to “parallel” surfaces encompasses substantially parallel surfaces. The permitted range of deviation from Platonic ideals of these geometric constructs is to be determined with reference to ranges in the specification, and where such ranges are not stated, with reference to industry norms in the field of use, and where such ranges are not defined, with reference to industry norms in the field of manufacturing of the designated feature, and where such ranges are not defined, features substantially embodying a geometric construct should be construed to include those features within 15% of the defining attributes of that geometric construct. The terms “first”, “second”, “third,” “given” and so on, if used in the claims, are used to distinguish or otherwise identify, and not to show a sequential or numerical limitation. As is the case in ordinary usage in the field, data structures and formats described with reference to uses salient to a human need not be presented in a human-intelligible format to constitute the described data structure or format, e.g., text need not be rendered or even encoded in Unicode or ASCII to constitute text; images, maps, and data-visualizations need not be displayed or decoded to constitute images, maps, and data-visualizations, respectively; speech, music, and other audio need not be emitted through a speaker or decoded to constitute speech, music, or other audio, respectively. Computer implemented instructions, commands, and the like are not limited to executable code and can be implemented in the form of data that causes functionality to be invoked, e.g., in the form of arguments of a function or API call. To the extent bespoke noun phrases (and other coined terms) are used in the claims and lack a self-evident construction, the definition of such phrases may be recited in the claim itself, in which case, the use of such bespoke noun phrases should not be taken as invitation to impart additional limitations by looking to the specification or extrinsic evidence.

In this patent, to the extent any U.S. patents, U.S. patent applications, or other materials (e.g., articles) have been incorporated by reference, the text of such materials is only incorporated by reference to the extent that no conflict exists between such material and the statements and drawings set forth herein. In the event of such conflict, the text of the present document governs, and terms in this document should not be given a narrower reading in virtue of the way in which those terms are used in other materials incorporated by reference.

POTENTIOMETERS AS POSITION SENSOR IN DEXTEROUS ROBOTICS FINGERS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS

Provisional Applications (1)