The disclosure relates to a robotic device and a method for controlling a robotic device to approach an unidentified object and autonomously identify one or more properties of the object without human interaction by learning active tactile perception through belief-space control.
Robots operating in an open world may encounter many unknown and/or unidentified objects and may be expected to manipulate them effectively. To achieve this, it may be useful for robots to infer the physical properties of unknown objects through physical interactions. The ability to measure these properties online may be used for robots to operate robustly in the real-world with open-ended object categories. A human might identify properties of an object by performing exploratory procedures such as pressing on objects to test for object hardness and lifting objects to estimate object mass. These exploratory procedures may be challenging to hand-engineer and may vary based on the type of object.
Provided are a robotic device and a method for controlling a robotic device to approach an unidentified object and autonomously identify one or more properties of the object, without human interaction, by learning active tactile perception through belief-space control.
Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments.
In accordance with an aspect of the disclosure, there is provided a method for identifying a property of an object including: obtaining sensor data from at least one sensor; identifying, using the sensor data, a property of interest of an object; training, using one or more neural networks, a model to predict a next uncertainty about a state of the object based on an action; and based on identifying the next uncertainty about the state of the object, controlling a movement of a robotic element to perform the action.
The training may include repeatedly performing the training until a convergence is identified based on a reduced training error.
The training may include minimizing a training loss by approximating a belief state.
The action may include pressing the object with the robotic element and obtaining readings from the at least one sensor.
The identifying the property of interest of the object may include pressing the object with the robotic element at multiple points of the object and obtaining readings from the at least one sensor.
The identifying the property of interest may include lifting the object with the robotic element.
The model may include a dynamics model and an observation model.
According to an aspect of the disclosure, there is provided an electronic device for identifying a property of an object including: at least memory storing instructions; and at least one processor configured to execute the instructions to: obtain sensor data from at least one sensor; identify, using the sensor data, a property of interest of an object; train, using one or more neural networks, a model to predict a next state and observation of the system based on an action; and based on identifying the next uncertainty about the object property of interest, control a movement of a robotic element to perform the action.
The at least one processor may be further configured to repeatedly perform the training until a convergence is identified based on a reduced training error.
The at least one processor may be further configured to minimize a training loss by approximating a belief state.
The action may include pressing the object with the robotic element and obtaining readings from the at least one sensor.
The at least one processor may be further configured to identify the property of interest of the object by pressing the object with the robotic element at multiple points of the object and obtaining readings from the at least one sensor.
The at least one processor may be further configured to identify the property of interest by lifting the object with the robotic element.
The model may include a dynamics model and an observation model.
According to an aspect of the disclosure, there is provided a non-transitory computer readable storage medium that stores instructions to be executed by at least one processor to perform a method for identifying a property of an object including: obtaining sensor data from at least one sensor; identifying, using the sensor data, a property of interest of an object; training, using one or more neural networks, a model to predict a next state of the object based on an action; and based on identifying the next state of the object, controlling a movement of a robotic element to perform the action.
The training may include repeatedly performing the training until a convergence is identified based on a reduced training error.
The training may include minimizing a training loss by approximating a belief state.
The action may include pressing the object with the robotic element and obtaining readings from the at least one sensor.
The identifying the property of interest of the object may include pressing the object with the robotic element at multiple points of the object and obtaining readings from the at least one sensor.
The identifying the property of interest comprises lifting the object with the robotic element.
The above and other aspects, features, and advantages of embodiments of the disclosure will be more apparent from the following description taken in conjunction with the accompanying drawings, in which:
Embodiments of the present disclosure provide a robotic device and a method for controlling a robotic device for autonomously identifying one or more properties of an object.
As the disclosure allows for various changes and numerous examples, one or more embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the disclosure to modes of practice, and it will be understood that all changes, equivalents, and substitutes that do not depart from the spirit and technical scope of the disclosure are encompassed in the disclosure.
In the description of the embodiments, detailed explanations of related art are omitted when it is deemed that they may unnecessarily obscure the essence of the disclosure. Also, numbers (for example, a first, a second, and the like) used in the description of the specification are identifier codes for distinguishing one element from another.
Also, in the present specification, it will be understood that when elements are “connected” or “coupled” to each other, the elements may be directly connected or coupled to each other, but may alternatively be connected or coupled to each other with an intervening element therebetween, unless specified otherwise.
Throughout the disclosure, it should be understood that when an element is referred to as “including” an element, the element may further include another element, rather than excluding the other element, unless mentioned otherwise.
In the present specification, regarding an element represented as a “unit,” “processor,” “controller,” or a “module,” two or more elements may be combined into one element or one element may be divided into two or more elements according to subdivided functions. This may be implemented by hardware, software, or a combination of hardware and software. In addition, each element described hereinafter may additionally perform some or all of functions performed by another element, in addition to main functions of itself, and some of the main functions of each element may be performed entirely by another component.
Throughout the disclosure, the expression “at least one of a, b or c” indicates only a, only b, only c, both a and b, both a and c, both b and c, all of a, b, and c, or variations thereof.
Embodiments may relate to a robotic device and a method for controlling a robotic device for autonomously identifying one or more properties of an object.
According to one or more embodiments, a method for autonomously learning active tactile perception policies, by learning a generative world model leveraging a differentiable Bayesian filtering algorithm, and designing an information-gathering model predictive controller is described herein.
According to one or more embodiments, exploratory procedures are learned to estimate object properties through belief-space control. Using a combination of 1) learning based state-estimation to infer the property from a sequence of observations and actions, and 2) information-gathering model-predictive control (MPC), a robot may learn to execute actions that are informative about the property of interest and to discover exploratory procedure without any human priors. According to one or more embodiments, a method may use three simulated tasks: mass estimation, height estimation and toppling height estimation.
For example, a mass of a cube may be estimated. A cube has constant size and friction coefficient, but its mass changes randomly between 1 kg and 2 kg in between episodes. A robot should be able to push it and extract mass from the force and torque readings generated by the push. A height of an object may also be estimated. For example, a force torque sensor, in this scenario, may act as a contact detector. An expected behavior may be to come down until contact is made, at which point the height may be extracted from forward kinematics. A minimum toppling height may also be estimated. A minimum toppling height refers to a height at which an object will topple instead of slide when pushed.
At operation S203, the process may include running a controller and a state estimator. The controller may be an information-gathering model predictive controller. The state estimator evaluates a current state and a current action and predicts a next state based on the current action. A state of a system may refer to elements that are useful for predicting a future of the system. At operation S205, the process may include adding the interaction with the object to a dataset and training the state estimator. According to an embodiment, the training phase may be performed for a fixed number of steps or based on a convergence criterion. For example, at operation S207, it may be determined whether there is a convergence. The determining whether there is a convergence may include comparing a current error value with a known error value. When the current error value is minimized then there is a convergence. If there is a convergence (S207—Y), then the training phase is complete and the deployment process may be initiated (S209), which will be described with respect to
According to an embodiment, environment 303 may refer to robot pose and velocity, object pose and velocity, object properties, and any properties that describe an environment and are subject to change either during or in between episodes. The force-torque reading proprioception refers to identifying how force-torque sensors react when performing an action on an object (e.g., pressing an object, grabbing an object, etc.) The object property estimate 304 refers to an estimated property as identified by the learning-based state estimator (e.g., mass, height, friction).
The electronic device 1000 includes a bus 1010, a processor 1020, a memory 1030, an interface 1040, and a display 1050.
The bus 1010 includes a circuit for connecting the components 1020 to 1050 with one another. The bus 1010 functions as a communication system for transferring data between the components 1020 to 1050 or between electronic devices.
The processor 1020 includes one or more of a central processing unit (CPU), a graphics processor unit (GPU), an accelerated processing unit (APU), a many integrated core (MIC), a field-programmable gate array (FPGA), or a digital signal processor (DSP). The processor 1020 is able to perform control of any one or any combination of the other components of the electronic device 1000, and/or perform an operation or data processing relating to communication. For example, the processor 1020 may perform the methods illustrated in
The memory 1030 may include a volatile and/or non-volatile memory. The memory 1030 stores information, such as one or more of commands, data, programs (one or more instructions), applications 1034, etc., which are related to at least one other component of the electronic device 1000 and for driving and controlling the electronic device 1000. For example, commands and/or data may formulate an operating system (OS) 1032. Information stored in the memory 1030 may be executed by the processor 1020.
The applications 1034 include the above-discussed embodiments. These functions can be performed by a single application or by multiple applications that each carry out one or more of these functions. For example, the applications 1034 may include an artificial intelligence (AI) model for performing the methods illustrated in
The display 1050 includes, for example, a liquid crystal display (LCD), a light emitting diode (LED) display, an organic light emitting diode (OLED) display, a quantum-dot light emitting diode (QLED) display, a microelectromechanical systems (MEMS) display, or an electronic paper display. The display 1050 can also be a depth- aware display, such as a multi-focal display. The display 1050 is able to present, for example, various contents, such as text, images, videos, icons, and symbols.
The interface 1040 includes input/output (I/O) interface 1042, communication interface 1044, and/or one or more sensors 1046. The I/O interface 1042 serves as an interface that can, for example, transfer commands and/or data between a user and/or other external devices and other component(s) of the electronic device 1000.
The communication interface 1044 may enable communication between the electronic device 1000 and other external devices, via a wired connection, a wireless connection, or a combination of wired and wireless connections. The communication interface 1044 may permit the electronic device 1000 to receive information from another device and/or provide information to another device. For example, the communication interface 1044 may include an Ethernet interface, an optical interface, a coaxial interface, an infrared interface, a radio frequency (RF) interface, a universal serial bus (USB) interface, a Wi-Fi interface, a cellular network interface, or the like. The communication interface 1044 may receive videos and/or video frames from an external device, such as a server.
The sensor(s) 1046 of the interface 1040 can meter a physical quantity or detect an activation state of the electronic device 1000 and convert metered or detected information into an electrical signal. For example, the sensor(s) 1046 can include one or more cameras or other imaging sensors for capturing images of scenes. The sensor(s) 1046 can also include any one or any combination of a microphone, a keyboard, a mouse, and one or more buttons for touch input. The sensor(s) 1046 can further include an inertial measurement unit. The sensor(s) 1046 can further include force-torque sensors. In addition, the sensor(s) 1046 can include a control circuit for controlling at least one of the sensors included herein. Any of these sensor(s) 1046 can be located within or coupled to the electronic device 1000. The sensor(s) 1046 may receive a text and/or a voice signal that contains one or more queries.
According to one or more embodiments, provided is a method for autonomously learning active tactile perception policies, by learning a generative world model leveraging a differentiable Bayesian filtering algorithm, and designing an information-gathering model predictive controller.
While the embodiments of the disclosure have been described with reference to the figures, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope as defined by the following claims.
This application is based on and claims priority under 35 U.S.C. § 119 from U.S. Provisional Application No. 63/336,921 filed on Apr. 29, 2022, in the U.S. Patent & Trademark Office, the disclosure of which is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
63336921 | Apr 2022 | US |