This application is a national phase entry under 35 U.S.C. § 371 of International Patent Application PCT/FR2019/050983, filed Apr. 26, 2019, designating the United States of America and published as International Patent Publication WO 2019/211552 A1 on Nov. 7, 2019, which claims the benefit under Article 8 of the Patent Cooperation Treaty to French Patent Application Serial No. 1853868, filed May 4, 2018.
This present disclosure concerns the field of communication robots. More precisely, it concerns the management of the resources of the robot to control a functioning of the communication robot to simulate an empathic intelligence allowing interaction with one or more humans by actions that are not simply slave actions (the term “robot” having been imagined in 1920 by the Czech Karel Capek inspired by the word robota meaning “slave” in Slavic languages).
Such a robot may take the form of a humanoid robot, an autonomous car or more simply an apparatus with a communication interface allowing bidirectional interaction with one or more humans via multimodal messages (tactile, visual or audible) emitted and received by the robot.
In the article “Measuring empathy for human and robot hand pain using electroencephalography” by Yutaka Suzuki, Lisa Galli, Ayaka Ikeda, Shoji Itakura & Michiteru Kitazaki in “Scientific Reports 5, Article number: 15924 (2015),” the authors describe the functioning of the human brain, specifically its empathic response, when it perceives a robot in a situation that evokes an emotional form, such as pain. The authors of this study present volunteers with either a photograph of a human hand or a robotic hand, each in a situation that may or may not evoke pain (in the image presented to the study participants, an open pair of scissors is displayed, potentially cutting the finger of the hand shown in the image).
Traditionally univocal, the relationship between the robot and the human is becoming reciprocal. Some roboticists are tempted to present robots as “emo robots” (having emotions, a “heart”), which is already the case in Japan with the Pepper robot (trade name) or in France with the Nao robot (trade name). While artificial empathy is always a simulation, robot manufacturers are attempting to present us with machines that would be capable of having real emotions.
In order to meet this need, the European patent granted under number EP1486300 B1 has been proposed describing a behavior control system for a robot that operates autonomously, comprising:
The internal state management section manages emotions, each of which is an index to the internal state in a hierarchical structure having a plurality of layers, and uses the emotions in a layer of primary emotions necessary for individual preservation and another layer of secondary emotions, which vary depending upon excess/deficiency of the primary emotions, and further divides the primary emotions into layers including an innate reflexive or physiological layer and an associative layer based on dimensions.
Also known is the European patent granted under number EP1494210 describing a speech communication system with a function for having a conversation with a conversation partner, comprising:
The conversation control means continues the conversation when the speech content of the conversation partner obtained as the recognition result by the speech recognition means is identical to the expected response content, even if the search by the search control means fails.
The solutions of the prior art dissociate on the one hand the technical resources intended for language processing, and on the other, the technical resources intended for recognition processing of the robot's environment by cameras and possibly other sensors as well as the control of the robot's movements.
As a result, the solutions of the prior art do not allow for enriched learning tasks, which merge semantic inputs and outputs with perceived data and actuations.
In particular, patent EP1486300 does not provide for taking into account the verbal dimension of the interaction between the human and the robot, and thus does not allow, for example, for the robot to learn naturally during interaction with the human.
As for patent EP1494210, it provides only for verbal interaction between the human and the robot to control artificial conversations.
The juxtaposition of the teaching of these two documents would not allow a satisfactory solution to be obtained because the two solutions do not use the same technical grammars and do not allow for verbal information and information perceived by the robot's sensors to be merged into the same semantic set.
Moreover, the internal state of the robot corresponding to an “emotion” is exclusively translated by the selection of an action chosen among a collection of actions associated with emotions, which considerably limits the possibilities of empathic intelligence in real time with the human.
The solutions of the prior art lead to fixed emotional appearances, in a weakly responsive manner with respect to the attitude of the human in interaction with the robot. The credibility of the emotional appearance is mediocre due to the absence of real-time modifications, which prevents the cognitive load of the human interacting with the robot from being reduced.
The object of the present disclosure is to respond to these drawbacks by proposing, according to the most general acceptance of the present disclosure, a method for controlling a plurality of effectors of a robot by a plurality of primitives made up of parameterizable coded functions:
wherein the method is based on associating, in every step, coded objects with a sequence of characters corresponding to their semantic description, in particular, with:
Advantageously, the action selection system classifies the rules based on the proximity of the context of each of these rules to the context computed from the content of the objects contained in the memory and selects the actions associated with the rules relevant to the context.
According to the embodiments:
The present disclosure also relates to a robot comprising a plurality of effectors controlled by a controller executing a process according to the present disclosure.
This present disclosure will be better understood by reading the following detailed description of a non-limiting example of the present disclosure, with reference to the accompanying drawings wherein:
Hardware Architecture
The robot, according to the embodiment described in
Each sensor (2 to 5) is associated with a physical driver circuit (12 to 15) “driver,” integrated in the sensor or provided on the communication interface circuit (1).
The communication interface circuit (1) brings together the circuits for pre-processing (22 to 25) the signals supplied by the sensors (2 to 5) to transmit the information to the main computer (30) or to dedicated computers (31 to 33). Some pre-processing circuits (22 to 25) may be made up of specialized circuits, such as image analysis circuits or speech recognition circuits.
The dedicated computers (31 to 33) receive signals from certain sensors (2 to 5), as well as commands from the main computer (30), to compute instructions, which are transmitted to pre-processing circuits (51 to 53) for the computation of the action parameters of an effector (41 to 43). These parameters are exploited by interface circuits (61 to 63) to provide the effectors (41 to 43) with control signals, for example, in the form of pulse-width modulated electrical signals PWM.
Moreover, the main computer (30) is associated with a communication module (36) for data exchange with external resources, e.g., the interne.
The effectors (41 to 43) are, for example, made up of:
Functional Architecture
Description of the Perception Functions
The information from the sensors (2 to 5) provides information allowing the computation, via perception functions (72 to 75), of digital metadata, image of the semantic representation that the robot makes of the world.
Each of these perception functions (72 to 75) receives pre-processed data from one or more sensors (2 to 5) to compute the metadata corresponding to a perception type.
For example,
These digital metadata are made up of objects coded in object language, for example, in C# or C++. These metadata comprise all or part of the following elements:
The set of coded objects corresponding to the selection functions is stored in a memory (80) the content of which expresses the representation of the world as perceived by the robot.
Description of the Condensation Functions
Processing (91 to 98) is applied to this set of coded objects in the background by condensation functions, which extract characteristic data for each of the objects and group together coded objects sharing the same characteristic data.
For example, a first condensation function performs face recognition processing on human-type detected objects, based on the following objects:
A second condensation function performs the association processing of a person with a recognized sound, based on the following objects:
A third condensation function performs the association processing of the two objects, from the following recomputed objects:
“I SEE PAUL 1 METER AWAY AND 20 DEGREES TO THE RIGHT”
“I SEE A MAN 1 METER AWAY AND 20 DEGREES TO THE RIGHT WHO SAYS ‘HELLO’ TO ME”
to modify the object 1 thusly “I SEE PAUL 1 METER AWAY AND 20 DEGREES TO THE RIGHT WHO SAYS ‘HELLO’ TO ME”
The condensation processing (91 to 98) is applied recursively to the contents of the memory (80) containing the robot's representation of the world.
Description of the Selection of Rules and Actions
The system further comprises an action selection system (“rule manager”) (100), which comprises a declarative memory (101) in which is stored a library of rules (102 to 106) associating a context (112 to 116) with an action (122 to 126).
For example, a first rule R1 is made up of a numerical sequence of the type:
“IF YOU HEAR A MAN SAY HELLO TO YOU, ANSWER ‘HELLO SIR’”
and a second R2 rule is made up of a numerical sequence of the type:
“IF YOU HEAR A HUMAN SAY HELLO TO YOU, AND THAT HUMAN'S NAME IS #FIRSTNAME, SAY ‘HELLO #FIRSTNAME’”
and a third rule R3 is made up of a numerical sequence of the type:
“IF YOU HEAR A NOISE AND IT IS BETWEEN 2:00 A.M. AND 4:00 A.M., SEND YOUR OWNER A PICTURE OF WHAT YOU SEE.”
An action (122 to 126) materializes by a numerical sequence designating a tree of primitives (131 to 134) executed by the robot effectors, such as:
Action 1 “SEND A MESSAGE TO YOUR OWNER” corresponds to a unitary sequence made up of a single primitive (131 to 134):
Action 2 “SMILE” corresponds to a composite sequence comprising several primitives (131 to 134):
Action 3 “EXPRESS TERROR” corresponds to a composite sequence comprising several primitives (131 to 134):
Action 4 “LIFT ARM” corresponds to a composite sequence comprising a single primitive (131 to 134):
The action selection system (100) classifies the rules (102 to 105) based on the proximity of the context (112 to 115) of each of these rules to the context computed from the content of the objects (81 to 88) contained in the memory (80), by a computation of distance in the N-dimensional space of the context.
This computation periodically provides a subset (110) of rules (102 to 104) associated with actions (122 to 124), in the form of a list ordered by relevance based on the aforementioned distance computation. The list is optionally filtered based on a threshold value to form a subset of rules the distance of which from the current context is less than this threshold value.
This subset (110) is then ordered based on a Satisfaction Indicator (IS). For this purpose, each rule is assigned to a variable numerical parameter IS at its execution. How this numerical parameter IS is determined will be explained hereinafter.
The subset (110) thus determined defines a set of ordered actions (122-124) associated with the rules (102-104) and used to control the operation of the robot's effectors.
Description of the Execution of Actions
The execution of actions (122 to 124) is carried out via the activation of primitives (131 to 134) the parameterization of which is determined by the content of the actions (amplitude of the movement, address of the sound sequence, intensity of the sound sequence, etc.).
The primitives (131 to 134) designate meta-functions, resulting in a computer code the execution of which is carried out by a set of commands (201 to 205) transmitted to software applications (211 to 215) or directly to effectors (41 to 43), optionally with parameters, for example:
Each primitive (131 to 134) is parameterized, if necessary, with:
The activation of the primitives (131 to 134) is filtered by a resource manager (200) the object of which is to prevent commands that are contradictory or impossible to perform from being sent to the same effector (41). This filtering prioritizes the activation of the primitives associated with the actions (122 to 124) having the highest IS parameter.
For example, if among the list of selected actions (122 to 124) there is:
then the parameters applied to this same primitive managing the movement of the arm are incompatible, and the resource manager (200) prevents the action having the lowest IS parameter, and the parameterized primitive actually executed is the one resulting from the action having the highest IS parameter.
In another example, the resource manager (200) computes a new primitive from two primitives deemed incompatible, the parameterization of which is computed by weighting the parameterizations of the two incompatible primitives according to the IS parameter associated with each of them and thus corresponds to a compromise.
Learning of the Rules
According to an embodiment of the present disclosure, the recording of the rules (102 to 106) is carried out by speech learning.
For this purpose, a learning module (400) comprising a speech recognition and semantic analysis module analyzes the sentences pronounced by an operator to extract actions and contexts defining a new rule (106).
For example, if the human says a sentence such as “if you hear a loud noise, activate an audible signal,” the module builds and saves a rule associating the action “ACTIVATE AN AUDIBLE SIGNAL” with the context “YOU HEAR A NOISE EXCEEDING A LEVEL X.”
In another example, the learning module (400) also comprises means for kinetic recognition, to recognize a gesture, e.g., indicating an object. In this case, the learning module (400) further comprises an image analysis module, to allow learning by gestures combined with spoken words.
For example, the human indicates an object in the field of vision of the robot's camera and pronounces the sentence “IF YOU SEE THIS #OBJECT, GRASP IT,” leads to the recording of the rule made up of the action “GRASP THE OBJECT” associated with the context “SEE OBJECT ‘#OBJECT.’”
In a third example, the learning module (400) comprises means for learning by mimicry or by recurrence: when the robot records a same action associated with a same context repeatedly, it triggers the recording of a new rule associating this action with this context.
For example, when the human systematically answers “You're welcome” to “Thank you” stated by the robot, the robot creates the rule associating the action “SAY YOU'RE WELCOME” to the context “A HUMAN SAYS ‘THANK YOU.’”
The combinations of actions and contexts are constructed based on the value of the rapport parameter VEI4 computed for each pair of an action and a context, during the interactions between the human and the robot.
Obviously, the rules (102 to 106) may also be recorded by programming.
Computation of the IS Parameter
Each rule (102 to 106) in the declarative memory (101) is associated with an IS (Satisfaction Indicator) parameter, which is recomputed after each execution of the associated action (112 to 116), based on the current value of the parameter VEI3.
The value of the IS parameter is used to order the actions (112-116) and to prioritize access to the resources by the resource manager (200) to the actions associated with the IS parameter having the highest value.
Knowledge Pooling
In the aforementioned examples, the rules (102 to 106) are stored in the local memory of a robot.
According to one embodiment, several robots share a common memory for storing the robots' rules, with each robot maintaining its own IS parameter.
For example, the declarative memory (101) of the action selection system (100) of a robot is periodically synchronized with the content of a server shared between a plurality of robots via the communication module (36).
The learning of a new rule (106) in the action selection system (100) of a particular robot enriches the behavior of the action selection systems (100) of the plurality of robots accessing the server, while maintaining its own internal state values VEI and its own rule priority order (102-106), based on its own IS indicator.
Description of the computation of the VEIX parameters
The VEIX (Internal State Variables) parameters are variables between 0 and 1 representative of the internal state of the robot:
VEI1 represents the state of waking or activation of the robot.
VEI2 represents the robot's state of surprise.
VEI3 represents the robot's apparent state of satisfaction.
VEI4 represents the state of rapport between the robot and the humans with whom it interacts.
This parameter may be calculated through steps of controlling coded functions of stimuli from the user and acquiring the effect induced on the user to deduce the level of empathy based on the conformity between the stimulus and the image acquired.
VEI5 represents the robot's state of joy.
In general, the VEIX parameters are computed by modules (181 to 185) from the data supplied by the memory (80) determining the representation of the world by the robot based on the data acquired by the set of sensors (2 to 5) and the information representative of the state of the robot for the determination of the synchronisms between the robot and the human.
The VEI1 to VEI5 parameters are also computed on the basis of internal criteria, independent of the robot's external environment. These criteria take into account, for example:
The result of the processing periodically provides updated values for the VEIX parameters, which modulate the perception functions (71 to 75) and the primitives (131 to 134), as well as the action selection module (100).
For example:
The functions for computing the VEIX parameters may be determined by an initial characterization step involving recording experimental data obtained by a process including subjecting persons to a plurality of stimuli and associating with each of the stimuli corresponding to the perception functions a level of perception [pleasure, satisfaction, alertness, surprise, etc.].
Number | Date | Country | Kind |
---|---|---|---|
1853868 | May 2018 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/FR2019/050983 | 4/26/2019 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2019/211552 | 11/7/2019 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
20030187653 | Okubo | Oct 2003 | A1 |
20110178619 | Jung | Jul 2011 | A1 |
20140347265 | Aimone | Nov 2014 | A1 |
20150362988 | Yamamoto | Dec 2015 | A1 |
20150378444 | Yin | Dec 2015 | A1 |
20160140963 | Connell | May 2016 | A1 |
20170148434 | Monceaux | May 2017 | A1 |
20180326583 | Baroudi | Nov 2018 | A1 |
20180336000 | Vaughn | Nov 2018 | A1 |
20190172448 | Monceaux | Jun 2019 | A1 |
20190291277 | Oleynik | Sep 2019 | A1 |
20190313058 | Harrison | Oct 2019 | A1 |
20200016745 | Tang | Jan 2020 | A1 |
Number | Date | Country |
---|---|---|
1494210 | Jan 2007 | EP |
1486300 | Aug 2011 | EP |
2013150076 | Oct 2013 | WO |
WO-2013150076 | Oct 2013 | WO |
Entry |
---|
International Search Report for International Application No. PCT/FR2019/050983 dated Sep. 23, 2019, 2 pages. |
International Written Opinion for International Application No. PCT/FR2019/050983 dated Sep. 23, 2019, 7 pages. |
Suzuki et al., Measuring Empathy for Human and Robot Hand Pain using Electroencephalography, www.nature.com/scientificreports, published Nov. 3, 2015, 9 pages. |
Number | Date | Country | |
---|---|---|---|
20220009082 A1 | Jan 2022 | US |