The disclosure of Japanese Patent Application No. 2007-317453 filed on Dec. 7, 2007 including the specification, drawings and abstract is incorporated herein by reference in its entirety.
1. Field of the Invention
The invention relates to a behavior determination apparatus and method, a behavior learning apparatus and method, a robot apparatus, and a medium recorded with a program, and more particularly to a behavior determination apparatus and method, a behavior learning apparatus and method, and a medium recorded with a program that are suitable for installation in a robot apparatus capable of autonomous action, and to a robot apparatus installed therewith.
2. Description of the Related Art
A technique in which actions are modeled and stored as action models and a subsequent action is determined from an action model based on past experience is available as a conventional method for determining an action of a robot (see Japanese Patent Application Publication No. 2005-297105 (JP-A-2005-297105), for example).
The robot apparatus described in JP-A-2005-297105 includes state input means for inputting an external or internal state of the robot apparatus, internal state managing means for managing internal state vectors, associative storage means for calculating a predicted internal state variation vector on the basis of a state vector corresponding to the external or internal state, and information generating means for generating information relating to the robot apparatus on the basis of a current internal state vector managed by the internal state managing means and the predicted internal state variation vector calculated by the associative storage means.
The associative storage means is formed from a neural network having a state vector, which is constituted by a person ID obtained from a face/person recognition device, an object ID obtained from an object recognition device, other information from various sensors, and so on, as input and the predicted internal state variation vector as output, and learns a set of the state vector and the actual internal state variation vector at that time as a learning sample. When a similar state vector is obtained, a predicted internal state variation vector based on past experience is supplied to the emotion generating means from the associative storage means. The emotion generating means generates an emotion on the basis of the predicted internal state variation vector and the internal state vector. Behavior selecting means selects an action corresponding to the emotion.
However, with the technique described in JP-A-2005-297105, only behavior within the range of the stored action models can be generated. A target of the related art is to generate robot movement that corresponds to the surrounding conditions. Therefore, the focus of the related art has been directed toward experience extraction, in which an identical taught movement is performed under identical conditions to those of the past, rather than task execution.
More specifically, in the intelligence storage space of the robot (machine) according to the related art, only determined actions are performed in relation to limited information, such as environmental variation caused by a movement of the robot itself and behavior to be taken in relation to static object information. However, a technique enabling an autonomous robot to classify and associate large amounts of information obtained over time is required. If this information can be modularized and gathered in a single information space, an action (reaction) employing complicated information can be generated.
The invention provides a behavior learning apparatus and method capable of constructing an intelligent determination system in which input information is modularized into a single information space, a behavior determination apparatus and method using a learning result generated by the behavior learning apparatus, a robot apparatus installed with these apparatuses, and a medium recorded with a program.
A behavior determination apparatus according to a first aspect of the invention includes: a word extraction unit for extracting words from external instruction information; and a behavior determination unit for determining a behavior on the basis of a word network in which relationships between words are weighted on a network, and the words extracted by the word extraction unit.
In the first aspect of the invention, a behavior is determined on the basis of the word network, in which relationships between words are weighted on a network, and therefore an action constituted by words having a close relationship to the external instruction information can be determined.
A behavior learning apparatus according to a second aspect of the invention includes: a word extraction unit for extracting words from external instruction information; and a network construction unit for constructing a network in which words are associated by weightings on the basis of the words extracted by the word extraction unit and relationships therebetween, and updating the weightings between the words on the basis of the instruction information.
In the second aspect of the invention, a network in which weightings between words are defined using external instruction information can be generated and learned.
A behavior determination method according to a third aspect of the invention includes: a word extraction step of extracting words from external instruction information; and a behavior determination step of determining a behavior on the basis of a word network in which relationships between words are weighted on a network, and the words extracted in the word extraction step.
A behavior learning method according to a fourth aspect of the invention includes: a word extraction step of extracting words from external instruction information; and a network construction step of constructing a network in which words are associated by weightings on the basis of the words extracted in the word extraction step and relationships therebetween, and updating the weightings between the words on the basis of the instruction information.
A robot apparatus according to a fifth aspect of the invention is a robot apparatus for expressing an action in accordance with an external instruction, including: a word extraction unit for extracting words from external instruction information; and a behavior determination unit for determining a behavior on the basis of a word network in which relationships between words are weighted on a network, and the words extracted by the word extraction unit.
Another robot apparatus according to a sixth aspect of the invention includes: a word extraction unit for extracting words from external instruction information; and a network construction unit for constructing a network in which words are associated by weightings on the basis of the words extracted by the word extraction unit and relationships therebetween, and updating the weightings between the words on the basis of the instruction information.
Further, in a medium recorded with a program for causing a computer to execute a predetermined operation according to seventh and eighth aspects of the invention, the program includes the behavior determination processing or the behavior learning processing described above.
According to the invention, a behavior learning apparatus and method capable of constructing an intelligent determination system in which input information is modularized into a single information space, a behavior determination apparatus and method using a learning result generated by the behavior learning apparatus, and a robot apparatus installed with these apparatuses can be provided.
The foregoing and further features and advantages of the invention will become apparent from the following description of example embodiments with reference to the accompanying drawings, wherein like numerals are used to represent like elements, and wherein:
A specific embodiment to which the invention is applied will be described in detail below with reference to the drawings.
The input/output unit 102 includes a camera 121 constituted by a Charge Coupled Device (CCD) or the like for obtaining images of the peripheral area, one or a plurality of built-in microphones 122 for collecting peripheral sounds, a speaker 123 for outputting a voice in order to interact with a user or the like, a Light Emitting Diode (LED) 124 for expressing responses to the user, emotions, and so on, a sensor unit 125 constituted by a touch sensor or the like, and so on.
The driving unit 103 includes a motor 131, a driver 132 for driving the motor, and so on, and operates the leg portion unit 1d and the arm portion unit 1b in accordance with user instructions or the like. The power supply unit 104 includes a battery 141 and a battery control unit 142 for controlling charge/discharge thereof, and supplies power to each unit.
The external storage unit 105 is constituted by a detachable Hard Disk Drive (HDD), an optical disk, a magneto-optical disk, or the like, which stores various programs, control parameters, and so on, and supplies these programs and data to internal memory (not shown) or the like of the control unit 101 as needed.
The control unit includes a Central Processing Unit (CPU), Read Only Memory (ROM), Random Access Memory (RAM), a wireless communication interface, and so on, and controls various actions of the robot 1. The control unit 101 also includes modules, that, for example, operate in accordance with a control program stored in the ROM, such as an image recognition module 12 for analyzing an image obtained by the camera 121, a route search module 13 for performing a route search on the basis of an image recognition result, a behavior determination module 14 for selecting a behavior to be taken on the basis of various recognition results, a voice recognition module 15 for performing voice recognition, a tag information recognition module 16 for recognizing tag information and the like, and so on. In this embodiment in particular, a word network (knowledge network) is generated by the behavior determination module 14 by modularizing input information into a single information space using information from the tag information recognition module 16 and image recognition module 12, and so on. By generating a behavior using the word network, actions can be expressed in a more natural manner.
The robot 1 according to this embodiment is installed with a behavior determination module (behavior determination apparatus) which, upon reception of instruction information from a human, determines a behavior by extracting words having a close relationship to the instruction information from past instruction information. Thus, even when the instruction information from the person is insufficient, the behavior intended by the instructor can be determined accurately. The behavior determination apparatus installed in the robot apparatus will now be described in detail.
Note that here, a behavior determination module is described as the behavior determination apparatus, but the processing of each block may be realized by causing the CPU to execute a computer program. In this case, the computer program may be stored on a recording medium or transmitted via the Internet or another transmission medium. Further, the newest version of the knowledge network that is enlarged, updated, and so on may be obtained over the Internet or the like.
The knowledge acquisition unit 21 extracts words included in instruction information from an external source such as a user (knowledge acquisition function). The instruction information may take the form of voice information such as “wash the towel”, an image recognition result of a towel held in the hand of the user, and so on.
The network construction unit 22 stores/accumulates the words (accumulation function) and weights the relationships between the words on a network. Hereafter, this network will be referred to as a word network. In this case, for example, correct cases and incorrect cases are input from the exterior, and the relationships between the words are learned. The word weighting of the word network in a correct case is increased by 0.01 to 0.1, for example, every time the word is input, and the word weighting of the word network in an incorrect case is reduced by 0.01 to 0.1, for example. An initial value of the word weighting in the word network may be set at 0.5 or the like, for example.
Upon reception of instruction information from a user, the behavior determination unit 23 determines a behavior by extracting words having a close relationship to the instruction information from the word network, which serves as past information. Since the behavior is determined by extracting words having a close relationship from the word network, the behavior intended by the instructor can be determined accurately even when the instruction information provided by the user is insufficient.
To be able to execute various tasks, the robot must learn the various tasks and accumulate experience thereof autonomously. In this embodiment, similarly to a human developmental process, the robot is exposed to various experiences and associates knowledge autonomously using the knowledge accumulated from these experiences. Thus, the robot is able to make inferences in relation to tasks that it has not experienced.
The behavior determination apparatus according to this embodiment will now be described in further detail. In the cognitive development of a human being, the accumulation and storage of experience and the referencing of this accumulated experience from storage is the most basic form of intelligent reasoning capacity. The intelligent determination system of the robot is constructed by emulating this structure. The behavior determination apparatus (behavior determination module) is installed with the following functions.
(1) Brief knowledge/experience acquisition function: the knowledge acquisition unit 21 obtains brief (word-level) knowledge from conditions. In this case, the human teaches the robot correct cases and incorrect cases, whereby knowledge is acquired. By teaching correct and incorrect cases, the reflection of empirical value to the word network can be converged more quickly. For example, “towel”, “washing machine”, “shoe”, and so on are obtained on an index level (without including concepts) in the form of: put the towel into the washing machine→Y; wipe it with the towel→Y; wash the towel→Y; put the shoes into the washing machine→N; put the towel into the shoes→N, and relationships therebetween are expressed.
(2) Network construction function: an accumulated knowledge base is constructed, and a categorized knowledge base is learned (experience base). In this embodiment, it is assumed that the movement of the robot is symbolized (abstracted). Moreover, it is assumed that appearing objects have been recognized and indexed.
The brief knowledge acquired in (1) is accumulated, and a network is constructed on the basis of connections between the accumulated brief knowledge. By inputting similar instruction information repeatedly and so on, the network is converged to a certain level. For example, when a weighting of the word network reaches or exceeds 0.99, updating of the weightings between the corresponding words is stopped. Further, when a weighting falls to or below 0.1, updating of the weightings between the corresponding words is stopped. As a result, the network is categorized using experience information collected without meaning and the connections therebetween. Further, newly added information is converged into the actual word network using the categories as a base. As a result, an instruction from the user to “put the towel into the washing machine and wash it with detergent” generates a new connection (washing machine—detergent) that was not present in (1).
(3) Knowledge extraction: the behavior determination unit 23 has an inference (determination) function for solving problems. An inference relating to insufficient information for executing a task is made from the word network, which is converged continuously through the learning and experience accumulation of (2). Ex. 1) Inference function: in relation to an instruction from a human to “wash the towel”, the robot infers information that is missing from the instruction, such as “washing machine” and “detergent”, and outputs the behavior “wash the towel in the washing machine using detergent” as a result. Ex. 2) Error prevention function: in relation to the instruction “wash the leather shoes in the washing machine”, the weighting between the words “shoe” and “washing machine” is small, and therefore the fact that shoes must not be washed in the washing machine is held in the knowledge base. Hence, the robot does not carry out the behavior immediately, and instead checks with the human whether or not the leather shoes are to be washed in the washing machine, for example.
Next, an operation of the behavior determination apparatus according to this embodiment will be described.
To construct this type of word network, the knowledge acquisition unit 21 in the behavior determination apparatus 20 of the robot 1 obtains common sense information from a human (step S1). As noted above, this information includes “put the towel in the washing machine”, “wipe it with the towel”, “wash the towel”, “put the shoes into the washing machine”, “put the towel into the shoes”, and so on, and accordingly, words such as “towel”, “washing machine”, “wipe”, “wash”, and “shoe” are extracted.
Next, the network construction unit 22 searches the current word network to determine whether or not the corresponding words are included (step S2). In the example shown in
Next, behavior generation will be described. First, a task instruction is received from the user (step S11). The behavior determination unit 23 then searches the word network for matching or similar words to the words included in the task (step S12). When a matching word is found (step S13: Yes), the weightings between the word and adjacent nodes are checked (step S14). The word having the largest weighting is then selected together with the word having the largest weighting of the words connected to the selected word. By repeating this processing, words are extracted gradually such that finally, a maximum total task execution weighting is obtained (step S15).
For example, when the user says “towel”, the words towel-washing machine-detergent-wash are extracted from the word network shown in
On the other hand, when a provided word is not included in the word network, for example when the user says “bet”, the robot responds to the user with a question such as “What should I do?” (step S16), and thus incorrect actions are prevented. In other words, the robot may include a behavior confirmation unit for confirming with the user that the behavior determined by the behavior determination unit 23 is correct after the behavior is output. Thus, the user can teach the robot the correct behavior. In this case, the network construction unit 22 updates the weightings of the word network on the basis of the confirmation result obtained by the behavior confirmation unit. Thus, the word network can be updated in accordance with the teaching of the user. The network construction unit 22 is also capable of updating or enlarging the word network on the basis of recognition information provided externally or detected by the sensor unit 125. The network construction unit 22 is capable of generating a comprehensive intelligence network by connecting a network constructed from the recognition information (sensor information) to the word network, for example. As a result, the robot can express behavior that is even more varied.
Further, “towel”, “detergent”, and so on are used as the nodes of the word network, but similar words to the respective nodes may be registered in each node, and a thesaurus or the like may be prepared. For example, assuming that “handkerchief” is a similar word to “towel”, when the user presents a handkerchief, the robot can use the network shown in
In this embodiment, a word network is constructed from correct cases and incorrect cases taught by the user, and the word network is used to determine a behavior. When incorrect cases are taught in addition to correct cases, the reflection of empirical value to the word network can be converged more quickly. Further, when the word network is used, an action not included in the user instruction can be inferred. Moreover, by storing a large amount of associated information in a network, large numbers of words and behaviors can be gathered in a single information space, and therefore actions and reactions employing complicated information can be generated. Furthermore, the robot collects information for executing a task autonomously in response to an incomplete task instruction from a human, and therefore the task instructions issued to the robot by the human can be simplified.
Note that the invention is not limited to the above embodiment alone, and may of course be subjected to various modifications within a scope that does not depart from the spirit of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2007-317453 | Dec 2007 | JP | national |