ACTION CONTROL DEVICE, ACTION CONTROL METHOD, AND RECORDING MEDIUM

Information

  • Patent Application
  • 20240173636
  • Publication Number
    20240173636
  • Date Filed
    October 16, 2023
    a year ago
  • Date Published
    May 30, 2024
    7 months ago
Abstract
An action control device that controls the actions of a robot acquires an external stimulus; and in a case of executing an action corresponding to the external stimulus, controls so as to execute, based on an intimacy between a subject applying the external stimulus and the robot, different action content.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of Japanese Patent Application No. 2022-187973, filed on Nov. 25, 2022, the entire disclosure of which is incorporated by reference herein.


FIELD OF THE INVENTION

The present disclosure relates to an action control device, an action control method, and a recording medium.


BACKGROUND OF THE INVENTION

In the related art, various types of robots have been developed but, in recent years, advancements have been made in the development of not only industrial robots, but also of consumer robots such as pet robots. For example, Unexamined Japanese Patent Application Publication No. 2001-157985 describes a robot device, provided with a pressure sensor, that determines, by a pattern of a detected pressure detection signal, whether a person that contacts the robot device is a user that is registered in advance.


SUMMARY OF THE INVENTION

One aspect of an action control device according to the present disclosure is

    • an action control device that controls an action of a control target device, the action control device comprising:
    • a controller that
      • acquires an external stimulus, and
      • in a case where the controller executes an action corresponding to the external stimulus, controls so as to execute, based on an intimacy between a subject applying the external stimulus and the control target device, different action content.





BRIEF DESCRIPTION OF DRAWINGS

A more complete understanding of this application can be obtained when the following detailed description is considered in conjunction with the following drawings, in which:



FIG. 1 is a drawing illustrating the appearance of a robot according to Embodiment 1;



FIG. 2 is a cross-sectional view of the robot according to Embodiment 1, viewed from a side surface;



FIG. 3 is a drawing for explaining a housing of the robot according to Embodiment 1;



FIG. 4 is a block diagram illustrating the functional configuration of the robot according to Embodiment 1;



FIG. 5 is a drawing for explaining an example of an action mode setting table according to Embodiment 1;



FIG. 6 is a drawing for explaining an example of an emotion map according to Embodiment 1;



FIG. 7 is a drawing for explaining an example of a growth table according to Embodiment 1;



FIG. 8 is a drawing for explaining an example of an action content table according to Embodiment 1;



FIG. 9 is a flowchart of action control processing according to Embodiment 1;



FIG. 10 is a flowchart of microphone input processing according to Embodiment 1;



FIG. 11 is a drawing illustrating an example of a sound buffer according to Embodiment 1;



FIG. 12 is a flowchart of similarity with voice history determination processing according to Embodiment 1;



FIG. 13 is a flowchart of action mode setting processing according to Embodiment 1;



FIG. 14 is a flowchart of normal action mode processing according to Embodiment 1;



FIG. 15 is a flowchart of familiar action mode processing according to Embodiment 1;



FIG. 16 is a flowchart of touch response familiar action processing according to Embodiment 1;



FIG. 17 is a flowchart of sound response familiar action processing according to Embodiment 1; and



FIG. 18 is a flowchart of loud sound response familiar action processing according to Embodiment 1.





DETAILED DESCRIPTION OF THE INVENTION

Hereinafter, embodiments of the present disclosure are described while referencing the drawings. Note that, in the drawings, identical or corresponding components are denoted with the same reference numerals.


Embodiment 1

An embodiment in which an action control device according to Embodiment 1 is applied to a robot 200 illustrated in FIG. 1 is described while referencing the drawings. As illustrated in FIG. 1, the robot 200 according to the embodiment is a pet robot that resembles a small animal. The robot 200 is covered with an exterior 201 provided with bushy fur 203 and decorative parts 202 resembling eyes. A housing 207 of the robot 200 is accommodated in the exterior 201. As illustrated in FIG. 2, the housing 207 of the robot 200 includes a head 204, a coupler 205, and a torso 206. The head 204 and the torso 206 are coupled by the coupler 205.


Regarding the torso 206, as illustrated in FIG. 2, a twist motor 221 is provided at a front end of the torso 206, and the head 204 is coupled to the front end of the torso 206 via the coupler 205. The coupler 205 is provided with a vertical motor 222. Note that, in FIG. 2, the twist motor 221 is provided on the torso 206, but may be provided on the coupler 205 or on the head 204.


The coupler 205 couples the torso 206 and the head 204 so as to enable rotation (by the twist motor 221) around a first rotational axis that passes through the coupler 205 and extends in a front-back direction of the torso 206. The twist motor 221 rotates the head 204, with respect to the torso 206, clockwise (right rotation) within a forward rotation angle range around the first rotational axis (forward rotation), counter-clockwise (left rotation) within a reverse rotation angle range around the first rotational axis (reverse rotation), and the like. Note that, in this description, the term “clockwise” refers to clockwise when viewing the direction of the head 204 from the torso 206. A maximum value of the angle of twist rotation to the right (right rotation) or the left (left rotation) can be set as desired, and the angle of the head 204 in a state, as illustrated in FIG. 3, in which the head 204 is not twisted to the right or the left is referred to as a “twist reference angle.”


The coupler 205 couples the torso 206 and the head 204 so as to enable rotation (by the vertical motor 222) around a second rotational axis that passes through the coupler 205 and extends in a width direction of the torso 206. The vertical motor 222 rotates the head 204 upward (forward rotation) within a forward rotation angle range around the second rotational axis, downward (reverse rotation) within a reverse rotation angle range around the second rotational axis, and the like. A maximum value of the angle of rotation upward or downward can be set as desired, and the angle of the head 204 in a state, as illustrated in FIG. 3, in which the head 204 is not rotated upward or downward is referred to as a “vertical reference angle.” Note that, in FIG. 2, an example is illustrated in which the first rotational axis and the second rotational axis are orthogonal to each other, but a configuration is possible in which the first and second rotational axes are not orthogonal to each other.


The robot 200 includes a touch sensor 211 that can detect petting or striking of the robot 200 by a user. More specifically, as illustrated in FIG. 2, the robot 200 includes a touch sensor 211H on the head 204. The touch sensor 211H can detect petting or striking of the head 204 by the user. Additionally, as illustrated in FIGS. 2 and 3, the robot 200 includes a touch sensor 211LF and a touch sensor 211LR respectively on the front and rear of a left-side surface of the torso 206, and a touch sensor 211RF and a touch sensor 211RR respectively on the front and rear of a right-side surface of the torso 206. These touch sensors 211LF, 211LR, 211RF, 211RR can detect petting or striking of the torso 206 by the user.


The robot 200 includes an acceleration sensor 212 on the torso 206. The acceleration sensor 212 can detect an attitude (orientation) of the robot 200, and can detect being picked up, the orientation being changed, being thrown, and the like by the user. The robot 200 includes a gyrosensor 213 on the torso 206. The gyrosensor 213 can detect vibrating, rolling, rotating, and the like of the robot 200.


The robot 200 includes a microphone 214 on the torso 206. The microphone 214 can detect external sounds. Furthermore, the robot 200 includes a speaker 231 on the torso 206. The speaker 231 can be used to emit animal sounds, sing songs, and the like.


Note that, in the present embodiment, the acceleration sensor 212, the gyrosensor 213, the microphone 214, and the speaker 231 are provided on the torso 206, but a configuration is possible in which all or a portion of these components are provided on the head 204. Note that a configuration is possible in which, in addition to the acceleration sensor 212, the gyrosensor 213, the microphone 214, and the speaker 231 provided on the torso 206, all or a portion of these components are also provided on the head 204. The touch sensor 211 is provided on each of the head 204 and the torso 206, but a configuration is possible in which the touch sensor 211 is provided on only one of the head 204 and the torso 206. Moreover, a configuration is possible in which a plurality of any of these components is provided.


Next, the functional configuration of the robot 200 is described. As illustrated in FIG. 4, the robot 200 includes an action control device 100, a sensor 210, a driver 220, a sound outputter 230, and an operation inputter 240. Moreover, the action control device 100 includes a controller 110, a storage 120, and a communicator 130. In FIG. 4, the action control device 100, and the sensor 210, the driver 220, the sound outputter 230, and the operation inputter 240 are connected to each other via a bus line BL, but this is merely an example. A configuration is possible in which the action control device 100, and the sensor 210, the driver 220, the sound outputter 230, and the operation inputter 240 are connected by a wired interface such as a universal serial bus (USB) cable or the like, or by a wireless interface such as Bluetooth (registered trademark) or the like. Additionally, a configuration is possible in which the controller 110, and the storage 120 and the communicator 130 are connected via the bus line BL.


The action control device 100 controls, by the controller 110 and the storage 120, actions of the robot 200. Note that the robot 200 is a device that is controlled by the action control device 100 and, as such, is also called a “control target device.”


In one example, the controller 110 is configured from a central processing unit (CPU) or the like, and executes various processings described later using programs stored in the storage 120. Note that the controller 110 is compatible with multithreading functionality, in which a plurality of processings are executed in parallel. As such, the controller 110 can execute the various processings described below in parallel. Additionally, the controller 110 is provided with a clock function and a timer function, and can measure the date and time, and the like.


The storage 120 is configured from read-only memory (ROM), flash memory, random access memory (RAM), or the like. Programs to be executed by the CPU of the controller 110, and data needed in advance to execute these programs are stored in the ROM. The flash memory is writable non-volatile memory, and stores data that is desired to be retained even after the power is turned OFF. Data that is created or modified during the execution of the programs is stored in the RAM. In one example, the storage 120 stores a voice history, emotion data 121, emotion change data 122, a growth table 123, an action mode setting table 126, a sound buffer 127, and the like, all described hereinafter.


The communicator 130 includes a communication module compatible with a wireless local area network (LAN), Bluetooth (registered trademark), or the like, and carries out data communication with a smartphone or similar external device.


The sensor 210 includes the touch sensor 211, the acceleration sensor 212, the gyrosensor 213, and the microphone 214 described above. The controller 110 acquires, as external stimulus data, detection values detected by the various sensors of the sensor 210. The external stimulus data expresses an external stimulus acting on the robot 200. Note that a configuration is possible in which the sensor 210 includes sensors other than the touch sensor 211, the acceleration sensor 212, the gyrosensor 213, and the microphone 214. The types of external stimuli acquirable by the controller 110 can be increased by increasing the types of sensors of the sensor 210.


The touch sensor 211 detects contacting by some sort of object. The touch sensor 211 is configured from a pressure sensor or a capacitance sensor, for example. The controller 110 acquires a contact strength and/or a contact time on the basis of the detection values from the touch sensor 211 and, on the basis of these values, can detect an external stimulus such as that the robot 200 is being pet or being struck by the user, and the like (for example, see Unexamined Japanese Patent Application Publication No. 2019-217122). Note that a configuration is possible in which the controller 110 detects these external stimuli by a sensor other than the touch sensor 211 (for example, see Japanese Patent No. 6575637).


The acceleration sensor 212 detects acceleration in three axial directions, namely the front-back direction (X-axis direction), the width (left-right) direction (Y-axis direction), and the vertical direction (Z direction) of the torso 206 of the robot 200. The acceleration sensor 212 detects gravitational acceleration when the robot 200 is stopped and, as such, the controller 110 can detect a current attitude of the robot 200 on the basis of the gravitational acceleration detected by the acceleration sensor 212. Additionally, when, for example, the user picks up or throws the robot 200, the acceleration sensor 212 detects, in addition to the gravitational acceleration, acceleration caused by the movement of the robot 200. Accordingly, the controller 110 can detect the movement of the robot 200 by removing the gravitational acceleration component from the detection value detected by the acceleration sensor 212.


The gyrosensor 213 detects angular velocity of the three axes of the robot 200. The controller 110 can determine a rotation state of the robot 200 on the basis of the angular velocities of the three axes. Additionally, the controller 110 can determine a vibration state of the robot 200 on the basis of the maximum values of the angular velocities of the three axes.


The microphone 214 detects ambient sound of the robot 200. The controller 110 can, for example, detect, on the basis of a component of the sound detected by the microphone 214, that the user is speaking to the robot 200, that the user is clapping their hands, and the like.


Specifically, the controller 110 samples, at a prescribed sampling frequency (16,384 Hz in the present embodiment) and number of quantization bits (16 bits in the present embodiment), sound data acquired from the microphone 214, and stores the sampled sound data in the sound buffer 127 of the storage 120. In the present embodiment, the sound buffer 127 includes 16 consecutive buffers (storage regions) that each contain 512 samples of sampling data. Specifically, as illustrated in FIG. 11, voice similarity is determined with the 16 consecutive buffers (storage regions) 1270 to 1285 as one unit. In the present embodiment, the 16 consecutive buffers are expressed as array variables. For example, buffer 1270 is expressed as buf[0] and buffer 1285 is expressed as buf[15]. 512 samples×16 buffers/16384 Hz=0.5 seconds of sound data is stored by the 16 buffers.


Note that processing in which the controller 110 stores the sound data acquired from the microphone 214 in the sound buffer 127 is executed in parallel with other processings as a sound buffer storage thread. Additionally, in the present embodiment, in voice characteristic parameter calculation processing, described later, the controller 110 performs, for the 16 buffers 1270 to 1285, processing for calculating three pieces of Cepstrum information from the 512 samples of sampling data in one buffer. The controller 110 treats the 48 (=3×16) pieces of data obtained thereby as a 48-dimension voice characteristic parameter.


A history storage number (for example, 256) of this voice characteristic parameter is stored in the storage 120 on a first-in first-out (FIFO) basis. In the present embodiment, the FIFO storing the voice characteristic parameter is called (VFIFO), and the number of voice characteristic parameters stored in the VFIFO is stored in a variable called “VFIFO_SIZE.” A history of the voice characteristic parameter is stored in the VFIFO and, as such, the VFIFO is also called “voice history.”


Returning to FIG. 4, the driver 220 includes the twist motor 221 and the vertical motor 222, and is driven by the controller 110. The controller 110 controls the driver 220 and, as a result, the robot 200 can express actions such as, for example, lifting the head 204 up (rotating upward around the second rotational axis), twisting the head 204 sideways (twisting/rotating to the right or to the left around the first rotational axis), and the like. Action control data for performing these actions are stored in the storage 120, and the actions of the robot 200 are controlled on the basis of the detected external stimulus, a growth value described later, and the like.


The sound outputter 230 includes the speaker 231, and sound is output from the speaker 231 as a result of sound data being input into the sound outputter 230 by the controller 110. For example, the robot 200 emits a pseudo-animal sound as a result of the controller 110 inputting animal sound data of the robot 200 into the sound outputter 230. This animal sound data is also stored in the storage 120, and an animal sound is selected on the basis of the detected external stimulus, a growth value described later, and the like.


In one example, the operation inputter 240 is configured from an operation button, a volume knob, or the like. The operation inputter 240 is an interface for receiving operations performed by the user (owner or borrower) such as, for example, turning the power ON/OFF, adjusting the volume of the output sound, and the like. Note that a configuration is possible in which, in order to further enhance a sense of lifelikeness, the robot 200 includes only a power switch as the operation inputter 240 on the inside of the exterior 201, and does not include other operation buttons, the volume knob, and the like. In such a case as well, operations such as adjusting the volume of the robot 200 can be performed using an external smartphone or the like connected via the communicator 130.


The functional configuration of the robot 200 is described above. Next, action modes of the robot 200 set by the controller 110 of the action control device 100 are described. In the present embodiment, the robot 200 has, as the action modes, two action modes, namely a normal action mode and a familiar action mode. Typically, the robot 200 operates in the normal action mode but, when a person that has high intimacy with the robot 200 (a person intimate with the robot 200, for example, the owner, a person who always cares for the robot 200, or the like) speaks to the robot 200, the robot 200 transitions from the normal action mode to the familiar action mode and operates in the familiar action mode for a certain amount of time. Note that the familiar action mode is an action mode that is transitioned to when the person intimate with the robot 200 is near and, as such, is also called an “intimate action mode.”


The normal action mode is an action mode in which an action prepared in advance is performed on the basis of an externally-received stimulus (sound, touch, or the like), an emotion at that time, or the like, regardless of the intimacy between the robot 200 and the user near the robot 200. For example, in the normal action mode, the robot 200 performs a surprised action when the robot 200 hears a loud sound, and performs a happy action when petted.


The familiar action mode is an action mode that is transitioned to from the normal action mode when a determination is made, on the basis of a likelihood (certainty) between the robot 200 and the user near the robot, that the user near the robot 200 is a person with high intimacy to the robot 200. The familiar action mode is set for only a certain amount of time. In the familiar action mode, the robot 200 performs an action of playing (playing around) with the user near the robot 200 in accordance with the intimacy.


Specifically, for an action mode setting based on voice, a recognition level is determined in accordance with the action mode setting table 126 illustrated in FIG. 5 and on the basis of the similarity between the voice characteristic parameter of the acquired voice and the voice history, and one action mode, namely the normal action mode or the familiar action mode (three minutes, four minutes, five minutes), is set. Specifically, when the similarity between the voice characteristic parameter of the acquired voice and the voice history is low (lower than a predetermined threshold), the controller 110 determines, in accordance with the action mode setting table 126, that the intimacy between the robot 200 and the person speaking to the robot 200 is low (that is, that the person is not a “person that always cares for the robot 200”). and sets the action mode to the normal action mode.


When the similarity between the voice characteristic parameter of the acquired voice and the voice history is high (higher than the predetermined threshold), the controller 110 determines, in accordance with the action mode setting table 126, that the intimacy between the robot 200 and the person speaking to the robot 200 is high (that is, that the person is “a person that always cares for the robot 200”). Moreover, the controller 110 recognizes, on the basis of the likelihood (the certainty of “definitely” or “probably” or “maybe”) corresponding to the level of the similarity, that person as a “person that always cares for the robot 200”, and sets the action mode to the familiar action mode for a familiar amount of time corresponding to the likelihood. For example, when the similarity is very high, a first familiar amount of time (for example, five minutes) is set as the familiar amount of time, when the similarity is high, a second familiar amount of time (for example, four minutes) is set as the familiar amount of time, and when the similarity is medium, a third familiar amount of time (for example, three minutes) is set as the familiar amount of time.


In the present embodiment, the setting of the action mode is performed on the basis of voice similarity, but the setting of the action mode is not limited to being performed on the basis of voice similarity. For example, a configuration is possible in which the action mode is set to the familiar action mode when the manner of petting is similar to the past history. Additionally, a configuration is possible in which both the voice and the manner of petting are used to define each of a familiar action mode for when the similarity of both are high, a familiar action mode for when only the similarity of the voice history is high, and a familiar action mode for when the similarity of a touch history is high (for example, see Japanese Patent Application No. 2021-158663 for a method for determining whether the method of petting is similar to the past history).


Additionally, a configuration is possible in which, instead of absolutely setting the robot 200 to the familiar action mode when the similarity to the history is high, the controller 110 sets the robot 200 to the familiar action mode on the basis of a certain probability (for example, a probability corresponding to an amount of growth (growth value, described later) of the robot 200). Moreover, a configuration is possible in which, when the robot 200 is not set to the familiar action mode regardless of the similarity to the history being high, a familiar action (action when it is recognized that the person near the robot 200 is the owner or a person that always cares for the robot 200) described in, for example, Japanese Patent Application No. 2021-158663, is set on the basis of the certain probability.


Next, of the data stored in the storage 120, the emotion data 121, the emotion change data 122, the growth table 123, the action content table 124, and the growth days count data 125, which are pieces of data required to determine general actions determined on the basis of the growth value and the like, are described in order. The herein described general actions are performed in the normal action mode of the present embodiment.


The emotion data 121 is data for imparting pseudo-emotions to the robot 200, and is data (X, Y) that represents coordinates on an emotion map 300. As illustrated in FIG. 6, the emotion map 300 is expressed by a two-dimensional coordinate system with a degree of relaxation (degree of worry) axis as an X axis 311, and a degree of excitement (degree of disinterest) axis as a Y axis 312. An origin 310 (0, 0) on the emotion map 300 represents an emotion when normal. Moreover, as the value of the X coordinate (X value) is positive and the absolute value thereof increases, emotions for which the degree of relaxation is high are expressed and, as the value of the Y coordinate (Y value) is positive and the absolute value thereof increases, emotions for which the degree of excitement is high are expressed. Additionally, as the X value is negative and the absolute value thereof increases, emotions for which the degree of worry is high are expressed and, as the Y value is negative and the absolute value thereof increases, emotions for which the degree of disinterest is high are expressed. Note that, in FIG. 6, the emotion map 300 is expressed as a two-dimensional coordinate system, but the number of dimensions of the emotion map 300 may be set as desired.


In the present embodiment, regarding the size of the emotion map 300 as the initial value, as illustrated by frame 301 of FIG. 6, a maximum value of both the X value and the Y value is 100 and a minimum value is −100. Moreover, during a first period, each time the pseudo growth days count of the robot 200 increases one day, the maximum value and the minimum value of the emotion map 300 both increase by two. Here, the first period is a period in which the robot 200 grows in a pseudo manner, and is, for example, a period of 50 days from a pseudo birth of the robot 200. Note that the pseudo birth of the robot 200 is the time of the first start up by the user of the robot 200 after shipping from the factory. When the growth days count is 25 days, as illustrated by frame 302 of FIG. 6, the maximum value of the X value and the Y value is 150 and the minimum value is −150. Moreover, when the first period (in this example, 50 days) elapses, the pseudo growth of the robot 200 ends and, as illustrated in frame 303 of FIG. 6, the maximum value of the X value and the Y value is 200, the minimum value is −200, and the size of the emotion map 300 is fixed.


The emotion change data 122 is data that sets an amount of change that each of an X value and a Y value of the emotion data 121 is increased or decreased. In the present embodiment, as emotion change data 122 corresponding to the X of the emotion data 121, DXP that increases the X value and DXM that decreases the X value are provided and, as emotion change data 122 corresponding to the Y value of the emotion data 121, DYP that increases the Y value and DYM that decreases the Y value are provided. Specifically, the emotion change data 122 includes the following four variables, and is data expressing degrees to which the pseudo emotions of the robot 200 are changed.

    • DXP: Tendency to relax (tendency to change in the positive value direction of the X value on the emotion map)
    • DXM: Tendency to worry (tendency to change in the negative value direction of the X value on the emotion map)
    • DYP: Tendency to be excited (tendency to change in the positive value direction of the Y value on the emotion map)
    • DYM: Tendency to be disinterested (tendency to change in the negative value direction of the Y value on the emotion map)


In the present embodiment, an example is described in which the initial value of each of these variables is set to 10, and the value increases to a maximum of 20 by processing for learning emotion change data 122 in action control processing, described later. Due to this learning processing, the emotion change data 122, that is, the degree of change of emotion changes and, as such, the robot 200 assumes various personalities in accordance with the manner in which the user interacts with the robot 20. That is, the personality of each individual robot 200 is formed differently on the basis of the manner in which the user interacts with the robot 200.


In the present embodiment, each piece of personality data (personality value) is derived by subtracting 10 from each piece of emotion change data 122. Specifically, a value obtained by subtracting 10 from DXP that expresses a tendency to be relaxed is set as a personality value (chipper), a value obtained by subtracting 10 from DXM that expresses a tendency to be worried is set as a personality value (shy), a value obtained by subtracting 10 from DYP that expresses a tendency to be excited is set as a personality value (active), and a value obtained by subtracting 10 from DYM that expresses a tendency to be disinterested is set as a personality value (spoiled).


The initial value of each personality value is 0 and, as the robot 200 grows, each personality value changes, with an upper limit of 10, due to external stimuli and the like (manner in which the user interacts with the robot 200) detected by the sensor 210. In a case in which, as in the present embodiment, four personality values change from 0 to 10, it is possible to express 14,641 types of personalities (11 to the 4th power).


In the present embodiment, the greatest value among these four personality values is used as growth level data (the growth value) that expresses a pseudo growth level of the robot 200. Moreover, the controller 110 controls so that variation is introduced into the action content of the robot 200 in accordance with the pseudo growth of the robot 200 (as the growth value increases). As such, the data used by the controller 110 is the growth table 123.


As illustrated in FIG. 7, types of actions to be performed by the robot 200 in response to an action trigger such as the external stimulus detected by the sensor 210 or the like, and a probability of each action being selected in accordance with the growth value (hereinafter referred to as “action selection probability”) are stored in the growth table 123. Note that the action trigger is information about the external stimulus or the like that triggers the performance of some sort of action by the robot 200.


For example, a case is assumed in which, as a current personality value of the robot 200, the personality value (chipper) is 3, the personality value (active) is 8, the personality value (shy) is 5, and the personality value (spoiled) is 4, and a loud sound is detected by the microphone 214. In this case, the growth value is 8, which is the maximum value of the four personality values, and the action trigger is “heard a loud sound.” Moreover, in the growth table 123 illustrated in FIG. 7, when referencing the entry for when the action trigger is “heard a loud sound” and the growth value is 8, it is clear that the action selection probability of “basic action 2-0” is 20%, the action selection probability of “basic action 2-1” is 20%, the action selection probability of “basic action 2-2” is 40%, and the action selection probability of “personality action 2-0” is 20%.


That is, in this case, the “basic action 24)” is selected at a probability of 20%, the “basic action 2-1” is selected at a probability of 20%, the “basic action 2-2” is selected at a probability of 40%, and the “personality action 2-0” is selected at a probability of 20%. Moreover, when the “personality action 2-0” is selected, selection according to the four personality values of one of four types of personality actions such as those illustrated in FIG. 8 is further performed. Then, the robot 200 executes the selected action.


Note that, in the growth table 123 (FIG. 7) of the present embodiment, one personality action is selected for each action trigger but, as with the basic actions, a configuration is possible in which the types of selected personality actions are increased in accordance with an increase in the personality values. Additionally, in the present embodiment, only the growth table 123 (FIG. 7) for defining actions when in the normal action mode is defined, but a configuration is possible in which a growth table for defining actions when in the familiar action mode is separately defined. Moreover, a configuration is possible in which the content of FIG. 5 is also incorporated into the content of the growth table 123 (FIG. 7) and a growth table is set that defines, in the action type, not only actions for when in the normal action mode but also actions for when in the familiar action mode.


Provided that the growth table 123 can, for each action trigger, define a function (growth function) that returns, with the growth value as an argument, the action selection probability of each action type, any form may be used for the growth table 123, and the growth table 123 need not necessarily be in the form of tabular data such as illustrated in FIG. 7.


As illustrated in FIG. 8, the action content table 124 is a table in which specific action content of the various action types defined in the growth table 123 is stored and, for the personality actions, action content is defined for every type of personality. Note that the action content table 124 is not essential data. For example, the action content table 124 is unnecessary in a case in which the growth table 123 is constructed such that specific action content is directly recorded in the action type field of the growth table 123.


The growth days count data 125 has an initial value of 1, and 1 is added for each passing day. The growth days count data 125 represents a pseudo growth days count (number of days from a pseudo birth) of the robot 200. In the present embodiment, a period of the growth days count expressed by the growth days count data 125 is called a “second period.”


Next, the action control processing executed by the controller 110 of the action control device 100 is described while referencing the flowchart illustrated in FIG. 9. The action control processing is processing in which the controller 110 controls the actions (motion, animal sound, or the like) of the robot 200 on the basis of detection values from the sensor 210 or the like. When the user turns ON the power of the robot 200, execution of a thread of this action control processing is started in parallel with other required processings. As a result of the action control processing, the driver 220 and the sound outputter 230 are controlled, the motion of the robot 200 is expressed, sounds such as animal sounds and the like are output, and the like.


Firstly, the controller 110 initialization-processes the various types of data such as the emotion data 121, the emotion change data 122, the growth days count data 125, and the like (step S101). The various variables used in the present embodiment (BigSound_Flag, TalkSound_Flag, Talkdefinitely_Flag, Talkprobably_Flag, Talkmaybe_Flag, Touch_Flag, and the like) are also initialized to OFF or 0 in step S101. Additionally, the controller 110 sets the action mode to the normal action mode in step S101.


Next, the controller 110 executes microphone input processing for acquiring the external stimulus (voice) of the subject (the user) from the microphone 214 (step S102). Next, the controller 110 executes action mode setting processing for setting the action mode (step S103). Details of the action mode setting processing are described later but, mainly, the action mode setting processing is processing for setting, on the basis of the similarity between the external stimulus acquired in step S102 and the past history, the action mode to the normal action mode or the familiar action mode presented in the action mode setting table 126 illustrated in FIG. 5.


Next, the controller 110 executes touch input processing for acquiring the external stimulus from the touch sensor 211 and/or the acceleration sensor 212 (step S104). In the touch input processing, when touched or when there is a change in acceleration or angular velocity, the controller 110 sets the Touch_Flag to ON, calculates a touch characteristic parameter, and determines, on the basis of the similarity between the calculated touch characteristic parameter and a touch history, which is the history of past touch characteristic parameters, the intimacy with the subject (the user) applying the external stimulus (see Japanese Patent Application No. 2021-158663 for details about the touch input processing).


Note that, in the present embodiment, to facilitate comprehension, the microphone input processing and the touch input processing are described as separate processings, but a configuration is possible in which processing for acquiring the external stimulus from the various types of sensors of the sensor 210 and determining the intimacy with the subject (the user) applying the external stimulus is executed as a single processing (external input processing). Additionally, in the present embodiment, the action mode setting processing is executed in step S103, but a configuration is possible in which the action mode is set in consideration of an external input other than voice by executing the action mode setting processing after the touch input processing or after the external input processing.


Next, the controller 110 determines whether the external stimulus is acquired by the sensor 210 (step S105). For example, when a sound-based external stimulus is detected, as a result of the microphone input processing described above, the BigSound_Flag (flag that turns ON when a loud sound is detected) or the TalkSound_Flag (flag that turns ON when the voice of a person is detected) is set to ON and, as such, the controller 110 can determine, on the basis of the values of these flag variables, whether the external stimulus is acquired in step S105.


When a determination is made that the external stimulus is acquired (step S105; Yes), the controller 110 acquires, in accordance with the external stimulus acquired in the microphone input processing and the touch input processing, the emotion change data 122 to be added to or subtracted from the emotion data 121 (step S106). When, for example, petting of the head 204 is detected as the external stimulus, the robot 200 obtains a pseudo sense of relaxation and, as such, the controller 110 acquires DXP as the emotion change data 122 to be added to the X value of the emotion data 121.


Next, the controller 110 sets the emotion data 121 in accordance with the emotion change data 122 acquired in step S106 (step S107). When, for example, DXP is acquired as the emotion change data 122 in step S106, the controller 110 adds the DXP of the emotion change data 122 to the X value of the emotion data 121. However, in a case in which a value (X value, Y value) of the emotion data 121 exceeds the maximum value of the emotion map 300 when adding the emotion change data 122, that value of the emotion data 121 is set to the maximum value of the emotion map 300. In addition, in a case in which a value of the emotion data 121 is less than the minimum value of the emotion map 300 when subtracting the emotion change data 122, that value of the emotion data 121 is set to the minimum value of the emotion map 300.


In steps S106 and S107, any type of settings are possible for the type of emotion change data 122 acquired and the emotion data 121 set for each individual external stimulus. Examples are described below.

    • The head 204 is petted (relax): X=X+DXP
    • The head 204 is struck (worry): X=X−DXM


      (these external stimuli can be detected by the touch sensor 211H of the head 204)
    • The torso 206 is petted (excite): Y=Y+DYP
    • The torso 206 is struck (disinterest): Y=Y−DYM


      (these external stimuli can be detected by the touch sensor 211 of the torso 206)
    • Held with head upward (happy): X=X+DXP and Y=Y+DYP
    • Suspended with head downward (sad): X=X-DXM and Y=Y−DYM


      (these external stimuli can be detected by the touch sensor 211 and the acceleration sensor 212)
    • Spoken to in kind voice (peaceful): X=X+DXP and Y=Y−DYM
    • Yelled at in loud voice (upset): X=X-DXM and Y=Y+DYP


      (these external stimuli can be detected by the microphone 214)


Next, the controller 110 determines whether the current action mode is the normal action mode or the familiar action mode (step S108). When a determination is made that the current action mode is the normal action mode (step S108: Normal action mode), the controller 110 executes normal action mode processing, described later (step S112), and executes step S115.


When a determination is made that the current action mode is the familiar action mode (step S108: Familiar action mode), the controller 110 executes familiar action mode processing described later (step S109). Then, the controller 110 determines whether a familiar action amount of time (predetermined amount of time set from the start of the familiar action mode) set in the action mode setting processing of step S103 has elapsed (step S110). When the familiar action amount of time has not elapsed (step S110; No), the controller 110 executes step S115.


When a determination is made that the familiar action amount of time has elapsed (step S110; Yes), the controller 110 sets the action mode to the normal action mode (step S111), and executes step S115.


Meanwhile, when a determination is made in step S105 that the external stimulus is not acquired (step S105; No), the controller 110 determines whether to perform a spontaneous action such as a breathing action that creates the impression that the robot 200 is breathing, or the like, by periodically driving the twist motor 221 and the vertical motor 222 at a certain rhythm (step S113). Any method may be used as the method for determining whether to perform the spontaneous action and, in the present embodiment, it is assumed that the determination of step S113 is “Yes” and the breathing action is performed every breathing cycle (for example, two seconds).


When a determination is made to perform the spontaneous action (step S113; Yes), the controller 110 executes the spontaneous action (for example, the breathing action) (step S114), and executes step S115.


When a determination is made to not perform the spontaneous action (step S113; No), the controller 110 uses a built-in clock function to determine whether a date has changed (step S115). When a determination is made that the date has not changed (step S115: No), the controller 110 executes step S102.


Meanwhile, when a determination is made that the date has changed (step S115; Yes), the controller 110 determines whether it is in a first period (step S116). When the first period is, for example, a period 50 days from the pseudo birth (for example, the first startup by the user after purchase) of the robot 200, the controller 110 determines that it is in the first period when the growth days count data 125 is 50 or less. When a determination is made that it is not in the first period (step S116; No), the controller 110 executes step S118.


When a determination is made that it is in the first period (step S116; Yes), the controller 110 executes learning processing of the emotion change data 122, and expands the emotion map (step S117). The learning processing of the emotion change data 122 is, specifically, processing for updating the emotion change data 122 by adding 1 to the DXP of the emotion change data 122 when the X value of the emotion data 121 is set to the maximum value of the emotion map 300 even once in step S107 of that day, adding 1 to the DYP of the emotion change data 122 when the Y value of the emotion data 121 is set to the maximum value of the emotion map 300 even once in step S107 of that day, adding 1 to the DXM of the emotion change data 122 when the X value of the emotion data 121 is set to the minimum value of the emotion map 300 even once in step S107 of that day, and adding 1 to the DYM of the emotion change data 122 when the Y value of the emotion data 121 is set to the minimum value of the emotion map 300 even once in step S107 of that day.


However, when the various values of the emotion change data 122 become exceedingly large, the amount of change of one time of the emotion data 121 becomes exceedingly large and, as such, the maximum value of the various values of the emotion change data 122 is set to 20, for example, and the various values are limited to that maximum value or less. Here, 1 is added to each piece of the emotion change data 122, but the value to be added is not limited to 1. For example, a configuration is possible in which a number of times at which the various values of the emotion data 121 are set to the maximum value or the minimum value of the emotion map 300 is counted and, when that number of times is great, the numerical value to be added to the emotion change data 122 is increased.


Expanding the emotion map 300 in step S117 of FIG. 9 is, specifically, processing in which the controller 110 expands both the maximum value and the minimum value of emotion map 300 by 2. However, the numerical value “2” to be expanded is merely an example, and the emotion map 300 may be expanded by 3 or greater, or be expanded by 1. Additionally, the numerical values that the emotion map 300 is expanded by for the maximum value and the minimum value need not be the same.


Then, the controller 110 adds 1 to the growth days count data 125, initializes both the X value and the Y value of the emotion data 121 to 0 (step S118), and executes step S102.


Next, the microphone input processing executed in step S102 of the action control processing is described w % bile referencing FIGS. 10 and 11.


Firstly, the controller 110 substitutes, for a variable ML, a maximum level of the sampling data of the voice that is acquired by the microphone input processing and stored in the sound buffer 127 (step S201). Next, the controller 110 determines whether the value of the variable ML is greater than a BigSoundTh (step S202). Note that the BigSoundTh is a value (loud sound threshold), and the robot 200 performs a surprised action in response to sounds louder than the BigSoundTh. When a determination is made that the variable ML is greater than the BigSoundTh (step S202: Yes), the controller 110 sets a variable BigSound_Flag, indicating that a loud sound has been input, to ON (step S203), ends the microphone input processing, and executes step S103 of the action control processing.


Meanwhile, when a determination is made that the variable ML is not greater than the BigSoundTh (step S202; No), the controller 110 determines whether the value of the variable ML is greater than a TalkSoundTh. Note that the TalkSoundTh is a value (talking voice threshold), and the robot 200 cannot hear, as a talking voice, sounds that are quieter than or equal to the TalkSoundTh. When a determination is made that the variable ML is not greater than the TalkSoundTh (step S204: No), the controller 110 ends the microphone input processing, and executes step S103 of the action control processing.


Meanwhile, when a determination is made that the variable ML is greater than the TalkSoundTh (step S204: Yes), the controller 110 determines whether a number of buffers storing the sound data in the sound buffer 127 is less than a reference number (here, the 16 buffers 1270 to 1285) (step S205). When a determination is made that the number of buffers is less than the reference number (step S205; Yes), the controller 110 executes step S205, and continues the storing of the reference number of buffers.


Meanwhile, when a determination is made that the number of buffers storing the sound data has reached the reference number (step S205; No), the controller 110 determines whether the sound stored in the reference number of buffers is noise (step S206). As an example of a method for determining whether the sound is noise, when the sound stored in the buffers is a talking voice that is not noise, a sound of a level greater than TalkSoundTh occurs for a certain amount of time (for example, 0.1 seconds or longer). Meanwhile, when the sound stored in the buffers is noise, there is a high possibility that the sound is a single, momentary sound. The controller 110 uses such sound characteristics to determine whether the sound stored in each buffer is noise.


Firstly, for a predetermined number of buffers (in the present embodiment, three sound buffers, namely, buffer 1270, buffer 1271, and buffer 1272) from the beginning (the buffer 1270) among the reference number of buffers, the controller 110 investigates the number of buffers in which, of the sampling data stored in each buffer, sampling data having a maximum value greater than the TalkSoundTh is stored. When a determination is made that there is even one buffer in which sampling data having a maximum value less than or equal to the TalkSoundTh is stored, the sampling data of the reference number of buffers stored this time is determined to be noise. Meanwhile, when a determination is made that the maximum level of the sampling data stored in all of the buffers is greater than the TalkSoundTh, the sampling data is determined to be not noise.


When a determination is made that the sound stored in the reference number of buffers is noise (step S206; Yes), the controller 110 disregards the sampling data stored in the current reference number of buffers (that is, determines that there are no sound external stimuli that constitute an action trigger), ends the microphone input processing, and executes step S103 of the action control processing.


Meanwhile, when a determination is made that the sound stored in the reference number of buffers is not noise (step S206; No), the controller 110 determines that the sampling data is a talking voice, substitutes ON for the variable TalkSound_Flag that indicates that a talking voice is inputted (step S207), and performs voice characteristic parameter calculation processing (step S208). The voice characteristic parameter calculation processing is processing for calculating the voice characteristic parameter by calculating a Cepstrum from the sampling data stored in the sound buffer 127 (for details, see Japanese Patent Application No. 2021-158663).


Next, the controller 110 performs similarity with voice history determination processing (step S209). The similarity with voice history determination processing is processing for calculating a similarity by comparing the voice characteristic parameter calculated by the voice characteristic parameter calculation processing and the voice history, and outputting from return=0 to return=3 in accordance with the similarity (0=not similar, 1=medium similarity, 2=high similarity, and 3=very high similarity).


Then, the controller 110 determines an output result of the similarity with voice history determination processing (step S210). When the determination result of the similarity with voice history determination processing is return=3 (step S210; Yes), the controller 110 substitutes ON for a variable Talkdefinitely_Flag indicating that the robot 200 recognizes that the voice definitely is a person that always cares for the robot 200 (step S211), and executes step S212.


When a determination is made that the determination result of the similarity with voice history determination processing is not return=3 (step S210; No), the controller 110 determines whether the determination result of the similarity with voice history determination processing is return=2 (that is, high similarity) (step S213). When a determination is made that the determination result is return=2 (step S213; Yes), the controller 110 substitutes ON for a variable Talkprobably_Flag indicating that the robot 200 recognizes that the voice probably is a person that always cares for the robot 200 (step S214), and executes step S212.


When a determination is made that the determination result of the similarity with voice history determination processing is not return=2 (step S213; No), the controller 110 determines whether the determination result of the similarity with voice history determination processing is return=1 (that is, medium similarity) (step S215). When a determination is made that the determination result is return=1 (step S215; Yes), the controller 110 substitutes ON for a variable Talkmaybe_Flag indicating that the robot 200 recognizes that the voice may be is a person that always cares for the robot 200 (step S216), and executes step S212.


When a determination is made that the determination result of the similarity with voice history determination processing is not return=1 (step S215; No), the controller 110 substitutes ON for a variable Talkgeneralaction_Flag indicating that a general action is to be performed (step S217), and executes step S212.


Next, in step S212, the controller 110 stores, in the voice history (VFIFO), the voice characteristic parameter calculated in step S208 (step S212). Then, the controller 110 ends the microphone input processing, and executes step S103 of the action control processing.


Next, the similarity with voice history determination processing executed in step S218 of the microphone input processing is described while referencing FIG. 12.


Firstly, the controller 110 determines whether a stored number of the voice history stored in the variable VFIFO_Size (buffer) is greater than a minimum voice reference number (in the present embodiment, 32) (step S251). When a determination is made that the stored number is less than or equal to the minimum voice reference number (step S251; No), the controller 110 outputs “return=0” (expressing not similar), ends the similarity with voice history determination processing, and executes step S210 of the microphone input processing.


When a determination is made that the stored number is greater than the minimum voice reference number (step S251; Yes), the controller 110 initializes a variable abssimCnt for counting the number of voice histories for which the similarity is very high, a variable simCnt for counting the number of voice histories for which the similarity is high, a variable maysimCnt for counting the number of voice histories for which the similarity is medium, and a variable i for stipulating the various elements (VFIFO[0] to VFIFO[VFIFO_Size−1]), as array variables, of the voice history VFIFO to 0 (step S252).


Next, the controller 110 calculates a distance (L2 norm) between the voice characteristic parameter calculated in step S208 and the VFIFO[i], and substitutes this distance for a variable d[i] (step S253). Next, the controller 110 determines whether the value of the variable d[i] is less than a VAbsSimTh (voice extremely highly similar threshold) (step S254). Note that a value less than a VSimTH (voice similar threshold), described later, is set in advance as the VAbsSimTh (voice extremely highly similar threshold). When a determination is made that the variable d[i] is less than the VAbsSimTh (step S254; Yes), the controller 110 adds 1 to the variable abssimCnt (step S255), and executes step S256. When a determination is made that the variable dlii is greater than or equal to the VAbsSimTh (step S254; No), the controller 110 executes step S256.


Next, the controller 110 determines whether the value of the variable d[i] is less than the VSimTh (set in advance as the voice similar threshold) (step S256). When a determination is made that the variable d[i] is less than the VSimTh (step S256; Yes), the controller 110 adds 1 to the variable simCnt (step S257), and executes step S258. When a determination is made that the variable d[i] is greater than or equal to the VSimTh (step S256; No), the controller 110 executes step S258.


Next, in step S258, the controller 110 determines whether the value of the variable d[i] is less than a VMaySimTh (voice medium similar threshold). Note that a value greater than the VSimTH (voice similar threshold) is set in advance as the VMaySimTh (voice medium similar threshold). When a determination is made that the variable d[i] is less than the VMaySimTh (step S258: Yes), the controller 110 adds 1 to the variable maysimCnt (step S259), and executes step S260. When a determination is made that the variable d[i] is greater than or equal to the VMaySimTh (step S258; No), the controller 110 executes step S260.


In step S260, the controller 110 adds 1 to the variable i. Next, the controller 110 determines whether the value of the variable i is less than the variable VFIFO_Size (step S261). When a determination is made that the variable i is less than the variable VFIFO_Size (step S261; Yes), the controller 110 executes step S253.


When a determination is made that the variable i is greater than or equal to the variable VFIFO_Size (step S261; No), the controller 110 determines whether a ratio of the variable abssimCnt to the variable VFIFO_Size exceeds 20% (step S262). When a determination is made that the ratio of the variable abssimCnt to the variable VFIFO_Size exceeds 20% (step S262: Yes), the similarity between the voice characteristic parameter calculated in step S208 and the voice history is very high and, as such, the controller 110 outputs “return=3”, ends the similarity with voice history determination processing, and executes step S210 of the microphone input processing.


Meanwhile, when a determination is made that the ratio of the variable abssimCnt to the variable VFIFO_Size is less than or equal to 20% (step S262: No), the controller 110 determines whether a ratio of the variable simCnt to the variable VFIFO_Size exceeds 20% (step S263). When a determination is made that the ratio of the variable simCnt to the variable VFIFO_Size exceeds 20% (step S263; Yes), the similarity between the voice characteristic parameter calculated in step S208 and the voice history is high and, as such, the controller 110 outputs “return=2”, ends the similarity with voice history determination processing, and executes step S210 of the microphone input processing.


When a determination is made that the ratio of the variable simCnt to the variable VFIFO_Size is less than or equal to 20% (step S263; No), the controller 110 determines whether a ratio of the variable maysimCnt to the variable VFIFO_Size exceeds 30% (step S264). When a determination is made that the ratio of the variable maysimCnt to the variable VFIFO_Size exceeds 30% (step S264; Yes), the similarity between the voice characteristic parameter calculated in step S208 and the voice history is medium and, as such, the controller 110 outputs “return=1”, ends the similarity with voice history determination processing, and executes step S210 of the microphone input processing.


Meanwhile, when a determination is made that the ratio of the variable maysimCnt to the variable VFIFO_Size is less than or equal to 30% (step S264; No), the voice characteristic parameter calculated in step S208 and the voice history are not similar and, as such, the controller 110 outputs “return=0”, ends the similarity with voice history determination processing, and executes step S210 of the microphone input processing. Note that comparing against “20%” and “30%” in the determinations described above are merely examples, and can be changed as needed together with the VabsSimTh, the VSimTh, and the VMaySimTh.


Next, the action mode setting processing that is executed in step S103 of the action control processing (FIG. 9) is described while referencing FIG. 13.


Firstly, the controller 110 determines whether the subject (the user or the like) applying the external stimulus definitely is a person that always cares for the robot 200 (that is, whether the Talkdefinitely_Flag is ON) (step S131). When a determination is made that the subject definitely is a person that always cares for the robot 200 (step S131; Yes), the controller 110 sets the action mode to the familiar action mode, sets the familiar action amount of time to a first familiar action amount of time (for example, five minutes) (step S132), ends the action mode setting processing, and executes step S104 of the action control processing.


When a determination is made that the subject (the user or the like) applying the external stimulus is not definitely a person that always cares for the robot 200 (that is, when the Talkdefinitely_Flag is not ON) (step S131; No), the controller 110 determines whether the subject (the user or the like) applying the external stimulus probably is a person that always cares for the robot 200 (that is, whether the Talkprobably_Flag is ON) (step S133). When a determination is made that the subject probably is a person that always cares for the robot 200 (step S133: Yes), the controller 110 sets the action mode to the familiar action mode, sets the familiar action amount of time to a second familiar action amount of time (for example, four minutes) (step S134), ends the action mode setting processing, and executes step S104 of the action control processing.


When a determination is made that the subject (the user or the like) applying the external stimulus is not probably a person that always cares for the robot 200 (that is, when the Talkprobably_Flag is not ON) (step S133; No), the controller 110 determines whether the subject (the user or the like) applying the external stimulus “maybe is a person that always cares for the robot 200” (that is, whether the Talkmaybe_Flag is ON) (step S135). When a determination is made that the subject “maybe is a person that always cares for the robot 200 (step S135; Yes), the controller 110 sets the action mode to the familiar action mode, sets the familiar action amount of time to a third familiar action amount of time (for example, three minutes) (step S136), ends the action mode setting processing, and executes step S104 of the action control processing.


When a determination is made that the subject (the user or the like) applying the external stimulus is not “maybe a person that always cares for the robot 200” (that is, when the Talkmaybe_Flag is not ON) (step S135: No), the controller 110 sets the action mode to the normal action mode (step S137), ends the action mode setting processing, and executes step S104 of the action control processing.


As a result of the action mode setting processing described above, the controller 110 sets the action mode on the basis of the likelihood obtained from the level of similarity between the subject (the user or the like) applying the external stimulus to the robot 200 and the past voice history. In cases in which the action mode is set to the familiar action mode, the action mode is returned to the normal action mode when the predetermined familiar action amount of time elapses from the start of the familiar action mode. When setting the familiar action mode again during the period in which the familiar action mode is set, the familiar action amount of time is re-set (updated) on the basis of a degree of confidence in the intimacy. Accordingly, a user that always cares for the robot 200 can extend the amount of time that the action mode is set to the familiar action mode by occasionally speaking to the robot 200 when in the familiar action mode.


Next, the normal action mode processing that is executed in step S112 of the action control processing (FIG. 9) is described while referencing FIG. 14.


Firstly, the controller 110 determines whether there is an external stimulus such as a touch or the like in the touch input processing (step S151). Specifically, it is sufficient that the controller 110 determines whether the Touch_Flag is ON. When there is a touch or the like (step S151; Yes), the controller 110 performs a touch general action (step S152). The touch general action is a general action performed when the user pets the body of the robot 200, holds the robot 200, or the like, and specifically is an action set in the action type field of the growth table 123, with the body is petted or held as the action trigger (in FIG. 7, the basic action 0-0 and the like). Next, the controller 110 substitutes OFF for the variable Touch_Flag (step S153), ends the normal action mode processing, and executes step S115 of the action control processing (FIG. 9).


Meanwhile, when there is not an external stimulus such as a touch or the like in the touch input processing (step S151: No), the controller 110 determines whether there is a sound as the external stimulus in the microphone input processing (step S154). Specifically, it is sufficient that the controller 110 determines whether the TalkSound_Flag is ON. If there is a sound (step S154; Yes), the controller 110 performs a “Talk general action” (step S155). The “Talk general action” is a general action performed when the user speaks to the robot 200, and specifically is an action set in the action type field of the growth table 123, with the robot 200 is spoken to as the action trigger (in FIG. 7, the basic action 1-0 and the like). Next, the controller 110 substitutes OFF for the variable TalkSound_Flag (step S156), ends the normal action mode processing, and executes step S115 of the action control processing (FIG. 9).


Meanwhile, when there is not a sound as the external stimulus in the microphone input processing (step S154; No), the controller 110 determines whether there is a loud sound as the external stimulus in the microphone input processing (step S157). Specifically, it is sufficient that the controller 110 determines whether the BigSound_Flag is ON. When there is a loud sound (step S157: Yes), the controller 110 executes an action of reacting to the loud sound (step S158). That is, the controller 110 executes an action (the basic action 2-0 or the like) corresponding to “heard a loud sound” as the action trigger of the growth table 123 illustrated in FIG. 7. Then, the controller 110 substitutes OFF for the variable BigSound_Flag (step S159), ends the normal action mode processing, and executes step S115 of the action control processing (FIG. 9).


Meanwhile, when there is not a loud sound as the external stimulus (step S157: No), the controller 110 executes an action corresponding to another external stimulus (when an action trigger corresponding to the external stimulus acquired in the microphone input processing and/or the touch input processing exists in the growth table 123, an action corresponding to that action trigger) (step S160), ends the normal action mode processing, and executes step S115 of the action control processing (FIG. 9).


Next, the familiar action mode processing that is executed in step S109 of the action control processing (FIG. 9) is described while referencing FIG. 15.


Firstly, the controller 110 determines whether there is an external stimulus such as a touch or the like in the touch input processing (step S171). Specifically, it is sufficient that the controller 110 determines whether the Touch_Flag is ON. When there is a touch or the like (step S171; Yes), the controller 110 executes touch response familiar action processing (step S172). The touch response familiar action processing is described later. Next, the controller 110 substitutes OFF for the variable Touch_Flag (step S173), ends the familiar action mode processing, and executes step S110 of the action control processing (FIG. 9).


Meanwhile, when there is not an external stimulus such as a touch or the like in the touch input processing (step S171; No), the controller 110 determines whether there is a sound as the external stimulus in the microphone input processing (step S174). Specifically, it is sufficient that the controller 110 determines whether the TalkSound_Flag is ON. When there is a sound (step S174; Yes), the controller 110 executes sound response familiar action processing (step S175). The sound response familiar action processing is described later. Next, the controller 110 substitutes OFF for the variable TalkSound_Flag (step S176), ends the familiar action mode processing, and executes step S110 of the action control processing (FIG. 9).


Meanwhile, when there is not a sound as the external stimulus in the microphone input processing (step S174; No), the controller 110 determines whether there is a loud sound as the external stimulus in the microphone input processing (step S177). Specifically, it is sufficient that the controller 110 determines whether the BigSound_Flag is ON. When there is a loud sound (step S177; Yes), the controller 110 executes loud sound response familiar action processing (step S178). The loud sound response familiar action processing is described later. Then, the controller 110 substitutes OFF for the variable BigSound_Flag (step S179), ends the familiar action mode processing, and executes step S110 of the action control processing (FIG. 9).


Meanwhile, when there is not a loud sound as the external stimulus (step S177; No), the controller 110 executes an action corresponding to another external stimulus (when an action trigger corresponding to the external stimulus acquired in the microphone input processing and/or the touch input processing exists in the growth table 123, an action corresponding to that action trigger) (step S180), ends the familiar action mode processing, and executes step S110 of the action control processing (FIG. 9).


Next, the touch response familiar action processing that is executed in step S172 of the familiar action mode processing (FIG. 15) is described while referencing FIG. 16.


Firstly, the controller 110 determines, by the touch sensor 211H, whether the head 204 is being held down (step S301). This can be determined on the basis of whether the pressure acquired by the touch sensor 211H is greater than or equal to a predetermined threshold. When the head 204 is not being held down (step S301; No), the controller 110 ends the touch response familiar action processing, and executes step S173 of the familiar action mode processing (FIG. 15).


When the head 204 is being held down (step S301: Yes), the controller 110 performs an action of raising the torso 206 (step S302). Specifically, the controller 110 raises the head 204 using the vertical motor 222. Since the user is holding the head 204 down, the torso 206 is raised by raising the head 204 using the vertical motor 222. Note that when the force of the user holding the head 204 down is weak, it is thought that the head 204 will rise without the torso 206 rising. As such, the predetermined threshold when determining in step S301 is set, by step S302, to a value at which the torso 206 will rise.


Next, the controller 110 determines whether the head 204 is still being held down (step S303). When the head 204 is still being held down (step S303; Yes), the controller 110 executes step S302, and repeats the action of raising the torso 206.


When the head 204 is not being held down (step S303; No), the controller 110 returns the robot 200 to the original state (typically, a cyclical breathing action) (step S304), ends the touch response familiar action processing, and executes step S173 of the familiar action mode processing (FIG. 15).


As a result of this touch response familiar action processing, when the user holds down the head 204 of the robot 200, the robot 200 raises the torso 206 in response thereto and, as such, the user can be given the impression of playing with the robot 200.


Next, the sound response familiar action processing that is executed in step S175 of the familiar action mode processing (FIG. 15) is described while referencing FIG. 17.


Firstly, the controller 110 determines, by the touch sensor 211 of the torso 206, whether the torso 206 is being touched (step S311). When the torso 206 is not being touched (step S311; No), the controller 110 ends the sound response familiar action processing, and executes step S176 of the familiar action mode processing (FIG. 15).


When the torso 206 is being touched (step S311; Yes), the controller 110 performs a trembling action (step S312). Specifically, the controller 110 causes the robot 200 to tremble by moving the head 204 left and right at small increments (for details of this processing, see Unexamined Japanese Patent Application Publication No. 2022-142113).


Next, the controller 110 determines whether the torso 206 is still being touched (step S313). When the torso 206 is still being touched (step S313; Yes), the controller 110 executes step S312 and repeats the trembling action.


When the torso 206 is not being touched (step S313; No), the controller 110 performs an action of raising the head 204 and looking around (step S314). Specifically, the controller 110 uses the vertical motor 222 to raise the head 204, and uses the twist motor 221 to rotate the head 204 to the left and right.


Then, the controller 110 returns the robot 200 to the original state (typically, a cyclical breathing action) (step S315), ends the sound response familiar action processing, and executes step S176 of the familiar action mode processing (FIG. 15).


As a result of this sound response familiar action processing, when the user holds down the torso 206 of the robot 200, the robot 200 trembles and, as such, the robot 200 can give the impression of being frightened due to the body being held down. Moreover, when the hand of the user is removed from the robot 200, the robot 200 raises the head 204 and looks around as if to say “the danger has passed.” As a result, the user can feel that the robot 200 is more adorable.


Next, the loud sound response familiar action processing that is executed in step S178 of the familiar action mode processing (FIG. 15) is described while referencing FIG. 18.


Firstly, the controller 110 generates a random number, namely an integer from 0 to 2 (step S321). Next, the controller 110 determines whether the generated random number is 0 (step S322). When the generated random number is 0 (step S322: Yes), the controller 110 performs an action of tilting the robot 200 to the left (step S323). Specifically, the controller 110 uses the vertical motor 222 to lower the head 204, and uses the twist motor 221 to rotate the head 204 to the right. As a result, the body of the robot 200 tilts diagonally to the left. Then, the controller 110 ends the loud sound response familiar action processing, and executes step S179 of the familiar action mode processing (FIG. 15).


When the generated random number is not 0 (step S322; No), the controller 110 determines whether the generated random number is 1 (step S324). When the generated random number is 1 (step S324; Yes), the controller 110 performs an action of tilting the robot 200 to the right (step S325). Specifically, the controller 110 uses the vertical motor 222 to lower the head 204, and uses the twist motor 221 to rotate the head 204 to the left. As a result, the body of the robot 200 tilts diagonally to the right. Then, the controller 110 ends the loud sound response familiar action processing, and executes step S179 of the familiar action mode processing (FIG. 15).


When the generated random number is not 1 (step S324; No), the controller 110 determines whether the generated random number is 2 (step S326). When the generated random number is 2 (step S326; Yes), the controller 110 causes the robot 200 to perform a swing action (step S325). Specifically, the controller 110 repeatedly performs the action of tilting to the left and the action of tilting to the right to give the impression that that the robot 200 is swinging. Then, the controller 110 ends the loud sound response familiar action processing, and executes step S179 of the familiar action mode processing (FIG. 15).


Note that a configuration is possible in which, in step S321 described above, the controller 110 generates numbers in a regular order of, for example, 0, 1, 2, 0, 1, 2, and so on, instead of generating a random number.


As a result of this loud sound response familiar action processing, when the user makes a loud sound, the body of the robot 200 moves so as to tilt and swing and, as such, the user can be given the impression of playing with the robot 200 by making sounds.


Note that the various actions described above of the familiar action mode are merely examples, and the action content may be changed on the basis of the emotion data 121 at each point in time and/or the emotion change data 122.


Additionally, in the action mode setting processing (FIG. 13) described above, the controller 110 sets the action mode to the familiar action mode only when a determination is made that the intimacy with the user applying the external stimulus is high. However, the present disclosure need not be limited to this setting method. For example, a configuration is possible in which, when a determination is made that the robot 200 is spoken to, the controller 110 sets the action mode to the familiar action mode in accordance with a number of days from the pseudo-birth of the robot 200 (for example, the birthday of the robot 200), regardless of the similarity with the history. Additionally, a configuration is possible in which the controller 110 occasionally (for example, about one time per day) sets the action mode to the familiar action mode by a random number or the like, regardless of the similarity with the history. The familiar action amount of time in such cases can also be set as desired, and a configuration is possible in which the controller 110 sets the familiar action amount of time to, for example, the comparatively short third familiar action amount of time (for example, three minutes).


As a result of the action control processing described above, the controller 110 acquires the external stimulus acting on the robot 200 (the control target device); sets, on the basis of the likelihood (owner, person that always cares for the robot 200, person that cares little for the robot 200, and the like) corresponding to the level of intimacy between the robot 200 and the subject (the user or the like) applying the external stimulus, the action mode in which the action content that the robot 200 performs to the subject is defined; and controls the action of the robot 200 on the basis of the external stimulus and the action mode. As such, in accordance with the similarity between a characteristic quantity of the manner of speaking and an external stimulus characteristic quantity stored in the storage 120 as the history, the robot 200 can determine that the person applying the external stimulus is “probably a person that always cares for the robot 200” and perform an action in the familiar action mode, and the controller 110 can cause an action to be performed that takes the relationship between the robot 200 and the subject into consideration.


Modified Examples

The present disclosure is not limited to the embodiment described above, and various modifications and uses are possible. For example, a configuration is possible in which, as when in the normal action mode, the action content when in the familiar action mode changes in accordance with the growth value and/or the personality.


The actions of the robot 200 are not limited to the actions by the driver 220 and the outputting of sounds from the sound outputter 230. A configuration is possible in which, in cases in which the robot 200 includes other controlled components (for example, an LED, a display, or the like), as the action of the robot 200, the controller 110 controls a color and/or a brightness of an LED that is turned ON. It is sufficient that the controlled components to controlled by the controller 110 include at least one of the driver 220 and the sound outputter 230.


The configuration of the emotion map 300, and the setting methods of the emotion data 121, the emotion change data 122, the personality data, the growth value, and the like in the embodiment described above are merely examples. For example, a configuration is possible in which a numerical value (when exceeding 10, always set to 10) obtained by dividing the growth days count data 125 by a certain number is set as the growth value.


In the embodiment described above, the action control device 100 for controlling the robot 200 is built into the robot 200, but the action control device 100 for controlling the robot 200 need not necessarily be built into the robot 200. For example, a configuration is possible in which the action control device 100 is configured as a device separate from the robot 200, and the robot 200 includes a controller 250 and a communicator 260 separate from the controller 110 and the communicator 130 of the action control device 100. In such a case, the communicator 260 and the communicator 130 are configured so as to send and receive data to and from each other, and the controller 110 acquires the external stimulus detected by the sensor 210, controls the driver 220 and the sound outputter 230, and the like via the communicator 130 and the communicator 260.


In the embodiments described above, a description is given in which the action programs executed by the CPU of the controller 110 are stored in advance in the ROM or the like of the storage 120. However, the present disclosure is not limited thereto, and a configuration is possible in which the action programs for executing the various processings described above are installed on an existing general-purpose computer or the like, thereby causing that computer to function as a device corresponding to the action control device 100 according to the embodiments described above.


Any method can be used to provide such programs. For example, the programs may be stored and distributed on a non-transitory computer-readable recording medium (flexible disc, Compact Disc (CD)-ROM, Digital Versatile Disc (DVD)-ROM, Magneto Optical (MO) disc, memory card, USB memory, or the like), or may be provided by storing the programs in a storage on a network such as the internet, and causing these programs to be downloaded.


Additionally, in cases in which the processings described above are realized by being divided between an operating system (OS) and an application/program, or are realized by cooperation between an OS and an application/program, it is possible to store only the portion of the application/program on the non-transitory recording medium or in the storage. Additionally, the programs can be piggybacked on carrier waves and distributed via a network. For example, the programs may be posted to a bulletin board system (BBS) on a network, and distributed via the network. Moreover, a configuration is possible in which the processings described above are executed by starting these programs and, under the control of the operating system (OS), executing the programs in the same manner as other applications/programs.


Additionally, a configuration is possible in which the controller 110 is constituted by a desired processor unit such as a single processor, a multiprocessor, a multi-core processor, or the like, or by combining these desired processors with processing circuitry such as an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or the like.


The foregoing describes some example embodiments for explanatory purposes. Although the foregoing discussion has presented specific embodiments, persons skilled in the art will recognize that changes may be made in form and detail without departing from the broader spirit and scope of the invention. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense. This detailed description, therefore, is not to be taken in a limiting sense, and the scope of the invention is defined only by the included claims, along with the full range of equivalents to which such claims are entitled.

Claims
  • 1. An action control device that controls an action of a control target device, the action control device comprising: a controller that acquires an external stimulus, andin a case where the controller executes an action corresponding to the external stimulus, controls so as to execute, based on an intimacy between a subject applying the external stimulus and the control target device, different action content.
  • 2. The action control device according to claim 1, wherein the controller controls so as to determine, based on a result of comparing the acquired external stimulus and the external stimulus acquired previously, the intimacy between the control target device and the subject applying the external stimulus.
  • 3. The action control device according to claim 2, wherein the controller acquires a sound as the external stimulus, andcontrols so as to determine, based on a level of similarity between a characteristic quantity of a predetermined parameter included in the acquired sound and a characteristic quantity of the predetermined parameter included in sound acquired previously, the intimacy with the subject applying the external stimulus.
  • 4. The action control device according to claim 3, wherein the controller sets, based on the level of the similarity, an action mode in which the action content is defined, andthe action mode includes a normal action mode in which action content executed regardless of the intimacy with the subject is defined, and an intimate action mode in which action content executed based on the intimacy with the subject is defined.
  • 5. The action control device according to claim 4, wherein the controller controls so that, after the action mode is set to the intimate action mode, when a determination is made that a predetermined amount of time set in accordance with the level of similarity has elapsed, the action mode is transitioned to the normal action mode.
  • 6. The action control device according to claim 5, wherein the predetermined amount of time differs in accordance with the determined intimacy.
  • 7. The action control device according to claim 4, wherein the controller controls so as to continue the intimate action mode when, for an external stimulus acquired when set to the intimate action mode, the similarity determined from the external stimulus is a predetermined level.
  • 8. An action control method comprising: acquiring, by a controller, an external stimulus; andin a case of executing an action corresponding to the external stimulus, controlling so as to execute, by the controller and based on an intimacy between a subject applying the external stimulus and the control target device, different action content.
  • 9. The action control method according to claim 8, wherein the controller controls so as to determine, based on a result of comparing the acquired external stimulus and the external stimulus acquired previously, the intimacy between the control target device and the subject applying the external stimulus.
  • 10. The action control method according to claim 9, wherein the controller acquires a sound as the external stimulus, andcontrols so as to determine, based on a level of similarity between a characteristic quantity of a predetermined parameter included in the acquired sound and a characteristic quantity of the predetermined parameter included in sound acquired previously, the intimacy with the subject applying the external stimulus.
  • 11. The action control method according to claim 10, wherein the controller sets, based on the level of the similarity, an action mode in which action content is defined, andthe action mode includes a normal action mode in which action content executed regardless of the intimacy with the subject is defined, and an intimate action mode in which action content executed based on the intimacy with the subject is defined.
  • 12. A non-transitory computer-readable recording medium storing a program, the program causing a computer to: acquire an external stimulus; andin a case of executing an action corresponding to the external stimulus, control so as to execute, based on an intimacy between a subject applying the external stimulus and the control target device, different action content.
  • 13. The non-transitory computer-readable recording medium according to claim 12, wherein the program causes the computer to control so as to determine, based on a result of comparing the acquired external stimulus and the external stimulus acquired previously, the intimacy between the control target device and the subject applying the external stimulus.
  • 14. The non-transitory computer-readable recording medium according to claim 13, wherein the program causes the computer to: acquire a sound as the external stimulus, andcontrol so as to determine, based on a level of similarity between a characteristic quantity of a predetermined parameter included in the acquired sound and a characteristic quantity of the predetermined parameter included in sound acquired previously, the intimacy with the subject applying the external stimulus.
  • 15. The non-transitory computer-readable recording medium according to claim 14, wherein the program causes the computer to set, based on the level of the similarity, an action mode for which the action content is defined, andthe action mode includes a normal action mode in which action content executed regardless of the intimacy with the subject is defined, and an intimate action mode in which action content executed based on the intimacy with the subject is defined.
Priority Claims (1)
Number Date Country Kind
2022-187973 Nov 2022 JP national