The present invention relates to a technique for inferring thoughts such as desires of people with speech difficulties.
Electrical appliances such as air conditioners, televisions, lighting equipment, and microwave ovens controlled with a user's voice have attained widespread use. Electrical appliances that can respond to only very limited commands were already in widespread use at the end of the 20th century. Recent years haves seen the use of electrical appliances controllable with a user's voice more humanly or interactively using technologies of artificial intelligence (AI), Internet of things (IoT), and information and communications technology (ICT). Electrical appliances controllable with a voice using the technologies are called “smart home appliances”, “IoT home appliances”, “AI home appliances”, or the like.
Further, smart speakers use the technologies of AI, IoT, and ICT to search for information and present the information found out by the search or to control electrical appliances based on a user's voice. Such functions are available also on mobile devices of smart phones, tablet computers, smart watches, and so on.
Hereinafter, such smart home appliances, smart speakers, and mobile devices are referred to as “ICT devices”.
As described above, a user can use an ICT device more humanly or interactively. Accordingly, also for people with physical disabilities, the ICT device is easier to use than conventional devices, so that the quality of life (QOL) of people with physical disabilities can be improved.
However, it is still difficult for the ICT device to directly improve the QOL of people with speech difficulties, e.g., people with severe motor and intellectual disabilities or dementia patients. This is because the voice of people with speech difficulties is still difficult to be recognized by the speech recognition function employed in the ICT device. In view of this, the ICT device has been used through a caregiver such as a family member or a healthcare worker.
Patent Literature 1: Japanese Patent No. 6315744
In the meantime, a conversation assistance terminal described in Patent Literature 1 includes an input device and a display device. In a case where an image displayed in the display device is selected, the conversation assistance terminal outputs, by voice, a message corresponding to the selected image. A user of the conversation assistance terminal can select a displayed image to cause the terminal to utter a message on behalf of the user. This can cause an ICT device to hear the voice.
However, in a conventional device such as the conversation assistance terminal, it is necessary to select a displayed image; therefore, such a conventional device is sometimes difficult for a person with severe motor and intellectual disabilities or a dementia patient to use without assistance of a caregiver. In other words, such a conventional device places a burden on the caregiver.
The present invention has been achieved in light of such a problem, and therefore, an object of the present invention is to reduce the burden on the caregiver in a case where a person with severe motor and intellectual disabilities or a dementia patient uses the ICT device.
A thought inference system according to an aspect of the present invention includes a data set acquisition module configured to acquire a plurality of data sets, each of the data sets indicating a first condition for a case where a first physical reaction is seen in a person with speech difficulties and a first thought of the person with speech difficulties for a case where the first physical reaction is seen; an inference model generation module configured to generate, for each of a plurality of combinations of a time frame and a location, an inference model by machine learning in which the first condition indicated in the data set acquired in a time frame and a location of the subject combination is used as an explanatory variable and the first thought indicated in the data set is used as an objective variable; an inference module configured to infer a second thought for a case where a second reaction is seen in the person with speech difficulties by inputting input data indicating a second condition for a case where the second reaction is seen to the inference model that is generated, among the plurality of combinations, for a combination of a time frame and a location in which the second reaction is seen; and an output module configured to output the second thought.
The present invention makes it possible to reduce the burden on the caregiver in a case where a person with severe motor and intellectual disabilities or a dementia patient uses the ICT device.
The thought inference system 1 illustrated in
The thought inference system 1 includes a care support server 2, a plurality of caregiver terminals 3, a plurality of video cameras 41, a plurality of indoor measuring instruments 42, a plurality of smart speakers 43, a plurality of biometric devices 44, and a communication line 5.
The care support server 2 and the individual caregiver terminals 3 can exchange data via the communication line 5. Examples of the communication line 5 include the Internet and a public line.
The care support server 2 generates an inference model by machine learning and infers a thought of the care recipient 81 based on the inference model. Examples of the care support server 2 include a personal computer, a workstation, and a cloud server. The following describes an example in which the care support server 2 is a personal computer.
As illustrated in
The ROM 22 or the auxiliary storage device 23 stores, therein, an operating system and computer programs such as an inference service program 2P. The inference service program 2P is a program for implementing the functions of a learning module 201, an inference model storage module 202, an inference module 203, and so on, which are illustrated in
The RAM 21 is a main memory of the care support server 2. Computer programs such as the inference service program 2P are appropriately loaded into the RAM 21.
The processor 20 executes the computer programs loaded into the RAM 21. Examples of the processor 20 include a graphics processing unit (GPU) and a central processing unit (CPU).
The network adapter 24 performs communication with a device, e.g., the caregiver terminal 3 or a web server, using protocols such as transmission control protocol/internet protocol (TCP/IP). As the network adapter 24, a network interface card (NIC) or a communication device for Wi-Fi is used.
The keyboard 25 and the pointing device 26 are input devices for an operator to input commands or data.
The display 27 serves to display a screen with which to input commands or data, a screen showing the result of calculations by the processor 20, or the like.
Referring back to
As illustrated in
The ROM 32 or the flash memory 33 stores, therein, an operating system and computer programs such as a client program 3P. The client program 3P is a program for implementing the functions of a data providing module 301, a client module 302, and so on illustrated in
The RAM 31 is a main memory of the caregiver terminal 3. Computer programs such as the client program 3P are appropriately loaded into the RAM 31. The processor 30 executes the computer programs loaded into the RAM 31.
The touch panel display 34 includes a touch panel for the caregiver 82 to input commands or data and a display on which to display a screen.
The network adapter 35 performs communication with another device, e.g., the care support server 2, using protocols such as TCP/IP. Examples of the network adapter 35 include a communication device for Wi-Fi.
The short-range wireless board 36 performs communication with a device within a few meters from the caregiver terminal 3. In particular, in the present embodiment, the short-range wireless board 36 performs communication with the video camera 41, the indoor measuring instrument 42, and the biometric device 44. Examples of the short-range wireless board 36 include a communication device for Bluetooth.
The video camera 37 captures a moving image to generate moving image data. In particular, in the present embodiment, the video camera 37 is used to generate moving image data on the care recipient 81.
The wired input/output board 38 is a device performing communication with peripheral devices by wire. The wired input/output board 38 has a connection port to which the peripheral devices are connected directly or via a cable. Examples of the wired input/output board 38 include an input/output device of a universal serial bus (USB) standard.
The speech processing module 39 includes a speech processing device, a microphone, and a speaker. A sound captured by the microphone is encoded into audio data by the speech processing device, and audio data sent from another device is decoded to a sound by the speech processing device and the sound is reproduced by the speaker.
Referring back to
In order to reduce a burden on the caregiver 82, it is desirable that one video camera 41 is given to each of the care recipients 81 and the video camera 41 is placed in advance so that the care recipient 81 is within an image-capturing range. It is of course possible that the video camera 37 of the caregiver terminal 3 is used to capture images and the microphone of the speech processing module 39 is used to pick up sounds. For example, in a case where the care recipient 81 and the caregiver 82 are out of the house, the caregiver terminal 3 may be used, instead of the video camera 41, to capture an image and pick up sounds. The following description takes an example in which the video camera 41 is used for image capturing and sound pickup.
The indoor measuring instrument 42 measures the state of a room where the care recipient 81 is present. Specifically, the indoor measuring instrument 42 measures the temperature, humidity, air pressure, illuminance, amount of ultraviolet (UV) radiation, pollen level, and the like inside the room. Examples of the indoor measuring instrument 42 include a small measuring instrument for environmental measurement, which is commercially available, having a Bluetooth, Wi-Fi, or USB communication function. A device in which a plurality of commercially available sensors is combined may be used as the indoor measuring instrument 42. For example, a device in which various sensors such as a temperature/humidity sensor, an air pressure sensor, an illuminance sensor, a UV sensor provided by ALPS ALPINE CO., LTD. are combined with one another. Incidentally, the pollen level is the amount of pollen per unit volume.
The indoor measuring instrument 42 is installed for each room where the care recipient 81 is present. In a case where the care recipients 81 are present in one room, only one indoor measuring instrument 42 may be installed in the room and shared with the care recipients 81, or, alternatively, one indoor measuring instrument 42 may be installed near each of the care recipients 81. Further, when going out, the care recipient 81 or the caregiver 82 may carry the indoor measuring instrument 42.
The smart speaker 43 recognizes a sound to perform tasks based on the recognition result, for example, to search for information or control electrical appliances. In the present embodiment, particularly, the smart speaker 43 performs the tasks for the care recipient 81. Examples of the smart speaker 43 include a commercially available smart speaker having a Bluetooth, Wi-Fi, or USB communication function.
The biometric device 44 measures biometric information such as heart rate, pulse wave, blood oxygen level, blood pressure, electrocardiographic potential, body temperature, myoelectric potential, and amount of sweating of the care recipient 81. Examples of the biometric device 44 include a device that has a Bluetooth, Wi-Fi, or USB communication function and a function to measure biometric information, for example, a general wearable terminal such as Apple Watch provided by Apple Inc., a wearable terminal with a health function, or a wearable terminal for medical use.
The functions illustrated in
An operator of the care support server 2 needs to collect a lot of data regarding the care recipient 81 in order to cause the care support server 2 to implement machine learning and generate an inference model. Accordingly, the data is collected with the cooperation of the caregiver 82, with the care recipient 81 regarded as a target for data sampling. Incidentally, each care recipient 81 (81a, 81b, . . . ) is given a unique user code in advance.
One inference model can be shared with a plurality of care recipients 81, but in order to generate a highly accurate inference model with less data, it is desirable to generate an inference model dedicated for each of the care recipients 81. The description goes on to a method for collecting data, taking an example of collecting data for generating an inference model dedicated for the care recipient 81a for a case where a caregiver 82a cares for the care recipient 81a.
The caregiver 82a launches the client program 3P on his/her caregiver terminal 3 and sets the mode to a data collection mode in advance. The data providing module 301 (see
Meanwhile, in general, in a case where a thought such as a desire or a feeling arises, a human sometimes expresses the thought in an action or facial expression. Such a phenomenon is sometimes called a “desire expressive reaction”. The following description takes an example in which the desire expressive reaction appears as a hand gesture.
When the care recipient 81a would like to express his/her thought such as a desire, the care recipient 81a moves his/her hand according to the thought. Then, the video camera 41 installed near the care recipient 81a captures the gesture of the care recipient 81a and also collects sounds around the care recipient 81a. In a case where the care recipient 81a speaks with the gesture, the collected sounds include the voice of the care recipient 81a. The video camera 41 then outputs moving image audio data 601 including a moving image (video) of the captured gesture and the collected sounds to the caregiver terminal 3 of the caregiver 82a. The format of the moving image audio data 601 is, for example, MP4 or MOV.
In a case where the caregiver terminal 3 is put in the data collection mode, in response to the moving image audio data 601 input from the video camera 41, the moving image data extraction module 30A, the environmental data acquisition module 30B, the biometric data acquisition module 30C, the thought data generation module 30D, and the raw data transmission module 30E perform processing as follows.
The moving image data extraction module 30A extracts, from the moving image audio data 601, data corresponding to the moving image as moving image data 611. In other words, data corresponding to the sounds is cut.
The environmental data acquisition module 30B performs processing for acquiring data on the environment around the care recipient 81a as described below. The environmental data acquisition module 30B instructs the indoor measuring instrument 42 near the care recipient 81a to measure the current temperature and so on.
In response to the instructions, the indoor measuring instrument 42 measures the temperature, humidity, air pressure, illuminance, amount of UV radiation, pollen level, and so on, and transmits spatial condition data 612 indicating the measurement result to the caregiver terminal 3.
Further, the environmental data acquisition module 30B accesses an application program interface (API) of a website offering weather information, e.g., OpenWeatherMap API, via the Internet, and acquires weather data 613 indicating information regarding the current weather, cloud cover, wind direction, and wind speed in an area where the care recipient 81a is currently in, and information regarding the maximum temperature, minimum temperature, sunshine duration, and the like in the area of the day. The maximum temperature, the minimum temperature, and the sunshine duration may be actually observed or predicted. The sunshine duration may be calculated based on the sunrise time and the sunset time by the environmental data acquisition module 30B.
The environmental data acquisition module 30B also acquires current location data 614 indicating the current location of the care recipient 81a. Specifically, the latitude and longitude of the caregiver terminal 3 itself is indicated in the current location data 614 as the latitude and longitude of the current location of the care recipient 81a. The latitude and longitude can be acquired with a global positioning system (GPS) function of the caregiver terminal 3 itself.
The current location data 614 also indicates a location attribute of the care recipient 81a. The “location attribute” of the care recipient 81a is an attribute of a facility where the care recipient 81a is currently present. Examples of the attribute include “home”, “classroom”, “dining room”, “bathroom”, “workplace”, “convenience store”, “hospital” and “park”. The location attribute is preferably acquired in the following manner.
For example, for each facility, a table in which latitude and longitude of the facility are correlated with an attribute of the facility is prepared in the client program 3P. The environmental data acquisition module 30B checks the latitude and longitude acquired using the GPS function with the table, and acquires an attribute matching the latitude and longitude as the location attribute of the care recipient 81a.
Alternatively, for each facility, an oscillator for emitting a radio wave indicating a unique identifier is installed in advance and a table in which an identifier of the facility is correlated with an attribute of the facility is prepared in the client program 3P. The environmental data acquisition module 30B checks the identifier indicated in the radio wave received by the short-range wireless board 36 with the table, and acquires an attribute matching the identifier as the location attribute of the care recipient 81a. Examples of such an oscillator include an iBeacon oscillator. “iBeacon” is a registered trademark.
The environmental data acquisition module 30B also acquires time data 615 indicating the current time from a simple network time protocol (SNTP) server. The environmental data acquisition module 30B may acquire the time data 615 from an operating system (OS) or a clock of the caregiver terminal 3 itself. It can be said that the time is a time at which a desire expressive reaction has been observed.
The biometric data acquisition module 30C performs processing for acquiring the current biometric information of the care recipient 81a as follows. The biometric data acquisition module 30C instructs the biometric device 44 of the care recipient 81a to measure the current biometric information. In response to the instructions, the biometric device 44 of the care recipient 81a measures biometric information such as heart rate, pulse wave, blood oxygen level, blood pressure, body temperature, myoelectric potential, amount of sweating, and electrocardiographic potential, and then transmits biometric data 616 indicating the measurement result to the caregiver terminal 3.
The thought data generation module 30D generates thought data 617 indicating, as a correct thought, a thought that the care recipient 81a would like to express, for example, in the following manner.
The thought data generation module 30D displays the thought input screen 71 as illustrated in
The caregiver 82a determines, based on the motion of the care recipient 81a and the current context (environment such as a time, location, and weather), which thought option matches or is closest to a thought that the care recipient 81a would like to express. The caregiver 82a makes an annotation by scrolling through the option list 712 to touch and select, from the option list 712, a thought option that is determined to match or be closest to the thought that the care recipient 81a would like to express, and touches a complete button 713.
In response to the operation, the thought data generation module 30D generates thought data 617 indicating the selected thought option as a correct thought.
Note that, in a case where the caregiver 82a does not understand or is uncertain about the thought that the care recipient 81a would like to express, the caregiver 82a may make the selection after checking with the care recipient 81a. Even in a case where the caregiver 82a understands the thought that the care recipient 81a would like to express, the caregiver 82a may make the selection after checking with the care recipient 81a just in case. If there are no suitable thought options, then the caregiver 82a may add a thought that the care recipient 81a would like to express as a new thought option, and the thought data generation module 30D may generate thought data 617 indicating the new thought option as a correct thought.
Alternatively, in a case where the care recipient 81a makes a motion, the caregiver 82a may utter a thought option that the caregiver 82a determines to match or be closest to the thought that the care recipient 81a would like to express. In response to the utterance, the voice of the caregiver 82a is recorded in the moving image audio data 601. The thought data generation module 30D then extracts the voice of the caregiver 82a from the moving image audio data 601, and determines the uttered thought option with the speech recognition function. The thought data generation module 30D then generates thought data 617 indicating the determined thought option as a correct thought. Incidentally, in a case where the uttered content (thought) does not correspond to any of the existing thought options, the uttered thought may be added as a new thought option and thought data 617 indicating the added thought option as a correct thought may be generated. Alternatively, AI may classify the uttered thought as any of the existing thought options, and generate, as the thought data 617, data indicating the thought option into which the uttered thought has been classified.
The raw data transmission module 30E correlates the moving image data 611, the spatial condition data 612, the weather data 613, the current location data 614, the time data 615, the biometric data 616, and the thought data 617 with a user code of the care recipient 81a, and transmits, to the care support server 2, the resultant as one set of raw data 61 as illustrated in
In the care support server 2, the learning module 201 includes a motion pattern determination module 20A, a data set generation module 20B, a data set storage module 20C, a machine learning processing module 20D, and an accuracy test module 20E as illustrated in
In a case where the raw data 61 is transmitted from the caregiver terminal 3, the motion pattern determination module 20A determines which of predefined patterns the motion of the care recipient 81a seen in a moving image of the moving image data 611 of the raw data 61 corresponds to, for example, in the following manner.
The motion pattern determination module 20A extracts the position of a hand from each frame of the moving image in the moving image audio data 601 to identify a change in position of the hand, namely, a trajectory of the hand. The motion pattern determination module 20A then determines that, from among the plurality of patterns, a pattern most similar to the identified trajectory corresponds to a pattern of the motion of the care recipient 81a this time.
The trajectory can be identified using a known technique. For example, the trajectory may be identified using Kinovea that is motion analysis software.
Which pattern the identified trajectory corresponds to can also be determined by known techniques. For example, pattern matching may be used for the determination. Alternatively, AI may be used for the determination. Specifically, many trajectories are prepared in advance, and data sets are prepared by an operator labeling which pattern each trajectory corresponds to. Supervised learning is employed to generate a classifier for classifying trajectories. The motion pattern determination module 20A then uses the classifier to determine which pattern the trajectory identified from the moving image data 611 corresponds to.
The predefined patterns are, for example, as follows. “One direction” corresponds to a pattern of moving a hand linearly from a position to another position only once. “Circle” corresponds to a pattern of moving a hand in a circular motion. “Reciprocation” corresponds to a pattern of reciprocating a hand linearly between a position and another position only once. “Weak” corresponds to a pattern of moving a hand in a trembling manner. “Repetition” corresponds to a pattern of making an identical motion twice or more in succession. However, in order to distinguish “repetition” from “weak”, a condition applicable to “repetition” is so set that a hand moves by a certain distance or more (20 cm or more, for example) in one motion. “Random” corresponds to a pattern of moving a hand randomly, which does not correspond to any of the five patterns described above.
The data set generation module 20B generates motion pattern data 618 indicating the determination result by the motion pattern determination module 20A, i.e., the pattern determined, replaces the moving image data 611 of the raw data 61 with the generated motion pattern data 618, and thereby generates the data set 62 as illustrated in
As described above, the motion of the care recipient 81a is triggered to perform the foregoing processing and operation to generate one set of the data set 62, and the data set 62 is stored into the data set storage module 20C. In the preparation phase, every time the care recipient 81a makes a motion, one set of data set 62 is generated by the foregoing processing and operation and is stored into the data set storage module 20C.
Incidentally, as with the case where the motion of the care recipient 81 is classified as any of the plurality of patterns, another piece of information may be classified as any of a plurality of classes and indicated in the data set 62.
For example, it is possible that the data set generation module 20B classifies the temperature indicated in the spatial condition data 612 as any of five categories in accordance with the category table 69A of
Alternatively, it is possible that the data set generation module 20B calculates a discomfort index based on the temperature and the humidity indicated in the spatial condition data 612 and classifies the discomfort index as any of five categories in accordance with the category table 69B of
Alternatively, it is possible that the data set generation module 20B classifies the time indicated in the time data 615, according to the lifestyle of the care recipient 81a, as any of time frames such as “sleep time frame”, “wake-up time frame”, “breakfast time frame”, “class time frame”, “lunch time frame”, “break time frame”, “free time frame”, “dinner time frame”, and “bedtime frame” based on a timetable prepared in advance. The time may be replaced with the classified time frame. Alternatively, the classified time frame may be added to the time data 615. For further accurate classification, a timetable may be prepared for each day of the week and the time may be classified based on the timetable according to the day of the week at which the time data 615 is acquired.
Yet alternatively, it is possible that the data set generation module 20B calculates a temperature difference between the maximum temperature and the minimum temperature indicated in the weather data 613 and adds the temperature difference to the weather data 613.
Further, in a case where the care recipient 81 (81b, . . . ) other than the care recipient 81a makes a motion, the foregoing processing and operation are also performed to generate each data set 62 for the care recipient 81b, . . . , and that each data set 62 is correlated with the corresponding user code and the resultant is stored into the data set storage module 20C.
In a case where a certain number of data sets 62 (1,000 sets, for example) or more for a certain care recipient 81 is stored into the data set storage module 20C, the machine learning processing module 20D implements machine learning based on the data sets 62 to generate an inference model for the care recipient 81. The following describes an example in which an inference model for the care recipient 81a is generated.
The machine learning processing module 20D reads data sets 62 correlated with the user code of the care recipient 81a out of the data set storage module 20C. A predetermined ratio (70%, for example) of the data sets 62 is used as training data, and the remaining data sets 62 (30%, for example) are used as test data. Hereinafter, the data set 62 used as the training data is referred to as “training data 6A” and the data set 62 used as the test data is referred to as “test data 6B”.
The machine learning processing module 20D uses, as an objective variable (correct data, label), the thought data 617 of data constituting each set of the training data 6A and uses, as an explanatory variable (input data), the residual data, namely, the spatial condition data 612, the weather data 613, the current location data 614, the time data 615, the biometric data 616, and the motion pattern data 618 to generate the inference model 68 based on a known machine learning algorithm. For example, the machine learning processing module 20D uses a machine learning algorithm such as support vector machine (SVM), random forest, or deep learning to generate the inference model 68. Alternatively, both SVM and BORUTA may be used to generate the inference model 68.
In response to the inference model 68 generated, the accuracy test module 20E uses the test data 6B to test the accuracy of the inference model 68. To be specific, as illustrated in
The accuracy test module 20E determines whether the inferred thought (thought option) matches the correct thought indicated in the thought data 617 (#803). Other test data 6B also undergoes the processing of steps #801 to #803.
If the number of times that the correct thought matches the inferred thought in step #803 with respect to the number of times the processing of steps #801 to #803 has been performed is a predetermined ratio (for example, 90%) or more, then the accuracy test module 20E certifies that the inference model 68 is acceptable, and the inference model 68 is correlated with the user code of the care recipient 81a and the resultant is stored into the inference model storage module 202. Thereby, preparation for inferring the thought of the care recipient 81a is completed.
On the other hand, if the number of times that the correct thought matches the inferred thought in step #803 with respect to the number of times the processing of steps #801 to #803 has been performed is smaller than the predetermined ratio, then the accuracy test module 20E discards the inference model 68 and the collection of the data set 62 is continued. Alternatively, an administrator may adjust the target data by changing the value of the weight parameter of any of the items (for example, weather of the weather data 613, or location attribute of the current location data 614) constituting each of the target data (spatial condition data 612, weather data 613, current location data 614, time data 615, biometric data 616, and motion pattern data 618), or, alternatively, by deleting any of the items (for example, cloud cover of the weather data 613, or latitude and longitude of the current location data 614) and may conduct a test again.
An inference model 68 for the care recipient 81 (81b, . . . ) other than the care recipient 81a is generated in the similar method, and the inference model 68 is correlated with the corresponding user code and the resultant is stored in the inference model storage module 202.
In a case where an inference model 68 for a certain care recipient 81 is stored into the data set storage module 20C, the inference module 203 of the care support server 2 and the client module 302 of the caregiver terminal 3 are ready to infer a thought of the care recipient 81. The description goes on to an inference method by taking an example in which the video camera 41 captures an image of the care recipient 81a to infer a thought of the care recipient 81a in a case where the caregiver 82a cares for the care recipient 81a.
The caregiver 82a launches the client program 3P on his/her caregiver terminal 3 and sets the mode to an inference mode in advance. The client module 302 is then activated. As illustrated in
When the care recipient 81a makes a gesture in order to express his/her thought to the caregiver 82a, the video camera 41 installed near the care recipient 81a captures the gesture of the care recipient 81a and also collects sounds around the care recipient 81a. Thereby, the moving image audio data 601 is generated. Then, the moving image audio data 601 is output from the video camera 41 to be input to the caregiver terminal 3 of the caregiver 82a.
In a case where the caregiver terminal 3 is put in the data collection mode, in response to the moving image audio data 601 input from the video camera 41, the moving image data extraction module 30F, the environmental data acquisition module 30G, the biometric data acquisition module 30H, and the raw data transmission module 30J perform processing as follows.
The moving image data extraction module 30F extracts, from the moving image audio data 601 input this time, data corresponding to the moving image as moving image data 631. In other words, as with the moving image data extraction module 30A of the data providing module 301, the moving image data extraction module 30F cuts data corresponding to the sounds.
The environmental data acquisition module 30G acquires the following data in a manner similar to that of the processing of the environmental data acquisition module 30B. The environmental data acquisition module 30G instructs the indoor measuring instrument 42 near the care recipient 81a to measure the current temperature and so on, so that spatial condition data 632 indicating the current temperature, humidity, air pressure, illuminance, amount of UV radiation, and pollen level is acquired. The environmental data acquisition module 30G acquires, from a website, weather data 633 indicating information regarding the current weather, cloud cover, wind direction, and wind speed in an area where the care recipient 81a is currently in, and information regarding the maximum temperature, minimum temperature, and the like in the area of the day. The environmental data acquisition module 30G acquires current location data 634 indicating the latitude and longitude of the current location of the care recipient 81a and the location attribute by using the GPS function or iBeacon. The environmental data acquisition module 30G acquires time data 635 indicating the current time from the SNTP server or the like.
The biometric data acquisition module 30H acquires, from the biometric device 44 of the care recipient 81a, biometric data 636 indicating biometric information such as the current heart rate, pulse wave, blood oxygen level, blood pressure, electrocardiographic potential, body temperature, myoelectric potential, and amount of sweating in a manner similar to that of the processing of the biometric data acquisition module 30C.
The raw data transmission module 30J correlates the moving image data 631, the spatial condition data 632, the weather data 633, the current location data 634, the time data 635, and the biometric data 636 with the user code of the care recipient 81a to transmit the resultant as one set of raw data 63 illustrated in
In the care support server 2, the inference module 203 incudes a motion pattern determination module 20F, a target data generation module 20G, an inference processing module 20H, and an inference result replying module 20J as illustrated in
In a case where the raw data 63 is transmitted from the caregiver terminal 3, the motion pattern determination module 20F determines which of the predefined patterns the current motion of the care recipient 81a seen in a moving image of the moving image data 631 of the raw data 63 corresponds to. The determination method is similar to the determination method of the motion pattern determination module 20A.
The target data generation module 20G generates motion pattern data 637 indicating the determination result by the motion pattern determination module 20F, i.e., the pattern determined, replaces the moving image data 631 of the raw data 63 with the generated motion pattern data 637, and thereby generates target data 64.
The inference processing module 20H infers, as illustrated in
The inference result replying module 20J transmits inference result data 65 indicating the thought of the care recipient 81a inferred by the inference processing module 20H to the caregiver terminal 3 of the caregiver 82a.
In the caregiver terminal 3, in response to the inference result data 65 received, the inference result output module 30K (see
Further, it is possible to cause the smart speaker 43 to hear the voice reproduced by the speech processing module 39 to deal with a desire of the care recipient 81a. For example, in a case where the voice “hot” is input, the smart speaker 43 turns on an air conditioner of a room where the care recipient 81a is present, or lowers the set temperature of the air conditioner. Further, in a case where the voice “Where is the teacher?” is input, the smart speaker 43 generates an email with content of “Where is the teacher?” and sends the email to a school teacher of the care recipient 81a.
A thought of the care recipient 81 (81b, . . . ) other than the care recipient 81a is also inferred based on the corresponding raw data 63 and inference model 68 in the similar method, and the inference result is sent to the caregiver terminal 3 of the caregiver 82 who cares for the corresponding care recipient 81 (81b, . . . ).
The description goes on to the flow of the entire processing in the care support server 2 with reference to the flowchart.
The care support server 2 executes the processing based on the inference service program 2P in the steps depicted in the flowchart of
In the data set preparation phase, the care support server 2 receives the raw data 61 (see
The care support server 2 then generates the motion pattern data 618 indicating the identified pattern (#814), replaces the moving image data 611 with the motion pattern data 618 to generate the data set 62 (see
Every time the raw data 61 is received, the care support server 2 executes the processing of steps #812 to #816.
In the machine learning phase, when the number of data sets 62 correlated with an identical user code reaches a predetermined number of sets, the care support server 2 generates an inference model 68 by implementing machine learning based on these data sets 62 (#821). In a case where a test confirms that the inference model 68 has a certain degree of accuracy, the inference model 68 is correlated with the user code and stored (#822).
In the inference phase, when the raw data 63 (see
The care support server 2 generates the motion pattern data 637 indicating the identified pattern (#834), replaces the moving image data 631 with the motion pattern data 637 to generate the target data 64 (#835).
The care support server 2 then inputs the target data 64 to an inference model 68 correlated with the user code to infer the thought of the care recipient 81 (#836), and transmits the inference result data 65 indicating the inference result to the caregiver terminal 3 (#837). In response to the operation, the inference result is output in the form of a character string or a voice in the caregiver terminal 3.
Every time the raw data 63 is received, the care support server 2 executes the processing of steps #832 to #837.
According to the present embodiment, in a case where the care recipient 81 has a thought such as a desire or something that the care recipient 81 would like to express, the thought can be inferred and output even when the caregiver 82 is not present. This makes it possible for the care recipient 81 to use an ICT device while the burden on the caregiver 82 is reduced as compared with the conventional cases. Even in a case where the care recipient 81 is a person with severe motor and intellectual disabilities, a dementia patient, or the like, the ICT device can be used more easily than conventionally possible.
In the present embodiment, the care support server 2 implements machine learning by using the data set 62, specifically, the data on the items illustrated in
For example, the care support server 2 may implement machine learning and inference by adding data indicating 3-axis geomagnetism and 3-axis acceleration of the current location of the care recipient 81 to the data set 62 and the target data 64, respectively.
In the present embodiment, the data set 62 and the target data 64 include, as data indicating a desire expressive reaction of the care recipient 81, the motion pattern data 618 and the motion pattern data 637, respectively. Specifically, data indicating a pattern of a hand gesture of the care recipient 81 is included in the data set 62 and the target data 64, and the care support server 2 implements machine learning and inference based on the pattern. Instead of this, however, the care support server 2 may implement machine learning and inference based on another desire expressive reaction. For example, the care support server 2 may implement machine learning and inference based on a pattern of eye movements, a pattern of eyelid movements (open/close), a pattern of change in voice pitch, presence/absence of uttering, a pattern of facial expressions, or the like of the care recipient 81. Alternatively, the care support server 2 may implement machine learning and inference based on a plurality of desire expressive reactions (a pattern of hand gestures and a pattern of facial expressions) arbitrarily selected from among the desire expressive reactions described above. The same applies to a modification described later with reference to
Incidentally, the eye movement, the eyelid movement, and the facial expression are preferably extracted from a moving image captured by the video camera 41 or the video camera 37 of the caregiver terminal 3. Alternatively, the care recipient 81 may wear a glass wearable terminal to capture the eye movement and the eyelid movement with the wearable terminal. The voice pitch and the presence/absence of uttering are preferably identified based on audio data included in the moving image audio data 601.
In the present embodiment, the care support server 2 identifies the trajectory of a hand of the care recipient 81 based on a moving image. Instead of this, however, the care support server 2 may identify the trajectory based on data on a posture, angular velocity, or angular acceleration acquired by a gyro sensor of the biometric device 44 attached to a wrist of the care recipient 81. The same applies to the modification described later with reference to
The care support server 2 may include, in the data set 62 (see
In the present embodiment, the care support server 2 generates one inference model 68 for one care recipient 81. Instead of this, however, the care support server 2 may generate one inference model 68 for one care recipient 81 for each time frame and for each location. The following description takes an example in which an inference model 68 for the care recipient 81a is generated.
The care support server 2 has installed, thereon, an inference service program 2Q instead of the inference service program 2P. The inference service program 2Q enables implementing the functions of a learning module 211, an inference model storage module 212, and an inference module 213 illustrated in
The learning module 211 includes a motion pattern determination module 21A, a data set generation module 21B, a data set storage module 21C, a machine learning processing module 21D, and an accuracy test module 21E, and executes processing for generating an inference model.
The inference module 213 includes a motion pattern determination module 21F, a target data generation module 21G, an inference processing module 21H, and an inference result replying module 21J, and executes processing for inferring a thought.
The caregiver terminal 3 has installed, thereon, a client program 3Q instead of the client program 3P. The client program 3Q enables implementing the functions of a data providing module 311, a client module 312, and so on illustrated in
The data providing module 311 includes a desire expression detection module 31A, an environmental data acquisition module 31B, a biometric data acquisition module 31C, a thought data generation module 31D, and a raw data transmission module 31E, and provides the care support server 2 with data for learning.
The client module 312 includes a desire expression detection module 31F, an environmental data acquisition module 31G, a biometric data acquisition module 31H, a raw data transmission module 31J, and an inference result output module 31K, and causes the care support server 2 to implement inference.
When the care recipient 81a moves his/her hand in order to express his/her thought such as a desire, the video camera 41 near the care recipient 81a captures the motion of the care recipient 81a and collect sounds, and the moving image audio data 601 is output to the caregiver terminal 3 of the caregiver 82a.
In the caregiver terminal 3 placed in the data collection mode, in a case where the moving image audio data 601 is input from the video camera 41, the desire expression detection module 31A determines whether a desire expressive reaction is seen based on the moving image audio data 601, for example, in the following manner.
If the hand gesture indicated in the moving image audio data 601 is a predetermined gesture (for example, a motion of reciprocating the hand between two points in succession), then it is determined that a desire expressive reaction is seen. Alternatively, if the voice indicated in the moving image audio data 601 continues for a predetermined time (two seconds, for example) or more, it may be determined that a desire expressive reaction is seen.
In a case where the desire expression detection module 31A determines that a desire expressive reaction is seen, the environmental data acquisition module 31B and the biometric data acquisition module 31C perform the following processing.
The environmental data acquisition module 31B acquires data regarding the environment around the care recipient 81a, namely, the spatial condition data 612, the weather data 613, the current location data 614, and the time data 615. The acquisition method is similar to the method for acquiring these sets of data by the environmental data acquisition module 30B (see
The biometric data acquisition module 31C acquires data on the current biometric information of the care recipient 81a, namely, the biometric data 616. The acquisition method is similar to the method for acquiring the data by the biometric data acquisition module 30C.
The desire expression detection module 31A extracts the moving image data 611 from the moving image audio data 601, as with the moving image data extraction module 30A.
The thought data generation module 31D has the event table 605 for the care recipient 81a. As illustrated in
The thought data generation module 31D generates thought data 617 indicating, as a correct thought, a thought that the care recipient 81a would like to express based on the event table 605 and so on, for example, in the following manner.
The thought data generation module 31D determines, based on the event table 605, an event that corresponds to the current location indicated in the current location data 614 and the time indicated in the time data 615. For example, in a case where the current location data 614 indicates “classroom” and the time data 615 indicates “11:50”, the thought data generation module 31D determines that the event is “class”.
In response to the event determined, the thought data generation module 31D displays the thought input screen 71 (see
The caregiver 82a determines, based on the motion, the state, and so on of the care recipient 81a, which thought option matches or is closest to a thought that the care recipient 81a would like to express. The caregiver 82a then touches and selects, from the option list 712, a thought option that is determined to match or be closest to the thought that the care recipient 81a would like to express, and touches the complete button 713.
In response to the operation, the thought data generation module 31D generates thought data 617 indicating the selected thought option as a correct thought. Note that if there are no suitable thought options, a new thought option may be added as with the thought data generation module 30D.
The raw data transmission module 31E correlates the moving image data 611, the spatial condition data 612, the weather data 613, the current location data 614, the time data 615, the biometric data 616, and the thought data 617 with an event code of the event this time and the user code of the care recipient 81a, and transmits, to the care support server 2, the resultant as one set of raw data 61.
In the care support server 2, in a case where the raw data 61 is transmitted from the caregiver terminal 3, the motion pattern determination module 21A determines which of the predefined patterns the motion of the care recipient 81a seen in a moving image of the moving image data 611 of the raw data 61 corresponds to. The determination method is similar to the determination method of the motion pattern determination module 20A illustrated in
The data set generation module 21B generates the data set 62 (see
The machine learning processing module 21D has the explanatory variable table 606 for the care recipient 81a. As illustrated in
In a case where a certain number of data sets 62 (100 sets, for example) or more that is correlated with the user code of the care recipient 81a and correlated with an identical event code is stored into the data set storage module 21C, the machine learning processing module 21D reads the data sets 62 out of the data set storage module 21C. A predetermined ratio of the data sets 62 is used as the training data 6A and the remaining data sets 62 are used as the test data 6B.
The machine learning processing module 21D uses the training data 6A to generate an inference model 68 for the event of the care recipient 81a based on a known machine learning algorithm. The point that the thought data 617 of the data constituting each set of the training data 6A is used as the objective variable is similar to machine learning by the machine learning processing module 20D. However, among variables included in each set of the spatial condition data 612, the weather data 613, the current location data 614, the time data 615, the biometric data 616, and the motion pattern data 618, a variable correlated with the event in the explanatory variable table 606 of the care recipient 81a is selected and used as the explanatory variable. In other words, the machine learning processing module 21D implements machine learning with the thought data 617 and the data indicating the selected explanatory variable used as the data set. Further, at this time, a weight coefficient of each variable (explanatory variable) indicated in the explanatory variable table 606 is used.
Incidentally, the explanatory variable used in machine learning may be selected by not the machine learning processing module 21D but the data set generation module 21B. To be specific, in response to the raw data 61 transmitted from the caregiver terminal 3, the data set generation module 21B selects an explanatory variable corresponding to the event code of the raw data 61 based on the explanatory variable table 606 for the care recipient 81a. The data set generation module 21B then combines the thought data 617 with data on the selected explanatory variable to store the resultant as the data set 62 into the data set storage module 21C.
In response to the inference model 68 generated, the accuracy test module 21E uses the test data 6B to test the accuracy of the inference model 68. The test method is basically similar to the test method by the accuracy test module 20E. It should be noted that the explanatory variable indicated in the explanatory variable table 606 is used.
In a case where the accuracy test module 21E certifies that the inference model 68 is acceptable, the inference model 68 is correlated with the user code of the care recipient 81a and the event code of the corresponding event, and the resultant is stored into the inference model storage module 212.
The individual modules of the inference module 213 of the care support server 2 and the individual modules of the client module 312 of the caregiver terminal 3 perform processing for inferring a thought of the care recipient 81. Hereinafter, the processing is described by taking an example of inferring a thought of the care recipient 81a.
In the caregiver terminal 3 placed in the inference mode, in a case where the moving image audio data 601 is input from the video camera 41, the desire expression detection module 31F determines whether a desire expressive reaction is seen based on the moving image audio data 601. The determination method is similar to the determination method by the desire expression detection module 31A. The desire expression detection module 31F also extracts the moving image data 631 as with the moving image data extraction module 30F (see
If it is determined that a desire expressive reaction is seen, then the environmental data acquisition module 31G and the biometric data acquisition module 31H perform the following processing.
The environmental data acquisition module 31G acquires the spatial condition data 632, the weather data 633, the current location data 634, and the time data 635 in a method similar to the processing method of the environmental data acquisition module 30B.
The biometric data acquisition module 31H acquires the biometric data 636 in a method similar to the processing method of the biometric data acquisition module 30C.
The raw data transmission module 31J correlates the moving image data 631, the spatial condition data 632, the weather data 633, the current location data 634, the time data 635, and the biometric data 636 with the user code of the care recipient 81a, and transmits, to the care support server 2, the resultant as one set of raw data 61.
In the care support server 2, the motion pattern determination module 21F determines which of the patterns a motion of the care recipient 81a corresponds to, in a method similar to the method of the motion pattern determination module 20A.
The target data generation module 21G generates the motion pattern data 637 indicating the determination result of the motion pattern determination module 21F, and replaces the moving image data 631 of the raw data 61 with the generated motion pattern data 637, and thereby generates the target data 64.
As with the machine learning processing module 21D, the inference processing module 21H includes the explanatory variable table 606 (see
The inference processing module 21H determines, based on the explanatory variable table 606, an event that corresponds to a current location indicated in the current location data 634 and corresponds to a time indicated in the time data 635 of the target data 64. For example, in a case where the current location data 634 indicates “classroom” and the time data 635 indicates “11:55”, the inference processing module 21H determines that the event is “break”.
The inference processing module 21H inputs a value of a thought option corresponding to the determined event in the target data 64 to an inference model 68 corresponding to the user code of the care recipient 81a and the event code of the event, acquires an output value from the inference model 68, and infers the thought of the care recipient 81a.
The inference result replying module 21J transmits, as the inference result data 65, data indicating the thought of the care recipient 81a inferred by the inference processing module 21H to the caregiver terminal 3 of the caregiver 82a.
In the caregiver terminal 3, in response to the inference result data 65 received, the inference result output module 31K displays a character string representing the thought indicated in the inference result data 65 in the form of the inference result screen 72 (see
In the caregiver terminal 3, in response to the inference result data 65 received, the inference result output module 31K (see
A thought of the care recipient 81 (81b, . . . ) other than the care recipient 81a is also inferred based on the corresponding target data 64 and inference model 68 in the similar method, and the inference result is sent to the caregiver terminal 3 of the caregiver 82 who cares for the corresponding care recipient 81 (81b, . . . ).
According to the method for implementing machine learning by preselecting, for each event, an explanatory variable to be used as described above, it is possible to generate a fitted inference model 68 for each event, which increases the accuracy of inference. One of the main factors that the inventor was able to devise this method is that variables highly correlated with the desire expressive reaction were identified from among the foregoing various variables by experiments.
The inventor calculated, by experiments, correlations between thoughts of the care recipient 81 and various variables regarding the environment or body of the care recipient 81 (for example, temperature, humidity, air pressure, illuminance, amount of UV radiation, pollen level, weather, cloud cover, wind direction, wind speed, maximum temperature, minimum temperature, sunshine duration, temperature difference, current location, time, time frame, 3-axis geomagnetism, 3-axis acceleration, heart rate, pulse wave, blood oxygen level, blood pressure, body temperature, amount of sweating, electrocardiographic potential, myoelectric potential, motion pattern, eye movement, eyelid movement, change in voice pitch, presence/absence of uttering, pattern of facial expressions, and so on). The experiments have shown that, in particular, thoughts of the care recipient 81 are easy to change in different current locations, i.e., in different places, and further, thoughts of the care recipient 81 are easy to change in different times or time frames. In short, the experiments have shown that the location and the time are highly correlated with the desire expressive reaction.
Further, the inventor identified that a combination of location and time corresponds to an event in the daily life of the care recipient 81. In view of this, the inference model 68 was generated for each combination of location and time, namely, for each event, as described in the modification.
Further, the experiments by the inventor have shown that a variable highly correlated with the desire expressive reaction among the various variables is different depending on location or time. As examples of the explanatory variable used in machine learning, the event table 605 (see
In the examples of
In the examples of
In the examples of
In the present embodiment, the inference model 68 is generated and used properly for each care recipient 81. Instead of this, a common inference model 68 may be generated and shared with a plurality of care recipients 81.
The configuration, the processing contents, the processing sequence, the configuration of the data set, and so on of the entire or each part of the thought inference system 1, the care support server 2, and the caregiver terminal 3 can be changed appropriately according to the gist of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
2020-158438 | Sep 2020 | JP | national |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2021/034867 | Sep 2021 | US |
Child | 18178922 | US |