This disclosure relates to an information processing system, an information processing device, an information processing method, and an information processing program (“information processing system and the like”) which analyze and visualize various kinds of data such as a medical questionnaire, voice, a facial expression, a pulse wave/heart rate of a subject to contribute towards solving social problems by realizing early detection of a mental illness.
Specifically, this disclosure relates to an information processing system and the like which, by having a subject (user) use a questioning/examination site on a network such as the Internet to fill in a questionnaire (stress check) related to stress and subsequently engage in a video chat with a counselor or use a pulse wave meter, acquire questionnaire data, facial expression image data, voice data, pulse wave/heart rate data and the like of the subject, calculate values of a stress level, a brain fatigue level, and a mood level based on the various kinds of acquired data and, by plotting the calculated values in a three-dimensional space defined by an X-axis, a Y-axis, and a Z-axis, visualize the subject's emotion (mental status) and the like to support diagnosis and treatment.
In recent years, while the popularization of information technology (IT) has led to rapid simplification and improved convenience in communication, there is an upward trend in the number of people with mental illnesses caused by a lack of communication, work-related stress, fatigue and the like. Therefore, diagnostic systems are being proposed which, by quantitatively measuring and objectively assessing data related to a stressed state or a fatigued state of a person (subject), enables the subject himself/herself to readily comprehend fatigue/stress.
For example, a fatigue/stress examination system described in Japanese Patent Laid-Open No. 2015-054002 is capable of analyzing, using a cloud-side analysis server, electrocardiogram/pulse wave data measured by an electrocardiogram monitor/pulse wave meter, comprehending a state of stress in terms of a numerical value from a balance and strength of autonomic nerves, transmitting the analyzed data to a client terminal, and visually displaying the analyzed data on the client terminal.
In addition, a health value estimation system described in Japanese Patent Laid-Open No. 2018-181004 constructs an estimation model by classifying characteristic behavior (behavioral features) that appear under stress into a plurality of clusters and converting the characteristic behavior (behavioral features) into numerical values from action history including position information and movement information acquired from various sensors included in a mobile terminal such as a smartphone, turning on/off of power, logs related to activation of applications, the number of times of telephone use and the like, and learning, by machine learning, a relationship with a stressed state based on heart rate data measured in advance. Furthermore, by collating a numerical value of a behavioral feature newly acquired using the mobile terminal such as a smartphone with the constructed estimation model, a health value indicating a state of health of the subject himself/herself can be estimated.
The fatigue/stress examination system described in JP '002 simultaneously measures an electrocardiogram and a pulse wave of a subject, measures a state of autonomic nerves of the subject from the electrocardiogram/pulse wave data, and provides unified management of fatigue/analysis result data so that a degree of fatigue and a stress tendency are visualized as numerical values. However, since the fatigue/analysis result data does not reflect subjective determination results based on voice and facial expressions of the subject himself/herself that are obtained during questioning or an interview by a doctor, an industrial physician, a health nurse or the like, there is a possibility that an analysis of emotion of the subject cannot be realized with high accuracy.
In addition, the health value estimation system described in JP '004 is capable of constructing an estimation model using accurately quantified data which can become more suitable training data in supervised machine learning and capable of estimating a health value of the subject himself/herself. However, since the health value estimation system cannot accurately quantify a depressive mood when the subject himself/herself is unaware of being melancholic, subjective determination results based on voice and facial expressions of the subject himself/herself that are obtained during questioning or an interview by a doctor, an industrial physician, a health nurse or the like cannot be used as training data. As a result, since an estimation model that sufficiently reflects an emotion (mental status) of the subject cannot be constructed, there is a possibility that the health value estimation system described in JP '004 is also unable to realize an analysis of emotion of the subject with high accuracy.
It could therefore be helpful to provide an information processing system and the like which acquire not only quantitative data (a pulse wave/heart rate and the like) of the subject measured by measuring instruments but also data of a stress check result based on a stress check questionnaire (Stress Check System Introduction Manual—Ministry of Health, Labour and Welfare (URL https://www.mhlw.go.jp/bunya/roudokijun/anzeneisei12/pdf/150709-1.pdf) (Retrieved Feb. 15, 2021)) and data such as voice, a facial expression image or the like of the subject during counseling using a communication tool such as a video chat (video call), calculate values of a stress level, a brain fatigue level, and a mood level from the data and, by plotting the calculated values in a three-dimensional space defined by an X-axis, a Y-axis, and a Z-axis, visualize the subject's emotion (mental status) and the like and enable diagnosis and treatment to be assisted.
We thus provide:
An information processing device connected to a terminal device of a subject and which visualizes emotion of the subject includes:
Preferably, the data managing section associates the data related to the voice, the facial expression image, and the pulse wave of the subject with dates and times of acquisition of the data and stores the data in storage means of the information processing device, and
Preferably, the three-dimensional space is divided into a plurality of per-type classification categories, and
Preferably, an improvement plan to be proposed to the subject is determined for each of the plurality of per-type categories, and
Preferably, the data related to the voice is data acquired by making a continuous audio recording of the voice of the subject reading out loud predetermined fixed phrases displayed on the terminal device at least until a predetermined audio recording time is reached during a video call with the subject via the terminal device.
Preferably, the emotion expression engine section executes a cerebral activity index measurement algorithm for measuring CEM values that each represents a cerebral activity index to acquire one or more of the CEM values for each subject from the data related to the voice, and the brain fatigue level is an average value of the one or more CEM values.
Preferably, the data related to the pulse wave is data acquired by dividing a pulse wave measured by the pulse wave meter into sections, each section being a predetermined time interval.
Preferably, the emotion expression engine section divides, for each section of the pulse wave, the pulse wave in the section into Hamming windows and calculates, with respect to the pulse wave in each of the Hamming windows, a pulse interval PPI being an interval from a peak to a next peak of the pulse wave of one heartbeat and a time of day,
Preferably, the low-frequency section is 0.04 Hz or higher and lower than 0.15 Hz, and
Preferably, the data related to the facial expression image is data acquired by making a continuous video recording of a moving image of a facial expression of the subject until at least a predetermined video recording time is reached during a video call with the subject via the terminal device.
Preferably, the emotion expression engine section executes a facial expression recognition algorithm to count each of a plurality of emotion expressions recognized from a moving image of a facial expression of the subject included in the data related to the facial expression image,
Preferably, the plurality of emotion expressions are happy, surprise, neutral, fear, angry, disgust, and sad in Russell's circumplex model of affect.
Preferably, the data managing section acquires environmental data at least including air temperature and humidity in addition to the data related to the voice, the facial expression image, and the pulse wave of the subject, and
Preferably, the data managing section acquires questionnaire data including a score of a stress check result of the subject in addition to the data related to the voice, the facial expression image, and the pulse wave of the subject, and
An information processing method is executed in a server connectable to a terminal device of a subject via a network, the information processing method including the steps of:
An information processing system includes:
The program causes, by being executed by a computer, the computer to function as each section of the information processing device.
The program causes, by being executed by a computer, the computer to execute each step of the information processing method.
We thus provide an information processing system and the like capable of acquiring not only quantitative data such as a pulse wave/heart rate acquired from a pulse wave meter but also data of a stress check result and data such as voice, a facial expression image or the like of the subject during counseling using a video call, calculating values of a stress level, a brain fatigue level, and a mood level from the pieces of data and, by plotting the calculated values in a three-dimensional space defined by an X-axis, a Y-axis, and a Z-axis, visualizing the subject's emotion and the like and enabling diagnosis and treatment to be assisted, thereby realizing analysis of emotion of the subject with high accuracy, realizing early detection of mental illness, and contributing towards solving social problems.
Hereinafter, examples will be described with reference to the accompanying drawings. The following examples describe our devices, systems and methods and are not intended to solely limit this disclosure to the examples. In addition, various modifications may be made without departing from the scope thereof. Furthermore, same constituent elements in the drawings will be denoted by same reference signs whenever possible and redundant descriptions will not be repeated.
For example, the information processing device 10 is a computer such as a server that is connectable to a network N. In addition, for example, the terminal device 20 is a terminal connectable to the network N such as a personal computer, a notebook personal computer, a smartphone, or a mobile phone.
For example, the network N may be an open network such as the Internet or a closed network such as an intranet that is connected by a dedicated line. The network N is not limited thereto and, when appropriate, a closed network and an open network may be used in combination in accordance with a required security level or the like.
The information processing device 10 and the terminal device 20 are connected to the network N and are capable of communicating with each other. Using the terminal device 20, a subject (user) can access the information processing device 10 and transmit an answered questionnaire (medical questionnaire) of a stress check to the information processing device 10. The questionnaire of the stress check is, for example, a questionnaire of a stress check in the Stress Check Implementation Program issued by the Ministry of Health, Labour and Welfare (Stress Check System Introduction Manual—Ministry of Health, Labour and Welfare (URL https://www.mhlw.go.jp/bunya/roudokijun/anzeneisei12/pdf/150709-1.pdf) (Retrieved February 2021)).
In addition, to receive counseling from a counselor, the subject can perform a video call (video chat) with the counselor via the terminal device 20. Furthermore, the terminal device can transmit data related to a pulse wave of the subject having been measured using a pulse wave meter to the information processing device 10. As the pulse wave meter, for example, a device that measures a pulse wave from a fingertip of the subject can be used (Checking Corona-related Stress by a Fingertip, Jointly-developed by Yamagata University, Jul. 18, 2020, Asahi Shimbun Digital (URL https://www.asahi.com/articles/ASN7K6V99N78UZHB00M.html) (Retrieved Feb. 15, 2021)).
The information processing device 10 can acquire at least data related to voice, a facial expression image, and a pulse wave of the subject from the terminal device 20 and can calculate, based on the pieces of data, indexes representing an emotion (mental status) of the subject such as a brain fatigue level, a mood level, and a stress level to be described later.
For example, the information processing device 10 is a server (computer) and, illustratively, the information processing device 10 includes a CPU (Central Processing Unit) 11, a memory 12 constituted of a ROM (Read Only Memory), a RAM (Random Access Memory) and the like, a bus 13, an input/output interface 14, an input section 15, an output section 16, a storage section 17, and a communicating section 18.
The CPU 11 executes various kinds of processing in accordance with a program recorded in the memory 12 or a program loaded to the memory 12 from the storage section 17. For example, the CPU 11 can execute a program that causes the server (computer) to function as an information processing device capable of visualizing emotion of the subject and assisting diagnosis and treatment. In addition, at least a part of the functions of the information processing device can be implemented by hardware with an application specific integrated circuit (ASIC) or the like.
The memory 12 also stores, when appropriate, data and the like necessary for the CPU 11 to execute the various kinds of processing. The CPU 11 and the memory 12 are connected to each other via the bus 13. The input/output interface 14 is also connected to the bus 13. The input section 15, the output section 16, the storage section 17, and the communicating section 18 are connected to the input/output interface 14.
The input section 15 can be realized by an input device such as a keyboard or a mouse independent of a main body that houses other sections of the information processing device 10 and various kinds of information can be input in accordance with an instruction operation by a user (manager) or the like of the information processing device 10. The input section 15 may be constituted of various buttons, a touch panel, a microphone or the like.
The output section 16 is constituted of a display, a speaker or the like and outputs data related to a text, a still image, a moving image, voice or the like. The text data, still image data, moving image data, voice data or the like outputted by the output section 16 is outputted from the display, the speaker or the like to be recognizable by the user as characters, an image, video, or voice.
The storage section 17 is constituted of a storage device such as a DRAM (Dynamic Random Access Memory) or another semiconductor memory, a solid state drive (SSD), or a hard disk and is capable of storing various kinds of data.
The communicating section 18 realizes communication to be performed with other devices. For example, the communicating section 18 is capable of communicating with other devices (for example, the terminal devices 20-1, 20-2 to 20-n) and with one another via the network N.
Although not illustrated, the information processing device 10 is appropriately provided with a drive when necessary. For example, a removable medium constituted of a magnetic disk, an optical disk, a magneto optical disk, a semiconductor memory or the like is appropriately mounted to the drive. The removable medium stores a program for realizing a function of visualizing emotion or the like of the subject by calculating values of a stress level, a brain fatigue level, and a mood level of the subject and plotting the values in a three-dimensional space defined by an X-axis, a Y-axis, and a Z-axis and various kinds of data such as text data and image data. A program read from the removable medium by the drive and the various kinds of data are installed in the storage section 17 when necessary.
Next, a configuration of hardware of the terminal device 20 will be described. As shown in
A functional configuration of the information processing device 10 included in the information processing system to visualize emotion of the subject will be described with reference to
In addition, using a partial storage area of the storage section 17, the storage section 17 can be caused to function as a user information database 171. As another example, the user information database 171 can be constituted of an external storage device separate from the information processing device 10 and, for example, a cloud storage can be used as the external storage device. While the user information database 171 is configured as a single storage device in these examples, the user information database 171 may be stored divided into two or more storage devices.
The emotion expression engine section 111 can calculate a brain fatigue level, a mood level, and a stress level as indexes that represent emotion of the subject based on data related to voice, a facial expression image, and a pulse wave of the subject acquired from the terminal device 20. For example, the emotion expression engine section 111 can calculate a brain fatigue level based on a frequency of the voice, calculate a mood level by extracting an emotion of the subject from the facial expression image, and calculate a stress level by performing a frequency analysis of the pulse wave by fast Fourier transform and extracting a high-frequency section and a low-frequency section.
The three-axes processing section 112 can generate a graph of points plotted at coordinates corresponding to the brain fatigue level, the mood level, and the stress level calculated by the emotion expression engine section 111 and display the graph on the information processing device 10 or the terminal device 20. In addition, the three-axes processing section 112 can generate a graph of points plotted according to a time series at coordinates corresponding to the brain fatigue level, the mood level, and the stress level of the subject for each of the dates and times at which data related to voice, a facial expression image, and a pulse wave of the subject has been acquired in a three-dimensional space defined by an X-axis, a Y-axis, and a Z-axis and display the graph on the information processing device 10 or the terminal device 20.
The data managing section 113 can acquire at least data related to voice, a facial expression image, and a pulse wave of the subject from the terminal device 20 and store the data in storage means (for example, the user information database 171) of the information processing device 10. In addition, the data managing section 113 can associate the data related to the voice, the facial expression image, and the pulse wave of the subject with a date and time at which the data has been acquired and store the associated data in the storage means (the user information database 171) of the information processing device.
In addition, the table R2 stores, in association with a date and time, voice data, facial expression image data, pulse wave data, questionnaire data including answers to a stress check by the subject, life log data that records behavior and the like of the subject, and environmental data including air temperature and humidity received from the terminal device 20. The date and time included in the table R2 is the date and time at which data related to voice, a facial expression image, and a pulse wave of the subject had been received from the terminal device a date and time at which the subject had accessed the information processing device 10 using the terminal device 20 and the like.
Furthermore, in association with time obtained by normalizing a date and time stored in the table R2, the stress level, the brain fatigue level, and the mood level of the subject calculated by the emotion expression engine section 111 are stored as values of an X-axis, a Y-axis, and a Z-axis in a three-dimensional space. Normalization of a date and time can be realized by, for example, converting the date and time into a UNIX (registered trademark) time.
Subsequently, the terminal device 20 displays a questioning result on a screen (step S2). Screen displays of the terminal device 20 when executing processing from step S1 to step S2 are shown in
After the subject answers all of the items of the stress check questionnaire, contents such as those shown in
After registration, for example, contents such as those shown in
In addition, the terminal device 20 can display a message such as “A limit of registrants for step S2 in which chat consulting with an expert and stress measurement by a stress meter can be performed has been reached. If you wish to use step S2, please register on the registration page via the link provided below” in a lower half of the screen and can prompt the subject to register more detailed personal information by performing a selection operation such as a click or a tap of “To user registration” at the bottom of the screen. The personal information of the subject having been transmitted from the terminal device 20 to the information processing device 10 is stored by the data managing section 113 in, for example, the user information database.
Once again referring to the flow chart shown in
The terminal device 20 can communicably connect to the pulse wave meter 30 and receive pulse wave data of the subject from the pulse wave meter 30.
Referring to the flow chart shown in
The weight multiplication unit section 111B can adjust each value of the brain fatigue level, the mood level, and the stress level calculated by the emotion expression core section 111A by respectively multiplying the brain fatigue level, the mood level, and the stress level by a weight coefficient determined based on a discomfort index calculated from the air temperature and the humidity included in the environmental data. When the discomfort index is denoted by DI, the air temperature by T, and the humidity by H, for example, the discomfort index can be obtained by a formula expressed as DI=0.81T+0.01H×(0.99T−14.3)+46.3. In addition, although not illustrated in
The emotion expression engine section 111 calculates a brain fatigue level in the emotion expression core section 111A from voice data of the subject acquired as input data. For example, the brain fatigue level can be obtained by calculating a CEM (Cerebral Exponent Macro) value that represents a cerebral activity index. A cerebral activity index measurement algorithm (SiCECA algorithm, Yuki Aoki et al., Development of Fatigue Degree Estimation System for Smartphone, E-037 FIT2013) developed by the Electronic Navigation Research Institute enables a cerebral activity index (CEM value) to be calculated from voice. By executing the cerebral activity index measurement algorithm, the emotion expression engine section 111 can acquire one or more (for example, around two to five) CEM values for each subject from data related to the voice of the subject. For example, the brain fatigue level corresponds to a value obtained by calculating an average value of the one or more CEM values.
Once again referring to
By executing the facial expression recognition algorithm, the emotion expression engine section 111 can recognize a plurality of emotion expressions from a moving image of a facial expression of the subject included in the data related to the facial expression image. For example, the plurality of emotion expressions may be the seven kinds, namely, happy, surprise, neutral, fear, angry, disgust, and sad in Russell's circumplex model of affect (J. A. Russell et al., Core affect, prototypical emotional episodes, and other things called emotion: Dissecting the elephant, Journal of Personality and Social Psychology, 76(5), 805-819)).
By executing the facial expression recognition algorithm (for example, the open source software “Face classification and detection”), the emotion expression engine section 111 counts each of the plurality of emotion expressions recognized from a moving image of a facial expression of the subject included in the data related to the facial expression image. The emotion expression engine section 111 calculates a proportion for each of the plurality of emotion expressions and calculates, for each emotion expression, a mood index for each emotion expression by multiplying the proportion of each of the plurality of emotion expressions by a predetermined weight with respect to each of the plurality of emotion expressions. The predetermined weight with respect to each of the plurality of emotion expressions can be determined based on Russell's circumplex model of affect as shown in
In the example shown in
The emotion expression engine section 111 can adopt a value obtained by dividing a maximum mood index being a largest mood index among mood indexes of the emotion expressions by a total value of the mood indexes of the emotion expressions as the mood level. In the example shown in
Once again referring to
The emotion expression engine section 111 divides, for each section of the pulse wave, the pulse wave in the section into Hamming windows and calculates, with respect to the pulse wave in each of the Hamming windows, a pulse interval PPI being an interval from a peak to a next peak of the pulse wave of one heartbeat and a time of day. The emotion expression engine section 111 generates, for each section of the pulse wave, a time-PPI graph which plots a point at coordinates corresponding to the pulse interval PPI and the time of day in a two-dimensional space defined by time of day as an axis of abscissa and PPI as an axis of ordinate. By performing interpolation such as linear interpolation or cubic spline interpolation between discrete values in the time domain-PPI graph, subsequently applying a fast Fourier transform FFT, and respectively integrating a power spectral density PSD of a result of the FFT in a low-frequency section and in a high-frequency section, the emotion expression engine section 111 can calculate an LF value corresponding to a low-frequency component, an HF value corresponding to a high-frequency component, and an LF/HF value. Known methods of calculating the LF value, the HF value, and the LF/HF value from the pulse interval PPI include a stressed state estimation method described in J. A. Russell et al., Core affect, prototypical emotional episodes, and other things called emotion: Dissecting the elephant, Journal of Personality and Social Psychology, 76(5), 805-819).
The emotion expression engine section 111 can adopt the LF/HF value (sympathetic nervous system index) as the stress level. Alternatively, the stress level can be a level based on at least one value among the LF value, the HF value, and the LF/HF value. For example, referring to previous data values, normalization can be performed by setting a maximum value of LF/HF to 2 and, when also using HF (parasympathetic nervous system index), setting a maximum value of HF to 900. When there are a plurality of pieces of normalized data in each window section, an average value is calculated. In the example shown in
When only an LF/HF value is used as a stress level, inversion is performed so that a maximum value (MAX) of stress becomes minimum (MIN) and (MAX2—standard value) is converted into one axis. For example, in the example shown in
The low-frequency section can be set from 0.04 Hz or higher to lower than 0.15 Hz and the high-frequency section can be set from 0.15 Hz or higher to lower than 0.4 Hz.
As shown in
Such a graph display in a three-dimensional space can be analyzed in multiple dimensions by integrating the three axes of the brain fatigue level, the mood level, and the stress level with a time axis and is displayed to be readily interpretable for the subject, experts, and other users. While a diagnosis result is depicted by a radar chart or the like in conventional stress checks, a correspondence between factors is hardly represented. As shown in
For example, the per-type classification categories defined in the three-dimensional space shown in
As shown in
In addition, an improvement plan to be proposed to the subject is determined for each of the plurality of per-type categories, and the emotion expression engine section 111 can notify the terminal device 20 of the subject, the information processing device 10 or the like of the improvement plan with respect to the category to which a point of coordinates corresponding to the brain fatigue level, the mood level, and the stress level of the subject belongs in the three-dimensional space.
Examples of improvement plans are shown in Table 2.
For example, when a given subject belongs to the per-type category A, the emotion expression engine section 111 can notify jogging, stretching, trekking, mindfulness, and yoga marked by circles as improvement plans. In addition, when another subject belongs to the per-type category B, the emotion expression engine section 111 can notify stretching, yoga, and cognitive behavioral therapy marked by circles as improvement plans. In this manner, the emotion expression engine section 111 can propose suitable improvement plans in accordance with a per-type category to which a subject belongs.
As described above, we provide an information processing system and the like which are capable of acquiring not only quantitative data such as a pulse wave/heart rate acquired from a pulse wave meter but also data of a stress check result and data such as voice, a facial expression image or the like of the subject during counseling using a video call, calculating values of a stress level, a brain fatigue level, and a mood level from the pieces of data and, by plotting the calculated values in a three-dimensional space defined by an X-axis, a Y-axis, and a Z-axis, visualizing the subject's emotion and the like and enabling diagnosis and treatment to be assisted, thereby realizing analysis of emotion of the subject with high accuracy, realizing early detection of mental illness, and contributing towards solving social problems.
The information processing system and the like are applicable to a wide range of applications including stress checks at businesses, by individuals, at educational establishments and the like, mental training in sports, improving concentration during learning, measuring mentality during employment examinations and the like, for example.
Number | Date | Country | Kind |
---|---|---|---|
2021-023407 | Feb 2021 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2022/005712 | 2/14/2022 | WO |