This patent application is based on and claims priority pursuant to 35 U.S.C. § 119(a) to Japanese Patent Application Nos. 2022-185030, filed on Nov. 18, 2022, and 2023-170784, filed on Sep. 29, 2023, in the Japan Patent Office, the entire disclosure of which is hereby incorporated by reference herein.
Embodiments of the present disclosure relate to an information processing system, an activity sensor, and a non-transitory recording medium.
Currently, telework or remote work is widespread, and a more efficient way of working and creative output are desired for collaborative work performed by employees in an office. An approach to promote the interaction among employees in, for example, an office so as to efficiently produce more creative output is detecting and analyzing activity information of the employees in the office and feeding back the result to the environment.
Some technologies to support communication between multiple persons in the same space are known in the art. For example, an information processing apparatus detects the positions of users in a space and displays an image to support the communication between the users.
In an embodiment, an information processing system includes circuitry to receive position information and direction information of a user. The circuitry further controls an environment of a space in which the user performs an activity, in accordance with the position information and the direction information of the user.
In another embodiment, an activity sensor includes a sensor to detect a position and a direction of a user, and circuitry configured to transmit, to an information processing system, position information indicating the position of the user detected by the sensor and direction information indicating the direction of the user detected by the sensor. The position information and the direction information of the user is to be used by the information processing system to control an environment of a space in which the user performs an activity. In another aspect, a non-transitory recording medium carries computer readable codes which, when executed by a computer system, cause the computer system to carry out a method for controlling an environment of a space in which a user performs an activity. The method includes acquiring position information and direction information of a user in a space in which a user performs an activity, and controlling an environment of the space in accordance with the position information and the direction information of the user.
A more complete appreciation of embodiments of the present disclosure and many of the attendant advantages and features thereof can be readily obtained and understood from the following detailed description with reference to the accompanying drawings, wherein:
The accompanying drawings are intended to depict embodiments of the present disclosure and should not be interpreted to limit the scope thereof. The accompanying drawings are not to be considered as drawn to scale unless explicitly noted. Also, identical or similar reference numerals designate identical or similar components throughout the several views.
In describing embodiments illustrated in the drawings, specific terminology is employed for the sake of clarity. However, the disclosure of this specification is not intended to be limited to the specific terminology so selected and it is to be understood that each specific element includes all technical equivalents that have a similar function, operate in a similar manner, and achieve a similar result.
Referring now to the drawings, embodiments of the present disclosure are described below. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.
A description is given below of an environmental control system and an environmental control method performed by the environmental control system according to embodiments of the present disclosure, with reference to the drawings.
Outline of Operation or Processing
Referring to
In the case illustrated in
In the case illustrated in
In the related art, environmental control is not performed in accordance with the positions and directions of the users to, for example, help a user working in a space to efficiently communicate with another user in the space or help one or more users in a space to work efficiently. Examples of environmental control include adjusting tone and outputting images.
By contrast, as described above, the environmental control system according to the present embodiment controls the environment such as sound and images in accordance with the position and the direction of the user. The present embodiment provides, for example, environmental control to promote conversation leading to creative output, or environmental control to help a user concentrate on work.
Terminology
An “activity” refers to performing a certain movement or work. An “activity” is not limited to moving a body, but includes a mental activity. “Activity information” refers to information acquired by the activity sensor 9. However, “activity information” may be detected by, for example, a camera or a microphone disposed in a space.
A “space” refers to a place having a certain extent where one or more users can be present. A space is not limited to an indoor space but may be an outdoor space.
An “environment” refers to something that affects a user in some way in a space. An “environment” is something at least perceived by the user in five senses. In the present embodiment, a tone and an image are described as examples of the environment. Additionally or alternatively, the information processing system may perform the environmental control by vibration or smell.
“Environmental control” includes controlling a certain device so as to output something (such as video, image, sound, vibration, or smell) that affects any of the five senses of a user in a space.
“Position information” refers to information indicating the position of a user in a space. The position may be, for example, coordinates with respect to a reference point in the space. In the present embodiment, the position information is detected by ultra wideband (UWB). The position information may be represented by a latitude and a longitude detected by, for example, an indoor messaging system (IMES) or a global navigation satellite system (GNSS). In the present embodiment, the position information may be referred to simply as “position.”
“Direction information” refers to information indicating the direction which the user faces in a space. The direction information is specified, for example, in a range of 0 to 360 degrees in the horizontal direction with respect to the north direction and in a range of 0 to 180 degrees in the elevation angle direction with respect to the zenith direction. The reference direction may be any direction. In the present embodiment, direction information may be referred to simply as “direction.”
System Configuration
The meeting room is provided with the image display device 12, the sensor 14, the speaker 16, the camera 18, the microphone 20, and the information processing terminal 22. The meeting room may be provided with, for example, a temperature sensor, a humidity sensor, or an illuminance sensor that acquires at least a part of environment-dependent information and transmits the acquired information to the information processing system 10. Although
For example, a user who enters a meeting room carries the activity sensor 9 that transmits radio waves such as a beacon. The sensor 14 in the meeting room receives the radio waves transmitted from the activity sensor 9 of the user in the meeting room as a signal for detecting the position information of the user and transmits the signal to the information processing system 10. The sensor 14 can be any sensor having a positioning system that outputs a signal for detecting the position information of the user.
The activity sensor 9 at a target (i.e., the user) to be measured includes, for example, two acceleration and angular velocity sensors, a microphone, and a vital sensor, and has a shape like a necklace to be worn around the neck of the user. The microphone of the activity sensor 9 receives the voice of the user wearing the activity sensor 9. The vital sensor acquires vital data from the user. The activity sensor 9 determines the behavior of the user based on signals detected by two or more acceleration and angular velocity sensors and values of the signals relative to a reference value.
The activity sensor 9 may be a dedicated sensor, a smart watch, a smartphone, or, for example, any of various types of BLUETOOTH LOW ENERGY (BLE) sensor. The information processing system 10 detects the position information of the users in the meeting room based on the signals for detecting the position information of the users transmitted from one or more sensors 14. The activity sensor 9 described above serves as a transmitter. The transmitter is not limited to the activity sensor 9 and may be any device that transmits a signal for detecting the position information of the user.
The information processing terminal 22 is a device operated by the user in the meeting room. Examples of the information processing terminal 22 include a laptop personal computer (PC), a mobile phone, a smartphone, a tablet communication terminal, a game console, a personal digital assistant (PDA), a digital camera, a wearable PC, a desktop PC, and a device dedicated to the meeting room. The information processing terminal 22 may be carried in the meeting room by the user. Alternatively, the meeting room may be provided with the information processing terminal 22.
The information processing terminal 22 may be a target to be measured by a positioning system. For example, the sensor 14 in the meeting room may receive radio waves transmitted from the activity sensor 9 of the information processing terminal 22 and transmit the received radio waves to the information processing system 10. For example, the sensor 14 transmits, to the information processing system 10, the signal for detecting the position information indicating the relative positions in the meeting room as illustrated in
The camera 18 in the meeting room captures an image in the meeting room and transmits the captured image data to the information processing system 10 as an output signal. For example, the camera 18 is a video camera of KINECT. The video camera of KINECT serves as a video camera that includes a range image sensor, an infrared sensor, and an array microphone. When such a video camera including a range image sensor, an infrared sensor, and an array microphone is used, the motion and the posture of the user are recognized.
The microphone 20 in the meeting room converts the voice of each user into an electrical signal. The microphone 20 transmits the electrical signal converted from the voice of the user to the information processing system 10 as an output signal. In alternative to or in addition to the microphone 20 in the meeting room, a microphone of the information processing terminal 22 may be used.
The speaker 16 in the meeting room converts an electrical signal into a physical signal and outputs sound such as ambient sound. The speaker 16 outputs the sound such as the ambient sound under the control of the information processing system 10. In alternative to or in addition to the speaker 16 in the meeting room, a speaker of the information processing terminal 22 may be used. The microphone 20 in the meeting room and the microphone of the information processing terminal 22 serve as input devices. The speaker 16 in the meeting room and the speaker of the information processing terminal 22 serve as output devices.
In addition to the above-described devices illustrated in
Two or more of the devices illustrated in
The meeting room is provided with the multiple image display devices 12, such as a projector. The image display devices 12 can display images on the sides partitioning the meeting room as illustrated in
The shape of the meeting room illustrated in
The information processing system 10 includes one or more information processing apparatuses. As will be described later, the information processing system 10 outputs the ambient sound or image suitable for interaction (for example, conversations and meetings) between the users in the meeting room, based on, for example, the position information of the users detected by the signal transmitted from the sensor 14, the output signal from the camera 18, and the output signal from the microphone 20.
The configuration of the information processing system 10 according to the present embodiment is not limited to that illustrated in
Hardware Configuration
Hardware Configuration of Computer
The information processing system 10 is implemented by, for example, a computer 500 having a hardware configuration illustrated in
The CPU 501 controls the entire operation of the computer 500. The ROM 502 stores programs, such as an initial program loader (IPL), for driving the CPU 501. The RAM 503 is used as a work area for the CPU 501. The HD 504 stores various kinds of data such as a program. The HDD controller 505 controls the reading and writing of various data from and to the HD 504 under the control of the CPU 501.
The display 506 displays various information such as a cursor, a menu, a window, a character, and an image. The external device I/F 508 is an interface for connecting various external devices. Examples of the external devices in this case include, but are not limited to, a universal serial bus (USB) memory and a printer. The network I/F 509 is an interface for performing data communication via the network N. Examples of the data bus 510 include, but are not limited to, an address bus and a data bus that electrically connect the components such as the CPU 501 with one another.
The keyboard 511 is a kind of input device provided with multiple keys for inputting, for example, characters, numerals, and various instructions. The pointing device 512 is a kind of input device used to, for example, select various instructions, execute various instructions, select a target for processing, and move a cursor. The DVD-RW drive 514 reads and writes various data from and to a DVD-RW 513 serving as a removable recording medium according to the present embodiment. The DVD-RW may be, for example, a digital versatile disc-recordable (DVD-R). The media I/F 516 controls the reading or writing (storing) of data from or to a recording medium 515 such as a flash memory.
Smartphone
The information processing terminal 22 is implemented by, for example, a smartphone 600 having a hardware configuration illustrated in
The CPU 601 controls the entire operation of the smartphone 600. The ROM 602 stores, for example, a program used by the CPU 601 and a program such as an IPL for driving the CPU 601. The RAM 603 is used as a work area for the CPU 601. The EEPROM 604 reads or writes various data such as a control program for the smartphone under the control of the CPU 601.
The CMOS sensor 605 is a kind of built-in image-capturing device to capture an image of an object (for example, a self-image of a user operating the smartphone 600) under the control of the CPU 601, to obtain image data. The CMOS sensor 605 may be an image-capturing device such as a charge-coupled device (CCD) sensor. The imaging element I/F 606 is a circuit that controls the driving of the CMOS sensor 605. Examples of the acceleration and orientation sensor 607 include various types of sensors such as an electromagnetic compass for detecting geomagnetism, a gyrocompass, and an accelerometer.
The media I/F 609 controls the reading or writing (storing) of data from or to a recording medium 608 such as a flash memory. The GPS receiver 611 receives a GPS signal from a GPS satellite.
The smartphone 600 further includes a long-range communication circuit 612, a CMOS sensor 613, an imaging element I/F 614, a microphone 615, a speaker 616, a sound input/output I/F 617, a display 618, an external device I/F 619, a short-range communication circuit 620, an antenna 620a for the short-range communication circuit 620, and a touch panel 621.
The long-range communication circuit 612 is a circuit for communicating with other devices through the network N. The CMOS sensor 613 is a kind of built-in image-capturing device to capture an image of a subject under the control of the CPU 601, to obtain image data. The imaging element I/F 614 is a circuit that controls the driving of the CMOS sensor 613. The microphone 615 is a built-in circuit that converts sound including voice into an electrical signal. The speaker 616 is a built-in circuit that generates sound such as ambient sound, music, or voice by converting an electrical signal into physical vibration.
The sound input/output I/F 617 is a circuit that processes the input and output of sound signals between the microphone 615 and the speaker 616 under the control of the CPU 601. The display 618 is a kind of display device to display, for example, an image of a subject and various icons. Examples of the display 618 include a liquid crystal display and an organic EL display.
The external device I/F 619 is an interface for connecting various external devices. The short-range communication circuit 620 is a communication circuit that is compliant with Near Field Communication (NFC) or BLUETOOTH, for example. The touch panel 621 is a kind of input device to enable a user to operate the smartphone 600 by touching or pressing a screen of the display 618.
The smartphone 600 further includes a bus line 610. The bus line 610 is, for example, an address bus or a data bus for electrically connecting the components such as the CPU 601 illustrated in
Activity Sensor
The microcomputer 701 has functions of components, such as a CPU, a ROM, a RAM, and an EEPROM, of a general-purpose computer. The microphone 702 is a built-in circuit that converts sound including voice into an electrical signal. The UWB module 703 periodically transmits radio waves having short and sharp rectangular waveforms (pulses). Although the communication range thereof is as short as about 10 meters, UWB module 703 enables high-speed communication with reduced power consumption. The sensor 14 detects the direction and the distance of the activity sensor 9 with high accuracy with an error of about several centimeters by receiving radio waves of a rectangular waveform (pulse). The vital sensor 704 detects vital data of the user. The vital data is an index such as a heartbeat, a pulse, or blood pressure which indicates that a person is alive. The acceleration and angular velocity sensor 705 is also referred to as an inertial sensor, and detects accelerations in three axial directions generated in the user and angular velocities of rotation about the three axes. The acceleration and angular velocity sensors 705 are disposed one on each of the front side and rear side of the user's neck. The microcomputer 701 determines the direction of the user, for example, by integrating the angular velocity. Further, the microcomputer 701 determines the behavior of the user on the basis of the acceleration or the angular velocity detected by the two acceleration and angular velocity sensors 705. The communication I/F 706 connects to a network such as a wireless LAN by, for example, BLUETOOTH, to transmit the activity information detected by the activity sensor 9 to the sensor 14 or the information processing system 10.
Functional Configuration
The environmental control system 100 according to the present embodiment is implemented, for example, by a functional configuration illustrated in
Activity Sensor
The activity sensor 9 includes a signal transmission unit 61, a direction transmission unit 62, a behavior information transmission unit 63, a sound data transmission unit 64, and a vital data transmission unit 65. These functional units of the activity sensor 9 implement functions or means as the microcomputer 701 (illustrated in
The signal transmission unit 61 periodically transmits to a sensor device a signal by radio waves called UWB. The signal includes a user identifier (ID), and the sensor 14 can detect the position of the user with the user identified. The position information of the user is represented by two-dimensional coordinates (x, y) based on a predetermined position in the meeting room. The position information of the user is specified by the coordinates with respect to a reference point (origin). For example, the reference point is a point of contact between the end of a wall and the end of another wall of the meeting room or the center of the meeting room.
The direction transmission unit 62 transmits the direction of the user detected by the microcomputer 701 to the sensor 14. The direction is specified, for example, in a range of 0 to 360 degrees in the horizontal direction with respect to the north direction and in a range of 0 to 180 degrees in the elevation angle direction with respect to the zenith direction. The direction may be calculated from the angular velocity by the information processing system 10 instead of being detected by the microcomputer 701.
The behavior information transmission unit 63 transmits the behavior information of the user detected by the microcomputer 701 to the sensor 14. Examples of the behavior include a nod, a head tilt, a face-up posture, an inclined posture, the direction of a face, and the stationary degree of a face. It is assumed that the correspondence between the behavior information and the acceleration and the angular velocity is obtained in advance by machine learning, for example. In other words, the behavior information transmission unit 63 outputs the behavior information corresponding to the acceleration and the angular velocity included in the activity information. The behavior information may be calculated by the information processing system 10 from acceleration and the angular velocity instead of being detected by the microcomputer 701.
Machine learning is a technique that enables a computer to acquire human-like learning ability. Machine learning refers to a technology in which a computer autonomously generates an algorithm to be used for determination such as data identification from training data loaded in advance and applies the generated algorithm to new data to make a prediction. The method for machine learning may be any suitable method such as one of supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and deep learning, and two or more of these learning methods may be combined. Examples of machine learning techniques include perceptron, deep learning, support-vector machine, logistic regression, naive Bayes, decision tree, and random forests, and are not limited the techniques described in the present embodiment.
For example, deep learning is an algorithm in which XYZ is predicted based on input data ABC, and then weights between neural networks are adjusted by back propagation, in order to reduce an error from teacher data. Gradient boosting using decision trees is an algorithm that causes multiple weak prediction models to independently perform learning using a gradient method, integrates prediction results by the multiple weak prediction models using, for example, the majority vote or the average, and outputs the integrated prediction result as a prediction result of the whole (a strong prediction model).
The sound data transmission unit 64 transmits sound data received by the microphone 702 to the sensor 14.
The sound data is converted into digital signals. The information processing system 10 performs speech recognition on the sound data and converts the sound data into speech data (text data).
The vital data transmission unit 65 transmits the vital data acquired by the vital sensor 704 to the sensor 14. Examples of the vital data include the heartbeat, the pulse, the upper and lower blood pressures, the body temperature, the saturated oxygen concentration, the amount of perspiration, and the subjective symptom (consciousness level) of the user. The vital data may include any information detectable from the user.
Although the activity sensor 9 transmits the activity information to the sensor 14 in the description given above, alternatively, the activity sensor 9 may directly transmit the activity information to the information processing system 10.
Input Device and Output Device
The sensor 14 includes an output signal transmission unit 70. The camera 18 includes an output signal transmission unit 80. The microphone 20 includes an output signal transmission unit 90. The information processing terminal 22 includes an output signal transmission unit 91 and an output unit 92. The speaker 16 includes an output unit 93.
The output signal transmission unit 70 of the sensor 14 transmits the activity information received from the activity sensor 9 to the information processing system 10. In other words, the output signal transmission unit 70 transmits the respective position information of the multiple users in the meeting room to the information processing system 10 together with the user ID. Further, the sensor 14 transmits the direction, the behavior information, the speech data converted from the sound data, and the vital data received from the activity sensor 9 to the information processing system 10 as output signals.
The output signal transmission unit 80 of the camera 18 transmits to the information processing system 10 an image-capturing result obtained by capturing an image of the inside of the meeting room as an output signal. The output signal transmission unit 90 of the microphone 20 transmits to the information processing system 10 electrical signals converted from the voices of the multiple users in the meeting room as output signals.
The output signal transmission unit 91 of the information processing terminal 22 transmits to the information processing system 10 an electrical signal converted by the microphone 615 from the voice of the user operating the information processing terminal 22 as an output signal. The output unit 92 of the information processing terminal 22 outputs sound such as an ambient sound in accordance with the sound data received from the information processing system 10. The output unit 93 of the speaker 16 outputs sound such as an ambient sound in accordance with the sound data received from the information processing system 10.
The output signal transmission units 70, 80, 90, and 91 illustrated in
Information Processing System
The information processing system 10 includes a communication unit 31, a user group determination unit 32, a user guide unit 33, a user information acquisition unit 34, a generation unit 35, a first environmental control unit 36, a second environmental control unit 37, a psychological safety estimation unit 38, a minutes creation unit 39, an individuality evaluation unit 44, and a storage unit 50. The storage unit 50 stores activity information 51, psychological safety information 52, and tone and image information 53, which will be described later. The functional units of the information processing system 10 are implemented as the CPU 501 illustrated in
The communication unit 31 of the information processing system 10 receives the activity information detected by the activity sensor 9 from the output signal transmission unit 70 of the sensor 14. The communication unit 31 receives the image-capturing result obtained by capturing the image of the inside of the meeting room as the output signal from the output signal transmission unit 80 of the camera 18. The communication unit 31 receives the electrical signals converted from the voices of the multiple users in the meeting room as the output signals from the output signal transmission unit 90 of the microphone 20. The communication unit 31 receives the electrical signal converted by the microphone 615 from the voice of the user operating the information processing terminal 22 as the output signal from the output signal transmission unit 91 of the information processing terminal 22. The communication unit 31 further receives an operation signal received by the information processing terminal 22 according to an operation performed by the user.
The user group determination unit 32 determines a group of two or more users on the basis of the position information and the directions of the users. The user group determination unit 32 determines two or more users facing each other or talking to each other, and determines that these users belong to a group.
The user guide unit 33 determines which of the image display devices 12 in the meeting room the user faces based on the position information and the direction of the user, and generates a work area of the user on the image display device 12. In the case where the image display devices 12 are projectors, the user guide unit 33 determines which of the image display devices 12 is projecting the screen that the user faces, and causes the determined image display device 12 to display a work area of the user. In other words, a user-dedicated work area is automatically prepared in a portion out of the entire screen of the image display device 12 in accordance with the position and the direction of the user.
The user information acquisition unit 34 acquires the activity information of the user received by the communication unit 31 from the sensor 14 and stores the activity information in time series for each user.
The generation unit 35 generates the sound data and the image data based on the behavior information of the multiple users in the meeting room as described later. The sound data and the image data generated by the generation unit 35 include data read from the storage unit 50.
The first environmental control unit 36 controls the output unit 92 of the information processing terminal 22 or the output unit 93 of the speaker 16 to output the tone (ambient sound) based on the sound data generated by the generation unit 35.
The second environmental control unit 37 controls the output unit 92 of the information processing terminal 22 or the image display device 12 to output the image or video based on the image data generated by the generation unit 35.
The psychological safety estimation unit 38 estimates or determines a psychological safety level based on the face-to-face time of the multiple users determined as a group. The psychological safety level is an example of the state of users, and the psychological safety estimation unit 38 serves as a user state determination unit. Alternatively, the psychological safety estimation unit 38 may estimate a psychological safety level according to the vital data of the multiple users determined as a group. Both the face-to-face time and the vital data may be used. Yet alternatively, the psychological safety estimation unit 38 may estimate a psychological safety level according to the speech data (e.g., voice volume) of the multiple users determined as a group. All the face-to-face time, the vital data, and the speech data may be used.
The minutes creation unit 39 creates minutes using the speech data of the users. The speech data (utterance) is constantly recorded as the activity information. The minutes are speech data recorded when two or more users are grouped.
The individuality evaluation unit 44 evaluates the diversity of individualities of users in a team to which the users belong. The team may be an organization team, such as a development team or a sales planning team, to which users are assigned; or a group to which users are grouped by some criterion.
The storage unit 50 stores the activity information 51, the psychological safety information 52, and the tone and image information 53, for example, in a table format as illustrated in
For example, the individual relaxation level during an activity of a team is one index representing the level of psychological safety of the team. It is known in the art that the relaxation level can be estimated by measuring the balance of the autonomic nervous system from the fluctuations in heartbeat. The information processing system 10 measures the individual activity information by the activity sensors 9 and presents a report on the relaxation levels to the team. Further, the information processing system 10 outputs in real time, for example, a calm video or background music (BGM), or conversely an exciting video or BGM in accordance with the relaxation level.
The item of time is the Japan Standard Time on the day. The activity information may be recorded by day.
In the item of position information, the position (coordinates) of the user detected by communication between the UWB module 703 and the sensor 14 is stored.
In the item of direction, the direction in which the user faces is stored. Although the direction in
In the item of behavior, the behavior information representing the behavior of the user is stored.
In the item of speech, text data recognized from the sound data representing an utterance by the user is stored as speech data.
The position information and the direction are detected successively (for example, 10 times or more per second). By contrast, the behavior information and the speech data are event data detected only when a behavior and an utterance are detected, respectively. For this reason, in
The item of face-to-face represents the time during which the user group determination unit 32 determines that the users are facing each other.
The psychological safety information 52 represents the level of psychological safety corresponding to face-to-face time. In other words, the level of psychological safety is estimated based on the knowledge that the longer the face-to-face time is, the higher the level of psychological safety is.
Although psychological safety is divided into three levels in
The item of psychological safety is the same as or similar to that in
The tone represents music associated with the level of psychological safety. For example, when the level of psychological safety is low, music for increasing the level of psychological safety is used, and when the level of psychological safety is high, music for obtaining more creative output is used. The tone is not limited to that stored in the tone and image information 53, but the information processing system 10 may download suitable music from the Internet. Further, the beats per minute (BPM) of, for example, music stored in the tone and image information 53 or acquired from the outside may be changed, the number of musical instruments may be increased, or the tone may be effectively changed.
The image represents an image or video associated with the level of psychological safety. For example, when the level of psychological safety is low, an image for increasing the level of psychological safety is displayed, and when the level of psychological safety is high, an image for obtaining a more creative output is displayed. The image is not limited to the images stored in the tone and image information 53, and the information processing system 10 may download images from the Internet. The image may be a still image or a moving image (video). Further, the image may be an illumination pattern in which illumination is brightened, darkened, or blinked; or a color scheme is rhythmically changed.
In
In
Determination of Group
The direction of the user will be described with reference to
A description is given below of the grouping of users with reference to
The user group determination unit 32 can determine whether two or more users facing each other or two or more users having a face-to face conversation correspond to close contact with an infected person in a manner similar to the manner of determination of group. The term “close contact” refers to, for example, a person who touches an infected person with his/her hand without taking necessary measures to prevent infection, or a person who stays 15 minutes or longer within such a distance (about 1 m) that his/her hand touches a hand of the infected person if their hands are extended.
The center of the sector 211 is a direction 213, and the center of the sector 212 is a direction 214. The angle that defines the spread of the sectors 211 and 212 is determined in advance. The angle defining the spread is set to such an angle that the users are presumed to be facing each other. A description is given of determining whether the users face each other according to the present embodiment, assuming that the angle defining the spread is 30 degrees in the forward and backward directions with respect to the directions 213 and 214. When the direction 213 is at 90 degrees, the direction opposite thereto (different by 180 degrees) is at 270 degrees by adding 90 degrees and 180 degrees. Accordingly, when the direction 214 is within a range of 240 (=270-30) degrees to 300 (=270+30) degrees, the user group determination unit 32 determines that the user A faces the user B.
Although the case of three users is described in
It is not necessary to determine that the sectors of the three users overlap each other. For example, in the case illustrated in
Further, as illustrated in
Output of Tone or Image corresponding to Psychological Safety Level
A supplemental description is given of psychological safety levels. Psychological safety refers to an environment having a gentle atmosphere in which members are encouraged to act naturally without being afraid of the reactions of others or feeling embarrassed. The psychological safety level is its degree. The gentle atmosphere refers to an atmosphere in which members can naturally convey his/her thoughts and feeling and creative output is expected.
In the present embodiment, the psychological safety estimation unit 38 estimates the psychological safety level based on, for example, the time during which the users face each other, the vital data such as heartbeat, or the voice volume. As a result, the information processing system 10 can output a tone or an image corresponding to the psychological safety level, to promote creative output.
The user information acquisition unit 34 acquires the activity information of the user repeatedly transmitted from the sensor 14 via the communication unit 31, and stores the activity information in time series on an individual user basis. The same applies to the flowchart of
Subsequently, the user group determination unit 32 extracts a group of users whose directions are directed toward each other among the users within the threshold distance from each other (S2).
The user group determination unit 32 starts measuring the face-to-face time of the group of users extracted in step S2 (S3).
The psychological safety estimation unit 38 acquires the psychological safety level corresponding to the time during which the users face each other, the vital data, or the speech data from the psychological safety information 52, and determines whether the estimated psychological safety level has changed (S4).
When the determination in step S4 is Yes, the first environmental control unit 36 outputs a tone corresponding to the psychological safety level, and the second environmental control unit 37 outputs an image corresponding to the psychological safety level (S5). When the determination in step S4 is No, the process proceeds to step S6.
The user group determination unit 32 determines whether the face-to-face time ends based on the positions and the directions of the users (S6).
In a case where the determination in step S6 is Yes, the user group determination unit 32 starts subtraction of the face-to-face time (S7). Gradually reducing the face-to-face time from the end of the face-to-face situation in this manner is advantageous. When the users again face each other, the user group determination unit 32 can resume measuring time from the face-to-face time being subtracted. When the determination in step S6 is No, the process proceeds to step S4 while the measurement of the face-to-face time is continued.
In this way, the information processing system 10 can output a tone or an image corresponding to the psychological safety level, and can promote creative output from the users facing each other.
In
As illustrated in
In
Further, multiple pairs of users facing each other may be in one meeting room. For example, the users A and B face each other, and the users C and D face each other. In this case, the psychological safety estimation unit 38 may calculate, as the psychological safety level, the average of the psychological safety level of the users A and B and the psychological safety level of the users C and D. Alternatively, the first environmental control unit 36 may use acoustic technologies to output tones individually corresponding to the psychological safety levels of the user pairs. In other words, the first environmental control unit 36 outputs different tones to the pair of users A and B and the pair of users C and D.
Similarly, the second environmental control unit 37 may output an image corresponding to the psychological safety level of the pair of users A and B to a wall (or a part of the wall) close thereto and an image corresponding to the psychological safety level of the pair of users C and D to a wall (or a part of the wall) close thereto.
Scoring Likelihood of Occurrence of Psychological Safety
The likelihood of occurrence of psychological safety may be scored with respect to the activity in the team of users. It is known in the art that behaviors in communication contribute to, for example, human relationships, psychological safety, and reliability, which are necessary for team creativity (quality for producing creative results). In particular, for example, a human relationship appears in how a listener listens to a speaking person. Nodding and chiming in are typical examples of the manner of active listening. Based on such knowledge, the psychological safety estimation unit 38 detects the directions and nods (or reactions) of the users from the activity information, to measure the level of active listening of listeners around a certain speaker. The psychological safety estimation unit 38 collects and analyzes the active listening information on a long-term basis, periodically returns a report of the collected information to the team, and provides the team with materials for team building. The term “team building” used in this disclosure refers to organizing teams by considering to which team the user is to be assigned. Using the active listening information, for example, when the level of active listening to a certain utterance is low, the environmental control system 100 can perform the environmental control to increase the concentration of surrounding listeners. For example, the first environmental control unit 36 reduces the volume of BGM, and the second environmental control unit 37 increases the illuminance of the image or the lighting.
Note that whether certain users belong to the same team may be determined using team information registered in advance, for example, in human resources information and indicates members belonging to the team. Alternatively, users who have been grouped a threshold number of times or more in the past may be regarded as a team.
The psychological safety estimation unit 38 calculates the average value and the standard deviation of all the percentages in the team. It is preferable for the team to have a high average value. However, when the standard deviation is large, the level of active listening differs among the team members, and it is presumed that the psychological safety level is unlikely to occur even if the average value is high. The psychological safety estimation unit 38 scores the standard deviation of frequencies of nodding in
The psychological safety estimation unit 38 measures the time of a conversation between the speaker and the listener grouped (S401). Alternatively, simply the time during which multiple users are grouped may be measured regardless of whether the users are talking.
The psychological safety estimation unit 38 measures the time during which the listener faces the speaker (S402).
The psychological safety estimation unit 38 calculates the percentage of the time during which the listener faces the speaker (S403).
The psychological safety estimation unit 38 calculates the average value and the standard deviation of the percentages of all the listeners in the team (S404). The average value and the standard deviation of the percentages of all the listeners in the team may be displayed on the information processing terminal 22 or may be output by voice.
The psychological safety estimation unit 38 scores the likelihood of occurrence of psychological safety of the team in a range of, for example, 0 to 100 (S405). After the scoring, in a case where at least a part of the team members starts a conversation again and the score of the team is equal to or lower than a threshold, the environmental control system 100 can perform the environmental control to increase the concentration of surrounding listeners. For example, the first environmental control unit 36 reduces the volume of BGM, and the second environmental control unit 37 increases the illuminance of the image or the lighting.
The psychological safety estimation unit 38 calculates the average value and the standard deviation of all the frequencies of nodding in the team. It is preferable for the team to have a high average value. However, when the standard deviation is large, the level of active listening differs among the team members, and it is presumed that the psychological safety level is unlikely to occur even if the average value is high. Since whether a person nods varies greatly from person to person, it is preferable that a certain upper limit be used for frequencies of nodding equal to or greater than a threshold. Such an upper limit can prevent an increase in the standard deviation in a case where psychological safety is likely to occur (human relationship is good). The psychological safety estimation unit 38 scores the standard deviation of frequencies of nodding in
The psychological safety estimation unit 38 measures the time of a conversation between the speaker and the listener grouped (S501). Alternatively, simply the time during which multiple users are grouped may be measured regardless of whether the users are talking.
The psychological safety estimation unit 38 measures the number of times of nodding of the listener in the time measured in step S501 (S502).
From the time measured in step S501 and the number of times measured in S502, the psychological safety estimation unit 38 calculates the frequency of nodding of the listener in a unit time (S503).
The psychological safety estimation unit 38 calculates the average value and the standard deviation of the frequencies of all the listeners in the team (S504). The average value and the standard deviation of the frequencies of all the listeners in the team may be displayed on the information processing terminal 22 or may be output by voice.
The psychological safety estimation unit 38 scores the likelihood of occurrence of psychological safety of the team in a range of, for example, 0 to 100 (S505). After the scoring, in a case where at least a part of the team members starts a conversation again and the score of the team is equal to or lower than a threshold, the environmental control system 100 can perform the environmental control to increase the concentration of surrounding listeners. For example, the first environmental control unit 36 reduces the volume of BGM, and the second environmental control unit 37 increases the illuminance of the image or the lighting.
The psychological safety estimation unit 38 may weight the average value and the standard deviation of the percentages in
Creation of Minutes in which Conversation Partner is Identified
A description is given below of the creation of minutes of conversations among grouped users, with reference to, for example,
In
When three or more users are grouped, the minutes creation unit 39 may determine that the conversation is performed between the users whose directions are directed most toward each other. This is because even in a group of three or more users, a speaker (speech source) often directs his/her body toward listeners (speech destinations).
When three or more users are grouped, the minutes creation unit 39 preferably records group identification information identifying the groups 225 and 226 to which the grouped users belong, respectively. Such a manner of recording facilitates extracting only the speech data of the users who uttered words in the same group from the minutes when the minutes are referred to later.
Further, the minutes creation unit 39 preferably records current psychological safety levels 227 and 228 in speech data, respectively. Such a manner of recording facilitates extracting only the speech data having psychological safety levels equal to or higher (or equal to or lower) than a threshold from the minutes when the minutes are referred to later. How much the psychological safety level affects the utterance can be analyzed, for example, by the manager of the organization.
In the process of
The minutes creation unit 39 determines that one of the users facing each other is the speaker and the other is the listener based on the speech data (S11). In other words, of the two users facing each other, the user who utters words is the speaker (speech source), and the other is the listener (listener).
Then, the minutes creation unit 39 records, in the minutes 220, the speech data representing the utterance from the speaker to the listener (S12). The minutes creation unit 39 records the current psychological safety level in the minutes 220 in association with the speech data.
Guidance Based on User Position and Direction
A description is given below of guidance to an appropriate image display device 12 based on the position and the direction of the user. When there are multiple image display devices 12 in the meeting room, the user guide unit 33 determines the image display device 12 (or a partial region of the image display device 12) to be used by the user according to which image display device 12 the user faces, as described below with references to
When the user A is within a threshold distance from the image display device 12a and faces the image display device 12a, the user guide unit 33 determines that the image display device 12a is to be used by the user A and guides the user A to the image display device 12a. For example, the user guide unit 33 displays a message 231 “work area of the user A” on the image display device 12a. When the user B is within the threshold distance from the image display device 12c and faces the image display device 12c, the user guide unit 33 determines that the image display device 12c is to be used by the user B and guides the user B to the image display device 12c. For example, the user guide unit 33 displays a message 232 “work area of the user B” on the image display device 12c. In the messages, “A” and “B” are, for example, names, identifiers (IDs), or nicknames of the users. In the information processing system 10, the name, the ID, or the nickname of the user is associated with the user ID set in the activity sensor 9. The information processing system 10 identifies, for example, the name associated with the user ID transmitted from the activity sensor 9.
The messages 231 and 232 are erased in response to a user operation or the elapse of certain time, and a work area dedicated to the user, which will be described later, is displayed. As described above, when the user approaches the image display device 12, the user can use the whole or a part of the image display device 12 as his/her work area.
With reference to
For example, every time the position of the user is detected, the user guide unit 33 determines whether there is the image display device 12 within the threshold distance from the user (S21). The distance between the user and the image display device 12 may be the shortest distance.
If the determination in step S21 is Yes, the user guide unit 33 determines whether the direction of the user is directed to the image display device 12 (S22). When the result of the determination of S21 is No, the process of
If the determination in step S22 is Yes, the second environmental control unit 37 displays a guidance message on the image display device 12 which the user faces (S23). When the result of the determination of S22 is No, the process of
After displaying the message, the user guide unit 33 erases the message in response to a user operation or the elapse of a certain time, and displays a work area dedicated to the user. When the user guide unit 33 determines that the user has confirmed the message based on the face direction of the user, the second environmental control unit 37 may erase the message and display the work area.
When the user guide unit 33 guides the user, the second environmental control unit 37 controls the image display device 12 to display a work area for the user and displays the user's speech data on the work area. In other words, the image display device 12 displays data input by voice or handwriting (or hand drafting). Handwritten data may be subjected to speech recognition in real time. The user can operate a Web browser in the work area to display a Web page or a desktop screen of his/her PC.
The work areas 201 and 241 each have a fixed shape and size in the initial state. The shape does not need to be circular as illustrated in the figure, but may be, for example, rectangular or polygonal. When the amount of speech data input by the user increases, the second environmental control unit 37 automatically widens the work areas 201 and 241. When the speech data further increases, the second environmental control unit 37 may display a scroll bar. The second environmental control unit 37 may change the shape or size of the work areas 201 and 241 according to a user operation. The second environmental control unit 37 may move the work areas 201 and 241 as desired according to dragging or a voice input by the user.
When the user guide unit 33 allocates, as work areas, areas of one image display device 12 to multiple users, the second environmental control unit 37 adjusts the positions, sizes, and shapes thereof so that the multiple work areas do not overlap.
Displaying the work areas 201 and 241 in this manner are applicable to the following scenes.
(i) The users individually concentrate on thinking up their ideas.
In this case, the speech data of the users A and B are sequentially displayed in the work areas 201 and 241.
(ii) One user presents his/her thoughts to another user.
In this case, the user can explain the content of the speech data with the speech data of the users A and B respectively displayed in the work areas 201 and 241.
(iii) Multiple users have a meeting (discussion).
In this case, the speech data of the users are displayed in their respective work areas.
The users can move to the work area of another user or copy and paste the speech data from and to the work area of another user, to gather ideas. Note that the second environmental control unit 37 displays their respective work areas at the positions close to the users so as not to overlap. When multiple users are grouped, the minutes are also recorded.
The second environmental control unit 37 determines whether the user has uttered words (S31). Which user has uttered words is identified by the user ID transmitted by the activity sensor 9. The user's utterance is constantly recorded.
When the user utters words (Yes in S31), the second environmental control unit 37 displays the speech data in his/her work area (S32).
In this way, the second environmental control unit 37 displays the speech data on the image display device 12 near the user who has uttered words. The user can input ideas by voice or handwriting at any time. When the user group determination unit 32 determines that multiple users face each other near the image display device 12, the users can have a discussion using the image display device 12.
The work area can be displayed on, not limited to a wall, but also a floor or a ceiling provided with the image display device 12.
Team Individuality Evaluation Based on Utterance Content
It is known in the art that individuality is reflected in utterance contents. The exposure and diversity of individualities are desirable in the creativity of a team such as a development team. A description will be given of a method in which the individuality evaluation unit 44 digitizes variations in the individuality of team members from the utterance contents (keywords).
The individuality evaluation unit 44 determines, for example, favorite fields, fields of interest, and specialty fields of the individual team members based on the keywords extracted from utterance contents. The individuality evaluation unit 44 may change the weight of the keyword in accordance with the volume or speed of utterance of the keyword. For example, the weight is larger as the voice is louder or the speed is higher. The individuality evaluation unit 44 infers the level of expression of individuality and the diversity of individuality of the team, using the variations (standard deviation) of the determined fields among the team members.
A description is given of the evaluation of individuality based on a keyword according to the present embodiment, with reference to
Similarly,
When the graphs of the user A, the user B, and the user C are compared with each other, it can be seen that the percentage of the field of science is greatly different, but the percentage of the field of art is similar. In order to quantify such differences, the individuality evaluation unit 44 calculates, for each field, a standard deviation of the percentages among the speakers. A large standard deviation means that the individuality of the team is diverse.
The individuality evaluation unit 44 extracts the speech data of the individual users in a certain past period from the activity information, and classifies keywords included in the speech data by field (S101).
The individuality evaluation unit 44 calculates, for each user, the number of keywords by field, and calculates the percentage of the number of keywords by field relative to the total number of keywords (S102). At this time, the individuality evaluation unit 44 may increase the number of keywords in accordance with the volume or speed of utterance of the keyword, and increase the total number by the same number.
The individuality evaluation unit 44 calculates, for each field, a standard deviation among members belonging to the same team, and calculates the total or average of the standard deviations of the fields as a variation score (S103). The individuality evaluation unit 44 may display the score of the team when a team member approaches the image display device 12, or may output the score to team members or the human resources department by, for example, email.
In this way, the individuality evaluation unit 44 determines whether the individualities of members belonging to the same team are diversified.
The environmental control system 100 of the present embodiment can control the environment of the meeting room by using the position information and the direction information of the user or the position information, the direction information, and the speech data of the user.
In the present embodiment, environmental control using user behavior information will be described.
Functions
The information processing system 10 according to the present embodiment includes a user state determination unit 41, a cursor position determination unit 42, a convenience presentation unit 45, and a communicative level presentation unit 46. The user state determination unit 41 determines, for example, the understanding level and concentration level of the user in a conversation in consideration of the behavior information.
The cursor position determination unit 42 determines the cursor position on the image display device 12 based on the direction of the user. The cursor is a small shape or symbol (for example, a mouse pointer) indicating the current input position on the operation screen of a computer.
The convenience presentation unit 45 evaluates the convenience of a place in an office and enables optimization of the layout of the office. The convenience evaluated by the convenience presentation unit 45 is the quantification, for each place in an office, of the usage rate, the concentration levels of users, the psychological safety levels (relaxation levels), and the conversation volumes.
The communicative level presentation unit 46 presents a place where meaningful communications are performed in the office. In a place where certain persons have meaningful communication, it is easy for other users to have meaningful communications. Accordingly, the user can select such a place to have communication.
User State Determination
A description is given of a method of determining a user state using the behavior information with reference to
For example, the user state determination unit 41 counts the number of times of nodding and head tilt of the users facing each other. The user state determination unit 41 deducts the number of times of head tilt from the number of times of nodding in a certain past time and converts the calculated value into the understanding level 250. For example, the understanding level 250 is determined in proportion to the value obtained by deducting the number of times of head tilt from the number of times of nodding. The understanding level 250 may be a numerical value such as 0 to 100% or may be, for example three levels of large, medium, and small.
The user state determination unit 41 may determine the understanding level by using speech data. For example, in response to detecting utterances such as “I see” and “uh-huh,” the user state determination unit 41 determines that the understanding level has enhanced. The user state determination unit 41 may determine the understanding level from the speech data by using a model obtained by machine learning of the correspondence between speech data and understanding levels.
Preferably, the first environmental control unit 36 outputs a tone corresponding to the understanding level of the user, and the second environmental control unit 37 outputs an image corresponding to the understanding level of the user. Accordingly, the tone and image information 53 includes tones and images corresponding to the understanding levels of users.
The item of understanding level represents the understanding level determined by the user state determination unit 41.
The item of tone represents tone associated with the understanding level. For example, when the understanding level is low, a tone that increases the concentration is used, and when the understanding level is high, a tone that promotes more creative output is used.
The item of image represents an image or video associated with the understanding level. For example, when the understanding level is low, an image that increases the concentration is used, and when the understanding level is high, an image that promotes more creative output is used.
The second environmental control unit 37 may display the understanding level 250 on the image display device 12 closest to the two users A and B. Alternatively, the second environmental control unit 37 may display the understanding level 250 on the image display device 12 facing the user who has uttered words.
The user state determination unit 41 determines whether the user has uttered (S41). Which user has uttered words is identified by the user ID transmitted by the activity sensor 9. Further, another user facing the user who has uttered words is identified as the listener.
When the user utters words, the user state determination unit 41 checks the behavior information of the user being the listener with respect to the utterance (S42). It is preferable that the user state determination unit 41 focuses on the behavior information of the listener from the start to the end of the speech data and within a certain time from the end. The user state determination unit 41 records the behavior information together with time.
The user state determination unit 41 deducts the number of times of head tilt from the number of times of nodding in a certain past time and updates the understanding level based on the calculated value (S43). The user state determination unit 41 may weight the number of times of nodding or head tilt immediately after the utterance. For example, the number of times of each nodding or head tilt immediately after the utterance is considered to be larger than 1. This is because there is a high probability that the nodding or head tilt immediately after the utterance reflects the understanding level.
The second environmental control unit 37 displays the understanding level of the user on the image display device 12, the first environmental control unit 36 outputs a tone corresponding to the understanding level, and the second environmental control unit 37 outputs an image corresponding to the understanding level (S44). Further, the second environmental control unit 37 may display the history of the understanding level of each user.
Displaying the understanding level of the user in this manner helps the speaker speak so as to be easily understood by the listener. Further, the information processing system 10 outputs a tone or an image to increase the understanding level.
With reference to
In a case where the position and the direction hardly change and the user A is in a predetermined posture, the user state determination unit 41 determines that the user A should concentrate and determines the concentration level. Examples of the predetermined posture include sitting positions such as full-lotus sitting, cross-legged sitting, and seiza. Seiza is kneeling with the legs folded underneath the thighs and the buttocks resting on the heels, with ankles turned outward. For example, when the user state determination unit 41 determines that the user does not move but his/her head is inclined down based on the behavior information, the user state determination unit 41 determines that the user is drowsy and the concentration level is low.
The user state determination unit 41 determines a user state other than the understanding level and the concentration level. For example, the user state determination unit 41 determines the strength of the impact received by the user based on a face-up posture. The user state determination unit 41 determines a state of anxiety or depression based on a face-down posture. The first environmental control unit 36 and the second environmental control unit 37 perform appropriate environmental control according to the state of the user.
The user state determination unit 41 determines whether the position and the direction of the user remain unchanged (S51). In a case where the user is standing, the user may move slightly even if the user concentrates on work. Accordingly, the user state determination unit 41 may ignore changes within a certain degree of the position or the direction.
When the determination in step S51 is Yes, the user state determination unit 41 determines whether the user is in a predetermined posture on the basis of the behavior information of the user (S52). For example, when the behavior information indicates that the user is sleeping or doing exercises, the user state determination unit 41 determines that the user does not concentrate on work. When the determination of S51 is No, the process of
In a case where the determination in step S52 is Yes, the user state determination unit 41 updates the concentration level in accordance with the elapsed time from when the user took the predetermined posture without changing the position and the direction (S53). When the determination of S52 is No, the process of
The first environmental control unit 36 outputs a tone corresponding to the concentration level, and the second environmental control unit 37 outputs an image corresponding to the concentration level (S54). Further, the second environmental control unit 37 may display the history of the concentration level of each user.
The tone or image corresponding to the concentration level is a tone or image for increasing (and maintaining) the concentration level. Alternatively, the tone or image corresponding to the concentration level is a tone or image for reporting a decrease in the concentration level when the concentration level decreases. The second environmental control unit 37 may display the image on the image display device 12 closest to the user A or the image display device 12 positioned in the face direction of the user A. Alternatively, the second environmental control unit 37 may display the image corresponding to the concentration level on all the image display devices 12.
In this way, the information processing system 10 outputs a tone and an image to increase and maintain the concentration level.
Display of Cursor
A description is given below of the display of a cursor using the behavior information, with reference to
The second environmental control unit 37 preferably places a cursor 262 on, for example, a button 261 included in the image in the direction of the user's face. In other words, even if the direction of the user's face is slightly deviated from the position of the button 261, the second environmental control unit 37 can enhance the operability by forcibly positioning the cursor 262 on the button 261. When the user utters words such as “press” or “on,” under such conditions, the information processing system 10 detects the pressing of the button 261 from the speech data. Accordingly, the information processing system 10 allows the user to manipulate the image based on the direction of the user's face and voice.
In
The users B and Care guided to the image display device 12b. Accordingly, the second environmental control unit 37 displays work areas 263 and 264 of the users B and C on the image display device 12b. In such a case, the user C can add speech data to the work area of the user C based on his/her face direction. Specifically, the information processing system 10 adds speech data obtained from speech recognition on signals acquired via the communication unit 31 from, for example, the microphone 20 to the work area determined from the face direction included in the behavior information. The cursor position determination unit 42 detects the work area 263 in the face direction of the user B and adds speech data 267 uttered by the user C to speech data 266 in the work area 263. The speech data 267 added to the work area 263 may be highlighted with, for example, a different color.
In
Needless to say, the user B can also add speech data to the work area 264 of the user C. Since the information processing system 10 allows the addition of the speech data only among the grouped users, an unrelated user is prevented from adding speech data.
The cursor position determination unit 42 determines whether or not the user faces the image display device 12 (S61).
When the determination in step S61 is Yes, the cursor position determination unit 42 determines whether or not there is a button in the direction of the user's face (S62). When the determination of S61 is No, the process of
If the determination in step S62 is Yes, the cursor position determination unit 42 determines to align the cursor position with the button (S63). The second environmental control unit 37 displays a cursor superimposed on the button.
Subsequently, the cursor position determination unit 42 determines whether or not the user has spoken while facing the work area (S64). This work area may be the user's own work area or another user's work area.
When the determination in step S64 is Yes, the second environmental control unit 37 additionally displays the speech data in the user's own work area or another user's work area based on the direction in which the user has uttered words (S65).
In this way, the information processing system 10 determines the face direction of the user relative to the image display device 12 and controls the environment such as the image.
Office Optimization
The tendencies of the levels of concentration and relaxation of users and conversation volume may depend on the location in the office. In other words, it is possible that the usage rate of a certain place is high and the levels of concentration and relaxation of users and conversation volume there are high, while the usage rate of another place is low and the levels of concentration and relaxation of users and conversation volume there are low. It can be said that the convenience of the former place is high but the convenience of the latter place is low. However, as an office, it is preferable that convenience is high at any place. A description is given of a method of summarizing the degrees of convenience of the places determined based on these indices and displaying the degree of convenience in association with the place. This allows the manager of the organization or the administrator of the office to optimize the layout of the office to increase the convenience.
In
The convenience for each place in
The convenience presentation unit 45 calculates the usage rate of each place; and the concentration level, the relaxation level, and the conversation volume in each place (S201). These indices are calculated based on the activity information of, for example, a most recent period (for example, one month). Examples of the place include meeting rooms, seats, and tables. The convenience presentation unit 45 acquires the position information of users in these places from the activity information, and calculates the usage rate, the concentration level, the relaxation level, and the conversation volume.
The usage rate is obtained by averaging, for example, the percentage of time of use of the place per day in a certain period. The concentration level is obtained by calculating the percentage of the time during which it is determined that the user is concentrated to the time during which the user is in the place and averaging the calculated percentages in a certain period. The relaxation level is obtained by calculating the percentage of the time during which it is determined that the user is relaxed to the time during which the user is in the place and averaging the calculated percentages in a certain period. Whether or not the user is relaxed is determined from the heartbeat detected by the activity sensor 9. The time-series data of heartbeat interval variation is represented by R-R interval (RRI) signals, and attention is paid to a high frequency-component (HF) and a low-frequency component (LF) of the RRI signals. It is known in the art that the HF is an activity index of the parasympathetic nerve and thus increases when the subject is relaxed, and the LF is an activity index of the sympathetic nerve and thus increases when the strain of the subject increases. Accordingly, a method of calculating the relaxation level from the expression HF/(HF+LF) is known in the art. The conversation volume is obtained by calculating, for example, the total of time during which it is determined that users have conversations in one day at the place and averaging the total in a certain period.
The convenience presentation unit 45 calculates the convenience of each location (S202). The convenience presentation unit 45 calculates the degree of convenience by, for example, weighting the usage rate, the concentration level, the relaxation level, and the conversation volume.
The convenience presentation unit 45 transmits, to the information processing terminal 22, screen information for a screen in which places of the office are highlighted with differences in color or shading pattern density according to the degree of convenience (S203). The information processing terminal 22 displays a screen on which the convenience is presented for each office place as illustrated in
As described above, the convenience presentation unit 45 summarizes the degree of convenience based on the activity information, to optimize the office design. This helps the user to select and use a highly convenient place.
Further, using the activity information, the convenience presentation unit 45 can determine a place where meaningful communications are to be performed in the office. The place where meaningful communications are to be performed is a place where communications are easily performed, which is described below with references to
Whether or not meaningful communications are performed may be determined from, for example, the number of nods, the voice volume of speech data, and heartbeat. One or more of these indices are not necessarily used. These indices may be replaced with equivalent indices. For example, the number of nods may be substituted with positive speech, and the heartbeat may be substituted with pulse.
The number of nods is obtained by averaging, for example, the frequency (the number of times per unit time) of nods in communication in the place in a certain period. Since communication is performed among multiple users, the number of times of nodding by all users in communication is counted. The voice volume of the speech data is obtained by, for example, averaging the voice volume of speech of the user in a certain period in the place. The heartbeat is obtained by, for example, averaging the heartbeat of the user in a certain period at the place. The larger the amount of nodding, the voice volume of speech data, and the heartbeat, it can be determined that meaningful communications are performed. The communicative level presentation unit 46 weights the number of nods, the sound volume of speech data, and the heartbeat to quantify the case of communication.
The communicative level presentation unit 46 calculates the number of nods, the sound volume, and heartbeat at each place (S301). These indices are calculated based on the activity information of, for example, a most recent period (for example, one month). The place is specified based on the position information of two or more users who have communicated with each other. In other words, the communicative level presentation unit 46 regards the grouped users as being communicating, and specifies, as the place, for example, a circular or rectangular range surrounding the position information of these users or a circle having a predetermined radius from the center of gravity of the position information of these users. The place may be specified by the unit of, for example, meeting room, seat, and table.
The communicative level presentation unit 46 calculates the ease of communication at each place (S302). For example, the convenience presentation unit 45 calculates the ease of communication by weighting the number of nods, the sound volume, and the heartbeat.
The communicative level presentation unit 46 transmits, to the information processing terminal 22, a screen that displays a mark having a color or density corresponding to the ease of communication, superimposed on the layout of the office (S303). The information processing terminal 22 displays a screen on which the ease of communication is presented for each office place as illustrated in
As described above, since the communicative level presentation unit 46 presents a place where meaningful communications are performed, the user can select the place where meaningful communications are performed to have a communication.
The environmental control system 100 of the present embodiment may control the environment of the meeting room by using the position information, the direction information, and the behavior information of the user.
In the present embodiment, a description is given of environmental control in a virtual meeting on the assumption that a user carrying the activity sensor 9 participates in the virtual meeting.
Some technologies are known in the art to display a screen of an electronic whiteboard or a PC screen in a virtual space, and enable users to hold, for example, a meeting while viewing a screen displaying a virtual space with virtual reality (VR) goggles. Some technologies are known in the art to hold, for example, online meetings in which a PC monitor displays the faces of the participants or a screen of an electronic whiteboard. Compared with such online meetings, in the virtual space (within the field of view of the VR goggles), the participants and the screen of the electronic whiteboard or PC are displayed as if the participants and the screen are in front of the eyes. Accordingly, the participants can feel the realism of the meeting.
Regarding the environmental control according to Embodiments 1 or 2, since an image can be displayed on, for example, any wall, ceiling, or floor in the virtual space, the degree of freedom of the environmental control is expected to increase.
As described in Embodiments 1 and 2, the information processing system 10 communicates with the activity sensor 9 of each user. In
In the present embodiment, the position information of the users in the meeting room 322 transmitted to the information processing system 10 may indicate the respective locations of the users in the meeting room 322.
However, when the virtual space 320 is the virtual meeting room 321 as illustrated in
The user operates the controller 331 carried by the user for VR operation to move the avatar in the virtual space 320. In the initial state, the direction of the avatar in the virtual space 320 matches the direction of the seat in the virtual space 320. The direction of the avatar is directed to the direction instructed by the operation of the controller 331 or the direction detected by the activity sensor 9. Thus, the face direction in the virtual space 320 is specified.
In the virtual meeting room 321, the user participating from the meeting room 322 or the remote environment 323 can view other participants and the screens of the devices installed in the meeting room as if the user is in a real meeting room.
In the case of the system configuration illustrated in
Generally, it is difficult for a user wearing VR goggles to perform an activity while freely walking in a real space with many obstacles since the field of view is obstructed by the VR goggles. In the related art, if the obstacle is a fixture, which does not move, the three-dimensional data of the obstacle relative to the space is obtained in advance and an obstacle having the same size is reproduced at the same position in a virtual space based on the three-dimensional data, to prevent collision in the real space. However, this method is not applicable for an object such as a person that moves freely. By contrast, in the present embodiment, since the position information of a person is acquired and processed by the information processing system 10, the information processing system 10 reproduces the person in a virtual space in the same or substantially same positional relation as in a real space. Accordingly, the collision between persons in the virtual space can be prevented. As a result, even when the fields of view are fully obstructed by the VR goggles, the persons can freely move around.
When the virtual space 320 is the virtual office 324 as illustrated in
In
Hardware Configuration
VR Goggles
The CPU 130 executes an operating system (OS) and a control program read from the ROM 132 to the main memory 131, to perform various types of processing. The main memory 131 includes a dynamic RAM (DRAM), and is used as, for example, the work area of the CPU 130.
In the ROM 132, the OS, a system program at power on, and a program for controlling the display terminal 330 are written in advance.
To the CPU 130, a universal asynchronous receiver-transmitter (UART) 135 is connected. The UART 135 is an interface for serial data transmission and reception between the CPU 130 and a BLUETOOTH module 136, and includes, for example, a first-in first-out (FIFO) memory and a shift register.
The BLUETOOTH module 136 includes a radio frequency (RF) unit and a baseband unit and is connected to an antenna 137. The BLUETOOTH module 136 performs wireless communication conforming to BLUETOOTH protocols.
The display controller 133 performs digital-to-analog (D/A) conversion on, for example, text, graphics, and image data, and performs control for displaying these data on a liquid crystal display (LCD) 134.
The wireless LAN controller 139 executes a communication protocol conforming to the Institute of Electrical and Electronics Engineers (IEEE) 802.11ax, and controls communication with other devices by transmitting and receiving radio waves via the antenna 138.
A sound signal received from the microphone 142 is converted into sound signal by an analog-to-digital (A/D) conversion circuit, and the sound data is encoded by an advanced audio coding (AAC) method by the audio codec 140. AAC encoded data received from an external device is decoded by the audio codec 140, converted into an analog signal by the D/A conversion circuit, and output from a speaker 143. A video codec 141 decodes compressed video data received from an external device. The compressed video data is in the format in conformity with, for example, the International Telecommunication Union (ITU)-T Recommendation H.264. Data is exchanged between the above-described components via the bus 144.
Controller for Operating VR
The CPU 110 executes a control program read from the ROM 112 to the main memory 111, to perform control processing. The main memory 111 includes a DRAM and is used as, for example, a work area of the CPU 110.
In the ROM 112, a system program at the time of power-on and a program for transmitting information on pressing of the menu display button 114, the pointer display button 115, and the confirmation button 116 by BLUETOOTH are written in advance.
The six-axis acceleration and angular velocity sensor 113 outputs measurement data of acceleration and angular velocity. The UART 117 is an interface for serial data transmission and reception between the CPU 110 and a BLUETOOTH module 118, and includes, for example, a FIFO memory and a shift register. The BLUETOOTH module 118 includes an RF unit and a baseband unit and is connected to an antenna 119. The BLUETOOTH module 118 performs wireless communication conforming to BLUETOOTH protocols.
Functions
The information processing system 10 of
The first environmental control unit 36 of the present embodiment outputs a tone (ambient sound) corresponding to the sound data generated by the generation unit 35 in the virtual space 320. There are two manners of output. One is outputting the tone from all the display terminals 330 so that all avatars in the virtual space 320 hear the tone. The other is outputting the toner from only the display terminal 330 of the specific avatar so that only the specific avatar hears the tone.
The second environmental control unit 37 outputs an image corresponding to the image data generated by generation unit 35 to virtual space 320. There are two manners of output. One is outputting the image to all the display terminals 330 so that all avatars in the virtual space 320 can view the image. The other is outputting the image to only the display terminal 330 of the specific avatar so that only the specific avatar can view the image.
Display Terminal
With reference to
The terminal communication unit 343 communicates with the controller 331 by wireless communication such as BLUETOOTH and receives operation information from the controller 331. The communication unit 344 transmits, for example, the operation information received from the controller 331 to the information processing system 10 via an input device or directly. Further, the communication unit 344 receives, from the information processing system 10, an image of the entire virtual space 320 or a part thereof corresponding to a partial field of view range. The display control unit 345 displays an image of the virtual space 320 in the field of view range corresponding to the position and the direction of the avatar.
Controller
A description is given of a functional configuration of the controller 331 according to the present embodiment. The controller 331 includes a terminal communication unit 341 and an operation receiving unit 342. The operation receiving unit 342 receives a user's operation on the controller 331 (such as pressing of a button for instructing a position or a direction, or pressing of a button for selecting a menu). The terminal communication unit 341 communicates with the controller 331 by wireless communication such as BLUETOOTH and transmits the operation information to the controller 331.
Virtual Space Control Unit
The user information acquisition unit 34 converts the activity information of the user received by the communication unit 31 from the sensor 14 to the activity information in the virtual space 320, and stores the activity information in time series for each user. The term “conversion” used here relates to position information. In the system configuration of
The activity information has a data structure similar to that in
Further, in the present embodiment, the following activity information can also be acquired.
Avatar information: The avatar information is information of the avatar used by the user in the virtual space 320. The avatar information includes information on the position and the direction of the avatar in the virtual space 320. The avatar information further includes information on, for example, appearance, clothes, and accessories of the avatar.
Action history: The action history is a history of actions performed by the avatar in a metaverse. For example, the action history includes places the avatar has visited and the activities (e.g., a meeting or a conversation) the avatar has performed.
Communication data: The communication data is data of, for example, text chats, voice chats, and gestures with which the avatar communicates with another avatar.
Interests and hobbies information: Interests and hobbies information related to hobbies and preferences of the avatar and indicates, for example, a content or activity in which the avatar is interested and places the avatar frequently visits.
Social graph: A social graph is information that visually represents the relation of the avatar with another avatar (friendship or belonging to a group) by, for example, connection lines or color coding.
Sensor data: Sensor data is information related to a physical motion or state of an avatar obtained using a sensor (such as a position sensor or an acceleration sensor) mounted on the display terminal 330.
Advertisement or personalization: Advertisement or personalization is information for individually customizing or recommending an advertisement or content based on the action history or the interests of the avatar.
Environmental Control in Virtual Space
Descriptions are given below of environmental control in the virtual space 320 based on the flowcharts of
Output of Tone or Image corresponding to Psychological Safety Level
In step S1A, the user information acquisition unit 34 obtains the activity information of the avatars in the virtual space 320, and stores the activity information in time series for each avatar. The user group determination unit 32 extracts a group of avatars within a threshold distance from each other in the virtual space 320 (SIA).
The user group determination unit 32 extracts a group of avatars whose directions are directed toward each other among the avatars within the threshold distance from each other (S2A).
Subsequent process of steps S3A to S7A can be similar to the process of steps S3 to S7 in
Recording the transition of the psychological safety level between users (avatars) as illustrated in
Creation of Minutes in which Conversation Partner is Identified
The minutes creation unit 39 determines one of the avatars facing each other is the speaker and the other is the listener based on speech data in virtual space 320 (S11A). An utterance of an avatar is an utterance of a user in the real space. Since the activity sensor 9 transmits the speech data to the information processing system 10, the virtual space control unit 43 treats the speech data as that uttered by the avatar in the virtual space 320. The utterance may be transmission and reception of, for example, a chat or a stamp in the virtual space 320.
Then, the minutes creation unit 39 records, in the minutes 220, the speech data representing the utterance from the speaker to the listener (S12A). The minutes creation unit 39 records the current psychological safety level in the minutes 220 in association with the speech data.
Guidance Based on Avatar Position and Direction
For example, every time the position of the avatar is detected in the virtual space 320, the user guide unit 33 determines whether there is the image display device 12 within the threshold distance from the avatar (S21A). In the system configuration of
If the determination in step S21A is Yes, the user guide unit 33 determines whether the direction of the avatar is directed to the image display device 12 in the virtual space 320 (S22A).
In the system configuration of
If the determination in step S22A is Yes, the second environmental control unit 37 displays a guidance message on the image display device 12 which the avatar faces (S23A). In the system configuration illustrated in
The guidance message is displayed, as illustrated in
The second environmental control unit 37 determines whether the avatar has uttered words (S31A). Which avatar has uttered words is identified by the user ID transmitted by the activity sensor 9. The avatar's utterance is constantly recorded.
When the avatar utters words (Yes in S31A), the second environmental control unit 37 displays the speech data in the avatar's work area (S32A). As illustrated in
In the system configuration illustrated in
Alternatively, speech data may be displayed in any desirable space by utilizing the merit of the virtual space 320. In either case, the speech data may or may not be visible to other avatars.
User State Determination
The user state determination unit 41 determines whether the avatar has uttered a word (S41A). Which avatar has uttered words is identified by the user ID transmitted by the activity sensor 9. Further, another avatar facing the avatar who has uttered words is identified as the listener.
When the avatar utters words, the user state determination unit 41 checks the behavior information of the avatar being the listener with respect to the utterance (S42A). It is preferable that the user state determination unit 41 focuses on the behavior information of the avatar being the listener from the start to the end of the speech data and within a certain time from the end. The user state determination unit 41 records the behavior information together with time. The behavior information of the avatar is detected by the activity sensor 9. For example, when nodding or head tilt is detected as a behavior, the activity sensor 9 transmits the information on the detected behavior to the information processing system 10, and the virtual space control unit 43 treats the behavior as the behavior of the avatar in the virtual space 320. The behavior includes, in addition to the behavior detected by the activity sensor 9, behaviors (such as nodding and head tilt) input from the controller 331 operated by the user. Subsequent process of S43A and S44A can be similar to the process of S43 and S44 in
In step S44A, in the system configuration illustrated in
Alternatively, the understanding level may be displayed in any desirable space by utilizing the merit of the virtual space 320. In the system configuration of
Determination of Concentration Level
The user state determination unit 41 determines whether the position and the direction of the avatar remain unchanged in the virtual space 320 (S51A).
When the determination in step S51A is Yes, the user state determination unit 41 determines whether the avatar is in a predetermined posture on the basis of the behavior information of the avatar (S52A). For example, when the behavior information indicates that the avatar is sleeping or doing exercises, the user state determination unit 41 determines that the avatar does not concentrate on work. The behavior information of the avatar is detected by the activity sensor 9. For example, when nodding or head tilt is detected as a behavior, the activity sensor 9 transmits the information on the detected behavior to the information processing system 10, and the virtual space control unit 43 treats the behavior as the behavior of the avatar in the virtual space 320. The behavior includes, in addition to the behavior detected by the activity sensor 9, behaviors (such as being interested and being concentrated) input from the controller 331 operated by the user. Subsequent process of S53A and S54A can be similar to the process of S53 and S54 in
In step S54A, in the system configuration illustrated in
Display of Cursor
The cursor position determination unit 42 determines whether the avatar faces the image display device 12 in the virtual space 320 (S61A). In the system configuration of
When the determination in step S61A is Yes, the cursor position determination unit 42 determines whether there is a button in the direction of the avatars face (S62A).
If the determination in step S62A is Yes, the cursor position determination unit 42 determines to align the cursor position with the button (S63A). The second environmental control unit 37 displays a cursor superimposed on the button. The button may be in an area preset by the user. The button may be a button that is directly pressed by the user in the real space. By selecting or pressing the button, it is determined whether the avatar speaks toward the work area. Subsequent process of S64A and S65A can be similar to the process of S64 and S65 in
Team Evaluation Based on Utterance Content
Also in the virtual space 320, since the speech data of the avatar is the speech data of the user, the process of S101A to 103A in
Office Optimization
The process of S201A to S203A can be similar to the process of S201 to S203 in
The process of S301A to S303A can be similar to the process of S301 to S303 in
Applied Cases
The above-described embodiments are illustrative and do not limit the present disclosure. Thus, numerous additional modifications and variations are possible in light of the above teachings without deviating from the scope of the present disclosure. The information processing system 10 described in the above embodiments of the present disclosure is one example, and the system configuration may vary depending on applications or purposes.
For example, one user may carry multiple activity sensors 9. This makes it possible to detect the direction and behavior of the user more accurately.
The information processing system 10 may determine the direction and the behavior of the user by combining the direction and the behavior information from the activity sensor 9 and the movement of the user detected by the camera 18.
The information processing system 10 desirably performs active noise canceling on the sound data obtained by the activity sensor 9 using the sound data obtained by the microphone 20. Active noise canceling is to cancel ambient noise, and thus the user's voice can be obtained with reduced noise.
The functional configurations of, for example,
The functionality of the elements disclosed herein may be implemented using circuitry or processing circuitry which includes general purpose processors, special purpose processors, integrated circuits, application specific integrated circuits (ASICs), digital signal processors (DSPs), field programmable gate arrays (FPGAs), conventional circuitry and/or combinations thereof which are configured or programmed to perform the disclosed functionality. Processors are considered processing circuitry or circuitry as they include transistors and other circuitry therein. In the disclosure, the circuitry, units, or means are hardware that carry out or are programmed to perform the recited functionality. The hardware may be any hardware disclosed herein or otherwise known which is programmed or configured to carry out the recited functionality. When the hardware is a processor which may be considered a type of circuitry, the circuitry, means, or units are a combination of hardware and software, the software being used to configure the hardware and/or processor.
The “processing circuit or circuitry” in the present specification includes a programmed processor to execute each function by software, such as a processor implemented by an electronic circuit, and devices, such as an application specific integrated circuit (ASIC), a digital signal processor (DSP), a field programmable gate array (FPGA), and conventional circuit modules arranged to perform the recited functions.
The group of apparatuses or devices described in the embodiments of the present disclosure is one of multiple computing environments to implement the embodiments disclosed in the present disclosure. In some embodiments, the information processing system 10 includes multiple computing devices such as a server cluster. The multiple computing devices communicate with one another through any type of communication link including, for example, a network and a shared memory, and perform the processes disclosed in the present disclosure.
Further, the information processing system 10 can combine disclosed processing steps in various ways. The components of the information processing system 10 may be combined into a single apparatus or may be divided into multiple apparatuses. One or more of the processes performed by the information processing system 10 may be performed by the information processing terminal 22. Any one of the above-described operations may be performed in various other ways, for example, in an order different from the one described above.
The above-described embodiments are illustrative and do not limit the present invention. Thus, numerous additional modifications and variations are possible in light of the above teachings. For example, elements and/or features of different illustrative embodiments may be combined with each other and/or substituted for each other within the scope of the present invention. Any one of the above-described operations may be performed in various other ways, for example, in an order different from the one described above.
The present disclosure further includes the following aspects.
In an information processing system of one aspect, in a case where the cursor is displayed on a button, the environmental control unit controls the environment of the space in accordance with speech data of the user who presses the button.
In the information processing system of another aspect, the environmental control unit determines an object to be used for environmental control of the space in accordance with the position information and the direction information of the user, and displays a work area of the user on the determined object. In a case where a work area is present at the position determined as the position of the cursor by the cursor position determination unit, the environmental control unit additionally displays speech data of the user in the work area.
In another aspect, the information processing system further includes a psychological safety estimation unit to calculate a percentage of facing time to time during which a speaker and a listener belonging to a team and grouped by the user group determination unit have a conversation. The facing time is the duration in which the listener faces the speaker. The psychological safety estimation unit calculates a standard deviation of the percentage of the facing time in the team, and the environmental control unit controls the environment of the space in accordance with the standard deviation.
In another aspect, the information processing system further includes a psychological safety estimation unit to calculate a frequency of nodding of a listener to a speaker in time during which the speaker and the listener belonging to a team and grouped by the user group determination unit have a conversation; and calculate a standard deviation of the frequency of nodding in the team. The environmental control unit controls the environment of the space in accordance with the standard deviation.
In another aspect, the information processing system further includes an individuality evaluation unit to determine fields of keywords extracted from the speech data; calculate, for each speaker, percentages of the fields to which the keywords belong, and calculate a sum or an average of standard deviations of the percentages of the fields among speakers, as an individuality variation score of a team to which the speakers belong. The environmental control unit outputs the individuality variation score.
In another aspect, the information processing system further includes a convenience presentation unit to calculate, for each of multiple places in the space, a usage rate of the place where the user performs an activity, a concentration level of the user, a relaxation level of the user obtained from heartbeat intervals, and an conversation volume of the user obtained from speech data of the user, and present convenience of the place based on the usage rate of the place; and the concentration level, the relaxation level, and the conversation volume of the user.
In another aspect, the information processing system further includes a communicative level presentation unit to calculate, for each of multiple places in the space, an amount of nodding in an activity performed by the user at the place, a voice volume of utterances of the user at the place, and a heartbeat of the user at the place, and present ease of communication at the place calculated based on the amount of nodding, the sound volume, and the heartbeat of the user.
In another aspect, in the information processing system, the space includes a virtual space.
In another aspect, an environmental control system for controlling an environment of a space in which a user performs an activity includes an information processing apparatus, an activity sensor, and an output device. The information processing apparatus includes a user information acquisition unit to receive position information and direction information of a user, and an environmental control unit to control an environment of a space in which the user performs an activity, in accordance with the position information and the direction information of the user. The activity sensor is carried by the user and configured to transmit the position information and the direction information of the user to the information processing apparatus. The output device includes an output unit to output a tone or an image under control of the information processing apparatus.
Number | Date | Country | Kind |
---|---|---|---|
2022-185030 | Nov 2022 | JP | national |
2023-170784 | Sep 2023 | JP | national |