The present disclosure relates to an information processing apparatus, an information processing method, and an information processing program.
There is a technique of detecting a user position from user's footsteps collected by a microphone installed in a user's home, and outputting a sound from an arbitrary speaker according to the user position or changing brightness of a light.
Patent Literature 1: JP 2006-148357 A
However, in such position detection, it is necessary to precisely align microphones in advance to an extent of coordinates. Furthermore, a position detected is, for example, a two-dimensional position such as 2 m to the north and 3 m to the east from the origin. Detection in this manner is not so convenient for both system and user.
Therefore, the present disclosure proposes an information processing apparatus, an information processing method, and an information processing program capable of detecting a user position in a more convenient manner for both the system and the user without requiring precise alignment of microphones in a user's home.
The present disclosure proposes an information processing apparatus including, an acquisition unit configured to acquire sound data recorded by a plurality of microphones installed in an arbitrary place, and acquire a relative position, from each of the plurality of microphones, of a footstep included in the sound data, and
a learning unit configured to generate a learning model by learning training data including the sound data as input and the relative position as a correct answer.
The present disclosure proposes a method using an information processing apparatus to implement, acquiring sound data recorded by a plurality of microphones installed in an arbitrary place, acquiring a relative position, from each of the plurality of microphones, of a footstep included in the sound data, and generating a learning model by learning training data including the sound data as input and the relative position as a correct answer.
The present disclosure proposes a program causing an information processing apparatus to execute, acquiring sound data recorded by a plurality of microphones installed in an arbitrary place, acquiring a relative position, from each of the plurality of microphones, of a footstep included in the sound data, and generating a learning model by learning training data including the sound data as input and the relative position as a correct answer.
Hereinafter, the present embodiment will be detailed with reference to the drawings. Note that, in the present specification and the drawings, substantially the same parts are given the same reference signs to omit duplicate description.
The description will be given in the following order.
1. Embodiment
2. Hardware configuration example
3. Summary
<1. Embodiment>
<<1.1. Functional Configuration Example>>
First, an information processing system according to the present embodiment will be described.
The information processing apparatus 100 is a server apparatus managed by a provider or the like of a service using the information processing system in
The information processing apparatus 100 estimates a user position from user's footsteps detected in the user's home 10 using a learning model, and provides a necessary service to the user according to the user position. Specifically, the service to be provided is a service that outputs a sound or music from a speaker in a room where the user is present, or changes brightness of a light according to the user position in the home. Other conceivable services include, for example, a service for outputting a warning sound or voice from a predetermined speaker in a case where a housekeeper approaches a preset entry prohibited area when the user uses a home service, a service for notifying a user terminal 200 or the like in a case where a suspicious person intrudes into the user's home 10 while the user is absent from home, and a caring service for an elderly user. Note that, by registering footsteps of each user (e.g., each family member) in the information processing apparatus 100, the information processing apparatus 100 may also estimate the user position of each user to provide a service suitable for each user.
A functional configuration of the information processing apparatus 100 will be described later. Note that, in
The user terminal 200 is a terminal used by the user who uses various services. The user terminal 200 may be a mobile terminal such as a smartphone or a tablet PC, or may be a stationary terminal installed in the user's home 10 or the like. An application including a user interface (UI) for using various services may be installed in the user terminal 200. Alternatively, the application may be a web application provided by the information processing apparatus 100 or the like.
The user can receive various services from the information processing apparatus 100 or the like via the user terminal 200, and can also receive an instruction, from the information processing apparatus 100, for causing a user position estimation model to relearn by the user himself/herself. Furthermore, the user performs movement accompanied by sound of footsteps for relearning of the user position estimation model in accordance with the instruction from the information processing apparatus 100, and transmits instruction data indicating start or end of the movement to the information processing apparatus 100 via the user terminal 200 at the start or end of the movement.
Furthermore, the user can input text via the user terminal 200 to set a label (e.g., one of the microphone 20 installed in a kitchen is labeled “kitchen” or the like) for each of the microphones 20.
The microphones 20 are microphone devices installed at arbitrary places in the user's home 10 in order to record the user's footsteps and the like.
The learning microphones 60 are microphone devices installed in an arbitrary place such as the development facility 50 for preliminary learning of the user position estimation model implemented by a developer or the like. Note that the learning microphones 60 do not need to be special microphones, and “learning” is added to distinguish from the microphones 20.
In addition, although not illustrated, a human motion detector that detects human and its motion in the user's home 10, a camera that detects a person and an object, a time of flight (ToF) sensor, an infrared (IR) sensor, an acceleration sensor, a radar (radio wave) sensor, and the like may be further installed in the user's home 10. A service can be provided to the user based on situation in the user's home 10 detected by these various sensors.
Next, a functional configuration of the information processing apparatus 100 according to the present embodiment will be described.
(Storage Unit 110)
The storage unit 110 according to the present embodiment is a storage area for temporarily or permanently storing various programs and data. For example, the storage unit 110 can store programs and data for the information processing apparatus 100 to execute various functions. As a specific example, the storage unit 110 may store a program and data (including the learning model) for estimating the user position, and management data for managing various settings. Obviously, the above is merely an example, and the type of data stored in the storage unit 110 is not particularly limited.
(Acquisition Unit 120)
The acquisition unit 120 according to the present embodiment acquires sound data recorded by the plurality of learning microphones 60 in the learning phase of the learning model for estimating the user position (hereinafter referred to as the “user position estimation model”). Furthermore, the acquisition unit 120 acquires relative positions of the footsteps from the learning microphones 60.
The acquisition unit 120 also acquires sound data (corresponding to second sound data) recorded by the microphones 20 installed in the user's home 10. In addition, the acquisition unit 120 acquires the label set to each of the microphones 20.
In addition, the acquisition unit 120 receives and acquires, from the user terminal 200, the instruction data indicating start or end of movement (corresponding to first instruction data and third instruction data, respectively) for relearning of the user position estimation model.
(Learning Unit 130)
The learning unit 130 according to the present embodiment receives the sound data recorded by the plurality of learning microphones 60, and learns training data including the relative positions of the footsteps from the learning microphones 60 as correct answers to generate the user position estimation model that is the learning model. Furthermore, the learning unit 130 performs relearning of the user position estimation model by using the sound data recorded by the microphones 20 installed in the user's home 10.
Note that the learning model of the present embodiment includes an input layer to which the sound data including the footsteps is input, an output layer, a first element belonging to any layer from the input layer to the output layer but other than the output layer, and a second element whose value is calculated based on the first element and a weight of the first element. The learning model causes the information processing apparatus 100 to function and output, from the output layer, the relative positions of the footsteps from the microphones 20 or the learning microphones 60 according to the sound data input to the input layer.
Note that a generation device (e.g., the information processing apparatus 100 such as a server device) that generates the learning model of the present embodiment may generate the above-described learning model using any learning algorithm. For example, the generation device may generate the learning model of the present embodiment using a learning algorithm such as a neural network (NN), a support vector machine (SVM), clustering, or reinforcement learning. As an example, it is assumed that the generation device generates the learning model of the present embodiment using the NN. In this case, the learning model may have the input layer including one or more neurons, an intermediate layer including one or more neurons, and the output layer including one or more neurons.
Here, it is assumed that the learning model according to the present embodiment is realized by a regression model indicated by “y=a1*x1+a2*x2+ . . . +a1*x1”. In this case, the first element included in the learning model corresponds to input data (x1) such as x1 and x2. Further, the weight of the first element corresponds to coefficient a1 corresponding to x1. Here, the regression model can be regarded as a simple perceptron having the input layer and the output layer. When each model is regarded as the simple perceptron, the first element can be regarded as any node included in the input layer, and the second element can be regarded as a node included in the output layer.
Furthermore, it is assumed that the learning model according to the present embodiment is realized by the NN including one or more intermediate layers such as a deep neural network (DNN). In this case, the first element included in the learning model corresponds to any node included in the input layer or the intermediate layer. In addition, the second element corresponds to a node of a next stage that is a node to which a value is transmitted from the node corresponding to the first element. In addition, the weight of the first element corresponds to a connection coefficient that is a weight considered for a value transmitted from the node corresponding to the first element to the node corresponding to the second element.
The relative positions of the footsteps from the microphones 20 or the learning microphones 60 are calculated using a learning model having an arbitrary structure such as the regression model or the NN described above. More specifically, in the learning model, when the sound data including the footsteps is input, a coefficient is set so as to output the relative positions of the footsteps from each of the microphones. The learning model according to the present embodiment may be a model generated based on a result obtained by repeating input and output of data.
Note that, the above example describes a case that the learning model according to the present embodiment is a model (referred to as model A) that outputs the relative positions of the footsteps from each of the microphones when the sound data including the footsteps is input. However, the learning model according to the present embodiment may be a model generated based on a result obtained by repeating input and output of data to and from the model A. For example, the learning model according to the present embodiment may be a learning model (referred to as a model B) that receives the sound data including the footsteps as an input and outputs the relative positions of the footsteps from each of the microphones output by the model A. Alternatively, the learning model according to the present embodiment may be a learning model that receives the sound data including the footsteps as an input and outputs the relative positions of the footsteps from each of the microphones output by the model B.
(Estimation Unit 140)
The estimation unit 140 according to the present embodiment estimates the user position based on a result output by inputting, to the user position estimation model, the sound data recorded by the microphones 20 installed in the user's home 10. In this case, the estimation unit 140 can estimate a relative position of the user from the label given to each of the microphones 20 and acquired by the acquisition unit 120 (e.g., the user is between a label “living Room” and a label “kitchen”).
Furthermore, the estimation unit 140 estimates the position of a wall based on the user's footsteps included in the sound data recorded by the microphones 20. Furthermore, the estimation unit 140 estimates a distance between the microphones 20 and presence or absence of the wall based on sound output from each of the microphones 20. Furthermore, the estimation unit 140 estimates a room layout of the user's home 10 based on at least one of an estimated position or presence of the wall and the distance between the microphones 20.
(Setting Unit 150)
The setting unit 150 according to the present embodiment labels each of the microphones 20. Note that the setting unit 150 can label according to at least one of user's voice and environmental sound included in the sound data recorded by the microphones 20. Alternatively, the setting unit 150 can label the microphones 20 according to text data input by the user via the user terminal 200.
(Generation Unit 160)
The generation unit 160 according to the present embodiment generates a user movement route for relearning of the user position estimation model. In this case, the generation unit 160 can generate the user movement route based on, for example, the installation positions of the microphones 20 and the distance between the microphones 20.
(Transmission Unit 170)
The transmission unit 170 according to the present embodiment transmits a movement instruction to prompt the user to move to a predetermined position based on the movement route generated by the generation unit 160. In this case, the transmission unit 170 can transmit the movement instruction to the user terminal 200 in order to display the movement instruction via the UI on the user terminal 200 or output the movement instruction by voice from the user terminal 200. Alternatively, in order to cause another device such as the microphone 20 to output the movement instruction by voice, the transmission unit 170 may also transmit the movement instruction to the microphone 20 or the like.
Furthermore, for relearning of the user position estimation model, the transmission unit 170 transmits, to the microphones 20, the instruction data for start or end of recording (corresponding to second instruction data and fourth instruction data, respectively) in response to the instruction data for start or end of the movement acquired by the acquisition unit 120.
Furthermore, in a case where the user position has moved, for example, from a predetermined position indicated by the labels set on the microphones 20 to within a predetermined range or outside the predetermined range, the transmission unit 170 transmits the instruction data for providing an arbitrary service to the user to the user terminal 200, the microphone 20, or another terminal. More specifically, for example, in order to reproduce music suitable for the user when the user moves to the living room, the transmission unit 170 transmits the instruction data to music reproduction equipment, such as a smart speaker, installed in the living room.
(Control Unit 180)
The control unit 180 according to the present embodiment controls each component included in the information processing apparatus 100. Furthermore, in addition to the control of each component, the control unit 180 can control various terminals in the user's home 10 based on, for example, the situation in the user's home 10 detected by the various sensors in the user's home 10.
The functional configuration example of the information processing apparatus 100 according to the present embodiment has been described above. Note that the functional configuration described above with reference to
In addition, the function of each component may be performed by reading a control program from a storage medium such as a read only memory (ROM) or a random access memory (RAM) storing the control program in which a process procedure for realizing these functions is described by an arithmetic device such as a central processing unit (CPU), and interpreting and executing the program. Therefore, it is possible to appropriately change the configuration to be used according to a technical level at the time of carrying out the present embodiment. Furthermore, an example of a hardware configuration of the information processing apparatus 100 will be described later.
<<1.2. Details of Functions>>
Next, functions of the information processing apparatus 100 according to the present embodiment will be described in detail. One of features of the information processing apparatus 100 according to the present embodiment is to generate the user position estimation model by inputting the sound data including the footsteps recorded by the learning microphones 60 and learning the training data including the relative positions of the footsteps from the learning microphones 60 as correct answers. First, a method of performing learning of the user position estimation model in the development facility 50 or the like will be described.
As illustrated in
Note that, in order to improve a learning accuracy of the user position estimation model, various pieces of sound data serving as training data can be obtained. For example, the sound data is obtained by recording footsteps with the learning microphones 60 while changing conditions of the person 70 such as gender, age group, weight, and whether or not wearing footwear such as socks and slippers. Furthermore, the sound data can also be obtained by simulating a difference in sound echoes of the same user's footsteps depending on a floor material and a wall position, using a sound transfer function in the environment, or by generating, for example, pseudo data of child's footsteps from data of adult's footsteps, using a conversion filter prepared in advance.
Learning of the user position estimation model is performed using the sound data and the relative positions obtained as described above.
Next, a method of performing relearning, at the user's home 10, of the user position estimation model given learning in the development facility 50 or the like will be described.
As indicated by a broken line in
When the user starts moving, for example, the instruction data indicating that the user starts to move is transmitted from the user terminal 200 to the information processing apparatus 100 by pressing a “movement start” button displayed on the user terminal 200. Thereafter, the instruction data for starting recording is transmitted from the information processing apparatus 100 to each of the microphones 20, and recording starts in each of the microphones 20. Note that, in a case where the instruction to the user is given via the microphones 20, the recording may start when the user utters a word, for example, “Starting to move” to the microphones 20. Alternatively, the recording may start by pressing a button provided on the microphone 20. The same applies to a case of ending the movement. However, regarding the stop of the recording, for example, the recording by the microphones 20 may be stopped by transmitting detection information to the information processing apparatus 100 when sufficient footsteps or stoppage of footsteps are detected by the microphones 20 at movement destinations, when stoppage of movement is detected by an inertial measurement unit (IMU) mounted on the user terminal 200, or when movement of the user to the microphone 20 at the movement destination is detected by various sensors such as the human motion detector and the camera installed near the microphone 20 or in the same casing.
As described above, when the user walks between the microphones 20, the user positions are estimated using the sound data recorded by each of the microphones 20, so that relearning of the user position estimation model is performed.
Next, relearning of the user position estimation model is performed using an error between the output result data and the correct answer data. In the example in
Furthermore, in a case where the user movement route instructed at the time of relearning of the user position estimation model is, for example, a route as indicated by the broken lines in
Furthermore, as illustrated in
Furthermore, the information processing apparatus 100 can also estimate the room layout of the user's home 10 based on sound output from each other between the microphones 20.
As another room layout estimation method, for example, the user uses a camera function mounted on the user terminal 200 to perform panoramic photographing of each room or photographing while walking in the user's home 10, so that the information processing apparatus 100 can recognize stereoscopic space from video or images captured to estimate the room layout. In this case, by further turning on light mounted on the microphone 20, the information processing apparatus 100 can also estimate the positions of the microphones 20 based on the position of the light reflected in the video or images captured. Alternatively, the user can register the room layout in the user terminal 200 via a graphical user interface (GUI) on an application displayed on the user terminal 200.
<<1.3. Functional Flow>>
Next, a procedure for a relearning process of the user position estimation model will be described with reference to
As illustrated in
Next, the transmission unit 170 of the information processing apparatus 100 transmits, to the user terminal 200, the movement instruction that prompts the user to move to a predetermined position based on the movement route generated by the generation unit 160 (Step S102). The movement instruction is generated, for example, using the labels of the microphones 20 at a movement source and a movement destination, such as “Walk from the western-style room to the kitchen”, so that the user can understand easily.
Next, the user terminal 200 determines whether or not the movement instruction transmitted from the information processing apparatus 100 has been received (Step S103). When it is determined that the movement instruction has not been received (Step S103: No), the user terminal 200 waits for reception of the movement instruction. When it is determined that the movement instruction has been received (Step S103: Yes), the user terminal 200 displays the movement instruction on the display (Step S104).
Next, the user presses the “movement start button” displayed on the display of the user terminal 200 (Step S105), and starts moving to the movement destination. In response to pressing of the “movement start button”, the user terminal 200 transmits a movement start instruction to the information processing apparatus 100.
Next, the acquisition unit 120 of the information processing apparatus 100 determines whether or not the movement start instruction transmitted from the user terminal 200 has been received (Step S106). When it is determined that the movement start instruction has not been received (Step S106: No), the acquisition unit 120 waits for reception of the movement start instruction. When it is determined that the movement start instruction has been received (Step S106: Yes), the acquisition unit 120 acquires the movement start instruction. Then, in response to the acquisition of the movement start instruction, the transmission unit 170 of the information processing apparatus 100 transmits a recording start instruction to each of the microphones 20 (Step S107).
Next, the microphones 20 determine whether or not the recording start instruction transmitted from the information processing apparatus 100 have been received (Step S108). When it is determined that the recording start instruction has not been received (Step S103: No), the microphones 20 wait for reception of the recording start instruction. When it is determined that the recording start instruction has been received (Step S108: Yes), the microphones 20 start recording (Step S109).
Next, the user moves in the user's home 10 according to the movement instruction, and when arriving at the movement destination, the user presses a “movement end button” displayed on the display of the user terminal 200 (Step S110). In response to pressing of the “movement end button”, the user terminal 200 transmits a movement end instruction to the information processing apparatus 100.
Next, the acquisition unit 120 of the information processing apparatus 100 determines whether or not the movement end instruction transmitted from the user terminal 200 has been received (Step S111). When it is determined that the movement end instruction has not been received (Step S111: No), the acquisition unit 120 waits for reception of the movement end instruction. In a case where it is determined that the movement end instruction has been received (Step S111: Yes), the acquisition unit 120 acquires the movement end instruction. Then, in response to the acquisition of the movement end instruction, the transmission unit 170 of the information processing apparatus 100 transmits the recording end instruction to each of the microphones 20 (Step S112).
Next, the microphones 20 determine whether or not the recording end instruction transmitted from the information processing apparatus 100 have been received (Step S113). When it is determined that the recording end instruction has not been received (Step S113: No), the microphones 20 wait for reception of the recording end instruction. When it is determined that the recording end instruction has been received (Step S113: Yes), the microphone 20 ends the recording (Step S114), and transmits the sound data recorded to the information processing apparatus 100 (Step S115).
Next, the acquisition unit 120 of the information processing apparatus 100 determines whether or not the sound data transmitted from the microphone 20 has been received (Step S116). When it is determined that the sound data has not been received (Step S116: No), the acquisition unit 120 waits for reception of the sound data. When it is determined that the sound data has been received (Step S116: Yes), the acquisition unit 120 acquires the sound data. Then, in response to the acquisition of the sound data, the learning unit 130 of the information processing apparatus 100 performs relearning of the user position estimation model, using the sound data (Step S117).
Next, when there is a next movement destination (Step S118: Yes), a movement instruction to the next movement destination is displayed on the user terminal 200 (Step S104), and the process is repeated. Strictly speaking, after the movement end button is pressed (Step S110), the movement instruction to the next movement destination is displayed in the user terminal 200 (Step S104). On the other hand, when there is no next destination (Step S118: No), this process ends.
Next, a procedure for a user position estimation process by the user position estimation model will be described with reference to
As illustrated in
Next, the microphones 20 transmit the sound data including the sound detected to the information processing apparatus 100 (Step S203).
Next, the acquisition unit 120 of the information processing apparatus 100 determines whether or not the sound data transmitted from the microphone 20 has been received (Step S204). When it is determined that the sound data has not been received (Step S204: No), the acquisition unit 120 waits for reception of the sound data. When it is determined that the sound data has been received (Step S204: Yes), the acquisition unit 120 acquires the sound data. Then, in response to the acquisition of the sound data, the estimation unit 140 of the information processing apparatus 100 estimates the user position based on a result output by inputting the sound data to the user position estimation model (Step S205).
Then, the information processing apparatus 100 provides a service corresponding to the estimated user position to the user (Step S206). The service according to the user position is, for example, a service of turning on a light of a room where the user is present, playing music from the nearest speaker, and issuing a warning or notifying the user terminal 200 when another person approaches the entry prohibited area. After Step S206, this process ends.
<2. Hardware Configuration Example>
Next, a hardware configuration example of the information processing apparatus 100 according to the present embodiment will be described.
(Processor 801)
The processor 801 functions as, for example, an arithmetic processor or a controller, and controls the overall operation of each component or a part thereof based on various programs recorded in the ROM 802, the RAM 803, the storage 810, or a removable recording medium 901.
(ROM 802 and RAM 803)
The ROM 802 is a unit that stores a program read by the processor 801, data used for calculation, and the like. The RAM 803 temporarily or permanently stores, for example, a program read by the processor 801, various parameters that appropriately change when the program is executed, and the like.
(Host Bus 804, Bridge 805, External Bus 806, and Interface 807)
The processor 801, the ROM 802, and the RAM 803 are mutually connected via, for example, the host bus 804 capable of high-speed data transmission. On the other hand, the host bus 804 is connected to the external bus 806 having a relatively low data transmission speed via, for example, the bridge 805. In addition, the external bus 806 is connected to various components via the interface 807.
(Input Device 808)
As the input device 808, for example, a mouse, a keyboard, a touch panel, a button, a switch, a lever, and the like are used. Furthermore, as the input device 808, a remote controller (hereinafter, remote control) capable of transmitting a control signal using infrared rays or other radio waves may be used. Furthermore, the input device 808 includes a voice input device such as a microphone.
(Output Device 809)
The output device 809 is a device capable of visually or audibly notifying a user of acquired information, including a display device such as a cathode ray tube (CRT), a liquid crystal display (LCD), or an organic electroluminescence light (EL), an audio output device such as a speaker or a headphone, a printer, a mobile phone, or a facsimile. Furthermore, the output device 809 according to the present embodiment includes various vibrating devices capable of outputting tactile stimulation. In addition, the output device 809 may be a device that exclusively outputs sound, such as a smart speaker, and may have a text to speech (TTS) function of reading out a character string.
(Storage 810)
The storage 810 is a device for storing various types of data. As the storage 810, for example, a magnetic storage device such as a hard disk drive (HDD), a semiconductor storage device, an optical storage device, or a magneto-optical storage device is used.
(Drive 811)
The drive 811 is, for example, a device that reads information recorded on the removable recording medium 901 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, or writes information to the removable recording medium 901.
(Connection Port 812)
The connection port 812 is a port for connecting an external connection device 902 such as a universal serial bus (USB) port, an IEEE 1394 port, a small computer system interface (SCSI), an RS-232 C port, or an optical audio terminal.
(Communication Device 813)
The communication device 813 is a communication device for connecting to various communication networks including the Internet and a mobile network such as a mobile telephone network, and is, for example, a communication card for wired or wireless LAN, Bluetooth (registered trademark), or wireless USB (WUSB), a router for optical communication, a router for asymmetric digital subscriber line (ADSL), or a modem for various communications.
(Removable Recording Medium 901)
The removable recording medium 901 is, for example, a DVD medium, a Blu-ray (registered trademark) medium, an HD DVD medium, or various semiconductor storage media. It is obvious that the removable recording medium 901 may also be, for example, an IC card on which a non-contact IC chip is mounted or an electronic device.
(External Connection Device 902)
The external connection device 902 is, for example, a printer, a portable music player, a digital camera, a digital video camera, or an IC recorder.
<3. Summary>
As described above, the information processing apparatus 100 includes the acquisition unit 120 that acquires the sound data recorded by the plurality of microphones installed in arbitrary places, and acquires the relative positions of the footsteps included in the sound data from the microphones, and the learning unit 130 that generates the learning model by inputting the sound data and learning the training data including the relative positions as correct answers.
As a result, it is not necessary to precisely align the microphones in the user's home 10, and it is possible to detect the user position in a more convenient manner for both the system and the user.
Although the preferred embodiments of the present disclosure have been described in detail with reference to the accompanying drawings, the technical scope of the present disclosure is not limited to the embodiments. It is obvious that a person having ordinary knowledge in the technical field of the present disclosure can conceive various changes or modifications within the scope of the technical idea described in the claims, and it is naturally understood that these also belong to the technical scope of the present disclosure.
Furthermore, the effects described in the present specification are merely illustrative or exemplary, and are not restrictive. In other words, the technology according to the present disclosure can exhibit other effects obvious to those skilled in the art from the description of the present specification in addition to or instead of the above effects.
Note that the present technology can also have the following configurations.
10 USER'S HOME
20 MICROPHONE
50 DEVELOPMENT FACILITY
60 LEARNING MICROPHONE
70 PERSON
100 INFORMATION PROCESSING APPARATUS
110 STORAGE UNIT
120 ACQUISITION UNIT
130 LEARNING UNIT
140 ESTIMATION UNIT
150 SETTING UNIT
160 GENERATION UNIT
170 TRANSMISSION UNIT
200 USER TERMINAL
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/014817 | 3/31/2020 | WO |