The present disclosure relates to an information processing apparatus, an information processing method, and a program.
For example, in a robot or an agent technique which operates in an environment such as a house, techniques for recognizing environmental information on the inside of a house and the periphery of a device have been developed and introduced.
Mobile robots and the like that operate in a predetermined environment, however, have greatly different visual information. In addition, the visual information changes from moment to moment. Moreover, a spatial region that can be observed by a mobile robot is made partial by an object existing in the predetermined environment. For that reason, there has been a demand for a recognition system that is not affected by a difference or a change between environments or visual occlusion in a predetermined environment such as a house.
Therefore, the present disclosure proposes an information processing apparatus, an information processing method, and a program capable of recognizing a specific event from some information in a predetermined environment.
An information processing apparatus according to the present disclosure includes: a sensor unit that senses environmental information on a predetermined area; a storage unit that stores event information including event feature data on a predetermined event and meta information including space information on the predetermined event associated with the event feature data; and a control unit that retrieves the event information from the storage unit and acquires the space information included in the event information based on a sensing result from the sensor unit.
According to one aspect of an embodiment, a specific event can be recognized from some information in a predetermined environment. Note that the effects described here are not necessarily limited, and any effect described in the present disclosure may be exhibited. Moreover, the effects set forth in the specification are merely examples and not limitations. Other effects may be exhibited.
An embodiment of the present disclosure will be described in detail below with reference to the drawings. Note that, in the following embodiment, the same reference signs are attached to the same parts to omit duplicate description.
[Configuration of System According to Embodiment]
First, an embodiment of the present disclosure will be outlined. As described above, in recent years, a technique for recognizing predetermined environmental information on the inside of a house and the periphery of a device with a mobile robot such as a pet-type robot has been developed.
In contrast, visual information greatly varies depending on environments of houses and the like, and changes from moment to moment. That is, in recognition of routine events in predetermined environments such as a house, there are great differences between the environments, and there are different features of events between the environments. For that reason, the mobile robot needs to learn an event for each environment. Moreover, the above-described mobile robot can partially observe space at different times. That is, a sensor in the mobile robot observes different space at different times. The mobile robot thus needs to recognize an event by integrating features from inputs of a plurality of sensors and complementing incomplete information even the observation is incomplete. For that reason, there has been a demand for a recognition system that is not affected by a difference or a change between environments of the insides of houses or a visual occlusion.
Then, in the present disclosure, a routine event is defined by event feature data and event meta information, which are obtained based on inputs from a plurality of sensors, and is three-dimensionally mapped. This allows event recognition processing more robust against a difference of a predetermined environment such as the inside of a house H and a visual occlusion. Here, the event feature data is visual information and auditory information that characterize the event itself. Specifically, the event feature data includes object feature data and voice feature data. The object feature data indicates the feature amount of an object. The voice feature data indicates the feature amount of voice. The event meta information includes position information indicating a predetermined position as space information.
First, a learning phase of an information processing apparatus according to the present embodiment will be described.
In the example of
In the example in
Furthermore, when simultaneously acquiring voice data and video data in the above-described learning phase, the information processing apparatus 1 can gradually sophisticate event information by sequentially updating an event DB. Therefore, according to the information processing apparatus 1, a user does not need to preliminarily set detailed event information. The information processing apparatus 1 can optimize the event information by simple operation, and facilitate the optimization of the event information.
[Configuration of Information Processing Apparatus According to Embodiment]
Next, a configuration example of the information processing apparatus 1 according to the embodiment will be described.
The sensor unit 2 includes a sensor that senses environmental information in a predetermined area (inside of house H). In the example in
The communication unit 3 is a communication module that transmits and receives data to and from another communicable device via a predetermined network. The communication unit 3 includes a reception unit 31 and a transmission unit 32. The reception unit 31 receives predetermined information from another device, and outputs the information to the control unit 5. The transmission unit 32 transmits predetermined information to another device via the network.
The storage unit 4 is a storage device for recording at least event information. The storage unit 4 stores a voice feature database (DB) 41, an object mask DB 42, an object feature DB 43, an event meta information DB 44, an event feature DB 45, an event DB 46, a threshold DB 47, and evoking event meta information 48.
Voice feature data stored in the voice feature DB is information on the feature amount of voice data acquired by the information processing apparatus 1. The voice feature data corresponds to, for example, a feature amount extracted by the control unit 5 to be described later based on the voice data acquired by the microphone sensor 21.
Returning to
Furthermore,
Furthermore, the event meta information includes at least two-dimensional or three-dimensional position information. The event meta information may include time information. In the present embodiment, the event meta information is meta information including position information and time information related to a predetermined event. The event meta information may further include information necessary for an action of a mobile robot. The information necessary for an action of a mobile robot is, for example, category information, occurrence frequency information, occurrence date and time information, and the like related to an event.
In the event feature DB 45, the above-described voice feature data and object feature data are associated with each other, and stored as the event feature data.
Moreover, in the event DB 46, the above-described event feature data and event meta information are associated with each other, and stored as the event information.
The threshold DB 47 includes information on a threshold for determining the coincidence level between the voice data acquired by microphone sensor 21 and the video data acquired by camera sensor 22. The threshold of the coincidence level is referred to as a coincidence threshold in the present specification, and relates to the coincidence level between the voice feature data obtained from the voice data and the object feature data obtained from the video data. Furthermore, the coincidence threshold is information on a threshold for determining whether or not to enter a learning phase, and, in other words, is used for determining whether or not the event is to be registered. When the input voice data or video data exceeds the coincidence threshold, processing enters the learning phase. When the input voice data or video data is equal to or less than the coincidence threshold, the processing enters the evoking phase. Here, in the learning phase, the processing of changing the event DB 46 is performed by registration processing or update processing performed by the control unit 5. In the evoking phase, the control unit 5 performs processing of outputting the event meta information included in predetermined event information from the event DB 46 under a predetermined condition.
The threshold DB 47 includes information on a threshold for determining the similarity level between the voice feature data registered in the voice feature DB 41 and the voice data acquired by the microphone sensor 21. Furthermore, the threshold DB 47 includes information on a threshold for determining the similarity level between the object feature data registered in the object feature DB 43 and the video data acquired by the camera sensor 22. These thresholds are referred to as evoking thresholds in the present specification. In other words, the evoking threshold is used for determining whether or not event feature data including voice feature data or object feature data similar to the feature amount of the input voice data or video data exists in the event information stored in the event DB 46.
The evoking event meta information 48 is included in the event information retrieved from the event DB 46. The information processing apparatus 1 makes an action plan based on the evoking event meta information 48.
Next, the control unit 5 will be described. The control unit 5 has a function of controlling each configuration of the information processing apparatus 1. As illustrated in
The voice feature extraction unit 51 extracts a feature amount having a high abstraction level from the voice data input from the microphone sensor 21, and converts the feature amount into voice feature data. Here, the processing of conversion from voice data to voice feature data can be achieved by, for example, a technique such as Fourier transform processing.
The object region estimation unit 52 estimates a region where the object 101 exists as illustrated in
Returning to
The microphone sensor 21, the camera sensor 22, the voice feature extraction unit 51, the object region estimation unit 52, the object feature extraction unit 53, the sound source object estimation unit 54, the voice feature DB 41, the object mask DB 42, and the object feature DB 43 described above constitute a feature extraction unit 70. The feature extraction unit 70 generally extracts event feature data from various pieces of data such as input voice data and video data. In contrast, the feature extraction unit 70 calculates the coincidence level between the object feature data and the voice feature data, and calculates whether or not the video data includes an object serving as a sound source.
The spatial position information acquisition unit 55 creates a map of a predetermined area (inside of house H) based on the depth information detected by the depth sensor 23, and stores the map in the storage unit 4 as map information serving as a base of the event meta information. The spatial position information acquisition unit 55 can generate the map information by simulation localization and mapping (SLAM). Note that the spatial position information acquisition unit 55 may update the map information at a predetermined cycle on the assumption that the furniture in the house H is rearranged, or may generate a map every time the information processing apparatus 1 moves. Furthermore, the information processing apparatus 1 may store a map generated by another device as the map information. The spatial position information acquisition unit 55 can calculate specific position information by comparing the depth information obtained by the depth sensor 23 with the map information stored in the storage unit 4. Examples of a method of acquiring the predetermined position information with the spatial position information acquisition unit 55 can include the following processing. That is, the method includes processing of acquiring coordinate information on the earth by using a positioning system such as a global positioning system (GPS) and self-position estimation processing of acquiring a relative position from a predetermined starting point by using video data, such as Visual SLAM.
The time information acquisition unit 56 is, for example, a timer such as a clock or a time information receiving mechanism. The time information receiving mechanism receives time information from a server that outputs time information via a predetermined network.
The spatial position information acquisition unit 55 outputs the position information associated with an observed event as a part of the event meta information. The event meta information is stored in an event meta information database of the storage unit 4. The event meta information includes at least event position information. Here, the event position information refers to a coordinate representation that uses any position as an origin and that includes two or more numerical values. The position information can be expressed by, for example, space information such as a relative position from a predetermined starting point in the map of an environment, that is, an XYZ position in a world coordinate system, and coordinate information of a world geodetic system obtained from a GPS satellite. Furthermore, the time information acquired by the time information acquisition unit 56 may be associated with the position information calculated by the spatial position information acquisition unit 55 as time information on the time when an event has occurred, and used as a part of the event meta information.
The depth sensor 23, the spatial position information acquisition unit 55, the time information acquisition unit 56, and the event meta information DB 44 described above constitute an event meta information acquisition unit 80. The event meta information acquisition unit 80 generally outputs information necessary for retrieval of event information and action of a mobile robot as the event meta information based on input from the depth sensor 23, and stores the information in the storage unit 4.
The learning evoking unit 57 serving as a part of a generation unit associates the event feature data obtained by the feature extraction unit 70 and the event meta information obtained by the event meta information acquisition unit 80 with each other to generate event information, and stores the event information in the event DB 46 of the storage unit 4. Note that, although the event feature data is stored in the event feature DB 45 and the event information is stored in the event DB 46 in the present embodiment, these are not limitations. That is, a system capable of outputting related information from specific input, such as a Boltzmann machine and a self-organizing map, may be used instead of using a database.
The learning evoking unit 57 determines which one of the registration processing, the update processing, and the evoking processing on the event information is to be executed based on the event feature data output from the feature extraction unit 70 and a coincidence threshold or an evoking threshold stored in the threshold DB 47.
The learning evoking unit 57, the event feature DB 45, the event DB 46, and the threshold DB 47 described above constitute an event memory unit 90. The event memory unit 90 generally selects any processing of registration, update, and evoking for the event information, while generating event information and storing the event information in the storage unit 4.
The action plan control unit 58 has a function of planning an action of the information processing apparatus 1 based on information acquired by the sensor unit 2 and various pieces of data stored in the storage unit 4. The action plan control unit 58 according to the present embodiment first retrieves event meta information from the voice data acquired by the microphone sensor 21. The event meta information corresponds to the voice data, and is stored in the event meta information DB 44. The action plan control unit 58 subsequently determines to execute an action of moving to a position specified by position information based on the position information included in the retrieved event meta information.
Furthermore, the action plan control unit 58 has a function of controlling the operation of a drive unit 6. The drive unit 6 has a function of driving a physical configuration of the information processing apparatus 1. The drive unit 6 has a function for moving the position of the information processing apparatus 1. The drive unit 6 is, for example, an actuator driven by a motor 61. For example, the action plan control unit 58 controls the motor 61 of the drive unit 6 based on the above-described action plan to drive an actuator of each joint unit provided in the drive unit 6. Note that the drive unit 6 may have any configuration as long as the information processing apparatus 1 can achieve a desired operation. The drive unit 6 may have any configuration as long as the position of the information processing apparatus 1 can be moved. When the information processing apparatus 1 includes a movement mechanism such as a caterpillar and a tire, the drive unit 6 drives the caterpillar, the tire, and the like. The drive unit 6 may further include a sensor necessary for controlling a mobile robot, such as a GPS reception unit and an acceleration sensor.
[Information Processing Method According to Embodiment]
Next, a processing procedure executed by the information processing apparatus 1 according to the present embodiment will be described.
As illustrated in
Next, when the processing proceeds to Step ST2, the event memory unit 90 of the information processing apparatus 1 determines whether or not the generated event feature data exceeds a coincidence threshold. Specifically, the sound source object estimation unit 54 first calculates the coincidence level between the voice feature data and the object feature data included in the event feature data, and outputs the coincidence level to the learning evoking unit 57. When the learning evoking unit 57 determines that the input coincidence level exceeds the coincidence threshold (Step ST2: Yes), the processing proceeds to Step ST3. When the voice feature data and the object feature data have a high coincidence level, the camera sensor 22 images an object that has output voice data substantially at the same time as the microphone sensor 21 acquires the voice data. In this case, as described above, the processing of the information processing apparatus 1 enters a learning phase.
Next, in Step ST3 serving as the learning phase, the control unit 5 of the information processing apparatus 1 evokes an event based on the event feature data. Specifically, the learning evoking unit 57 of the control unit 5 retrieves the event information stored in the event DB 46 based on the acquired event feature data. For example, the event DB 46 stores an event feature ID and event meta information associated with an event ID as illustrated in
The processing subsequently proceeds to Step ST4. The learning evoking unit 57 determines whether or not there is event information having event feature data in which the similarity level to the acquired event feature data exceeds a predetermined evoking threshold. Note that, in addition to the threshold of the similarity level regarding the event feature data, the learning evoking unit 57 may use a threshold based on other information included in the event meta information or a threshold based on an occurrence frequency or occurrence date and time as the evoking threshold regarding the similarity level. When the learning evoking unit 57 determines that there is event information including event feature data that exceeds the predetermined evoking threshold (Step ST4: Yes), the processing proceeds to Step ST5. Note that the description will be given on the assumption that the event feature data included in the retrieved event information is the event feature data of the event feature ID “E001” in
In Step ST5, the learning evoking unit 57 updates the retrieved event feature data. Specifically, the learning evoking unit 57 updates the event feature data included in the retrieved event information to the acquired event feature data. That is, for example, the voice feature data among the event feature data with the event feature ID “E001” is updated from the voice feature data “EA0015” in
Furthermore, when the learning evoking unit 57 determines that there is no event information including event feature data that exceeds the predetermined evoking threshold in Step ST4 (Step ST4: No), the processing proceeds to Step ST6. In Step ST6, the control unit 5 registers an event. Specifically, the learning evoking unit 57 generates event feature data from the voice feature data and the object feature data output from the feature extraction unit 70. Meanwhile, the learning evoking unit 57 acquires the event meta information output from the event meta information acquisition unit 80. The learning evoking unit 57 associates the event feature data and the event meta information with each other, attaches an event ID, and stores the event feature data and the event meta information in the event DB 46. Then, the learning phase executed by the information processing apparatus 1 ends.
Furthermore, when the learning evoking unit 57 determines that the calculated coincidence level is equal to or less than the coincidence threshold in Step ST2 (Step ST2: No), the processing proceeds to Step ST7. When the coincidence level between the voice feature data and the object feature data is equal to or less than the coincidence threshold, the camera sensor 22 has not imaged the object that had output the voice data at the time when the microphone sensor 21 acquires the voice data. In this case, as described above, the processing of the information processing apparatus 1 enters an evoking phase.
Next, in Step ST7 serving as the evoking phase, the control unit 5 of the information processing apparatus 1 evokes an event based on the voice feature data. Specifically, the learning evoking unit 57 of the control unit 5 retrieves the event information stored in the event DB 46 based on the acquired voice feature data. Note that the learning evoking unit 57 may retrieve the event information based on the acquired object feature data. For example, the event DB 46 stores an event feature ID and event meta information associated with an event ID as illustrated in
In Step ST8, the learning evoking unit 57 subsequently determines whether or not there is event information in which the similarity level between the voice feature data included in the retrieved event information and the acquired voice feature data exceeds a predetermined evoking threshold. When the learning evoking unit 57 determines that there is the event information including the voice feature data in which the similarity level to the acquired voice feature data exceeds the evoking threshold (Step ST8: Yes), the processing proceeds to Step ST9. For example, a case where the acquired voice feature data is “EA0015” will be described below in an example.
In Step ST9, the control unit 5 outputs the event meta information of a corresponding event. Specifically, the learning evoking unit 57 first retrieves the event feature data “E001” (see
The action plan control unit 58 to which the evoking event meta information 48 is input executes an action plan based on the position information included in the evoking event meta information 48, and controls the drive unit 6. As a result, the information processing apparatus 1 moves to the place indicated by the position information included in the evoking event meta information 48.
In contrast, in step ST8, when the learning evoking unit 57 determines that there is no event information including the voice feature data in which the similarity level to the acquired voice feature data exceeds the evoking threshold (Step ST8: No), the evoking phase executed by the information processing apparatus 1 ends.
Next, a specific example of the information processing apparatus 1 according to the embodiment will be described. In the present example, a case where a husband or a father comes home to the house H will be described as an example. First, as illustrated in
Furthermore, as illustrated in
Thereafter, as illustrated in
In contrast, as illustrated in
Although, in the present example, the information processing apparatus 1 retrieves event information including similar voice data based on the acquired voice data and moves to the position based on the associated event meta information, the information processing apparatus 1 may do the same based on the acquired video data. For example, the information processing apparatus 1 that has acquired thunder light as video data may retrieve event information including object feature data based on similar video data, and moves to a position based on associated event meta information.
Note that, as described above, the information processing apparatus 1 including a mobile robot can newly generate event information that is not stored in the event DB 46 only when voice data and video data corresponding to each other are acquired substantially at the same time and the processing proceeds to the learning phase. In this case, the generation of the event information depends on chance. Then, various methods can be adopted to facilitate simultaneous acquisition of voice data and video data related to each other. In the above-described example, for example, when the position of the door D of the entrance is not mapped, an application installed in a mobile terminal device possessed by the resident and GPS information included in the mobile terminal device may be linked. First, setting is performed to inform the resident of information on a mobile robot and to transmit the position information on the resident to the mobile robot by using the application of the mobile terminal device. Then, when the resident approaches the house H, the mobile robot is controlled to move to a random place and stand by. Furthermore, when there is no resident in the house H, the mobile robot may be set to stand by at a different place each time. Furthermore, beamforming may be used to the microphone sensor 21 to add an action plan for movement in a direction in which sound is emitted. Moreover, the application of the mobile terminal device may direct a resident who has not gone out to greet a resident coming home with the mobile robot.
(Variations)
Next, a variation of the above-described example will be described.
As illustrated in
In the state where the event information is stored in the storage unit 4 of the information processing apparatus 1A as described above, the information processing apparatus 1A acquires sound emitted by a home appliance as illustrated in
[Outline of Variation]
By the way, although, in the above-described embodiment, a case where the information processing apparatuses 1 and 1A are disposed in a predetermined area (house H) has been described, this is not a limitation. For example, the information processing apparatus 1 can be configured as a server apparatus.
For example, the information processing apparatus 300 receives voice data and video data as sensing results of environmental information transmitted from a pet-type robot 400. The pet-type robot 400 includes the sensor unit 2, the drive unit 6, and a drive control unit. The drive unit 6 can move to a position specified by input position information. The drive control unit drives the drive unit 6. The information processing apparatus 300 controls the action of the pet-type robot 400 based on the event meta information and the event feature data. The event meta information is stored in the event meta information DB. The event feature data is stored in the event feature DB 145. The information processing apparatus 300 transmits information on the position to which the pet-type robot 400 is to move based on the received voice data or video data. The pet-type robot 400 that has received the position information moves to a position included in the received position information. Note that, although a case where the information processing apparatus 300 receives a sensing result from the pet-type robot 400 has been described, this is not a limitation.
Moreover, the information processing apparatus 300 and a mobile terminal device 500 possessed by a user may be made communicable with each other, and the movement of the pet-type robot 400 may be made controllable by the mobile terminal device 500.
[Other Variations]
By the way, although, in the above-described embodiment, a predetermined area has been described as the house H, this is not a limitation. Any area can be set as the predetermined area.
An information device such as the information processing apparatus, an HMD, and a controller according to the above-described embodiment is implemented by a computer 1000 having a configuration as illustrated in
The CPU 1100 operates based on a program stored in the ROM 1300 or the HDD 1400, and controls each unit. For example, the CPU 1100 develops a program stored in the ROM 1300 or the HDD 1400 on the RAM 1200, and executes processing corresponding to various programs.
The ROM 1300 stores a boot program such as a basic input output system (BIOS) executed by the CPU 1100 at the time when the computer 1000 is started, a program depending on the hardware of the computer 1000, and the like.
The HDD 1400 is a computer-readable recording medium that non-transiently records a program executed by the CPU 1100, data used by the program, and the like. Specifically, the HDD 1400 is a recording medium that records a program according to the present disclosure. The program is one example of a program data 1450.
The communication interface 1500 is used for connecting the computer 1000 to an external network 1550 (e.g., Internet). For example, the CPU 1100 receives data from another device and transmits data generated by the CPU 1100 to another device via the communication interface 1500.
The input/output interface 1600 is used for connecting an input/output device 1650 and the computer 1000 to each other. For example, the CPU 1100 receives data from an input device such as a keyboard and a mouse via the input/output interface 1600. Furthermore, the CPU 1100 transmits data to an output device such as a display, a speaker, and a printer via the input/output interface 1600. Furthermore, the input/output interface 1600 may function as a medium interface that reads a program and the like recorded in a predetermined recording medium (medium). The medium includes, for example, an optical recording medium such as a digital versatile disc (DVD) and a phase change rewritable disk (PD), a magneto-optical recording medium such as a magneto-optical disk (MO), a tape medium, a magnetic recording medium, a semiconductor memory, and the like.
For example, when the computer 1000 functions as the information processing apparatus 1 according to the embodiment, the CPU 1100 of the computer 1000 implements the functions of the spatial position information acquisition unit 55 and the like by executing a program loaded on the RAM 1200. Furthermore, the HDD 1400 stores a program according to the present disclosure and data in the storage unit 4. Note that the CPU 1100 reads the program data 1450 from the HDD 1400 and executes the program data 1450. In another example, the CPU 1100 may acquire these programs from another device via the external network 1550.
Conventional information processing apparatuses such as a mobile robot have difficulty in being used in an environment such as the inside of the house H since the information processing apparatuses sometimes cannot evoke both an image and voice in association with space information or have a restriction on input information. In contrast, according to the above-described embodiment, both the image data and the voice data can be associated with the space information, and stored in a mutually evocable state. This allows all other information, specifically, the voice data, the image data, and the event meta information to be retrieved by acquiring only the voice data or only the video data. All the other information can be used for controlling the action of the mobile robot. Furthermore, even in an environment where the video data cannot be acquired, the information processing apparatuses 1 and 1A such as a mobile robot can move to a place where an event occurs as long as the voice data can be acquired. Similarly, even in an environment where the voice data cannot be acquired, the information processing apparatuses 1 and 1A such as a mobile robot can move to a place where an event occurs as long as the video data can be acquired. Moreover, since event information is registered or continuously updated at timing when voice data and video data can be simultaneously acquired, the information processing apparatuses 1 and 1A can operate robustly in response to a change in environment. Furthermore, the environment around an object or the like in the house H changes from moment to moment. The information processing apparatuses 1 and 1A can operate in response to a change in environment also in the next and subsequent times by shifting the processing to the learning phase at the timing when the voice data and the video data are simultaneously acquired.
Note that the present technology can also have the configurations as follows.
(1)
An information processing apparatus comprising:
a sensor unit that senses environmental information on a predetermined area;
a storage unit that stores event information including event feature data on a predetermined event and meta information including space information on the predetermined event associated with the event feature data; and
a control unit that retrieves the event information from the storage unit and acquires the space information included in the event information based on a sensing result from the sensor unit.
(2)
The information processing apparatus according to (1),
wherein the control unit
determines a similarity level between a sensing result sensed by the sensor unit and the event feature data stored in the storage unit, and
when the similarity level exceeds a predetermined evoking threshold, retrieves event information including event feature data that exceeds the evoking threshold from the storage unit.
(3)
The information processing apparatus according to (1) or (2),
wherein the event feature data includes object feature data obtained by an object that is allowed to be sensed by the sensor unit and voice feature data obtained by voice that is allowed to be sensed by the sensor unit.
(4)
The information processing apparatus according to (3),
wherein the control unit retrieves, from the storage unit, event information including voice feature data in which a similarity level to voice feature data exceeds a predetermined evoking threshold based on the latter voice feature data obtained from voice sensed by the sensor unit.
(5)
The information processing apparatus according to (3),
wherein the control unit retrieves, from the storage unit, event information including object feature data in which a similarity level to object feature data exceeds a predetermined evoking threshold based on the latter object feature data obtained from an object sensed by the sensor unit.
(6)
The information processing apparatus according to any one of (3) to (5),
wherein the object feature data corresponds to a feature amount of an object sensed by the sensor unit, and
the voice feature data corresponds to a feature amount of voice emitted from an object sensed by the sensor unit.
(7)
The information processing apparatus according to any one of (1) to (6), allowed to control a mobile robot including a drive unit that moves a housing,
wherein the control unit makes an action plan based on the acquired space information, and performs control of causing the mobile robot to act based on the action plan.
(8)
The information processing apparatus according to any one of (1) to (7),
wherein the information processing apparatus is a mobile robot.
(9)
An information processing method comprising
a computer retrieving event information from a storage unit, and outputs space information included in the event information based on a sensing result from a sensor unit that senses environmental information on a predetermined area, the storage unit storing the event information that includes event feature data on a predetermined event and meta information including the space information on the predetermined event associated with the event feature data.
(10)
A program causing a computer to function as:
a sensor unit that senses environmental information on a predetermined area;
a storage unit that stores event information including event feature data on a predetermined event and meta information including space information on the predetermined event associated with the event feature data; and
a control unit that retrieves the event information from the storage unit and outputs the space information included in the event information based on a sensing result from the sensor unit.
(11)
An information processing apparatus comprising:
a sensor unit that senses environmental information on a predetermined area; and
a generation unit that generates event information by associating event feature data on a predetermined event obtained based on a sensing result from the sensor unit and meta information including space information on the predetermined event obtained based on the sensing result with each other.
(12)
The information processing apparatus according to (10),
in which the event feature data includes object feature data obtained by an object allowed to be sensed by the sensor unit and voice feature data obtained by voice allowed to be sensed by the sensor unit,
the control unit determines a coincidence level between the object feature data and the voice feature data obtained based on the sensing result; and
when the coincidence level exceeds a predetermined coincidence threshold, the generation unit generates the event information.
(13)
An information processing method comprising
a computer generating event information by associating event feature data on a predetermined event obtained based on a sensing result from a sensor unit that senses environmental information on a predetermined area and meta information including space information on the predetermined event obtained based on the sensing result with each other.
(14)
A program causing a computer to function as:
a sensor unit that senses environmental information on a predetermined area; and
a generation unit that generates event information by associating event feature data on a predetermined event obtained based on a sensing result from the sensor unit and meta information including space information on the predetermined event obtained based on the sensing result with each other.
Number | Date | Country | Kind |
---|---|---|---|
2019-168590 | Sep 2019 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/027500 | 7/15/2020 | WO |