This application is based on and claims priority under 35 USC 119 from Japanese Patent Application No. 2021-005971 filed Jan. 18, 2021.
The present disclosure relates to an information processing apparatus, an information processing method, and a non-transitory computer readable medium.
Currently, a technology for visually augmenting the reality has been put into practical use by superimposing and displaying the information, which is created by a data processing, on the real world. This technology is called augmented reality (AR).
A device capable of displaying an augmented-reality image (hereinafter, also referred to as an “AR image”) includes a glasses-type device. The glasses-type device may be used in the same way as glasses when it does not display an AR image.
With the widespread use of devices capable of displaying AR images, it is expected that the chances of displaying AR images representing the contents of information in front of the field of view of a user wearing the device will increase.
Meanwhile, even when the surrounding environments of the user and the operating conditions of the user are the same, it is expected that, in one case, the user may want to display the contents of the information whose existence has been detected depending on the time and that, in another case, the user may not want to display the contents of the information depending on the time. For example, the user may not afford to check the contents of the information at that point of time, but the user may want to check the contents of the information later.
Aspects of non-limiting embodiments of the present disclosure relate to enabling display that reflects the user's wish at each time, as compared with a case, at the same time that the existence of information is detected, the contents of the information is displayed as an augmented-reality image in front of the field of view.
Aspects of certain non-limiting embodiments of the present disclosure address the above advantages and/or other advantages not described above. However, aspects of the non-limiting embodiments are not required to address the advantages described above, and aspects of the non-limiting embodiments of the present disclosure may not address advantages described above.
According to an aspect of the present disclosure, there is provided an information processing apparatus including: a processor configured to: when existence of predetermined information is detected, inquire a user whether to display contents of the information before the contents of the information are displayed as an augmented-reality image in front of a user's field of view; and control the displaying of the contents of the information by the augmented-reality image according to a user's instruction in response to the inquiry.
Exemplary embodiment(s) of the present disclosure will be described in detail based on the following figures, wherein:
Hereinafter, exemplary embodiments of the present disclosure will be described with reference to the drawings.
Entire System Configuration
The information processing system 1 illustrated in
Here, the term “connected” refers to a state in which communication is possible.
In the case of
Further, one xR device 20 may be connected to plural servers 10, or plural xR devices 20 may be connected to plural servers 10.
For example, a wireless LAN (Local Area Network), the Internet, a mobile communication system such as 4G or 5G, and Bluetooth are used for the communication network 30 in
The server 10 used in the present exemplary embodiment functions as an information processing apparatus that implements an information provision service through a cooperation with the xR device 20.
The xR device 20 used in the present exemplary embodiment is a glasses-type device worn by the user on the head. In the xR device 20 used in the present exemplary embodiment, a camera 21 is attached to the central portion of a frame.
The camera 21 is used as an “imaging unit” that captures an image in front of the user's field of view. The viewing angle of the camera 21 is substantially equal to the viewing angle of a person or equal to or greater than the viewing angle of a person.
However, the camera 21 may be a device capable of capturing a panoramic image or other wide-angle image, or may be a device capable of capturing an image of the entire celestial sphere or half celestial sphere. The panoramic image may be an image that captures 360° in the horizontal direction.
The xR device 20 with the camera 21 is also called an AR glass or an MR (Mixed Reality) glass. In the present exemplary embodiment, the glasses-type devices are collectively referred to as an “xR device”. However, the appearance of the xR device 20 is not limited to the glasses type illustrated in
In the present exemplary embodiment, a two-dimensional image is assumed as the AR image, but a three-dimensional image may also be used as the AR image. The three-dimensional image is an image in which distance information is recorded for each pixel, and is also called a “range image”. As for the camera 21, a stereo camera or a LiDAR (Light Detection and Ranging) may be used to acquire the three-dimensional image.
Server Configuration
The server 10 illustrated in
Here, a device compliant with a protocol used for communication via the communication network 30 is used for the communication module 120.
The server 10 may be additionally provided with a display, a keyboard, and a mouse.
The data processing unit 100 includes a processor 101, a Read Only Memory (ROM) 102, and a Random-Access Memory (RAM) 103.
Both the ROM 102 and the RAM 103 are semiconductor memories. The ROM 102 stores a Basic Input Output System (BIOS). The RAM 103 is used as a main storage device used for executing a program. For example, a Dynamic RAM (DRAM) is used for the RAM 103.
The processor 101 includes, for example, a Central Processing Unit (CPU). The processor 101 implements various functions through the execution of a program.
The processor 101 illustrated in
The image analyzer 101A corresponds to a function of analyzing an image obtained by capturing the direction of the user's line of sight (hereinafter, also referred to as “front of the field of view”).
The image analyzer 101A has a first analysis function of detecting the existence of predetermined information and a second analysis function of detecting the timing of outputting information to a user.
The first analysis function is used, for example, to detect the existence of predetermined information as the provision target, among various types of information included in the image obtained by capturing the direction of the user's line of sight.
In the present exemplary embodiment, the information as the provision target is registered in an inquiry target table 112.
The second analysis function is used, for example, to detect that the landscape or environment of the image obtained by capturing the direction of the user's line of sight corresponds to the timing of providing information. In the present exemplary embodiment, the timing of providing information is registered in a provision timing table 113.
The inquiry controller 101B corresponds to a function of controlling (i) a process of, when the existence of the information as the provision target is detected, inquiring the user whether to display the detected information before displaying the contents of the detected information in front of the user's field of view and (ii) a process of receiving a response to the inquiry from the user.
The inquiry controller 101B may know the existence of the information as the provision target through the analysis result of the image analyzer 101A or may know it by a notification from the xR device 20.
In the present exemplary embodiment, the result of recognition of characters included in the image is given from the xR device 20. However, the characters included in the image may be recognized by analysis of the image by the server 10.
When a scene corresponding to the output timing is detected by the second analysis function, the output controller 101C corresponds to a function of controlling the output of the contents of the corresponding information.
In the present exemplary embodiment, the contents of the information are output by either displaying the content of the information by the AR image or outputting the contents of the information by voice.
In the case of the output by the AR image, the output controller 101C in the present exemplary embodiment executes generation of the AR image according to the contents of the information, and control of the output position. When the contents of the information are output by voice, the output controller 101C of the present exemplary embodiment generates a voice file corresponding to the contents of the information.
The hard disk drive 110 is an auxiliary storage device using a magnetic disk as a recording medium. In the present exemplary embodiment, the hard disk drive 110 is used as the auxiliary storage device, but a non-volatile rewritable semiconductor memory may be used as the auxiliary storage device. An operating system and application programs are installed on the hard disk drive 110.
In the following, the operating system and the application programs are referred to as “program” without distinguishing between them.
Image data 111, the inquiry target table 112, and the provision timing table 113 are recorded in the hard disk drive 110 illustrated in
In the present exemplary embodiment, the image data 111 is captured by the camera 21 (see
In the present exemplary embodiment, a moving image is used as the image data 111. However, a still image captured at a predetermined cycle may be used as the image data 111. The capturing cycle is, for example, 5 seconds. Further, the image data 111 may be captured when a change in the direction of the head is detected. The change in the direction of the head may be detected, for example, by a change in acceleration.
In
The presence or absence of “user setting” is recorded in each content.
The “user setting” here are associated with each user who uses an information provision service. Therefore, even when certain information is detected, it may or may not be provided depending on the user.
In the case of
This setting may be changed at any time.
In
In the case of
In the case of
Further, in the case of
In
These settings are used when the user permits the provision of the contents of the information and there is no instruction regarding the output form. When the output form instructed by the user is different from a set output form, the user's instruction is given priority.
In the case of
Further, as for “time designation”, “19:00” is designated by the user. Similarly, as for “elapsed time designation”, “30 minutes later” is designated by the user.
Each hour and time may be registered as a response to an inquiry, but may also be registered in advance.
Further, as for “when browsing a specific homepage on a terminal”, “AB store” and “CD store” are registered as specific homepages. Browsing of a specific homepage is detected by analysis of the image data 111 (see
xR Device Configuration
The xR device 20 illustrated in
The data processing unit 200 includes a processor 201, a ROM 202, a RAM 203, and a flash memory 204.
The ROM 202, the RAM 203, and the flash memory 204 are all semiconductor memories. A BIOS is stored in the ROM 202. The RAM 203 is a main storage device used for executing a program. For example, a DRAM is used for the RAM 203.
The flash memory 204 is used for recording firmware, programs, and data files. The flash memory 204 is used as an auxiliary storage device.
The processor 201 includes, for example, a CPU. The processor 201 implements various functions through the execution of a program.
The processor 201 implemented in
The character recognition unit 201A corresponds to a function of recognizing characters included in an image obtained by capturing the direction of the user's line of sight.
The character recognition unit 201A in the present exemplary embodiment converts a recognized character string into a text string and outputs the text string. In the present exemplary embodiment, the character recognition is completely executed by the processor 201, but the text string may be output in cooperation with another device having a character recognition function. An artificial intelligence technology may be applied to the character recognition.
Further, the server 10 (see
Further, a server specialized in character recognition may be used as another device.
For example, a CMOS image sensor or a CCD image sensor is used for the camera 21. The number of cameras 21 may be one or more. In the example of
For example, when two cameras 21 are used, the two cameras 21 are arranged at both ends of the front portion of a frame. By using the two cameras 21, stereo imaging becomes possible, which makes it possible to measure a distance to a subject and estimate the anteroposterior relationship between subjects.
The AR module 220 is a module that implements the visual recognition of augmented reality obtained by synthesizing an AR image with a real landscape, and includes optical components and electronic components.
Typical methods of the AR module 220 include a method of arranging a half mirror in front of the user's eye, a method of arranging a volume hologram in front of the user's eye, and a method of arranging a blazed diffraction grating in front of the user's eye.
The microphone 230 is a device that converts a user's voice and ambient sound into an electric signal.
The speaker 240 is a device that converts an electric signal into sound and outputs the sound. The speaker 240 may be a bone conduction speaker or a cartilage conduction speaker.
The speaker 240 may be a device independent of the xR device 20, such as a wireless earphone. In this case, the speaker 240 is connected to the xR device 20 by Bluetooth or the like.
The inertial sensor 250 includes, for example, a 6-axis sensor. The 6-axis sensor includes a 3-axis acceleration sensor and a 3-axis angular velocity sensor. The motion of the user wearing the xR device 20 is estimated from the output of the inertial sensor 25. In the present exemplary embodiment, it is detected by the output of the inertial sensor 250 that the user has stopped.
The positioning sensor 260 is a GPS module that positions the position of its own terminal by receiving GPS signals transmitted from, for example, three or more Global Positioning System (GPS) satellites. The positioning by receiving GPS signals transmitted from GPS satellites is limited to outdoors.
Further, an indoor positioning module may be separately prepared as the positioning sensor 260. The indoor positioning module includes, for example, a module that receives a Bluetooth Low Energy (BLE) beacon and determines the position of its own terminal, a module that receives a Wi-Fi (registered trademark) signal and determines the position of its own terminal, a module that determines the position of its own terminal by autonomous navigation, and a module that receives an Indoor Messaging System (IMES) signal and determines the position of its own terminal.
The vibrator 270 is a device that generates continuous vibration or intermittent vibration. In the present exemplary embodiment, the vibration of the vibrator 270 is used, for example, for the purpose of notifying the user that the existence of information that may be provided has been detected.
The communication module 280 uses a device compliant with the protocol of the communication network 30 and is also used for communication with an external device. For example, Wi-Fi (registered trademark) or Bluetooth (registered trademark) is used for communication with the external device.
The AR module 220 illustrated in
The light guide plate 221 corresponds to a lens of eyeglasses. The light guide plate 221 has a transmittance of, for example, 85% or more. Therefore, the user may directly see the scenery in front of the light guide plate 221 through the light guide plate 221. External light L1 travels straight so as to pass through the light guide plate 221 and the diffraction grating 223B, and is incident on the user's eye E.
The micro display 222 is a display device that displays an AR image to be visually recognized by the user. The light of the AR image displayed on the micro display 222 is projected as the image light L2 onto the light guide plate 221. The image light L2 is refracted by the diffraction grating 223A and reaches the diffraction grating 223B while reflecting the inside of the light guide plate 221. The diffraction grating 223B refracts the image light L2 in the direction of the user's eye E.
This causes the external light L1 and the image light L2 to be simultaneously incident on the user's eye E. As a result, the user recognizes the presence of the AR image in front of the line of sight.
Processing Operation
Hereinafter, a processing operation executed by cooperation between the server 10 and the xR device 20 will be described with reference to
The processing operation illustrated in
In the present exemplary embodiment, it is assumed that the processing operation illustrated in
The processor 101 immediately acquires image data of the camera 21 (see
Next, the processor 101 analyzes the image data to detect predetermined information (step 2). The predetermined information is detected by collation with the inquiry target table 112 (see
When the predetermined information is detected, the processor 101 inquires the user whether to display the contents of the information (step 3). In the present exemplary embodiment, a method that does not use vision, such as sound or voice, is used for the inquiry. The sound or voice is output from the speaker 240 (see
Next, the processor 101 determines whether an instruction has been detected within a predetermined time (step 4).
In the present exemplary embodiment, voice is assumed as an instruction from the user. The predetermined time is preferably set for each user. However, when there is no user setting, an initial value is used. For example, 10 seconds is used for the predetermined time.
In consideration of a case where the user does not notice the inquiry, the inquiry in step 3 may be executed again when the time has elapsed for the first time.
When the user's instruction is not detected even after the predetermined time has elapsed, the processor 101 obtains a negative result in step 4. In this case, a negative denial by the user who does not want to display is included in the negative result, which is a so-called ignorance.
When the negative result is obtained in step 4, the processor 101 ends the process without providing the contents of the detected information to the user.
On the other hand, when the user's instruction is detected within the predetermined time, the processor 101 obtains an affirmative result in step 4.
When the affirmative result is obtained in step 4, the processor 101 determines whether there is an instruction to display by an AR image (step 5).
When an affirmative result is obtained in step 5, the processor 101 determines whether “display immediately” is instructed (step 6). The instruction here may be given by the user each time, or may be set in the provision timing table 113.
When an affirmative result is obtained in step 6, the processor 101 displays an AR image (step 7). This AR image contains the content of the detected information.
On the other hand, when a negative result is obtained in step 6, the processor 101 determines whether a display condition is that the user stops (step 8). In this case as well, an instruction to display at the time when the user stops, that is, when the user stops walking, may be given from the user each time, or may be set in the provision timing table 113.
When an affirmative result is obtained in step 8, the processor 101 determines whether the user has stopped (step 9).
In the present exemplary embodiment, the processor 101 repeats the determination in step 9 while a negative result is obtained in step 9.
Eventually, when the user stops and an affirmative result is obtained in step 9, the processor 101 displays an AR image (step 10). When the output condition is changed while the determination in step 9 is repeated, the arrival of the changed condition is determined. For example, when the user gives an instruction to “display immediately” while the determination in step 9 is repeated, the AR image is immediately displayed.
When a negative result is also obtained in step 8, that is, when the contents of the user's instruction are neither “display immediately” nor “a display condition is that the user stops”, the processor 101 determines whether the timing designated by the user has arrived (step 11).
This timing may be specifically instructed by the user each time, or may be set in the provision timing table 113.
In any case, the processor 101 repeats the determination in step 11 while a negative result is obtained in step 11.
Eventually, when the provision timing arrives, the processor 101 displays an AR image (step 12). When the output condition is changed while the determination in step 11 is repeated, the arrival of the changed condition is determined. For example, when the user gives an instruction to “display immediately” while the determination in step 11 is repeated, the AR image is immediately displayed.
Subsequently, a case where a negative result is obtained in step 5 will be described. The processing operation in this case is illustrated in
When the negative result is obtained in step 5, the processor 101 determines whether there is an instruction to provide by voice (step 21).
When the user explicitly utters a voice such as “unnecessary”, the processor 101 obtains a negative result in step 21. In this case, the processor 101 ends the process without providing the contents of the detected information to the user.
On the other hand, when an affirmative result is obtained in step 21, the processor 101 converts the contents of the information into a voice file (step 22).
Subsequently, the processor 101 determines whether “play immediately” has been instructed (step 23). The instruction here may be given by the user each time, or may be set in the provision timing table 113.
When an affirmative result is obtained in step 23, the processor 101 plays the voice file (step 24). In this case, a voice is output from the speaker 240 (see
On the other hand, when a negative result is obtained in step 23, the processor 101 determines whether a playback condition is that the user stops (step 25). In this case as well, an instruction to play at the time when the user stops, that is, when the user stops walking, may be given from the user each time, or may be set in the provision timing table 113.
When an affirmative result is obtained in step 25, the processor 101 determines whether the user has stopped (step 26).
In the present exemplary embodiment, the processor 101 repeats the determination in step 26 while a negative result is obtained in step 26. When the output condition is changed while the determination in step 26 is repeated, the arrival of the changed condition is determined. For example, when the user gives an instruction to “play immediately” while the determination in step 26 is repeated, the voice is immediately played.
Eventually, when the user stops and an affirmative result is obtained in step 26, the processor 101 plays the voice file (step 27).
When a negative result is also obtained in step 25, that is, when the contents of the user's instruction are neither “play immediately” nor “a playback condition is that the user stops”, the processor 101 determines whether the timing designated by the user has arrived (step 28). This timing may be specifically instructed by the user each time, or may be set in the provision timing table 113.
In any case, the processor 101 repeats the determination in step 28 while a negative result is obtained in step 28.
Eventually, when the provision timing arrives, the processor 101 plays the voice file (step 29). When the playback condition is changed while the determination in step 28 is repeated, the arrival of the changed condition is determined. For example, when the user gives an instruction to “play immediately” while the determination in step 28 is repeated, the voice is immediately played.
Hereinafter, examples of presentation of the contents of information according to the present exemplary embodiment will be described with reference to
Here, it is assumed that the contents of the operation information posted on an electric bulletin board in a station yard are provided as an AR image to a user.
In the case of
The camera 21 of the xR device 20 is capturing an image in front of the user. Therefore, the camera 21 captures the electric bulletin board 43 on the way to the automatic ticket gate 41. In this example, the image captured by the camera 21 is uploaded to and analyzed by the server 10 (see
As a result of the image analysis, when the existence of the operation information notifying a train delay is detected, the xR device 20 outputs a voice that inquires the user whether to display the contents of the information, as illustrated in
In the example of
In this way, in the present exemplary embodiment, every time the information is detected, the user is not provided with all the contents of the detected information, but the possibility of displaying the information is inquired.
The reason for making an inquiry before the contents of the information are displayed is that the user does not want to display the AR image immediately, for example, when walking in a crowd or when it is necessary to check the safety of the surroundings. However, In the present exemplary embodiment, the display of the AR image is not uniformly prohibited, but the display of the AR image may be performed immediately when the user desires.
That is, in the method of the present exemplary embodiment, the display, non-display, display timing, etc. of the AR image are determined by checking the convenience of the user, which may not be determined only by the surrounding environments, which are acquired by the camera 21, and the user's motion.
In the case of
The fact that the user has stopped may be determined from the output waveform of the inertial sensor 250 (see
This instruction corresponds to the case where the affirmative result is obtained in step 9 (see
In the case of
In the case of these instructions, the contents of the operation information will not be provided by the AR image until the conditions are satisfied. This instruction corresponds to the case where the negative result is obtained in step 11 (see
In the case of
In the case of
In any case, when the user moves without stopping until he/she arrives at the platform, the AR image will not be displayed in front of the user's field of view while the user approaches the automatic ticket gate 41, while the user passes through the automatic ticket gate 41, and while the user goes up the stairs to the platform.
In the case of
In general, since the condition of “when going on platform” is first satisfied, at the point of time when the analysis of the image captured by the camera 21 reveals that the user is on the platform, the contents of the detected information are displayed as an AR image.
In the example of
In addition, there is a possibility that the information is updated between the time when the existence of the operation information is detected and the time when the condition is satisfied. Therefore, the processor 201 (see
When the network environment is poor and it is difficult to access the above-mentioned homepage, the AR image generated by the server 10 (see
Further, when the character recognition unit 201A (see
With the above functions, the contents of the operation information may be displayed in front of the eyes at the timing designated by the user.
In
In the case of
In this way, when the contents of the information are related to a specific direction, place, and region, and the direction of the user's line of sight specified by image analysis is related to the direction, place, and region, the contents of the information associated with the direction of the user's line of sight is displayed as the AR image to the user.
By providing this function in the server 10 or the xR device 20, a mismatch between the displayed contents and the direction of the user's line of sight is reduced.
Surely, even when the image captured by the camera 21 is analyzed, the relevance to the contents of the information may not be known. In such a case, delay information about both the direction of Yokohama and the direction of Shinagawa is displayed as the AR image regardless of whether the user is looking at the train bound for Yokohama or the train bound for Shinagawa.
In the case of
Therefore, in the example of
Here, it is assumed that the contents of an advertisement posted near the entrance of a commercial building are provided as an AR image to the user.
In the case of
Since the user is walking toward the entrance 62, it is difficult to notice the existence of the signboard 63.
The camera 21 of the xR device 20 is capturing an image in front of the user. Therefore, the signboard 63 is reflected on the camera 21 on the way to the entrance 62 of the commercial building 61. In this example, an image obtained by the camera 21 is uploaded to and analyzed by the server 10 (see
When the existence of the signboard 63 is detected as a result of the image analysis, the xR device 20 outputs a voice inquiring the user whether to display the contents of the information, as illustrated in
In the example of
It is also possible to change the expression according to the contents of the information and the number of times the voice is output. In the former, it is possible to predict the contents of the information with a difference in expression. However, when there are too many expressions, it will be difficult to predict the contents of the information. In the latter, it is possible to avoid the contents of the inquiry from becoming monotonous.
In the example of
The xR device 20 that receives an instruction “immediately” acquires an AR image, which is generated by the image analysis, from the server 10 (see
In this example, even when a distance between the user and the signboard 63 is long and the contents described on the signboard 63 is unknown, the user may be made aware of the contents of the advertisement.
In the example of
For example, when “when the surroundings are not crowded” is set by the user as the condition for displaying the AR image (see
Here, a modification to example 1 or example 2 will be described.
In the case of
As a result, in the example of
This instruction corresponds to a case where the affirmative result is obtained in step 23 (see
In this example, since the method of providing the contents of the information to the user by the voice is adopted, the user's field of view is not obstructed by the AR image. Therefore, even when there are other passersby around the user, the user may know the contents of the signboard 63 posted on the commercial building 61 in the direction of the light of sight.
In the example of
In addition to “on the store”, a station or other places may be added, or the contents suggesting the type of information such as “traffic” and “regulation” may be added.
In any case, at the inquiry stage, the burden on the user is reduced by simplifying the contents as much as possible.
This is because when the contents of the inquiry are detailed, it is no difference from providing the content of the information. At the inquiry stage, it is sufficient to inform the existence of information that may be provided. In that sense, the inquiry may be “display information?” or “display?”.
The user in
Here, a case where plural pieces of information exist in the image captured by the camera 21 (see
In the case of
The face of the user in
In
When looking sideways in the direction of the station while walking, even when the electric bulletin board 43 and the signboard 63 are included in the user's field of view, the characters and the like are not read unless the user closely looks at them. Therefore, the operation information and the contents of advertisement are not noticed.
Meanwhile, the range captured by the camera 21 includes the electric bulletin board 43 and the signboard 63, and the characters are also captured as an image. The captured image is uploaded from the xR device 20 to the server 10 (see
As a result of the image analysis, the existence of the operation information and the existence of the advertisement are detected. After that, the xR device 20 outputs an inquiry to the user as to whether to display the contents of information.
In the case of
Therefore, the user will be aware of the existence of a station and store information.
In the case of
As illustrated in
In the second exemplary embodiment, an example in which a gesture is used for an instruction will be described.
In
One finger is allocated to the instruction to “display immediately”. Since the number of fingers may be one, the finger to be used does not have to be the index finger. For example, the finger may be a thumb or a little finger.
Further, in this example, the direction of the finger has no meaning, and the direction of the fingertip may be leftward or downward. The number of fingers is detected by analysis of an image captured by the camera 21.
In
In the case of
In the example of
In the case of
In the case of
Since it is unnecessary, an AR image is not displayed in front of the user's line of sight in
The instruction for the inquiry is not limited to the number of fingers.
For example, the direction of the finger may be used for the instruction. For example, when the fingertip points to the left, it means “display immediately”. When the fingertip points up, it means “display later”. When the fingertip points down, it means “unnecessary”.
In addition, the movement of fingers or a hand may be used as an instruction. For example, when a finger or a hand is raised up, it means “display immediately”. When a finger or a hand sticks forward, it means “display later”. When the hand is moved left or right, it means “unnecessary”. Further, the action of holding the hand or the action of spreading the hand may be used for the instruction, or the number of times of repeating the action of holding the hand and the action of spreading the hand may be used for the instruction.
In addition, the motion of fingers or a hand may be combined with the speed of the motion of fingers or a hand. For example, when a finger or a hand is raised quickly, it may mean “display immediately”. When a finger or a hand is raised slowly, it may mean “display later”.
Moreover, a shape made by fingers may be used for the instruction. For example, “display immediately” may be allocated to a ring shape, and “display later” may be allocated to a C-shaped shape.
Further, the right hand may be used for “display immediately”, and the left hand may be used for “display later”. Surely, the allocation of the instruction may be reversed left and right.
Moreover, both the right hand and the left hand may be used for the instruction. For example, when the user makes a circle shape with both hands, it may mean “display immediately”. When the user makes a heart shape with both hands, it may mean “display later”.
In a third exemplary embodiment, an example in which a button provided on the xR device 20 is used as an instruction will be described.
In the xR device 20 illustrated in
“Display immediately” is allocated to a single tap of the button 22 illustrated in
Surely, the number of taps and the allocation of instruction contents are examples. Further, when the sensor that detects a contact is used, the length of time during which a finger is in contact may be allocated to the instruction contents. For example, when the contact time is less than 1 second, it is regarded as “display immediately”. When the contact time is 1 second or more, it is regarded as “display later”.
The instruction using the button 22 is very suitable when there are many people around and an instruction by a voice or a gesture is not suitable.
In a fourth exemplary embodiment, an example of using an AR image for an inquiry will be described.
In the case of
In the case of
In the example of
The user who confirms these marks gives an instruction as to whether to permit display. An AR image displaying the mark here is also an example of a second augmented-reality image.
In the example of
Further, in
Further, when displaying the inquiry as an AR image, the inquiry may be displayed avoiding the center of the field of view. Since the user's interest is unknown at the inquiry stage, the inquiry is displayed around the field of view so as not to obstruct the user's field of view. However, as the inquiry is closer to the periphery of the field of view, it becomes harder to notice. Thus, it is also possible to place it in the center of the field of view. When the AR image is used for the inquiry, it is desirable that the user designates a position at which the AR image is to be displayed, in advance.
The xR device 20 used in a fifth exemplary embodiment vibrates the vibrator 270 (see
The xR device 20 used in the present exemplary embodiment notifies the user, with vibration, that the information registered in the inquiry target table 112 has been detected (see
The type of vibration may be changed according to the contents of information. For example, for the operation information on a transportation, a single vibration may be used, and for an advertisement, a double vibration may be used. Further, a pattern of vibration defined by the strength and length of the vibration may be changed.
In
The xR device 20A used in the present exemplary embodiment is provided with the function of the server 10 (see
A processor 201 illustrated in
The image data 111, the inquiry target table 112, and the provision timing table 113 are stored in the flash memory 204.
The xR device 20A used in the present exemplary embodiment alone executes detection, inquiry, and output timing control of information to be displayed as an AR image. Therefore, the xR device 20A may provide a service even when communication with the server 10 is disconnected.
The xR device 20A in the present exemplary embodiment is an example of an information processing apparatus.
An xR device 20C used in
In the seventh exemplary embodiment, the camera 21 is, for example, removable and is attached to a user's clothing or hat. However, a direction in which the camera 21 captures an image is aligned with the direction of the user's line of sight.
The camera 21 is wirelessly connected to the xR device 20C. For example, the camera 21 is connected to the xR device 20C by Bluetooth.
The xR device 20C here is also called a smart glass.
In the information processing system 1B illustrated in
In the case of
In the example of
The terminal 40 is an example of an information processing apparatus.
In a ninth exemplary embodiment, a case where information is provided by using position information on the xR device 20 will be described.
In the present exemplary embodiment, when there is information associated with position information measured by the positioning sensor 260 (see
The processing operation illustrated in
The processor 101 (see
Next, the processor 101 detects information associated with the user's current position (step 32). The position information is stored in the hard disk drive 110 (see
When information corresponding to the position of the user exists, the processor 101 inquires the user whether to display the contents of the information (step 33).
The subsequent processing operations are the same as the step 4 (see
In the present exemplary embodiment, it is not necessary to capture the direction of the user's line of sight with the camera 21 (see
(1) Although the exemplary embodiments of the present disclosure have been described above, the technical scope of the present disclosure is not limited to the scope described in the above-described exemplary embodiments. It is clear from the description of the claims that various modifications or improvements to the above-described exemplary embodiments are also included in the technical scope of the present disclosure.
(2) In the above-described exemplary embodiments, it is assumed that the camera 21 (see
(3) In the above-described exemplary embodiments, when the information included in the image obtained by capturing the surroundings of the user and the information associated with the user's position satisfy a predetermined condition, the user is inquired whether to display the information. However, the user may be inquired whether to display the contents of a received e-mail and alarm. A server that sends the e-mail and alarm here is an example of an external device. A notification such as the e-email or alarm occurs independently of the user's position and surrounding environments.
In addition, the external device includes a server that displays, for example, an automatic ticket gate, a digital signage arranged in a street, and characters of a game, as an AR image, on a user's terminal. The notification here is a notification of the contents of information associated with the user's position. A place where the automatic ticket gate is installed, a place where a terminal of the digital signage is installed, and a place where the characters of the game appear are examples of specific places.
(4) In the above-described first exemplary embodiment, the character recognition unit 201A (see
(5) In the above-described exemplary embodiments, the contents of information are output by the AR image or the voice, but the contents of information may be output by using both the AR image and the voice.
(6) In the embodiments above, the term “processor” refers to hardware in a broad sense. Examples of the processor include general processors (e.g., CPU: Central Processing Unit) and dedicated processors (e.g., GPU: Graphics Processing Unit, ASIC: Application Specific Integrated Circuit, FPGA: Field Programmable Gate Array, and programmable logic device).
In the embodiments above, the term “processor” is broad enough to encompass one processor or plural processors in collaboration which are located physically apart from each other but may work cooperatively. The order of operations of the processor is not limited to one described in the embodiments above, and may be changed.
The foregoing description of the exemplary embodiments of the present disclosure has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Obviously, many modifications and variations will be apparent to practitioners skilled in the art. The embodiments were chosen and described in order to best explain the principles of the disclosure and its practical applications, thereby enabling others skilled in the art to understand the disclosure for various embodiments and with the various modifications as are suited to the particular use contemplated. It is intended that the scope of the disclosure be defined by the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
JP2021-005971 | Jan 2021 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
10075340 | Okayama | Sep 2018 | B2 |
10539787 | Haddick | Jan 2020 | B2 |
10691199 | Shanware | Jun 2020 | B2 |
10937391 | Griswold | Mar 2021 | B2 |
11170539 | Inatani | Nov 2021 | B2 |
11172189 | Elmieh | Nov 2021 | B1 |
11282285 | Trim | Mar 2022 | B1 |
20110302662 | Kannari | Dec 2011 | A1 |
20120154557 | Perez | Jun 2012 | A1 |
20140098137 | Fein | Apr 2014 | A1 |
20140125574 | Scavezze | May 2014 | A1 |
20140285521 | Kimura | Sep 2014 | A1 |
20140300636 | Miyazaya | Oct 2014 | A1 |
20150153570 | Yamamoto | Jun 2015 | A1 |
20150264450 | Jung | Sep 2015 | A1 |
20160343168 | Mullins | Nov 2016 | A1 |
20180018933 | Rehmeyer | Jan 2018 | A1 |
20180059773 | Park | Mar 2018 | A1 |
20190325224 | Lee | Oct 2019 | A1 |
20190332863 | Abhishek | Oct 2019 | A1 |
20200026922 | Pekelny | Jan 2020 | A1 |
20200272223 | Joiner | Aug 2020 | A1 |
20210063971 | Sato | Mar 2021 | A1 |
20210289068 | Hosoda | Sep 2021 | A1 |
20210312887 | Griswold | Oct 2021 | A1 |
20220013129 | Carbune | Jan 2022 | A1 |
20220113934 | Lee | Apr 2022 | A1 |
Number | Date | Country |
---|---|---|
2017-174018 | Sep 2017 | JP |
6424100 | Nov 2018 | JP |
Number | Date | Country | |
---|---|---|---|
20220230015 A1 | Jul 2022 | US |