The present disclosure relates to an information processing apparatus, an information processing method, and an information processing program.
Augmented reality (AR) has become widespread as a technology for realizing realistic experience. The augmented reality is also called augmented reality (AR), and is a technology for adding, emphasizing, attenuating, or deleting information to a real environment surrounding a user to expand a real space viewed from the user. AR is realized by using, for example, a see-through type head mounted display (hereinafter, also referred to as “AR glasses”). According to the AR technology, superimposed display of a virtual object with respect to a scenery in a real space observed by a user through AR glasses, emphasis or attenuation display of a specific real object, display in which a specific real object is deleted and appears as if it does not exist, and the like are realized.
Meanwhile, Patent Literature 1 discloses a laser marking device that irradiates four surfaces of two opposing side walls, a ceiling, and a floor in a real space with line light indicating a vertical surface using laser light. In this laser marking device, for example, by placing the device on a floor surface, it is possible to irradiate four surfaces of a wall surface, a ceiling, and a floor with line light indicating a vertical surface with the floor surface as a reference plane. For example, in interior construction, it is possible to perform construction work such as installation of an object in a room or opening a hole in a wall surface, a floor, or a ceiling based on the line light.
Patent Literature 1: JP 2005-010109 A
It is easy to display a line in a virtual space using an AR technology. However, in the conventional AR technology, it is difficult to set a plane as a reference for displaying a line, and it is difficult to present display as a reference such as installation of an object in a real space in a virtual space. Therefore, work such as arrangement of objects in the real space may be difficult.
An object of the present disclosure is to provide an information processing apparatus, an information processing method, and an information processing program capable of more easily executing work in a real space.
For solving the problem described above, an information processing apparatus according to one aspect of the present disclosure has an acquisition unit configured to acquire motion information indicating a motion of a user; and a display control unit configured to perform display control on a display unit capable of superimposing and displaying a virtual space on a real space, wherein the display control unit specifies a real surface that is a surface in the real space based on the motion information, and displays a region image indicating a region for arranging a virtual object or a real object on a virtual surface that is a surface in the virtual space corresponding to the real surface according to an azimuth extracted based on the real surface.
Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. Note that, in the following embodiments, the same parts are denoted by the same reference numerals, and redundant description will be omitted.
Hereinafter, embodiments of the present disclosure will be described in the following order.
First, an outline of a technology according to the present disclosure will be described. The present disclosure relates to an augmented reality (AR) technology, and uses an AR glasses including a display unit that can be used by being worn on a head of a user and can display a virtual space superimposed on a real space, and an acquisition unit that acquires motion information indicating a motion of the user. The acquisition unit is realized as one of functions in the AR glasses.
In the AR glasses, a display control unit for controlling display by the display unit specifies a surface (referred to as a real surface) in the real space based on the motion information acquired by the acquisition unit, and sets a surface (referred to as a virtual surface) corresponding to the specified real surface in the virtual space. The display control unit displays, for example, a region image indicating a region for arranging a real object or a virtual object on the virtual surface according to an azimuth extracted based on the real surface.
In the present disclosure, as described above, the virtual surface for displaying the region image indicating the region for arranging the real object or the virtual object and the azimuth thereof are determined according to the motion of the user. Therefore, the user can easily acquire information on the position and azimuth in the real space or the virtual space. Furthermore, the user can more accurately execute the arrangement of the real object with respect to the real space and the arrangement of the virtual object with respect to the virtual space.
Prior to describing an information processing apparatus according to each embodiment of the present disclosure, a technology applicable to each embodiment will be described.
In
An AR glass system 1b illustrated in
Note that, in
In an AR glass system 1c illustrated in
In
The sensor unit 110 includes the outward camera 1101, an inward camera 1102, a microphone 1103, a posture sensor 1104, an acceleration sensor 1105, and an azimuth sensor 1106.
As the outward camera 1101, for example, an RGB camera capable of outputting a so-called full-color captured image of each color of red (R), green (G), and blue (B) can be applied. The outward camera 1101 is arranged in the AR glasses 10 so as to capture an image in the line-of-sight direction of the user wearing the AR glasses 10. The outward camera 1101 is capable of imaging, for example, a motion of a finger of the user.
Furthermore, the outward camera 1101 may further include at least one of an IR camera including an IR light emitting unit that emits infrared (IR) light and an IR light receiving unit that receives IR light, and a time of flight (TOF) camera for performing distance measurement based on a time difference between light emission timing and light reception timing. In a case where the IR camera is used as the outward camera 121, a retroreflective material is attached to an object to be captured such as a back of a hand, infrared light is emitted by the IR camera, and infrared light reflected from the retroreflective material can be received.
The inward camera 1102 includes, for example, an RGB camera, and is installed to be able to photograph the inside of the AR glasses 10, more specifically, the eye of the user wearing the AR glasses 10. The line-of-sight direction of the user can be detected based on the photographed image of the inward camera 1102.
Image signals of the images captured by the outward camera 1101 and the inward camera 1102 are transferred to the control unit 100.
As the microphone 1103, a microphone using a single sound collection element can be applied. The present invention is not limited thereto, and the microphone 1103 may be a microphone array including a plurality of sound collection elements. The microphone 1103 collects a voice uttered by the user wearing the AR glasses 10 and an ambient sound of the user. A sound signal based on the sound collected by the microphone 1103 is transferred to the control unit 100.
The posture sensor 1104 is, for example, a 3-axis or 9-axis gyro sensor, and detects the posture of the AR glasses 10, for example, roll, pitch, and yaw. The acceleration sensor 1105 detects acceleration applied to the AR glasses 10. The azimuth sensor 1106 is, for example, a geomagnetic sensor, and detects the azimuth in which the AR glasses 10 face. For example, a current position with respect to an initial position of the AR glasses 10 can be obtained based on the detection result of the acceleration sensor 1105 and the detection result of the azimuth sensor 1106. The posture sensor 1104, the acceleration sensor 1105, and the azimuth sensor 1106 may be configured by an inertial measurement unit (IMU).
Each sensor signal output from each of the posture sensor 1104, the acceleration sensor 1105, and the azimuth sensor 1106 is transferred to the control unit 100. The control unit 100 can detect the position and posture of the head of the user wearing the AR glasses 10 based on these sensor signals.
The output unit includes a display unit 1201, a sound output unit 1202, and a vibration presentation unit 1203. Note that, here, the left and right display units 1201L and 1201R illustrated in
The display unit 1201 includes a transmissive display installed in front of both eyes or one eye of the user wearing the AR glasses 10, and is used to display the virtual world. More specifically, the display unit 1201 performs display of information (for example, an image of a virtual object), and display of emphasis, attenuation, deletion, or the like of an image of a real object to expand the real space viewed from the user. The display unit 1201 performs a display operation in accordance with a display control signal from the control unit 100. In addition, a mechanism of transparently displaying a virtual space image with respect to a real space image in the display unit 1201 is not particularly limited.
The sound output unit 1202 includes a single sounding element that converts a sound signal supplied from the control unit 100 into a sound as aerial vibration and outputs the sound, or an array of a plurality of sounding elements, and constitutes a speaker or an earphone. The sound output unit 1202 is arranged, for example, on at least one of the left and right ears of the user in the AR glasses 10. The control unit 100 can cause the sound output unit 1202 to output a sound related to the virtual object displayed on the display unit 1201. The present invention is not limited to this, and the control unit 100 can also cause the sound output unit 1202 to output sounds by other types of sound signals.
Under the control of the control unit 100, the vibration presentation unit 1203 generates, for the hand sensor 20, a control signal for giving a stimulus (for example, vibration) to the finger of the user wearing the hand sensor 20.
A communication unit 130 communicates with the hand sensor 20 via wireless communication or wired communication. The communication unit 130 communicates with the hand sensor 20 using, for example, wireless communication by Bluetooth (registered trademark). The communication method by which the communication unit 130 communicates with the hand sensor is not limited to Bluetooth (registered trademark). Furthermore, the communication unit 130 can execute communication via a network such as the Internet. As an example, in the AR glass system 1c illustrated in
A storage unit 140 can store data generated by the control unit 100 and data used by the control unit 100 in a nonvolatile manner.
In
As in the case of the sensor unit 110, the posture sensor 2001, the acceleration sensor 2002, and the azimuth sensor 2003 may be configured by an inertial measurement unit (IMU). In the following description, it is assumed that the posture sensor 2001, the acceleration sensor 2002, and the azimuth sensor 2003 are constituted by an IMU.
The vibrator 2004 is supplied with the control signal generated by the vibration presentation unit 1203 described above, and performs an operation of giving a stimulus (vibration in this example) to the hand of the user wearing the hand sensor 20 according to the control signal.
The IMU 201 is mounted between an MP joint and an IP joint of the first finger (thumb) of a hand 21 by a belt 211 or the like. The IMUs 202 and 203 are respectively mounted by belts 212 and 213 and the like between an MP joint and a PIP joint and between the PIP joint and a DIP joint of the second finger (index finger) of the hand 21. The direction pointed by the second finger can be obtained based on the sensor signals of the two IMUs 202 and 203 worn on the second finger.
More specifically, the control unit 100 can detect the opening angle between the first finger and the second finger, the angle of the PIP joint (second joint) of the second finger, the presence or absence of contact between the fingertips of the first finger and the second finger, and the like based on the sensor signals output from the IMUs 201, 202, and 203. As a result, the control unit 100 can recognize the position and posture (Alternatively, a form taken by a finger) of the finger of the user in the hand 21 and the gesture by the fingers.
The hand sensor control unit 204 is wound around the palm of the hand 21 by a belt 214 or the like and attached. The hand sensor control unit 204 includes a communication unit (not illustrated) that communicates with the AR glasses 10 and the vibrator 2004. The hand sensor control unit 204 transmits each sensor signal output from the IMUs 201 to 203 to the AR glasses 10 by the communication unit. The hand sensor control unit 204 includes the vibrator 2004. The hand sensor control unit 204 vibrates the vibrator 2004 according to the control signal generated by the vibration presentation unit 1203 and transmitted from the AR glasses 10, and can give a stimulus to the hand 21 on which the hand sensor 20 is worn.
In the example of
The application execution unit 1001, the head position/posture detection unit 1002, the output control unit 1003, the finger position/posture detection unit 1004, and the finger gesture detection unit 1005 are realized by reading and executing an information processing program stored in the storage unit 140, for example, by a central processing unit (CPU) included in the AR glasses 10 to be described later. Not limited to this, some or all of the application execution unit 1001, the head position/posture detection unit 1002, the output control unit 1003, the finger position/posture detection unit 1004, and the finger gesture detection unit 1005 may be configured by a hardware circuit that operates in cooperation with each other.
In
Furthermore, the AR application can acquire three-dimensional information of the surroundings based on the captured image acquired by the outward camera 1101. In a case where the outward camera 1101 includes a TOF camera, the AR application can acquire surrounding three-dimensional information based on distance information obtained using the function of the TOF camera. Furthermore, the AR application can also analyze the sound signal output from the microphone 1103 and acquire an instruction by utterance of the user wearing the AR glasses 10, for example. Furthermore, the AR application can acquire an instruction by the user based on a gesture detected by the finger gesture detection unit 1005 to be described later.
The application execution unit 1001 further generates a display control signal for controlling display on the display unit 1201, and controls the display operation of the virtual object on the display unit 1201 by the AR application according to the generated display control signal. The virtual object generated by the AR application is arranged around the entire circumference of the user.
The head position/posture detection unit 1002 detects the position and posture of the head of the user based on the sensor signals of the posture sensor 1104, the acceleration sensor 1105, and the azimuth sensor 1106 included in the sensor unit 110 mounted on the AR glasses 10, and further recognizes the line-of-sight direction or the visual field range of the user.
The output control unit 1003 controls outputs of the display unit 1201, the sound output unit 1202, and the vibration presentation unit 1203 based on an execution result of an application program such as an AR application. For example, the output control unit 1003 specifies the visual field range of the user based on the detection result of the head position/posture detection unit 1002, and controls the display operation of the virtual object by the display unit 1201 so that the user can observe the virtual object arranged in the visual field range through the AR glasses 10, that is, so as to follow the motion of the head of the user.
Furthermore, the output control unit 1003 can superimpose and display the image of the virtual space on the image of the real space transmitted through the display units 1201L and 1201R. That is, in the AR glasses 10, the control unit 100 functions as a display control unit that performs display control of superimposing the virtual space on the real space and displaying the virtual space on the display units 1201L and 1201R by the output control unit 1003.
The head position/posture detection unit 1002 detects posture information including motions (θz, θy, θx) of the head of the user 800 in the roll, pitch, and yaw directions and parallel movement of the head based on the sensor signals of the posture sensor 1104, the acceleration sensor 1105, and the azimuth sensor 1106. The output control unit 1003 moves a display field angle of the display unit 1201 with respect to the real space in which the virtual object is arranged so as to follow the posture of the head of the user 800, and displays the image of the virtual object existing at the display field angle on the display unit 1201.
As a more specific example, the output control unit 1003 moves the display field angle so as to cancel the motion of the head of the user by rotation according to the Roll component (θz) of the head movement of the user 800 with respect to a region 802a, movement according to the Pitch component (θy) of the head movement of the user 800 with respect to a region 802b, movement according to the Yaw component (θx) of the head movement of the user 800 with respect to a region 802c, and the like. As a result, the virtual object arranged at the display field angle moved following the position and posture of the head of the user 800 is displayed on the display unit 1201, and the user 800 can observe the real space on which the virtual object is superimposed through the AR glasses 10.
The function of the control unit 100 will be described with reference to
That is, in the AR glasses 10, the control unit 100 functions as an acquisition unit that acquires, by the finger position/posture detection unit 1004, motion information indicating the motion of the user wearing the AR glasses 10 based on each sensor signal output from the hand sensor 20 and the image captured by the outward camera 1101.
The storage device 1505 is a nonvolatile storage medium such as a flash memory, and implements the function of the storage unit 140 described with reference to
The camera I/F 1503 is an interface for the outward camera 1101 and the inward camera 1102, and supplies image signals output from the outward camera 1101 and the inward camera 1102 to the bus 1520. In addition, a control signal for controlling the outward camera 1101 and the inward camera 1102, which is generated by the CPU 1500 according to the information processing program, is transmitted to the outward camera 1101 and the inward camera 1102 via the camera I/F 1503.
The sensor I/F 1504 is an interface for the posture sensor 1104, the acceleration sensor 1105, and the azimuth sensor 1106, and the sensor signals output from the posture sensor 1104, the acceleration sensor 1105, and the azimuth sensor 1106 are supplied to the bus 1520 via the sensor I/F 1504.
The display control unit 1506 controls display operation by the display units 1201L and 1201R in accordance with a command from the CPU 1500. For example, the display control unit 1506 converts a display control signal generated by the CPU 1500 according to the information processing program into a display signal displayable by the display units 1201L and 1201R, and supplies the display signal to the display units 1201L and 1201R.
The audio I/F 1507 is an interface for the microphone 1103 and the sound output unit 1202. For example, the audio I/F 1507 converts an analog sound signal based on the sound collected by the microphone 1103 into a digital sound signal and supplies the digital sound signal to the bus 1520. Furthermore, the audio I/F 1507 converts a signal into a signal in a format that can be reproduced by the sound output unit 1202 based on a digital sound signal generated by the CPU 1500 according to the information processing program and supplied via the bus 1520, for example, and supplies the signal to the sound output unit 1202.
The communication I/F 1508 controls communication between the AR glasses 10 and the hand sensor 20 in accordance with a command from the CPU 1500. Furthermore, the communication I/F 1508 can also control communication with the outside. For example, the communication I/F 1508 controls communication with the server 3 via the network 2 in the AR glass system 1c of
For example, by executing the information processing program according to each embodiment, the CPU 1500 configures the application execution unit 1001, the head position/posture detection unit 1002, the output control unit 1003, the finger position/posture detection unit 1004, and the finger gesture detection unit 1005 included in the control unit 100 described above on the main storage area of the RAM 1502 as modules, for example. Note that the information processing program can be acquired from the outside (for example, the server 3) via the communication I/F 1508, for example, and can be installed on the AR glasses 10.
In the example of
The I/F 2101 is an interface for the posture sensor 2001, the acceleration sensor 2002, and the azimuth sensor 2003, and sensor signals output from the posture sensor 2001, the acceleration sensor 2002, and the azimuth sensor 2003 are supplied to the CPU 2100 via the I/F 2101. The CPU 2100 transmits the sensor signals supplied from the posture sensors 2001, the acceleration sensor 2002, and the azimuth sensor 2003 from the communication I/F 2103 to the AR glasses 10.
The I/F 2102 is an interface for the vibrator 2004. For example, the I/F 2102 generates a drive signal for driving the vibrator 2004 based on a command issued by the CPU 2100 according to a control signal transmitted from the AR glasses 10 and received by the communication I/F 2103, and supplies the drive signal to the vibrator 2004.
Next, a first embodiment of the present disclosure will be described.
First, processing by the AR glasses 10 as the information processing apparatus according to the first embodiment will be schematically described.
In Step S100, the AR glasses 10 measure a surrounding three-dimensional (3D) shape using an existing technology based on the image captured by the outward camera 1101, for example. For example, the user looks around with the AR glasses 10 worn. Meanwhile, the AR glasses 10 capture images at regular time intervals by the outward camera 1101 and acquire a plurality of captured images obtained by imaging the surroundings. The AR glasses 10 analyze the acquired captured image and measure the surrounding three-dimensional shape. In a case where the outward camera 1101 includes a TOF camera, the AR glasses 10 can obtain surrounding depth information. The AR glasses 10 measure the surrounding three-dimensional shape based on the depth information.
The AR glasses 10 generate a three-dimensional model of the real space based on the measurement result of the surrounding three-dimensional shape. In this case, the AR glasses 10 can generate an independent three-dimensional model based on edge information or the like for the real object arranged in the real space. The AR glasses 10 store data of the three-dimensional model generated based on a result of measuring the surrounding three-dimensional shape in, for example, the storage unit 140.
In the next Step S101, the AR glasses 10 designate a plane based on the operation of the user. More specifically, the AR glasses 10 designate the plane (real surface) in the real space for displaying the real object or the region image indicating the region for arranging the virtual object in the virtual space, and specify the plane related to the display of the region image. In the next Step S102, the AR glasses 10 designate an orientation (azimuth) of the region image to be arranged on the plane designated in Step S101 based on the operation of the user. In the next Step S103, the AR glasses 10 and the density of the region image designated in Step S102 are set.
By the processing of the flowchart of
Here, a content of the region image is not limited as long as the region image can indicate the region for arranging the real object or the virtual object. As an example, a grid image indicating a grid obtained by combining a line having the orientation designated by the user in Step S102 and a line having an orientation (for example, an orientation orthogonal to the orientation) different from the orientation can be used as the region image. The region image is not limited to this, and may be a dot indicating each coordinate point of the real surface, or may be an image in which tile images of a predetermined size are arranged on the real surface.
Furthermore, for example, in a case where the region image is a grid or a dot, the density of the region image is an interval between grids or dots. When the region image is a tile image, the tile image size corresponds to the density.
Note that, in the flowchart of
In the following description, it is assumed that the region image is the grid image indicating the grid. In addition, the grid image indicating the grid is simply referred to as a “grid”.
The designation method of the plane in Step S101 in the flowchart of
The first designation method of the plane will be described.
The user performs an operation of pointing a plane desired to be designated with fingers of the hand 21. In a case where the user wears the hand sensor 20 on the hand 21, an action of pointing with a finger (second finger in this example) on which the IMUs 202 and 203 are provided is performed. The AR glasses 10 detect the motion of the hand 21 of the user based on each sensor signal transmitted from the hand sensor 20, and designate a plane (floor surface 300) intersecting an instruction destination pointed by the user as the real surface. Not limited to this, the AR glasses 10 can acquire the direction pointed by the finger of the hand 21 based on the captured image obtained by imaging the finger of the hand 21 or the like by the outward camera 1101.
In the example of
In Step S100 in the flowchart of
Although the plane is specified remotely in the above description, this is not limited to this example.
The second designation method of a plane will be described. In the case of the AR glass system 1b illustrated in
The third designation method of a plane will be described. The third designation method of a plane is a designation method of the plane as the real surface based on the line-of-sight direction of the user wearing the AR glasses 10. For example, the AR glasses 10 image an eyeball of the user wearing the AR glasses 10 using the inward camera 1102. The AR glasses 10 detect a line-of-sight (line-of-sight direction) of the user using an existing technology based on a captured image obtained by imaging the eyeball by the inward camera 1102. The AR glasses 10 designate a plane intersecting the line-of-sight as a real surface.
A fourth designation method of a plane will be described. In the fourth designation method of a plane, the AR glasses 10 designate the plane on which the AR glasses 10 are worn and stand as the real surface.
Note that the AR glasses 10 can obtain, for example, the vertical line 314 passing through a predetermined position on the AR glasses 10. Not limited to this, the AR glasses 10 can also obtain the vertical line 314 passing through the position estimated to be the center of the head where the AR glasses 10 are roughly attached.
A designation method of the orientation (azimuth) in the real space of the region image displayed in the virtual space corresponding to the plane of the real space designated in Step S101 in Step S102 in the flowchart of
The first designation method of the orientation will be described.
In
In the example of
This is not limited to this example, and the AR glasses 10 can detect the motion in which the position pointed by the hand 21 moves from the point 317a to the point 317b based on the captured image obtained by imaging the hand 21 by the outward camera 1101, and can designate the direction along the boundary 318 as the orientation of the region image based on the line segment connecting the point 317a and the point 317b.
In this manner, in a case where the orientation of the region image is designated using the feature portion on the real surface, a margin can be provided for the position pointed by the user. For example, in the case of using the hand sensor 20, even when the pointing position includes a slight deviation from the boundary 318, it can be regarded that a point on the boundary 318 existing in the vicinity of the pointing position is designated based on the three-dimensional model acquired in Step S100, for example. This also applies to a case where the image captured by the outward camera 1101 is used.
The second designation method of the orientation will be described. In the case of the AR glass system 1b illustrated in
The third designation method of the orientation will be described. The third designation method of the orientation is an example of designating the orientation of the region image based on the pattern on the plane of the real space designated in Step S101.
In the example of
In addition, it is also possible to use an edge of a designated plane as the pattern on the plane of the real space. For example, the AR glasses 10 detect the edge of the plane (floor surface 300) designated in Step S100 based on the image captured by the outward camera 1101 or the three-dimensional model of the real space generated in Step S101. When the plane is the floor surface 300, the boundary 318 between the floor surface 300 and the wall surface 301 illustrated in
A method of setting the density of the region image designated in Step S102 in Step S103 in the flowchart of
The AR glasses 10 can set the density of the region image based on a default value that the system has in advance. Not limited to this, the AR glasses 10 can set the density of the region image, for example, according to a user operation by the user wearing the AR glasses 10. As an example, the AR glasses 10 display an operation menu using the display units 1201L and 1201R or on the display screen of the controller 11 in the case of using the controller 11. The user operates the operation menu with a gesture, a voice, an operator, or the like to set the density of the region image.
Step S100 is the processing corresponding to Step S100 in the flowchart of
Next Steps S101-1 and S101-2 correspond to the processing of Step S101 of the flowchart of
Meanwhile, in a case where the AR glasses 10 determine in Step S101-1 that either the operation in which the hand 21 points the plane in the real space or the operation in which the hand 21 touches the plane in the real space is executed (Step S101-2, “Yes”), the processing proceeds to Step S101-2. In Step S101-2, the AR glasses 10 specifies the plane pointed by the hand 21 or the plane touched by the hand 21 in Step S101-1 as the plane of the real space corresponding to the plane on which the region image in the virtual space is displayed.
The processing proceeds to the next Step S102-1. This Step S102-1 and the following Steps S102-2 and S102-3 correspond to the processing of Step S102 of the flowchart of
In Step S102-1, the AR glasses 10 determine whether or not the hand 21 (finger) traces the plane specified in Step S101-2. The AR glasses 10 determine that the hand traces the plane in a case where the hand 21 is moved linearly while a state where the hand 21 points the plane or in a state where the hand 21 touches the plane in Step S101-1 is maintained. When the AR glasses 10 determine that the plane is traced by the hand 21 (Step S102-1, “Yes”), the processing proceeds to Step S102-3.
Meanwhile, when the AR glasses 10 determine that the plane is not traced by the hand 21 (Step S102-1, “No”), the processing proceeds to Step S102-2. In Step S102-2, the AR glasses 10 determine whether the hand 21 traces (linearly moves) the line segment on the plane specified in Step S101-2 while pointing the line segment. When the AR glasses 10 determine that the operation of tracing the line segment while pointing the line segment is not executed by the hand 21 (Step S102-2, “No”), the processing returns to Step S102-1.
In a case where the AR glasses 10 determine that the operation of tracing the line segment while pointing the line segment is executed in Step S102-2, the processing proceeds to Step S102-3.
In Step S102-3, the AR glasses 10 specifies the direction in which the hand 21 traces the plane in Step S102-1 or Step S102-2 as the direction (azimuth) of the region image to be displayed on the plane in the virtual space corresponding to the plane of the real space specified in Step S101-2.
When the orientation (azimuth) in which the region image is displayed is specified in Step S102-3, the processing proceeds to Step S103, and the density of the region image is set similarly to Step S103 of the flowchart of
In this manner, the AR glasses 10 can execute the identification of the plane in the virtual space for displaying the region image and the orientation of the region image to be displayed on the plane in the identified virtual space based on a series of motions of the hand 21 of the user. As a result, the user can easily display the region image on the plane in the virtual space corresponding to the plane in the real space.
Next, a display example of the region image applicable to the first embodiment will be described. Note that, here, the description will be given assuming that the region image is a grid.
In the example of
In addition, it is also possible to expand a virtual surface on which the grid is displayed. In this case, for example, the AR glasses 10 can display the grid displayed on the virtual surface corresponding to the designated real surface by extending the grid to a virtual surface corresponding to another real surface.
As another example of expanding and displaying the grid in the entire virtual space, it is also possible to display the grid in the entire virtual space.
In addition, it is also possible to display the grid on the virtual surface corresponding to the designated real surface on the building model 330 or the like described with reference to
Next, effects of the technology according to the first embodiment will be described in comparison with the existing technology.
A section (b) in
The section (a) in
Here, in the case of the technology according to the first embodiment, the grid is displayed in the virtual space by the display units 1201L and 1201R of the AR glasses 10. Therefore, by moving to another position around the object 420, the user can see the grid at the position that was hidden by the object 420 in the initial position. More specifically, as illustrated in the section (b) of
Next, a second embodiment of the present disclosure will be described. A second embodiment relates to display in the virtual space when the real object is arranged in the real space corresponding to the virtual space in which a grid is displayed in the virtual space using the technology of the first embodiment described above.
However, in a case where the ground contact surface with the floor surface 300 has a shape having no straight side as in each of the real objects 430a, 430b, and 430c illustrated in
Therefore, the AR glasses 10 according to the second embodiment can measure the feature amount of the real object to be arranged and acquire the position and posture of the real object based on the measured feature amount. Furthermore, the AR glasses 10 according to the second embodiment set the origin and coordinates on the real object based on the acquired position and posture. Then, a coordinate space represented by the set origin and coordinates is displayed in the virtual space.
As an example of acquiring the position and posture of the real object 430a, a first method will be described The AR glasses 10 measure the shape and texture of the real object 430a. For example, the AR glasses 10 first identify the real object 430a that the user is touching the hand 21. In this case, the AR glasses 10 may specify the real object 430a based on each sensor signal output from the hand sensor 20 or based on the captured image obtained by imaging the hand 21 by the outward camera 1101.
Next, the AR glasses 10 capture an image of the specified real object 430a by, for example, the outward camera 1101, three-dimensionally model the real object 430a in real time based on the captured image, and acquire a three-dimensional model. In addition, the AR glasses 10 set a certain posture (for example, a posture at the time of performing three-dimensional modeling) of the real object 430a in a state where Roll, Pitch, and Yaw are each “0”. The AR glasses 10 register information indicating the three-dimensional model and the posture acquired in this manner for the real object 430a. For example, the AR glasses 10 store the information for identifying the real object 430a and the information indicating the three-dimensional model and the posture of the real object 430a in the storage unit 140 in association with each other.
In addition, the AR glasses 10 constantly measure the feature amount of the real object 430a and compare the feature amount with the shape indicated by the registered three-dimensional model of the real object 430a to specify the current position and posture of the real object 430a. It is preferable that the AR glasses 10 update the display of the coordinate space 440a based on the information on the position and posture of the real object 430a specified in this way.
Note that the coordinate system of the virtual space and the coordinate system of the coordinate space can be associated with each other by, for example, matrix calculation using known rotation and translation.
As described above, in the second embodiment, the AR glasses 10 acquire the origin and the coordinates based on the position and the posture for each real object arranged in the real space, and set the coordinate space for each real object based on the acquired origin and coordinates. Therefore, the user can easily arrange a real object having a shape in which it is difficult to visually specify the position and posture in the real space in accordance with the grid displayed in the virtual space.
As another example of acquiring the position and posture of the real object 430a, a second method will be described. In this other example, the texture of a part of the real object 430a is regarded as a marker such as an AR marker, and the position and posture of the marker are detected to detect the position and posture of the real object 430a. For example, the AR glasses 10 image the real object 430a by the outward camera 1101, and detect the texture of the real object 430a based on the captured image. A part of the detected texture is extracted, and the extracted part is used as a marker.
As still another example of acquiring the position and posture of the real object 430a, a third method will be described.
The AR glasses 10 can specify the current position and posture of the real object 430a by comparing the three-dimensional model 431a downloaded from the server 3 with the real object 430a.
Next, processing on a partially deformed real object will be described. Examples of the partially deformed real object include a potted plant. In the potted plant, the portion of the pot is not deformed, but the position and posture of the plant itself may change due to wind or the like. Therefore, when the real object arranged in the real space is the potted plant, it is preferable not to use the portion of the plant for detection of the position and posture.
For example, the user wears the AR glasses 10 and traces the pot portion 451 having a fixed shape with the hand 21. In this case, for example, the AR glasses 10 image the entire real object 450 with the outward camera 1101 to detect the motion of the hand 21 of the user, and extracts a portion (pot portion 451) of the real object 450 designated according to the motion of the hand 21 as a detection target of the position and posture. In a case where the user wears the hand sensor 20, the AR glasses 10 may extract the detection target of the position and posture in the real object 450 based on each sensor signal output from the hand sensor 20.
Furthermore, the AR glasses 10 can also perform motion detection processing on a captured image obtained by imaging the real object 450 and extract a motion portion in the real object 450. In this case, a portion other than the extracted motion portion in the real object 450 is extracted as a detection target of the position and posture in the real object 450.
The AR glasses 10 ignore a portion (plant portion 452) of the real object 450 that has not been extracted as a detection target of the position and posture.
The AR glasses 10 measure feature amounts of the pot portion 451 extracted as a detection target of the position and posture of the real object 450 based on the image captured by the outward camera 1101 to acquire the position and posture. The AR glasses 10 sets an origin and coordinates based on the acquired position and posture with respect to the pot portion 451, and displays a coordinate space 44d represented by the set origin and coordinates in the virtual space.
As described above, in the second embodiment, the AR glasses 10 acquire the position and posture ignoring the deformed portion in the real object that is partially deformed. Therefore, the coordinate space can also be set for the real object that is partially deformed, and the real object can be easily arranged in the real space in accordance with the grid displayed in the virtual space.
Next, a modification of the second embodiment will be described. In the second embodiment described above, the coordinate space of the real object is displayed at the position in the virtual space corresponding to the real object arranged in the real space. Meanwhile, in the modification of the second embodiment, the coordinate space is displayed in the virtual space in advance. Then, the real object is moved to a position in the real space corresponding to a position in the virtual space where the coordinate space is displayed, and the real object and the coordinate space are associated with each other.
When the user moves the real object 430a to a predetermined position in the coordinate space 442a, the user acquires the position and posture of the real object 430a, and registers the real object 430a in association with the coordinate space 442a. The registration of the real object 430a is performed, for example, in accordance with a gesture and utterance of the hand 21 by the user wearing the AR glasses 10, or a predetermined operation on the controller 11 in a case where the AR glasses 10 use the controller 11. In this case, the posture of the real object 430a at the time when the registration is performed can be regarded as a state (initial state) in which Roll, Pitch, and Yaw are each “0”.
Next, a third embodiment of the present disclosure will be described. In the third embodiment, the AR glasses 10 give the user a stimulus to a sound or a tactile sense according to a positional relationship between the position of the hand 21 of the user wearing the AR glasses 10 and the region image.
For example, a case will be considered in which the user wearing the AR glasses 10 and wearing the hand sensor 20 on the hand 21 moves the hand 21 to a portion hidden by the line-of-sight of the user by the building model 330 and arranges the object along the grid. At this time, the AR glasses 10 detect the position of the hand 21 of the user based on each sensor signal of the hand sensor 20, and in a case where the hand 21 approaches the grid, a notification 500 is issued to the user in a predetermined pattern. The AR glasses 10 emit the notification 500 using, for example, at least one of vibration by the vibrator 2004 of the hand sensor 20 and sound output from the sound output unit 1202 of the AR glasses 10.
Furthermore, the AR glasses 10 can make the pattern of the notification 500 different according to the distance of the position of the detected hand 21 with respect to the grid. Furthermore, the pattern of the notification 500 can be made different depending on which direction the position of the detected hand 21 has come close to the grid line.
In
Here, the AR glasses 10 make a first sound output according to the distance between the estimated fingertip position and the grid line 321i and a second sound output according to the distance between the estimated fingertip position and the grid line 321j different from each other, thereby making the pattern of the notification 500 different. For example, the AR glasses 10 make the frequency of the first sound different from the frequency of the second sound. As an example, the AR glasses 10 make the frequency of the first sound lower than the frequency of the second sound. In this case, the first sound is heard, for example, as “Po” and the second sound is heard, for example, as “Pi”. As a result, the user can know which one of the grid line 321i along the vertical direction and the grid line 321j along the horizontal direction the fingertip position is closer to when placed on the grid.
The element for making the first sound and the second sound different is not limited to the frequency. For example, the AR glasses 10 may have different timbre (waveform) between the first sound and the second sound. Furthermore, in a case where the sound output unit 1202 is provided corresponding to each of both ears of the user, the localization of the first sound and the localization of the second sound may be different. Furthermore, the first sound and the second sound can be made different from each other by combining the plurality of elements.
The AR glasses 10 further change the frequency at which the first sound and the second sound are emitted according to the distance between the estimated fingertip position and the grid lines 321i and 321j, thereby making the pattern of the notification 500 different. More specifically, the AR glasses 10 increase the frequency at which the first sound is emitted as the estimated fingertip position approaches the grid line 321i. Similarly, the AR glasses 10 increase the frequency at which the second sound is emitted as the estimated fingertip position approaches the grid line 321j.
Note that, in a case where the estimated fingertip position is in the middle between a certain grid line and a grid line parallel and adjacent to the grid line, that is, in a case where the fingertip position is a position separated from the grid line by ½ of a grid interval in a direction orthogonal to the grid line, the AR glasses 10 do not emit the sound of the notification 500.
A more specific description will be given with reference to
Meanwhile, the position 510b is, for example, a center position of the grid, and is not close to any of the grid lines 321i and 321j. In other words, the position 510b is an intermediate position between the specific grid line 321i and the grid line on the right of the specific grid line among the grid lines along the vertical direction. In a case where it is estimated that the position 510b is the fingertip position, the AR glasses 10 do not output either the first sound or the second sound.
In a case where the estimated fingertip position is a position 510c that is a position on the grid line 321j and is a position separated from the grid line 321i by a distance of ½ of the grid interval in the horizontal direction, the AR glasses 10 do not output the first sound but output the second sound at the first frequency. For example, the AR glasses 10 continuously output only the second sound like “pipipipi...”.
In a case where the estimated fingertip position is a position on the grid line 321j and is at a position 510d which is a position closer than ½ of the grid interval in the horizontal direction from the grid line 321i, the AR glasses 10 output the second sound at the first frequency and output the first sound at a second frequency which is lower than the first frequency. For example, the AR glasses 10 continuously output the second sound like “pipipipi...” and intermittently output the first sound like “po, po, po,...” in parallel with the second sound.
Furthermore, in a case where the estimated fingertip position is a position on the grid line 321i and is at a position 510e which is a position closer than ½ of the grid interval in the vertical direction from the grid line 321j, the AR glasses 10 output the first sound at a high frequency and output the second sound at the second frequency. For example, the AR glasses 10 continuously output the first sound like “popopopo...” and intermittently output the second sound like “pi, pi, pi,...” in parallel with the first sound.
Note that, although an example of using sound as the notification 500 has been described above, this is not limited to this example. That is, the AR glasses 10 can control the operation of the vibrator 2004 provided in the hand sensor 20 according to the distance between the position (fingertip position) of the hand 21 and the grid line. In this case, it is conceivable to make the pattern of one vibration itself different for grid lines in different directions.
As described above, in the third embodiment, the notification 500 is issued to the user by the AR glasses 10 in a pattern according to the positional relationship between the hand 21 and the grid. As a result, even in a case where the hand 21 is at a position hidden from the line-of-sight of the user, the user can roughly grasp the position of the hand 21, and can easily perform work or the like at the position.
Next, a fourth embodiment of the present disclosure will be described. The fourth embodiment relates to display of the image of the hand 21 on the AR glasses 10 in a case where the hand 21 of the user goes around the back side of the real object in the line-of-sight direction of the user and is hidden behind the real object.
First, a first display method according to the fourth embodiment will be described.
Here, a case where the user grips a tree model 350 with the hand 21 (not illustrated) on which the hand sensor 20 is worn and tries to arrange the tree model 350 in a region 351 included in the plane 331 will be considered. The region 351 is a region on the back side of the site 341 as viewed from the user, and cannot be viewed from the user by being blocked by the building model 330.
The AR glasses 10 estimate the position and posture of the hand 21 based on each sensor signal output from the hand sensor 20. The AR glasses 10 generate a virtual image 22 imitating the position and posture of the hand 21 based on the estimation result, and display the generated virtual image 22 to be superimposed on an image of the real space at a position in the virtual space corresponding to the estimated position of the hand 21. In this case, the AR glasses 10 may display the virtual image 22 so as not to transmit the image of the real space, or can display the virtual image 22 so as to transmit the image of the real space seen at the position of the virtual image 22. Furthermore, the AR glasses 10 also display the grid shielded by the building model 330 so as to be superimposed on the image of the real space.
As a result, the user can confirm the position of the hand 21 in the region that is obstructed by the real object and cannot be viewed, and for example, can more accurately execute the arrangement of the object with respect to the region.
Next, a second display method according to the fourth embodiment will be described.
In the second method according to the fourth embodiment, similarly to the above-described first method, the AR glasses 10 acquire the position and posture of the hand 21 based on each sensor signal of the hand sensor 20 worn on the hand 21. In addition, the AR glasses 10 acquire a three-dimensional model of the real object (building model 330 in this example) that blocks the line-of-sight of the user with respect to the hand 21. The three-dimensional model of the real object may be generated based on the image captured in advance using, for example, the outward camera 1101 or the like, or may be acquired from the server 3 in a case where the three-dimensional model is registered in advance in the server 3.
Based on the acquired position and posture of the hand 21 and the three-dimensional model of the real object (building model 330), the AR glasses 10 generate an image of a portion that cannot be viewed from the user position due to being blocked by the real object. In this case, the AR glasses 10 may generate an enlarged image obtained by enlarging the portion. This image includes a virtual image 23 of hand 21 based on the position and posture of hand 21.
The AR glasses 10 form a window 620 for information presentation in the field of view of the AR glasses 10, and display the generated image in the window 620. In the example of
By referring to the image displayed in this window 620, the user can easily perform, for example, fine adjustment of the position at the time of work of arranging the tree model 350 in the region 351 on the back side of the building model 330 as viewed from the user. Furthermore, the user can confirm the state of the back of the shielding object when viewed from the user based on the image of the window 620. In this case, the AR glasses 10 can display the state of the back of the shielding object in the window 620 regardless of the presence or absence of the hand 21 of the user.
Note that the image in the window 620 can be enlarged and reduced by a predetermined user operation or the like.
Also by this second method, the user can confirm the position of the hand 21 in an area that is obstructed by the real object and cannot be seen, and for example, can more accurately perform the arrangement of the object with respect to the region. Furthermore, in the second method, the user can confirm the state of the back of the shielding object regardless of whether or not the hand 21 of the user is at the position.
Next, a fifth embodiment of the present disclosure will be described. In the fifth embodiment, a designated real object is duplicated in a virtual space, and a duplicated virtual object in which the real object is duplicated is arranged in the virtual space.
The real object 430a to be duplicated can be designated by the fingertip of the finger (for example, the second finger) on which the IMUs 202 and 203 of the hand 21 on which the hand sensor 20 is worn are provided. Alternatively, the finger of the user may be imaged by the outward camera 1101 of the AR glasses 10, and the real object 430a to be duplicated may be designated based on the captured image.
When designating the real object 430a to be duplicated, the user instructs the AR glasses 10 to duplicate the designated real object 430a. The instruction of duplication by the user may be issued by utterance, for example, or may be issued by operating an operator of controller 11.
Here, the three-dimensional model of the real object 430a has already been acquired in Step S100 of the flowchart of
The user can move the virtual real object 430a_copy in which the real object 430a is duplicated in the virtual space. For example, the user performs an operation of picking the virtual real object 430a_copy displayed in the virtual space with the finger, and further moves the finger in a picked state. When detecting the operation of picking and moving with the fingers based on the image captured by the outward camera 1101, the AR glasses 10 move the picked virtual real object 430a_copy in the virtual space according to the motion of the fingers.
In the fifth embodiment, as described above, the three-dimensional model of the real object in the real space is duplicated and arranged in the vicinity of the position corresponding to the position of the real object in the virtual space. For example, although there is only one real object in the real space, there is a case where it is desired to confirm a state in which a plurality of real objects are arranged in the real space. By applying the fifth embodiment, the user can easily confirm a state as when a plurality of real objects are arranged.
Next, a sixth embodiment of the present disclosure will be described. In the sixth embodiment, the AR glasses 10 generate a virtual space (referred to as a reduced virtual space) obtained by reducing the real space, and superimpose and display the virtual space on the real space in a non-transmissive manner.
The AR glasses 10 generate a virtual space (referred to as a reduced virtual space) obtained by reducing the real space based on the surrounding three-dimensional model acquired in Step S100 of
In this case, as described in the fifth embodiment, the three-dimensional model of each of the real objects 430a, 430b, and 430c in the real space has already been acquired. Reduced virtual real objects 430a_mini, 430b_mini, and 430c_mini respectively corresponding to the real objects 430a, 430b, and 430c in the reduced virtual space are generated using the three-dimensional models of the real objects 430a, 430b, and 430c, respectively.
In addition, the AR glasses 10 can move the reduced virtual real objects 430a_mini, 430b_mini, and 430c_mini in the reduced virtual space in the reduced virtual space according to a user operation.
As an example, a case where the reduced virtual real object 430a_mini is moved will be described. The user performs an operation of, for example, picking the reduced virtual real object 430a_mini displayed in the region 600 with the fingers of the hand 21, and moves the reduced virtual real object 430a_mini in the reduced virtual space as indicated by an arrow 610, for example, while maintaining the picked state. The AR glasses 10 capture an image of the movement of the hand 21 of the user by the outward camera 1101, and detect a picking motion, a moving direction, and the like based on the captured image.
Note that the arrow 610 is merely for describing the movement of the reduced virtual real object 430a_mini, and is not an object actually displayed in the region 600.
Here, even when the reduced virtual real object 430a_mini is moved in the reduced virtual space, the corresponding real object 430a in the real space does not move. Therefore, the AR glasses 10 display an object (image) 612 indicating the movement (arrow 610) in the reduced virtual space at the position in the virtual space corresponding to the real object 430a in the real space, corresponding to the reduced virtual real object 430a_mini moved in the reduced virtual space. In the example of
The user can reflect the movement of the reduced virtual real object 430a_mini in the reduced virtual space displayed in the region 600 on the movement of the real object 430a in the real space by actually moving the real object 430a in the real space according to the navigation by the object 612. In this manner, by displaying the object for reflecting the movement of the reduced virtual object in the reduced virtual space on the movement of the corresponding object in the real space, the user can easily determine the arrangement of each object in the real space.
Note that the effects described in the present specification are merely examples and are not limited, and other effects may be provided.
Note that the present technology can also have the following configurations.
1
a, 1b, 1c
3
10
11
20
21
22, 23
100
110
120
130
140
201, 202, 203
204
300
301
318
321
a, 321b, 321c, 321d, 321e, 321f, 321g, 321h, 321i, 321j
330
430
a, 430b, 430c
440
a, 440b, 440c
500, 501a, 501b, 501c, 501d, 501e
600
620
1001
1002
1003
1004
1005
1101
1102
1103
1104
1105
1106
1201, 1201L, 1201R
1202
Number | Date | Country | Kind |
---|---|---|---|
2020-088930 | May 2020 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2021/018224 | 5/13/2021 | WO |