The present application is based on, and claims priority from JP Application Serial Number 2019-014290, filed Jan. 30, 2019, the disclosure of which is hereby incorporated by reference herein in its entirety.
The present disclosure relates to a technique for detecting the position of a pointer.
JP A-2016-184850 (Patent Literature 1) discloses a projector capable of projecting a projection screen onto a screen, capturing, with a camera, an image including a pointer such as a finger, and detecting the position of the pointer using the captured image. When the tip of the pointer is in contact with the screen, the projector recognizes that a predetermined instruction for drawing or the like is input to the projection screen and draws the projection screen again according to the instruction. Therefore, a user is capable of inputting various instructions using the projection screen as a user interface. The projector of the type that can use the projection screen on the screen as the user interface, with which the user is capable of inputting instructions, in this way is called “interactive projector”. A screen surface functioning as a surface used to input instructions using the pointer is called “operation surface” as well. The position of the pointer is determined by triangulation using a plurality of images captured by a plurality of cameras.
However, in the related art, detection accuracy of the distance between the pointer and the operation surface and other distance-related parameters related to the distance is not always sufficient. Therefore, there has been demands for improvement of the detection accuracy of the distance-related parameters related to the distance between the pointer and the operation surface.
According to an aspect of the present disclosure, there is provided a position detecting method for detecting a parameter related to a position of a pointer with respect to an operation surface. The position detecting method includes: (a) imaging, using a first camera, the pointer over the operation surface as a background to capture a first captured image, and imaging, using a second camera disposed in a position different from a position of the first camera, the pointer over the operation surface as the background to capture a second captured image; (b) acquiring a first image for processing from the first captured image and acquiring a second image for processing from the second captured image; (c) extracting a first region of interest image and a second region of interest image, each including the pointer, from the first image for processing and the second image for processing; and (d) determining a distance-related parameter related to a distance between the operation surface and the pointer using a convolutional neural network including an input layer including a first input channel to which the first region of interest image is input and a second input channel to which the second region of interest image is input and an output layer that outputs the distance-related parameter.
The present disclosure can also be realized in a form of a position detecting device and can be realized in various forms other than the position detecting method and the position detecting device. The present disclosure can be realized in various forms such as an interactive projector, a computer program for realizing functions of the method or the device, and a non-transitory recording medium having the computer program recorded therein.
The projector 100 includes a projection lens 210 that projects an image onto the screen plate 820, two cameras 310 and 320 that capture images including the pointer 80, and two illuminating sections 410 and 420 that irradiate infrared lights for detecting the pointer 80, the two illuminating sections 410 and 420 corresponding to the two cameras 310 and 320.
The projection lens 210 projects the projection screen PS onto the operation surface SS. The projection screen PS includes an image drawn in the projector 100. When an image drawn in the projector 100 is absent, light is irradiated on the projection screen PS from the projector 100 and a white image is displayed. In this specification, the “operation surface SS” means a surface used to input an instruction using the pointer 80. The “projection screen PS” means a region of an image projected onto the operation surface SS by the projector 100.
In the system 800, one or a plurality of non-light emitting pointers 80 are usable. As the pointer 80, non-light emitting objects such as a finger and a pen are usable. A tip portion for an instruction of the non-light emitting pointer 80 is desirably excellent in a characteristic for reflecting infrared light and has a retroreflection characteristic.
A first camera 310 and a second camera 320 are respectively set to be capable of imaging the entire operation surface SS and have a function of respectively capturing images of the pointer 80 over the operation surface SS as a background. That is, the first camera 310 and the second camera 320 create images including the pointer 80 by receiving lights reflected on the operation surface SS and the pointer 80 in the infrared lights irradiated from a first illuminating section 410 and a second illuminating section 420. When two images captured by the first camera 310 and the second camera 320 are used, a three-dimensional position of the pointer 80 can be calculated by triangulation or the like. The number of cameras may be three or more.
The first illuminating section 410 has a function of a peripheral illuminating section that illuminates the periphery of an optical axis of the first camera 310 with infrared light. In the example shown in
The number of illuminating elements configuring the first illuminating section 410 is not limited to four and may be any number equal to or larger than two. However, a plurality of illuminating elements configuring the first illuminating section 410 are desirably disposed in rotationally symmetrical positions centering on the first camera 310. The first illuminating section 410 may be configured using a ring-like illuminating element instead of using the plurality of illuminating elements. Further, a coaxial illuminating section that emits infrared light through a lens of the first camera 310 may be used as the first illuminating section 410. These modifications are applicable to the second illuminating section 420 as well. When, with N set to an integer equal to or larger than 2, N cameras are provided, peripheral illuminating sections or coaxial illuminating sections are desirably provided respectively for the cameras.
An example shown in
The interactive projection system 800 is also operable in modes other than the white board mode. For example, the system 800 is also operable in a PC interactive mode for displaying, on the projection screen PS, an image of data transferred from a not-shown personal computer via a communication line. In the PC interactive mode, an image of data of spreadsheet software or the like is displayed. Input, creation, correction, and the like of data can be performed using various tools and icons displayed in the image.
The control section 700 performs control of the sections of the projector 100. The control section 700 has a function of an imaging control section 710 that acquires an image of the pointer 80 using the imaging section 300 and the infrared illuminating section 400. Further, the control section 700 has a function of an operation executing section 720 that recognizes content of an instruction performed on the projection screen PS by the pointer 80 detected by the position detecting section 600 and instructs the projection-image generating section 500 to create or change a projection image according to the content of the instruction.
The projection-image generating section 500 includes an image memory 510 that stores a projection image. The projection-image generating section 500 has a function of generating a projection image to be projected onto the operation surface SS by the projecting section 200. The projection-image generating section 500 desirably further has a function of a keystone correction section that corrects trapezoidal distortion of the projection screen PS.
The projecting section 200 has a function of projecting the projection image generated by the projection-image generating section 500 onto the operation surface SS. The projecting section 200 includes a light modulating section 220 and a light source 230 besides the projection lens 210 explained with reference to
The infrared illuminating section 400 includes the first illuminating section 410 and the second illuminating section 420 explained with reference to
The imaging section 300 includes the first camera 310 and the second camera 320 explained with reference to
The position detecting section 600 has a function of calculating a position of the tip portion of the pointer 80 using a first captured image captured and acquired by the first camera 310 and a second captured image captured and acquired by the second camera 320. The position detecting section 600 includes an image-for-processing acquiring section 610, a region-of-interest extracting section 620, and a convolutional neural network 630. These sections may be stored in a storage region of the position detecting section as models. The image-for-processing acquiring section 610 acquires, from the two captured images captured by the two cameras 310 and 320, a first image for processing and a second image for processing, which are two images for processing to be processed by the region-of-interest extracting section 620. In an example, the image-for-processing acquiring section 610 creates two calibration images by performing stereo calibration on the two captured images captured by the two cameras 310 and 320 and acquires the two calibration images as two images for processing. The region-of-interest extracting section 620 extracts, from the two images for processing, a first region of interest image and a second region of interest image, which are two region of interest images, each including the pointer 80. The convolutional neural network 630 is configured to include an input layer to which the two region of interest images are input and an output layer that outputs a distance-related parameter related to the distance between the operation surface SS and the pointer 80. Details of functions of the sections 610 to 630 are explained below.
Functions of the sections of the control section 700 and functions of the sections of the position detecting section 600 are realized by, for example, a processor in the projector 100 executing computer programs. A part of the functions of the sections may be realized by a hardware circuit such as an FPGA (field-programmable gate array).
In step S100, the imaging section 300 acquires a plurality of images by imaging the pointer 80 over the operation surface SS as the background.
In step S110, the imaging control section 710 turns on the first illuminating section 410 and turns off the second illuminating section 420. In step S120, the imaging control section 710 captures images using the first camera 310 and the second camera 320. As a result, the first image IM1_1 and the second image IM2_1 shown in an upper part of
In step S130, the imaging control section 710 turns off the first illuminating section 410 and turns on the second illuminating section 420. In step S140, the imaging control section 710 captures images using the first camera 310 and the second camera 320. As a result, the first image IM1_2 and the second image IM2_2 shown in a middle part of
When the imaging in step S120 and step S140 ends, as shown in a lower part of
When the processing in step S100 ends in this way, in step S200 in
Two calibration images are created by performing stereo calibration on the two images IM1_1 and IM2_2. The two calibration images are set as images for processing.
In this embodiment, as the “stereo calibration”, processing for adjusting a coordinate of one of the two images IM1_1 and IM2_2 is performed to eliminate a parallax on the operation surface SS. For example, when the first image IM1_1, which is the coordinate system (U, V), is set as a reference image and the second image IM2_2 is set as a comparative image to calculate a parallax, calibration can be performed to eliminate a parallax between the first image IM1_1 and the second image IM2_2 on the operation surface SS by adjusting the coordinate system (η, ξ) of the second image IM2_2 to the coordinate system (U, V). Calibration parameters such as a conversion coefficient necessary for the stereo calibration are determined in advance and set in the calibration executing section 610. Two images IM1 and IM2 shown in an upper part of
The two images IM1_1 and IM2_2 themselves are acquired as two images for processing.
Two images for processing are created by executing preprocessing such as distortion correction or parallelization on the two images IM1_1 and IM2_2.
According to an experiment by the inventors, the distance-related parameters were able to be most accurately determined when the method 1 was used among the method 1 to the method 3. This is assumed to be because peculiar lens distortion and distortion of an image due to positional deviation of a camera are corrected by performing the stereo calibration. However, the method 2 and the method 3 have an advantage that processing can be simplified compared with the method 1.
Instead of setting illumination periods for the two illuminating sections 410 and 420 at the exclusive timings different from each other and sequentially capturing images in the respective illumination periods as explained with reference to
In step S300 in
In step S400, the convolutional neural network 630 determines distance-related parameters from the two region of interest images RO1 and RO2. In the first embodiment, the distance itself between the operation surface SS and the pointer 80 is used as the distance-related parameter.
Numerical value examples of a pixel size Nx in the X direction, a pixel size Ny in the Y direction, and the number of channels Nc in outlets of the layers are shown in the lower right of layers shown in
The configuration of the convolutional neural network 630 shown in
The distance-related parameter can be determined using the convolutional neural network 630 because the distance-related parameter has a positive or negative correlation with feature values of the two region of interest images RO1 and RO2. As the feature values having the correlation with the distance-related parameter, there is a representative correlation value indicating a correlation between the two region of interest images RO1 and RO2. As an example of a method of creating representative correlation value of the two region of interest images RO1 and RO2, there is a method of, first, calculating correlation values for each of pixels of the two region of interest images RO1 and RO2 using kernel regions centering on the pixels of the two region of interest images RO1 and RO2 to thereby create a correlation image formed by the correlation values and further calculating a statistical representative value of the correlation values in the correlation image. As the correlation values, a correlation coefficient, an SAD (Sum of Absolute Difference), an SSD (Sum of Squared Difference), and the like can be used. An average, a maximum, a median, and the like correspond to the statistical representative value. Such a representative correlation value or a value similar to the representative correlation value is calculated as one of feature values of the two region of interest images RO1 and RO2 in the intermediate layer 632 of the convolutional neural network 630 and input to the fully coupled layer 633. As explained above, the distance ΔZ between the operation surface SS and the pointer 80 has a positive or negative correlation with the feature values of the two region of interest images RO1 and RO2. Therefore, it is possible to determine the distance ΔZ using the convolutional neural network 630 to which the two region of interest images RO1 and RO2 are input. During learning of the convolutional neural network 630, if causing the convolutional neural network 630 to learn a distance-related parameter other than the distance ΔZ, it is possible to obtain the distance-related parameter using the convolutional neural network 630.
In step S500 in
In step S400, the distance ΔZ between the operation surface SS and the pointer 80 is determined as the distance-related parameter. However, a parameter other than the distance ΔZ may be calculated as the distance-related parameter. For example, when, from the feature values obtained according to the two region of interest images RO1 and RO2, it can be assumed in step S400 that the distance ΔZ is sufficiently small, the operation in step S600 may be immediately executed without calculating the distance ΔZ. In this case, the distance-related parameter is an operation execution parameter such as a flag or a command indicating execution of operation corresponding to the position of the pointer 80. The operation execution parameter is output from the convolutional neural network 630. With this configuration, in a situation in which the distance ΔZ between the pointer 80 and the operation surface SS is assumed to be sufficiently small, it is possible to execute operation on the operation surface SS using the pointer 80 without determining the distance ΔZ between the pointer 80 and the operation surface SS.
As explained above, in the first embodiment, the distance-related parameter related to the distance ΔZ between the operation surface SS and the pointer 80 is determined using the convolutional neural network 630 to which the two region of interest images RO1 and RO2 are input and from which the distance-related parameter is output. Therefore, it is possible to accurately determine the distance-related parameter.
In the first embodiment, the region of interest images RO1 and RO2 input to the convolutional neural network 630 are stereo-calibrated images. Therefore, peculiar lens distortion and distortion of an image due to positional deviation of a camera are corrected by the stereo calibration. Consequently, it is possible to reduce an extraction error of characteristics by the convolutional neural network 630. As a result, there is an advantage that the learnt convolutional neural network 630 can be applied to different lenses and cameras as well.
The number of cameras may be three or more. That is, with N set to an integer equal to or larger than 3, N cameras may be provided. In this case, the image-for-processing acquiring section 610 acquires N images for processing. The region-of-interest extracting section 620 extracts N region of interest images, each including the pointer 80, from the N images for processing. The input layer 631 of the convolutional neural network 630 is configured to include N input channels to which the N region of interest images are input. With this configuration, the distance-related parameter is determined from the N region of interest images. Therefore, it is possible to accurately determine the distance-related parameter.
The present disclosure is not limited to the embodiments explained above and can be realized in various forms in a range not departing from the gist of the present disclosure. For example, the present disclosure can also be realized by the following aspects. The technical features in the embodiments corresponding to technical features in the aspects described below can be substituted or combined as appropriate in order to solve a part or all of the problems of the present disclosure or in order to achieve a part or all of the effects of the present disclosure. If the technical features are not explained as essential technical features in this specification, the technical features can be deleted as appropriate.
(1) According to a first aspect of the present disclosure, there is provided a position detecting method for detecting a parameter related to a position of a pointer with respect to an operation surface. The position detecting method includes: (a) imaging, using a first camera, the pointer over the operation surface as a background to capture a first captured image, and imaging, using a second camera disposed in a position different from a position of the first camera, the pointer over the operation surface as the background to capture a second captured image; (b) acquiring a first image for processing from the first captured image and acquiring a second image for processing from the second captured image; (c) extracting a first region of interest image and a second region of interest image, each including the pointer, from the first image for processing and the second image for processing; and (d) determining a distance-related parameter related to a distance between the operation surface and the pointer using a convolutional neural network including an input layer including a first input channel to which the first region of interest image is input and a second input channel to which the second region of interest image is input and an output layer that outputs the distance-related parameter.
With the position detecting method, since the distance-related parameter related to the distance between the operation surface and the pointer is determined using the convolutional neural network to which the two region of interest images are input and from which the distance-related parameter is output, it is possible to accurately determine the distance-related parameter.
(2) In the position detecting method, in the (a), with N set to an integer equal to or larger than 3, the pointer over the operation surface as the background may be captured using N cameras to acquire N captured images, in the (b), N images for processing may be acquired from the N captured images, in the (c), N region of interest images, each including the pointer, may be extracted from the N images for processing, and, in the (d), the distance-related parameter may be determined using a convolutional neural network including an input layer including N input channels to which the N region of interest images are input and an output layer that outputs the distance-related parameter.
With the position detecting method, since the distance-related parameter related to the distance between the operation surface and the pointer is determined using the convolutional neural network to which the N region of interest images are input and from which the distance-related parameter is output, it is possible to more accurately determine the distance-related parameter.
(3) In the position detecting method, in the (b), the N images for processing may be created by performing stereo calibration on the first captured image and the second captured image.
With the position detecting method, since the two region of interest images are extracted from the two images for processing on which the stereo calibration is performed, it is possible to accurately determine the distance-related parameter using the convolutional neural network to which the two region of interest images are input.
(4) In the position detecting method, in the (b), the first captured image and the second captured image may be acquired as the first image for processing and the second image for processing.
With the position detecting method, since the first captured image and the second captured image are acquired as the first image for processing and the second image for processing, it is possible to simplify processing for calculating the distance-related parameter.
(5) In the position detecting method, the distance-related parameter may be the distance between the operation surface and the pointer.
With the position detecting method, it is possible to accurately determine the distance between the operation surface and the pointer using the convolutional neural network.
(6) In the position detecting method, the distance-related parameter may be an operation execution parameter indicating that operation on the operation surface corresponding to a position of the pointer is executed.
With the position detecting method, in a situation in which it is assumed that the distance between the pointer and the operation surface is sufficiently small, it is possible to execute the operation on the operation surface using the pointer without determining the distance between the pointer and the operation surface.
(7) In the position detecting method, the (a) may include: sequentially selecting a first infrared illuminating section provided to correspond to the first camera and a second infrared illuminating section provided to correspond to the second camera; and executing imaging using the first camera while performing illumination with the first infrared illuminating section without performing illumination with the second infrared illuminating section, executing imaging using the second camera while performing illumination with the second infrared illuminating section without performing illumination with the first infrared illuminating section, and sequentially acquiring the first captured image and the second captured image one by one at different timings, and the first infrared illuminating section and the second infrared illuminating section may be configured to include at least one of a coaxial illuminating section configured to perform coaxial illumination on the cameras and a peripheral illuminating section disposed to surround peripheries of optical axes of the cameras.
With this position detecting method, since the first captured image and the second captured image can be captured in a state in which a shadow of the pointer is less on the operation surface, it is possible to accurately determine the distance-related parameter.
(8) According to a second aspect of the present disclosure, there is provided a position detecting device that detects a parameter related to a position of a pointer with respect to an operation surface. The position detecting device includes: an imaging section including a first camera configured to image the pointer over the operation surface as a background to capture a first captured image and a second camera disposed in a position different from a position of the first camera and configured to image the pointer over the operation surface as the background to capture a second captured image; an image-for-processing acquiring section configured to acquire a first image for processing from the first captured image and acquire a second image for processing from the second captured image; a region-of-interest extracting section configured to extract a first region of interest image and a second region of interest image, each including the pointer, from the first image for processing and the second image for processing; and a convolutional neural network including an input layer including a first input channel to which the first region of interest image is input and a second input channel to which the second region of interest image is input and an output layer that outputs a distance-related parameter related to a distance between the operation surface and the pointer.
With the position detecting device, since the distance-related parameter related to the distance between the operation surface and the pointer is determined using the convolutional neural network to which the two region of interest images are input and from which the distance-related parameter is output, it is possible to accurately determine the distance-related parameter.
(9) In the position detecting device, the imaging section may include, with N set to an integer equal to or larger than 3, N cameras configured to image the pointer over the operation surface as the background to capture N captured images, the image-for-processing acquiring section may acquire N images for processing from the N captured images, the region-of-interest extracting section may extract N region of interest images, each including the pointer, from the N images for processing, and the convolutional neural network may include an input layer including N input channels to which the N region of interest images are input and an output layer that outputs the distance-related parameter.
With the position detecting device, since the distance-related parameter related to the distance between the operation surface and the pointer is determined using the convolutional neural network to which the N region of interest images are input and from which the distance-related parameter is output, it is possible to more accurately determine the distance-related parameter.
(10) In the position detecting device, the image-for-processing acquiring section may create the N images for processing by performing stereo calibration on the first captured image and the second captured image.
With the position detecting device, since the two region of interest images are extracted from the two images for processing on which the stereo calibration is performed, it is possible to accurately determine the distance-related parameter using the convolutional neural network to which the two region of interest images are input.
(11) In the position detecting device, the image-for-processing acquiring section may acquire the first captured image and the second captured image as the first image for processing and the second image for processing.
With the position detecting device, since the first captured image and the second captured image are acquired as the first image for processing and the second image for processing, it is possible to simplify processing for calculating the distance-related parameter.
(12) In the position detecting device, the distance-related parameter may be the distance between the operation surface and the pointer.
With the position detecting device, it is possible to accurately determine the distance between the operation surface and the pointer using the convolutional neural network.
(13) In the position detecting device, the distance-related parameter may be an operation execution parameter indicating that operation on the operation surface corresponding to a position of the pointer is executed.
With the position detecting device, in a situation in which it is assumed that the distance between the pointer and the operation surface is sufficiently small, it is possible to execute the operation on the operation surface using the pointer without determining the distance between the pointer and the operation surface.
(14) The position detecting device may further include: a first infrared illuminating section configured to include at least one of a coaxial illuminating section configured to perform coaxial illumination on the first camera and a peripheral illuminating section disposed to surround a periphery of an optical axis of the first camera; a second infrared illuminating section configured to include at least one of a coaxial illuminating section configured to perform coaxial illumination on the second camera and a peripheral illuminating section disposed to surround a periphery of an optical axis of the second camera; and an imaging control section configured to control imaging performed using the first camera and the first infrared illuminating section and the second camera and the second infrared illuminating section. The imaging control section may sequentially select the first camera and the first infrared illuminating section and the second camera and the second infrared illuminating section, execute imaging using the first camera while performing illumination with the first infrared illuminating section without performing illumination with the second infrared illuminating section, execute imaging using the second camera while performing illumination with the second infrared illuminating section without performing illumination with the first infrared illuminating section and sequentially capture the first captured image and the second captured image at different timings.
With this position detecting device, since the first captured image and the second captured image can be captured in a state in which a shadow of the pointer is less on the operation surface, it is possible to accurately determine the distance-related parameter.
(15) According to a third aspect of the present disclosure, there is provided an interactive projector that detects a parameter related to a position of a pointer with respect to an operation surface. The interactive projector includes: a projecting section configured to project a projection image onto the operation surface; an imaging section including a first camera configured to image the pointer over the operation surface as a background to capture a first captured image and a second camera disposed in a position different from a position of the first camera and configured to image the pointer over the operation surface as the background to capture a second captured image; an image-for-processing acquiring section configured to acquire a first image for processing from the first captured image and acquire a second image for processing from the second captured image; a region-of-interest extracting section configured to extract a first region of interest image and a second region of interest image, each including the pointer, from the first image for processing and the second image for processing; and a convolutional neural network including an input layer including a first input channel to which the first region of interest image is input and a second input channel to which the second region of interest image is input and an output layer that outputs a distance-related parameter related to a distance between the operation surface and the pointer.
With the interactive projector, since the distance-related parameter related to the distance between the operation surface and the pointer is determined using the convolutional neural network to which N region of interest images are input and from which the distance-related parameter is output, it is possible to accurately determine the distance-related parameter.
Number | Date | Country | Kind |
---|---|---|---|
2019-014290 | Jan 2019 | JP | national |