The present application is based on, and claims priority from JP Application Serial Number 2019-014287, filed Jan. 30, 2019, the disclosure of which is hereby incorporated by reference herein in its entirety.
The present disclosure relates to a technique for detecting the position of a pointer.
JP A-2016-184850 (Patent Literature 1) discloses a projector capable of projecting a projection screen onto a screen, capturing, with a camera, an image including a pointer such as a finger, and detecting the position of the pointer using the captured image. When the tip of the pointer is in contact with the screen, the projector recognizes that a predetermined instruction for drawing or the like is input to the projection screen and draws the projection screen again according to the instruction. Therefore, a user is capable of inputting various instructions using the projection screen as a user interface. The projector of the type that can use the projection screen on the screen as the user interface, with which the user is capable of inputting instructions, in this way is called “interactive projector”. A screen surface functioning as a surface used to input instructions using the pointer is called “operation surface” as well. The position of the pointer is determined by triangulation using a plurality of images captured by a plurality of cameras.
However, in the related art, detection accuracy of the distance between the pointer and the operation surface and other distance-related parameters related to the distance is not always sufficient. Therefore, there has been demands for improvement of the detection accuracy of the distance-related parameters related to the distance between the pointer and the operation surface.
According to an aspect of the present disclosure, there is provided a position detecting method for detecting a parameter related to a position of a pointer with respect to an operation surface. The position detecting method includes: (a) imaging, using a first camera, the pointer over the operation surface as a background to capture a first captured image, and imaging, using a second camera disposed in a position different from a position of the first camera, the pointer over the operation surface as the background to capture a second captured image; (b) creating a first calibration image and a second calibration image by performing stereo calibration on the first captured image and the second captured image; (c) extracting a first region of interest image and a second region of interest image, each including the pointer, from the first calibration image and the second calibration image; (d) creating, by calculating, concerning each of the first region of interest image and the second region of interest image, a correlation value of a first kernel region and a second kernel region having corresponding pixels respectively as reference positions, a correlation image having the correlation value as a pixel value; and (e) determining a distance-related parameter related to a distance between the operation surface and the pointer using a convolutional neural network including an input layer to which the correlation image is input and an output layer that outputs the distance-related parameter.
The present disclosure can also be realized in a form of a position detecting device and can be realized in various forms other than the position detecting method and the position detecting device. The present disclosure can be realized in various forms such as an interactive projector, a computer program for realizing functions of the method or the device, and a non-transitory recording medium having the computer program recorded therein.
The projector 100 includes a projection lens 210 that projects an image onto the screen plate 820, two cameras 310 and 320 that capture images including the pointer 80, and two illuminating sections 410 and 420 that irradiate infrared lights for detecting the pointer 80, the two illuminating sections 410 and 420 corresponding to the two cameras 310 and 320.
The projection lens 210 projects the projection screen PS onto the operation surface SS. The projection screen PS includes an image drawn in the projector 100. When an image drawn in the projector 100 is absent, light is irradiated on the projection screen PS from the projector 100 and a white image is displayed. In this specification, the “operation surface SS” means a surface used to input an instruction using the pointer 80. The “projection screen PS” means a region of an image projected onto the operation surface SS by the projector 100.
In the system 800, one or a plurality of non-light emitting pointers 80 are usable. As the pointer 80, non-light emitting objects such as a finger and a pen are usable. A tip portion for an instruction of the non-light emitting pointer 80 is desirably excellent in a characteristic for reflecting infrared light and has a retroreflection characteristic.
A first camera 310 and a second camera 320 are respectively set to be capable of imaging the entire operation surface SS and have a function of respectively capturing images of the pointer 80 over the operation surface SS as a background. That is, the first camera 310 and the second camera 320 create images including the pointer 80 by receiving lights reflected on the operation surface SS and the pointer 80 in the infrared lights irradiated from a first illuminating section 410 and a second illuminating section 420. When two images captured by the first camera 310 and the second camera 320 are used, a three-dimensional position of the pointer 80 can be calculated by triangulation or the like. The number of cameras may be three or more.
The first illuminating section 410 has a function of a peripheral illuminating section that illuminates the periphery of an optical axis of the first camera 310 with infrared light. In the example shown in
The number of illuminating elements configuring the first illuminating section 410 is not limited to four and may be any number equal to or larger than two. However, a plurality of illuminating elements configuring the first illuminating section 410 are desirably disposed in rotationally symmetrical positions centering on the first camera 310. The first illuminating section 410 may be configured using a ring-like illuminating element instead of using the plurality of illuminating elements. Further, a coaxial illuminating section that emits infrared light through a lens of the first camera 310 may be used as the first illuminating section 410. These modifications are applicable to the second illuminating section 420 as well. When, with N set to an integer equal to or larger than 2, N cameras are provided, peripheral illuminating sections or coaxial illuminating sections are desirably provided respectively for the cameras.
An example shown in
The interactive projection system 800 is also operable in modes other than the white board mode . For example, the system 800 is also operable in a PC interactive mode for displaying, on the projection screen PS, an image of data transferred from a not-shown personal computer via a communication line. In the PC interactive mode, an image of data of spreadsheet software or the like is displayed. Input, creation, correction, and the like of data can be performed using various tools and icons displayed in the image.
The control section 700 performs control of the sections of the projector 100. The control section 700 has a function of an imaging control section 710 that acquires an image of the pointer 80 using the imaging section 300 and the infrared illuminating section 400. Further, the control section 700 has a function of an operation executing section 720 that recognizes content of an instruction performed on the projection screen PS by the pointer 80 detected by the position detecting section 600 and instructs the projection-image generating section 500 to create or change a projection image according to the content of the instruction.
The projection-image generating section 500 includes an image memory 510 that stores a projection image. The projection-image generating section 500 has a function of generating a projection image to be projected onto the operation surface SS by the projecting section 200. The projection-image generating section 500 desirably further has a function of a keystone correction section that corrects trapezoidal distortion of the projection screen PS.
The projecting section 200 has a function of projecting the projection image generated by the projection-image generating section 500 onto the operation surface SS. The projecting section 200 includes a light modulating section 220 and a light source 230 besides the projection lens 210 explained with reference to
The infrared illuminating section 400 includes the first illuminating section 410 and the second illuminating section 420 explained with reference to
The imaging section 300 includes the first camera 310 and the second camera 320 explained with reference to
The position detecting section 600 has a function of calculating a position of the tip portion of the pointer 80 using a first captured image captured and acquired by the first camera 310 and a second captured image captured and acquired by the second camera 320. The position detecting section 600 includes a calibration executing section 610, a region-of-interest extracting section 620, a correlation-image creating section 630, and a convolutional neural network 640. These sections may be stored in a storage region of the position detecting section as models. The calibration executing section 610 creates a first calibration image and a second calibration image, which are two calibration images, by performing stereo calibration on the first captured image and the second captured image, which are the two images captured by the two cameras 310 and 320. The region-of-interest extracting section 620 extracts, from the two calibration images, a first region of interest image and a second region of interest image, which are two region of interest images, each including the pointer 80. The correlation-image creating section 630 creates a correlation image explained below from the two region of interest images. The convolutional neural network 640 is configured to include an input layer to which the correlation image is input and an output layer that outputs a distance-related parameter related to the distance between the operation surface SS and the pointer 80. Details of functions of the sections 610 to 640 are explained below.
Functions of the sections of the control section 700 and functions of the sections of the position detecting section 600 are realized by, for example, a processor in the projector 100 executing computer programs. A part of the functions of the sections may be realized by a hardware circuit such as an FPGA (field-programmable gate array).
In step S100, the imaging section 300 acquires a plurality of images by imaging the pointer 80 over the operation surface SS as the background.
In step S110, the imaging control section 710 turns on the first illuminating section 410 and turns off the second illuminating section 420. In step S120, the imaging control section 710 captures images using the first camera 310 and the second camera 320. As a result, the first image IM1_1 and the second image IM2_1 shown in an upper part of
In step S130, the imaging control section 710 turns off the first illuminating section 410 and turns on the second illuminating section 420. In step S140, the imaging control section 710 captures images using the first camera 310 and the second camera 320. As a result, the first image IM1_2 and the second image IM2_2 shown in a middle part of
When the imaging in step S120 and step S140 ends, as shown in a lower part of
When the processing in step S100 ends in this way, in step S200 in
Instead of setting illumination periods for the two illuminating sections 410 and 420 at the exclusive timings different from each other and sequentially capturing images in the respective illumination periods as explained with reference to
In step S300 in
In step S400, the correlation-image creating section 630 creates a correlation image RIM from the two region of interest images RO1 and RO2. A pixel value of the correlation image RIM is obtained by calculating, with the following expression, a correlation value p of two kernel regions KR having corresponding pixels respectively as reference positions RP in the two region of interest images RO1 and RO2 as shown in the middle part of
where, Pi is a pixel value of the first calibration image IM1, Qi is a pixel value of the second calibration image IM2, μp is an average of pixel values in a first kernel region KR of the first calibration image IM1, μq is an average of pixel values in a second kernel region KR of the second calibration image IM2, m is the number of pixels of one side of the kernel region KR, σp and σq are variances, and σpq are covariance. The averages μp and μq are calculated using a publicly-known averaging method such as a simple average or a Gaussian average.
The correlation value ρ given by the above Expressions (1a) to (1d) is a so-called correlation coefficient and means similarity of the two calibration images IM1 and IM2 in the kernel region KR. That is, a larger positive correlation value ρ means that the similarity of the two calibration images IM1 and IM2 in the kernel region KR is higher and a parallax is smaller. Therefore, it is possible to determine the distance-related parameter related to the distance between the operation surface SS and the pointer 80 using the correlation image RIM.
A value other than the correlation coefficient can be used as the correlation value ρ. For example, an SAD (Sum of Absolute Difference) or an SSD (Sum of Squared Difference) may be used as the correlation value ρ. Preprocessing such as normalization and smoothing is desirably performed on the first calibration image IM1 and the second calibration image IM2 before the correlation image RIM is created.
In step S500 in
The distance-related parameter can be determined using the convolutional neural network 640 because the distance-related parameter has a positive or negative correlation with a feature value of the correlation image RIM. As the feature value having the correlation with the distance-related parameter, there is a statistical representative value of the correlation value ρ in the correlation image RIM. An average, a maximum, a median, and the like correspond to the statistical representative value. In the following explanation, an average ρave of the correlation value ρ is used as an example of the statistical representative value of the correlation value ρ in the correlation image RIM. The statistical representative value ρave has a negative correlation with respect to the distance ΔZ between the operation surface SS and the pointer 80.
In step S600 in
In step S500, the distance ΔZ between the operation surface SS and the pointer 80 is determined as the distance-related parameter. However, a parameter other than the distance ΔZ may be calculated as the distance-related parameter. For example, when, from the feature values obtained according to the correlation image RIM, it can be assumed in step S500 that the distance ΔZ is sufficiently small, the operation in step S700 maybe immediately executed without calculating the distance ΔZ. In this case, the distance-related parameter is an operation execution parameter such as a flag or a command indicating execution of operation corresponding to the position of the pointer 80. With this configuration, in a situation in which the distance ΔZ between the pointer 80 and the operation surface SS is assumed to be sufficiently small, it is possible to execute operation on the operation surface SS using the pointer 80 without determining the distance ΔZ between the pointer 80 and the operation surface SS.
As explained above, in the first embodiment, the distance-related parameter related to the distance ΔZ between the operation surface SS and the pointer 80 is determined using the convolutional neural network 640 to which the correlation image is input and from which the distance-related parameter is output. Therefore, it is possible to accurately determine the distance-related parameter.
The number of cameras may be three or more. That is, with N set to an integer equal to or larger than 3, N cameras may be provided. In this case, the calibration executing section 610 creates N calibration images respectively captured by the N cameras. The region-of-interest extracting section 620 extracts N region of interest images, each including the pointer 80, from the N calibration images. With M set to an integer equal to or larger than 1 and equal to or smaller than {N (N−1)/2}, the correlation-image creating section 630 can create M correlation images from M sets of region of interest images, two of which are selected out of the N region of interest images. That is, the correlation-image creating section 630 is capable of creating, by calculating, concerning each of two region of interest images of each set, a correlation value of two kernel regions KR having corresponding pixels respectively as reference positions in the two region of interest images, the M correlation images having the correlation value as a pixel value. The input layer 641 of the convolutional neural network 640 is configured to input the M correlation images. With this configuration, since N images can be captured in a state in which a shadow of the pointer 80 is less on the operation surface, it is possible to accurately determine the distance-related parameter.
The present disclosure is not limited to the embodiments explained above and can be realized in various forms in a range not departing from the gist of the present disclosure. For example, the present disclosure can also be realized by the following aspects. The technical features in the embodiments corresponding to technical features in the aspects described below can be substituted or combined as appropriate in order to solve a part or all of the problems of the present disclosure or in order to achieve a part or all of the effects of the present disclosure. If the technical features are not explained as essential technical features in this specification, the technical features can be deleted as appropriate.
(1) According to a first aspect of the present disclosure, there is provided a position detecting method for detecting a parameter related to a position of a pointer with respect to an operation surface. The position detecting method includes: (a) imaging, using a first camera, the pointer over the operation surface as a background to capture a first captured image, and imaging, using a second camera disposed in a position different from a position of the first camera, the pointer over the operation surface as the background to capture a second captured image; (b) creating a first calibration image and a second calibration image by performing stereo calibration on the first captured image and the second captured image; (c) extracting a first region of interest image and a second region of interest image, each including the pointer, from the first calibration image and the second calibration image; (d) creating, by calculating, concerning each of the first region of interest image and the second region of interest image, a correlation value of a first kernel region and a second kernel region having corresponding pixels respectively as reference positions, a correlation image having the correlation value as a pixel value; and (e) determining a distance-related parameter related to a distance between the operation surface and the pointer using a convolutional neural network including an input layer to which the correlation image is input and an output layer that outputs the distance-related parameter.
With the position detecting method, since the distance-related parameter related to the distance between the operation surface and the pointer is determined using the convolutional neural network to which the correlation image is input and from which the distance-related parameter is output, it is possible to accurately determine the distance-related parameter.
(2) In the position detecting method, in the (a), with N set to an integer equal to or larger than 3, the pointer over the operation surface as the background may be captured by N cameras to acquire N captured images, in the (b), N calibration images may be created by performing the stereo calibration on the N captured images, in the (c), N region of interest images, each including the pointer, maybe extracted from the N calibration images, in the (d), with M set to an integer equal to or larger than 1 and equal to or smaller than {N(N−1)/2}, by calculating, concerning each of M sets of two region of interest images, two of which are selected out of the N region of interest images, a correlation value of two kernel regions having corresponding pixels respectively as reference positions in the two region of interest images, M correlation images having the correlation value as a pixel value may be created, and in the (e), the distance-related parameter may be determined using a convolutional neural network including an input layer to which the M correlation images are input and an output layer that outputs the distance-related parameter.
With the position detecting method, since the distance-related parameter is determined using the three or more cameras, it is possible to more accurately determine the distance-related parameter.
(3) In the position detecting method, the distance-related parameter may be the distance between the operation surface and the pointer.
With the position detecting method, it is possible to accurately determine the distance between the operation surface and the pointer according to a statistical representative value of correlation values concerning a plurality of correlation images.
(4) In the position detecting method, the distance-related parameter may be an operation execution parameter indicating that operation on the operation surface corresponding to a position of the pointer is executed.
With the position detecting method, in a situation in which it is assumed that the distance between the pointer and the operation surface is sufficiently small, it is possible to execute the operation on the operation surface using the pointer without determining the distance between the pointer and the operation surface.
(5) In the position detecting method, the (a) may include: sequentially selecting a first infrared illuminating section provided to correspond to the first camera and a second infrared illuminating section provided to correspond to the second camera; and executing imaging using the first camera while performing illumination with the first infrared illuminating section without performing illumination with the second infrared illuminating section, executing imaging using the second camera while performing illumination with the second infrared illuminating section without performing illumination with the first infrared illuminating section, and sequentially acquiring the first captured image and the second captured image one by one at different timings, and the first infrared illuminating section and the second infrared illuminating section may be configured to include at least one of a coaxial illuminating section configured to perform coaxial illumination on the cameras and a peripheral illuminating section disposed to surround peripheries of optical axes of the cameras.
With this position detecting method, since the two captured images can be captured in a state in which a shadow of the pointer is less on the operation surface, it is possible to accurately determine the distance-related parameter.
(6) According to a second aspect of the present disclosure, there is provided a position detecting device that detects a parameter related to a position of a pointer with respect to an operation surface. The position detecting device includes: an imaging section including a first camera configured to image the pointer over the operation surface as a background to capture a first captured image and a second camera disposed in a position different from a position of the first camera and configured to image the pointer over the operation surface as the background to capture a second captured image; a calibration executing section configured to create a first calibration image and a second calibration image by performing stereo calibration on the first captured image and the second captured image; a region-of-interest extracting section configured to extract a first region of interest image and a second region of interest image, each including the pointer, from the first calibration image and the second calibration image; a correlation-image creating section configured to create, by calculating, concerning each of the first region of interest image and the second region of interest image, a correlation value of a first kernel region and a second kernel region having corresponding pixels respectively as reference positions, a correlation image having the correlation value as a pixel value; and a convolutional neural network including an input layer to which the correlation image is input and an output layer that outputs a distance-related parameter related to a distance between the operation surface and the pointer.
With the position detecting device, since the distance-related parameter related to the distance between the operation surface and the pointer is determined using the convolutional neural network to which the correlation image is input and from which the distance-related parameter is output, it is possible to accurately determine the distance-related parameter.
(7) In the position detecting device, with N set to an integer equal to or larger than 3, the imaging section may include N cameras configured to image the pointer over the operation surface as the background to acquire N captured images, the calibration executing section may create N calibration images by performing the stereo calibration on the N captured images, the region-of-interest extracting section may extract N region of interest images, each including the pointer, from the N calibration images, with M set to an integer equal to or larger than 1 and equal to or smaller than {N (N-1)/2}, the correlation-image creating section may create, by calculating, concerning each of M sets of region of interest images, two of which are selected out of the N region of interest images, a correlation value of two kernel regions having corresponding pixels respectively as reference positions in the two region of interest images, M correlation images having the correlation value as a pixel value, and the convolutional neural network may include an input layer to which the M correlation images are input and an output layer that outputs the distance-related parameter related to the distance between the operation surface and the pointer.
With the position detecting device, since the distance-related parameter is determined using the three or more cameras, it is possible to more accurately determine the distance-related parameter.
(8) In the position detecting device, the distance-related parameter may be the distance between the operation surface and the pointer.
With the position detecting device, it is possible to accurately determine the distance between the operation surface and the pointer according to a statistical representative value of correlation values concerning a plurality of correlation images.
(9) In the position detecting device, the distance-related parameter may be an operation execution parameter indicating that operation on the operation surface corresponding to a position of the pointer is executed.
With the position detecting device, in a situation in which it is assumed that the distance between the pointer and the operation surface is sufficiently small, it is possible to execute the operation on the operation surface using the pointer without determining the distance between the pointer and the operation surface.
(10) The position detecting device may further include: a first infrared illuminating section configured to include at least one of a coaxial illuminating section configured to perform coaxial illumination on the first camera and a peripheral illuminating section disposed to surround a periphery of an optical axis of the first camera; a second infrared illuminating section configured to include at least one of a coaxial illuminating section configured to perform coaxial illumination on the second camera and a peripheral illuminating section disposed to surround a periphery of an optical axis of the second camera; and an imaging control section configured to control imaging performed using the first camera and the first infrared illuminating section and the second camera and the second infrared illuminating section. The imaging control section may sequentially select the first camera and the first infrared illuminating section and the second camera and the second infrared illuminating section and sequentially acquire the first captured image and the second captured image at different timings by executing imaging using the first camera while performing illumination with the first infrared illuminating section without performing illumination with the second infrared illuminating section and executing imaging using the second camera while performing illumination with the second infrared illuminating section without performing illumination with the first infrared illuminating section.
With this position detecting device, since the two captured images can be captured in a state in which a shadow of the pointer is lesson the operation surface, it is possible to accurately determine the distance-related parameter.
(11) According to a third aspect of the present disclosure, there is provided an interactive projector that detects a parameter related to a position of a pointer with respect to an operation surface. The interactive projector includes: a projecting section configured to project a projection image onto the operation surface; an imaging section including a first camera configured to image the pointer over the operation surface as a background to capture a first captured image and a second camera disposed in a position different from a position of the first camera and configured to image the pointer over the operation surface as the background to capture a second captured image; a calibration executing section configured to create a first calibration image and a second calibration image by performing stereo calibration on the first captured image and the second captured image; a region-of-interest extracting section configured to extract a first region of interest image and a second region of interest image, each including the pointer, from the first calibration image and the second calibration image; a correlation-image creating section configured to create, by calculating, concerning each of the first region of interest image and the second region of interest image, a correlation value of a first kernel region and a second kernel region having corresponding pixels respectively as reference positions, a correlation image having the correlation value as a pixel value; and a convolutional neural network including an input layer to which the correlation image is input and an output layer that outputs a distance-related parameter related to a distance between the operation surface and the pointer.
With the interactive projector, since the distance-related parameter related to the distance between the operation surface and the pointer is determined using the convolutional neural network to which the correlation image is input and from which the distance-related parameter is output, it is possible to accurately determine the distance-related parameter.
Number | Date | Country | Kind |
---|---|---|---|
2019-014287 | Jan 2019 | JP | national |