This application claims the priority benefit of Taiwan application serial no. 101149581, filed on Dec. 24, 2012. The entirety of the above-mentioned patent application is hereby incorporated by reference herein and made a part of this specification.
The disclosure relates to a three-dimensional (3D) interactive device and an operation method thereof
In recent years, contact-free human-machine interfaces (cfHMIs) have been developed rapidly. According to the research paper authored by an analyst from Forrester Research, Inc., as long as the motion sensing technology that revolutionizes human beings' interactions with electronic devices has fully crept into our daily lives, it will present us a vision of future interactive experiences. At present, a number of manufacturers have been dedicated to creating various products that may be applied in our daily lives. For instance, Kinect, the new era motion sensing input device launched by Microsoft, enables gamers to interact with the console by means of human gestures and movements without the need to touch a game controller; at the exhibition hall of Boeing, an interactive virtual simulator allows people to experience 3D flight simulation.
Depth images provide complete spatial image information, and therefore how to effectively, timely, and stably obtain the information of the third dimension (i.e., the depth) is essential to the development of interactive virtual input technologies. According to the related art, a depth map technology that may achieve “spatial 3D interaction” has drawn most of the attention, whereas the absolute coordinate of hand motion and even delicate finger motion can be barely obtained through the depth map estimation due to the existing limitations of distance, resolution, and so on. As such, it is rather difficult to apply the depth map technology to meet small-range cfHMI requirements.
The existing interactive input devices are mostly applied for human-machine interactions within a wide range, such as a large-sized immersive interactive virtual device, an interactive digital blackboard, a motion-sensing interactive game, and so forth, whereas an object moving within a rather small range in the 3D space is not apt to be accurately positioned. For instance, the existing interactive input technology is not suitable for being applied to capture fine and short-ranged motions of hand-sized objects. Although some interactive devices equipped with handheld infrared projectors or markers may track and recognize users' gestures or motions, these interactive devices are merely applicable in case of the wide interaction range. Subject to the fixed projection image area and the imprecise dimensions of markers, the conventional interactive technologies are not yet applicable to a portable, contact-free 3D human-machine interactive interface that may capture short-range motions.
In an exemplary embodiment of the disclosure, a 3D interactive device that includes a projection unit, an image capturing unit, and an image processing unit is provided. The projection unit projects an interactive pattern to a surface of a body, such that a user performs an interactive trigger operation on the interactive pattern by a gesture. Here, the interactive pattern is projected within a projection range. The image capturing unit captures a depth image within an image capturing range, and the image capturing range covers the projection range. The image processing unit is connected to the projection unit and the image capturing unit. Besides, the image processing unit receives the depth image and determines whether the depth image includes a hand region of the user. If yes, the image processing unit performs hand geometric recognition on the hand region to obtain gesture interactive semantics. According to the gesture interactive semantics, the image processing unit controls the projection unit and the image capturing unit.
In another exemplary embodiment of the disclosure, an operation method of a 3D interactive device is provided, and the 3D interactive device includes a projection unit and an image capturing unit. The operation method includes following steps. A coordinate calibration process is performed on a projection coordinate of the projection unit and an image capturing coordinate of the image capturing unit. An interactive pattern is projected to a surface of a body by the projection unit, such that a user performs an interactive trigger operation on the interactive pattern by a gesture. Here, the interactive pattern is projected within a projection range. A depth image within an image capturing range is captured by the image capturing unit, and the image capturing range covers the projection range. Whether the depth image includes a hand region of the user is determined. If yes, hand geometric recognition is performed on the hand region to obtain gesture interactive semantics. The projection unit and the image capturing unit are controlled according to the gesture interactive semantics.
Several exemplary embodiments accompanied with figures are described in detail below to further describe the disclosure in details.
The accompanying drawings are included to provide further understanding, and are incorporated in and constitute a part of this specification. The drawings illustrate exemplary embodiments and, together with the description, serve to explain the principles of the disclosure.
In the three-dimensional (3D) interactive device described in the exemplary embodiment, the design of an image capturing unit/device and a design of a projection unit/device are combined, and a technique of calibrating a projection coordinate and an image capturing coordinate is applied, such that the portable 3D interactive device described herein may achieve contact-free interactive input effects. Since a target object is recognized and tracked by means of a depth image, the 3D interactive device that experiences environmental changes and background light variations is resistant to light and is able to prevent the ambient light interference. Besides, the user who employs the 3D interactive device described herein need not wear markers, and the 3D interactive device can still perform the gesture recognition function and is capable of positioning fingertips of a user in the 3D space, thus ensuring the contact-free interaction between the device and the user's motion within a small range (e.g., the size of a hand portion). The portable contact-free 3D interactive device is able to locate a coordinate of motion of an object within a small range, so as to achieve the 3D spatial interaction effects through projecting an interactive pattern to any location.
In order to make the disclosure more comprehensible, several exemplary embodiments are described below. The exemplary embodiments provided herein are explanatory and do not serve to limit the scope of the disclosure.
In the disclosure, the 3D interactive device may be integrated into an operation medical device, for instance, such that the medical device can provide not only the input function through pressing contact-type physical keys but also the function of positioning fingertips in the 3D space. Thereby, paramedics are able to control and operate the medical device in a contact-free manner, thus lowering the possibility of bacteria infection caused by human contact.
In the exemplary embodiment of the disclosure, the 3D interactive device 10 at least includes a projection unit and an image capturing unit (that are not shown in
Although the angle change or the position change of the operation lamp 20 may pose an impact on the image capturing location or the projection location of the 3D interactive device 10, the 3D interactive device 10 may continuously capture the depth image that covers the hand portion of the user U by means of the image capturing unit and further recognize the coordinate of the hand portion, such that the projection unit is allowed to project the interactive pattern to the hand portion of the user U. The size of the hand portion of the user U is approximately 10 cm×10 cm (i.e., a small range of area). The 3D interactive device 10 may also accurately analyze the variations in hand gestures and movement of the user U, interpret the gesture interactive semantics of the gestures and movement, and thereby present the result of interaction.
The detailed way to implement the 3D interactive device 10 shown in
With reference to
The projection unit 210 projects an interactive pattern to a surface of a body. The body may, for instance, refer to a projection screen, human body parts, a user's hand portion, an operation table, a hospital bed, a working table, a desktop, a wall, a notebook, a piece of paper, a wooden board, or any other object on which the interactive pattern may be projected; however, the disclosure is not limited thereto. In an exemplary embodiment, the projection unit 210 may be a pico-projector (also referred to as a mini projector), for instance. In general, the pico-projector uses a light emitting diode (LED) or other solid-state optical sources as its light source, so as to increase the needed lumen and further increase the brightness of the image projected by the pico-projector. The dimension of the pico-projector is similar to that of a normal consumer mobile phone. Therefore, the pico-projector is portable and can be used anywhere, and thus the pico-projector is suitable for being utilized in the 3D interactive device 200 described herein.
For instance, the projection unit 210 may have different specifications and may be “BENQ Jobee GP2” (trade name) with the 44 inches short-distance projection function and the brightness of 200 lumens, “ViewSonic high-definition palm-sized LED projector PLED-W200” (trade name) with the design of 40 inches short focal projection and the brightness of 250 lumens, or “i-connect ViewX” (trade name) laser pico-projector. The products exemplified above are examples of the projection unit 210 and should not be construed as limitations to the disclosure.
In an exemplary embodiment, the image capturing unit 220 may be a “depth image camera” which not only takes two-dimensional pictures but also emits infrared light. By calculating the time frame during which the infrared light is in contact with and is reflected by the to-be-shot object, the depth image camera may determine the distance from the object to the camera itself and thus obtain a depth image/depth map indicating the distance to the object. According to an exemplary embodiment, the image capturing unit 220 may be a contact-free depth camera with the active scanning specification.
For instance, the image capturing unit 220 may have different specifications and may be a time-of-flight camera, a stereo vision depth camera, a laser speckle camera, a laser tracking camera, and so forth.
The image processing unit 230 may be implemented in form of software, hardware, or a combination thereof, which should not be construed as a limitation to the disclosure. The software may refer to application software or a driver, for instance. The hardware may refer to a central processing unit (CPU), a general or specific programmable microprocessor, a digital signal processor (DSP), and so on, for instance.
The image processing unit 230 may further include a gesture recognition unit 232 which not only can identify the hand region of the user in the depth image captured by the image capturing unit 220 but also can recognize the geometrical shape of the hand region of the user. Through comparing a sample in a gesture interactive semantic database (not shown in
The coordinate calibration unit 240 is coupled to the projection unit 210 and the image capturing unit 220 for calibrating a projection coordinate of the projection unit 210 and an image capturing coordinate of the image capturing unit 220. The calibration method will be elaborated hereinafter.
In step S310, the coordinate calibration unit 240 performs a coordinate calibration process on a projection coordinate of the projection unit 210 and an image capturing coordinate of the image capturing unit 220 in the 3D interactive device 200. After a coordinate transformation between the projection coordinate and the image capturing coordinate is obtained, in step S320, the projection unit 210 projects a first interactive pattern to a surface of a body, such that a user performs an interactive trigger operation on the first interactive pattern by a gesture. Here, the first interactive pattern is projected within a predetermined projection range. According to the exemplary embodiment, the body may, for instance, refer to a user's hand, human body parts, a surface of a platform, a projection screen, an operation table, a hospital bed, a working table, a desktop, a wall, a notebook, a piece of paper, a wooden board, or any other object on which the first interactive pattern may be projected; however, the disclosure is not limited thereto.
In step S330, the image capturing unit 220 captures a depth image within an image capturing range, and the image capturing range covers the projection range. After the depth image is captured, in step S340, the image processing unit 230 determines whether the depth image includes a hand region of the user through the gesture recognition unit 232. If yes, hand geometric recognition is performed on the hand region to obtain gesture interactive semantics (step S350). In step S360, the image processing unit 230 controls the projection unit 210 and the image capturing unit 220 according to the gesture interactive semantics. In an exemplary embodiment, the image processing unit 230 may control the projection unit 210 to project a second interactive pattern (i.e., a resultant interaction pattern) according to the gesture interactive semantics to the surface of the body, so as to continue the interaction. The image processing unit 230 may also control the image capturing unit 220 to continuously capture the depth image that contains the gesture of the user.
The coordinate calibration process performed by the coordinate calibration unit 240 as described in step S310 is elaborated hereinafter.
In an exemplary embodiment, the coordinate calibration process may be divided into several steps S312, S314, and S316. The projection unit 210 respectively projects a border marker symbol and a center marker symbol on at least one border point and a center of the projection range, so as to form a set calibration pattern (step S312). According to the exemplary embodiment, the border marker symbol is a pattern of a circular point, for instance, and the center marker symbol is a hand-shaped pattern, for instance. However, the disclosure is not limited thereto, and both the border marker symbol and the center marker symbol may be shaped at will. As shown in
In step S314, the image capturing unit 220 captures a three-primary-color image of the calibration pattern. During the coordinate calibration process, the image capturing unit 220 serves to capture the three-primary-color image (i.e., the RGB image) instead of the depth image. Since the depth image has the function of providing depth information, the three-primary-color image of the calibration pattern has to be further obtained. With reference to
Through conducting an image comparison method, the coordinate calibration unit 240 analyzes a coordinate of the border marker symbol and a coordinate of the center marker symbol in the three-primary-color image, so as to obtain a coordinate transformation between the projection coordinate and the image capturing coordinate (step S316). The image comparison method includes but is not limited to a chamfer distance image comparison method. Any image comparison method that is suitable for analyzing and comparing the border marker symbol and the center marker symbol can be applied in this step.
As shown in
Another exemplary embodiment is provided hereinafter to elaborate the detailed steps S340 and S350 performed by the gesture recognition unit 232 shown in
With reference to
It is determined whether the depth image has a corresponding image block with the depth values greater than a depth threshold, so as to determine whether the depth image includes a hand region of the user (step S503). The depth threshold is set as 200, for instance.
If the depth image includes the hand region of the user, a convex hull and a convex deficiency of the depth image are analyzed according to a contour of the hand region (step S505).
The information regarding the convex hull and the convex deficiency of the depth image may be applied to recognize the geometrical shape of the hand region (step S507). It is then determined whether the hand region is the left-hand region or the right-hand region of the user (step S509). If it is determined that the hand region is the left-hand region, the projection unit 210 may project the interactive pattern to the left palm of the user; thus, in the following step S511, a location of a centroid point of the hand region (i.e., the left-hand region) of the user is analyzed and recognized. In step S513, the coordinate calibration unit 240 outputs the location of the centroid point to the projection unit 210. According to the coordinate of the centroid point, the projection unit 210 is capable of correspondingly adjusting a projection location of the interactive pattern or adjusting a dimension of the interactive pattern. That is, in an exemplary embodiment, the projection unit 210 accurately projects the interactive pattern to the left hand of the user, and the user is able to perform the interactive trigger operation on the interactive pattern projected to his or her left hand by the gesture of his or her right hand.
In step S509, if it is determined that the hand region is the right-hand region, it indicates that the user performs the interactive trigger operation by the gesture of his or her right hand. In the following step S515, the gesture recognition unit 232 analyzes a location of a depth of at least one fingertip of the hand region (i.e., the right-hand region) of the user and tracks a motion trajectory of the at least one fingertip.
Further, the gesture recognition unit 232 analyzes the motion trajectories of the thumb and the index of the hand region (i.e., the right-hand region) of the user (step S517). Through comparing a sample in a gesture interactive semantic database with the depth position of the at least one fingertip of the hand region of the user, the gesture recognition unit 232 recognizes the gesture interactive semantics represented by the motion trajectories (step S519).
In an exemplary embodiment, the establishment of the gesture interactive semantic database is based on several sets of gestures (including basic gestures and the corresponding motion trajectories) as the learning and comparison samples, such as spreading out five fingers, making a fist, picking up or putting down things with fingers, tapping, nudging, spreading-to-enlarge, pinching-to-shrink, and so forth.
Alternatively, in step S521, the image processing unit 230 outputs the gesture interactive semantics analyzed by the gesture recognition unit 232 to the projection unit 210, such that the projection unit 210 projects the second interactive pattern (e.g., the resultant interactive pattern) corresponding to the gesture interactive semantics. For instance, as shown in
Moreover, the gesture interactive semantics may have one or more corresponding 3D interactive parameters. For instance, the gesture interactive semantics “slide-to-scroll” indicate that the interactive pattern or the object projected by the projection unit is moved together with the movement of the gesture unless the gesture stops moving. However, the speed and the direction of the movement are determined by the 3D interactive parameters. When analyzing and tracking the motion trajectory of the hand region in the depth image, the gesture recognition unit 232 may consider the depth coordinate of the hand region, variations in the depth value, acceleration, and power as the 3D interactive parameters, and the image processing unit 230 may transmit the 3D interactive parameters to the projection unit 210. According to the 3D interactive parameters, the projection unit 210 is allowed to learn the direction where the projected interactive pattern or object is to be moved, the speed of the movement, and so forth. As such, the 3D interactive effects achieved herein may be further enhanced.
To sum up, in the 3D interactive device described in the disclosure, the design of the image capturing unit/device and the design of the projection unit/device are combined, and the technique of calibrating the projection coordinate and the image capturing coordinate is applied, such that the 3D interactive device described herein may position the hand portion of a user in the 3D space, and that the user is allowed to interact with the projected patterns in the 3D space in a contact-free manner. Since the hand portion is recognized and tracked by means of a depth image, the 3D interactive device that experiences environmental changes and background light variations is resistant to light and is able to prevent the ambient light interference. Moreover, if integrated into a medical instrument, the 3D interactive device described herein allows paramedics to input information to the medical device within a hand-sized 3D space in an accurate, contact-free manner, thus lowering the possibility of bacteria infection caused by human contact.
It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the disclosed embodiments without departing from the scope or spirit of the disclosure. In view of the foregoing, it is intended that the disclosure cover modifications and variations of this disclosure provided they fall within the scope of the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
101149581 A | Dec 2012 | TW | national |
Number | Name | Date | Kind |
---|---|---|---|
6160899 | Lee et al. | Dec 2000 | A |
6176782 | Lyons et al. | Jan 2001 | B1 |
7022971 | Ura et al. | Apr 2006 | B2 |
7274800 | Nefian et al. | Sep 2007 | B2 |
7340077 | Gokturk et al. | Mar 2008 | B2 |
7379563 | Shamaie | May 2008 | B2 |
7414705 | Boillot | Aug 2008 | B2 |
7620316 | Boillot | Nov 2009 | B2 |
7665041 | Wilson et al. | Feb 2010 | B2 |
7724355 | McIntosh et al. | May 2010 | B1 |
7725288 | Boillot | May 2010 | B2 |
7788607 | Boillot | Aug 2010 | B2 |
7834305 | Hagio et al. | Nov 2010 | B2 |
7834847 | Boillot et al. | Nov 2010 | B2 |
7834850 | Boillot et al. | Nov 2010 | B2 |
7863551 | Bang et al. | Jan 2011 | B2 |
7924441 | Milanovi | Apr 2011 | B1 |
7961173 | Boillot | Jun 2011 | B2 |
7978091 | Boillot | Jul 2011 | B2 |
8050461 | Shpunt et al. | Nov 2011 | B2 |
8060841 | Boillot et al. | Nov 2011 | B2 |
8139029 | Boillot et al. | Mar 2012 | B2 |
8150142 | Freedman et al. | Apr 2012 | B2 |
8166421 | Magal et al. | Apr 2012 | B2 |
8169404 | Boillot | May 2012 | B1 |
8180114 | Nishihara et al. | May 2012 | B2 |
8199108 | Bell | Jun 2012 | B2 |
8228315 | Starner et al. | Jul 2012 | B1 |
8230367 | Bell et al. | Jul 2012 | B2 |
8249334 | Berliner et al. | Aug 2012 | B2 |
8345920 | Ferren et al. | Jan 2013 | B2 |
8427511 | Shin et al. | Apr 2013 | B2 |
8818027 | Forutanpour et al. | Aug 2014 | B2 |
20020186221 | Bell | Dec 2002 | A1 |
20030132913 | Issinski | Jul 2003 | A1 |
20050276444 | Zhou et al. | Dec 2005 | A1 |
20070120834 | Boillot | May 2007 | A1 |
20070121097 | Boillot | May 2007 | A1 |
20070125633 | Boillot | Jun 2007 | A1 |
20070126696 | Boillot | Jun 2007 | A1 |
20070130547 | Boillot | Jun 2007 | A1 |
20070211022 | Boillot | Sep 2007 | A1 |
20070211023 | Boillot | Sep 2007 | A1 |
20070288194 | Boillot | Dec 2007 | A1 |
20080013793 | Hillis et al. | Jan 2008 | A1 |
20080048878 | Boillot | Feb 2008 | A1 |
20080055247 | Boillot | Mar 2008 | A1 |
20080059915 | Boillot | Mar 2008 | A1 |
20080100572 | Boillot | May 2008 | A1 |
20080111710 | Boillot | May 2008 | A1 |
20080204834 | Hill | Aug 2008 | A1 |
20080244468 | Nishihara et al. | Oct 2008 | A1 |
20090172606 | Dunn et al. | Jul 2009 | A1 |
20090316952 | Ferren et al. | Dec 2009 | A1 |
20100001994 | Kim et al. | Jan 2010 | A1 |
20100013763 | Futter et al. | Jan 2010 | A1 |
20100013944 | Venetsky et al. | Jan 2010 | A1 |
20100020078 | Shpunt | Jan 2010 | A1 |
20100039500 | Bell et al. | Feb 2010 | A1 |
20100050133 | Nishihara et al. | Feb 2010 | A1 |
20100050134 | Clarkson | Feb 2010 | A1 |
20100060583 | Yan | Mar 2010 | A1 |
20100060722 | Bell | Mar 2010 | A1 |
20100118123 | Freedman et al. | May 2010 | A1 |
20100194679 | Wu et al. | Aug 2010 | A1 |
20100231509 | Boillot et al. | Sep 2010 | A1 |
20100265316 | Sali et al. | Oct 2010 | A1 |
20100284082 | Shpunt et al. | Nov 2010 | A1 |
20100290698 | Freedman et al. | Nov 2010 | A1 |
20100304854 | McEldowney | Dec 2010 | A1 |
20110019205 | Gerber et al. | Jan 2011 | A1 |
20110025827 | Shpunt et al. | Feb 2011 | A1 |
20110025843 | Oggier et al. | Feb 2011 | A1 |
20110041100 | Boillot | Feb 2011 | A1 |
20110051118 | Sato et al. | Mar 2011 | A1 |
20110052006 | Gurman et al. | Mar 2011 | A1 |
20110077757 | Chang et al. | Mar 2011 | A1 |
20110085704 | Han et al. | Apr 2011 | A1 |
20110090147 | Gervais et al. | Apr 2011 | A1 |
20110096072 | Kim et al. | Apr 2011 | A1 |
20110096182 | Cohen et al. | Apr 2011 | A1 |
20110109577 | Lee et al. | May 2011 | A1 |
20110114857 | Akerman et al. | May 2011 | A1 |
20110154249 | Jang et al. | Jun 2011 | A1 |
20110158508 | Shpunt et al. | Jun 2011 | A1 |
20110164029 | King et al. | Jul 2011 | A1 |
20110164191 | Brown | Jul 2011 | A1 |
20110181553 | Brown et al. | Jul 2011 | A1 |
20110221750 | Sato et al. | Sep 2011 | A1 |
20110228251 | Yee et al. | Sep 2011 | A1 |
20110286676 | El Dokor | Nov 2011 | A1 |
20110291988 | Bamji et al. | Dec 2011 | A1 |
20120001875 | Li et al. | Jan 2012 | A1 |
20120013222 | Herzog et al. | Jan 2012 | A1 |
20120013529 | McGibney et al. | Jan 2012 | A1 |
20120038986 | Pesach | Feb 2012 | A1 |
20120056804 | Radivojevic et al. | Mar 2012 | A1 |
20120056852 | Lee et al. | Mar 2012 | A1 |
20120062736 | Xiong | Mar 2012 | A1 |
20120093360 | Subramanian et al. | Apr 2012 | A1 |
20120113223 | Hilliges et al. | May 2012 | A1 |
20120113241 | Sundaresan et al. | May 2012 | A1 |
20120117514 | Kim et al. | May 2012 | A1 |
20120119987 | Im et al. | May 2012 | A1 |
20120120073 | Haker et al. | May 2012 | A1 |
20120127273 | Zhang et al. | May 2012 | A1 |
20120133584 | Lee et al. | May 2012 | A1 |
20120133585 | Han et al. | May 2012 | A1 |
20120169583 | Rippel et al. | Jul 2012 | A1 |
20120184854 | Raju et al. | Jul 2012 | A1 |
20120194650 | Izadi et al. | Aug 2012 | A1 |
20130283208 | Bychkov et al. | Oct 2013 | A1 |
Number | Date | Country |
---|---|---|
200725380 | Jul 2007 | TW |
201037574 | Oct 2010 | TW |
201228357 | Jul 2012 | TW |
Entry |
---|
“Office Action of Taiwan Counterpart Application”, issued on Jul. 31, 2014, p. 1-p. 3. |
Woods et al, “Image Distortions in Stereoscopic Video Systems,” Proceedings of the SPIE vol. 1915 Stgereoscopic Displays and Applications IV, Feb. 1993, pp. 1-13. |
Hamer et al., “Tracking a Hand Manipulating an Object,” IEEE Computer Vision, Sep. 2009, pp. 1-8. |
Cyganek et al., “An Introduction to 3D Computer Vision Techniques and Algorithms,” Wiley, Mar. 2009, pp. xv-483. |
Elmezain et al., “A Robust Method for Hand Tracking Using Mean-shift Altorithm and Kalman Filter in Stereo Color Image Sequences,” World Academy of Science, Engineering and Technology 59, Nov. 2009, pp. 283-287. |
Wang et al., “Real-Time Hand-Tracking with a Color Glove,” ACM Transactions on Graphics (TOG)—Proceedings of ACM SIGGRAPH, Aug. 2009, pp. 1-8. |
Number | Date | Country | |
---|---|---|---|
20140177909 A1 | Jun 2014 | US |