The subject matter herein generally relates to controlling of electronic devices and related gesture recognition.
Many electronic devices support gesture recognition. In a current gesture recognition system, when the gesture recognition system of a electronic device establishes a detection block to detect and recognize a gesture of a hand, the established detection block may comprise one or more other objects (wall, human head, etc.). If such objects are not filtered from the detection block, the recognition precision of the gesture recognition system can be affected.
Implementations of the present technology will now be described, by way of example only, with reference to the attached figures.
It will be appreciated that for simplicity and clarity of illustration, where appropriate, reference numerals have been repeated among the different figures to indicate corresponding or analogous elements. In addition, numerous specific details are set forth in order to provide a thorough understanding of the embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein can be practiced without these specific details. In other instances, methods, procedures, and components have not been described in detail so as not to obscure the related relevant feature being described. Also, the description is not to be considered as limiting the scope of the embodiments described herein. The drawings are not necessarily to scale and the proportions of certain parts may be exaggerated to better illustrate details and features of the present disclosure. It should be noted that references to “an” or “one” embodiment in this disclosure are not necessarily to the same embodiment, and such references mean “at least one”.
Several definitions that apply throughout this disclosure will now be presented.
The term “coupled” is defined as connected, whether directly or indirectly through intervening components, and is not necessarily limited to physical connections. The connection can be such that the objects are permanently connected or releasably connected. The term “comprising,” when utilized, means “including, but not necessarily limited to”; it specifically indicates open-ended inclusion or membership in the so-described combination, group, series, and the like.
The electronic device 100 comprises a gesture recognition system 1, at least one processor 2, and at least one storage 3. The gesture recognition system 1 detects and recognizes a gesture to control the electronic device 100.
In one exemplary embodiment, the gesture recognition system 1 can be stored in the storage 3.
Referring to
The image obtaining module 11 obtains an image. The first filtering module 12 filters static objects comprised in the image.
In one exemplary embodiment, the image obtaining module 11 can enable a depth camera 4 (shown in
In one exemplary embodiment, the image comprises at least one frame. Each pixel of each frame can be represented by an XY-coordinate, and depth information of each pixel can be represented by a Z-coordinate.
In one exemplary embodiment, the image comprises one or more static objects and one or more objects in motion. The first filtering module 12 can filter the static objects comprised by the image through a Gaussian mixture model (GMM) to retain the motion objects. The static objects can be a wall, a table, a chair, and a sofa, for example. The motion objects can be a hand, a head, and a body, for example.
The first establishing module 13 obtains hand coordinate information and establishes a first block 200 comprising the hand 20.
In one exemplary embodiment, the first establishing module 13 can locate the hand coordinate information through a deep learning algorithm after the first filtering module 12 filters out the static objects. The first establishing module 13 establishes characteristic values of the hand 20 through the deep learning algorithm and obtains the hand coordinate information. Then, the first establishing module can establish the first block 200 according to the hand coordinate information.
In one exemplary embodiment, the deep learning algorithm is an application to learning tasks of artificial neural networks (ANNs) that contain more than one hidden layers and the deep learning algorithm is part of a broader family of machine learning methods based on learning data representations.
In one exemplary embodiment, a percentage of an area of the hand 20 in a total area of the first block 200 being greater than a predetermined percentage value improves a recognition speed of the gesture recognition system 1. For example, the predetermined percentage value can be forty percent.
The counting module 14 obtains depth information of each pixel of the first block 200 and counts the number of pixels of each depth level.
In one exemplary embodiment, the first block 200 comprises a plurality of pixels, each pixel also can be represented by an XY-coordinate, and depth information of each pixel also can be represented by a Z-coordinate. The counting module 14 can search coordinate information of each pixel of the first block 200 to obtain the depth information of each pixel. Then, the counting module 14 can further count the number of pixels of each depth level through a histogram (as shown in
Referring to
The second establishing module 15 obtains hand depth information according to a counting result and establishes a second block 300 (shown in
In one exemplary embodiment, the second establishing module 15 can obtain hand depth information according to a counting result of the histogram. If a depth value of a first pixel is less than a predetermined depth value, the first pixel can be recognized as a noise pixel. Then, the first pixel can be filtered from the histogram. In this exemplary embodiment, the predetermined value is 10 by way of example.
The fourth depth level is less than the predetermined depth value (e.g., 8<10), thus the second establishing module 15 filters the fourth depth information from the histogram. Then, an updated histogram only comprises first, second, and third depth levels. The second establishing module 15 extracts two depth levels from the updated histogram, where the two depth levels comprise the top two most numbers of pixels. The second establishing module 15 further selects a smaller value depth level from the two depth levels as the hand depth information for establishing the second block 300.
According to
In one exemplary embodiment, the second establishing module 15 further filters other motion objects (head, body for example) from the first block 200 to retain only the hand 20. The second establishing module 15 establishes a depth level range according to the hand depth information. The depth level range can be 48-52, for example. The second establishing module 15 filters second pixels from the first block 200 which are not within the depth level range. Then, the second establishing module 15 can generate a planar block 30 comprising the hand 20 and further establish the second block 300 based on the planar block 30 and the hand depth information (In
The recognizing module 16 detects a moving track of the hand 20 in the second block 300 and recognizes a gesture of the hand 20 according to the moving track.
In one exemplary embodiment, the storage 3 can store a gesture library. In the gesture library, different moving tracks correspond to different gestures. The recognizing module 16 can recognize the gesture of the hand 20 according to the detected moving track and the gesture library. The recognizing module 16 further can update the gesture library through the deep learning algorithm.
In step 600, the image obtaining module 11 obtains an image that comprises the hand 20 and image depth information.
In step 602, the first filtering module 12 filters static objects comprised in the image.
In step 604, the first establishing module 13 obtains hand coordinate information and establishes the first block 200 comprising the hand 20 according to the hand coordinate information.
In step 606, the counting module 14 obtains depth information of each pixel of the first block 200 and counts a number of pixels of each depth level.
In step 608, the second establishing module 15 obtains hand depth information according to a counting result and establishes the second block 300 based on the hand depth information.
In step 610, the recognizing module 16 detects a moving track of the hand 20 in the second block 300 and recognizes a gesture of the hand 20 according to the moving track.
In one exemplary embodiment, the image obtaining module 11 can enable the depth camera 4 to capture the image. The depth camera 4 can obtain the image depth information and the image can be a RGB colors image.
In one exemplary embodiment, the first filtering module 12 can filter the static objects comprised in the image through the GMM.
In one exemplary embodiment, the first establishing module 13 can locate the hand coordinate information through a deep learning algorithm after the first filtering module 12 filters out the static objects. The first establishing module 13 establishes characteristic values of the hand 20 through the deep learning algorithm and obtains the hand coordinate information according to the characteristic values. Then, the first establishing module can establish the first block 200 according to the hand coordinate information.
In one exemplary embodiment, the counting module 14 can search coordinate information of each pixel of the first block 200 to obtain the depth information of each pixel. Then, the counting module 14 can further count the number of pixels of each depth level through a histogram.
In one exemplary embodiment, the second establishing module 15 can obtain hand depth information according to a counting result of the histogram. If a depth value of a first pixel is less than a predetermined depth value, the first pixel can be recognized as a noise pixel. Then, the first pixel can be filtered out from the histogram.
In one exemplary embodiment, the second establishing module 15 extracts two depth levels from the updated histogram that the two depth levels comprise the most pixels. The second establishing module 15 further selects a smaller value depth level from the two depth levels as the hand depth information to establish the second block 300.
In one exemplary embodiment, the second establishing module 15 establishes a depth level range according to the hand depth information. The second establishing module 15 filters second pixels from the first block 200 which are not within the depth information range. Then, the second establishing module 15 can generate the planar block 30 comprising the hand 20 and establish the second block 300 according to the planar block 30 and the hand depth information.
In one exemplary embodiment, the storage 3 can store a gesture library. The recognizing module 16 can recognize the gesture of the hand 20 according to the detected moving track and the gesture library. The recognizing module 16 further can update the gesture library through the deep learning algorithm.
The exemplary embodiments shown and described above are only examples. Many such details are neither shown nor described. Even though numerous characteristics and advantages of the present technology have been set forth in the foregoing description, together with details of the structure and function of the present disclosure, the disclosure is illustrative only, and changes may be made in the detail, including in matters of shape, size, and arrangement of the parts within the principles of the present disclosure, up to and including the full extent established by the broad general meaning of the terms used in the claims. It will therefore be appreciated that the exemplary embodiments described above may be modified within the scope of the claims.
Number | Date | Country | Kind |
---|---|---|---|
106120799 A | Jun 2017 | TW | national |
Number | Name | Date | Kind |
---|---|---|---|
20110291926 | Gokturk | Dec 2011 | A1 |
20120093360 | Subramanian | Apr 2012 | A1 |
20130278504 | Tong | Oct 2013 | A1 |
20140177909 | Lin | Jun 2014 | A1 |
20170090584 | Tang | Mar 2017 | A1 |
Number | Date | Country |
---|---|---|
201426413 | Jul 2014 | TW |
201619752 | Jun 2016 | TW |
Number | Date | Country | |
---|---|---|---|
20180373927 A1 | Dec 2018 | US |