One embodiment of the invention is related to gesture recognition, and more particularly to a gesture recognition apparatus and a method thereof by using Doppler effect.
With the development of personal electronic products, interactive ways between people and electronic products are varied, for example mouse input, keyboard input, touch-screen input, and gesture recognition.
Gesture recognition is becoming an increasingly popular means of interacting with computers. Gesture recognition enables humans to communicate with the computers and interact naturally without any mechanical devices. It is possible to point a finger at the computer screen but not touching the screen so that the cursor will move accordingly.
The present gesture recognition can be based on video and sound. However, a video based gesture recognition has weaknesses of huge calculation, high rate of error discrimination and illumination request.
Therefore, an improved gesture recognition apparatus and a method thereof is provided in the embodiment of the present disclosure to solve the problems mentioned above.
a is a first frequency shift curve of a left basic signal.
b is a first frequency shift curve of a right basic signal.
a is a first two-value frequency shift curve of
b is a second two-value frequency shift curve of
Many aspects of the embodiments can be better understood with reference to the drawings mentioned above. The components in the drawings are not necessarily drawn to scale, the emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
Reference will now be made to describe exemplary embodiments of the present invention in detail. In this section we shall explain several exemplary embodiments of this invention with reference to the appended drawings. Whenever the shapes, relative positions and other aspects of the parts described in the embodiments are not clearly defined, the scope of the invention is not limited only to the parts shown, which are meant merely for the purpose of illustration. Also, while numerous details are set forth, it is understood that some embodiments of the invention may be practiced without these details. In other instances, well-known structures and techniques have not been shown in detail so as not to obscure the understanding of this description.
The present invention is related to sound-based gesture recognition. This technique uses a well-understood phenomenon known as the “Doppler effect” or “Doppler shift”, which characterizes the frequency change of a sound wave as a listener moves toward or away from the source. When the source moves towards the listener, the wavelength of the source is shortened and the frequency thereof is increased. It means blue shift is caused. When the source moves away from the listener, the wavelength of the source is lengthened and the frequency thereof is decreased. It means red shift is caused. Red shift is an opposite effect referred to blue shift. Using this effect, the present invention detects motion in front of and around a computing device and recognizes a set of gestures.
Referring to
The left source 11 and the right source 12 are separately disposed at two sides of the notebook and disposed at symmetrical positions. The detector 13 is disposed between the two sources 11, 12 and disposed at a central line of the two sources 11, 12. A distance from the detector 13 to the left source 12 is equal with that from the detector 13 to the right source 12. The left source 11 keeps a distance of 30-60 cm from the right source 12. The distance from the left source 11 to the second source 12 is determined by actual circumstance corresponding to variable requirements or sensitivities.
The left source 11 and the right source 12 generate inaudible sound waves between 18-22 kHz. The left source 11 generates a left basic signal Fl. The right source 12 generates a right basic signal Fr. A difference of frequency between the left basic signal Fl and the right basic signal Fr is not less than 1 kHz so that sounds from the two sources do not interfere with each other. In this embodiment, a frequency of the left basic signal Fl is 19 kHz, a frequency of the right basic signal Fr is 20 kHz.
Referring to
Referring to
The processer is configured to processing signals from the detector 13 and recognize a corresponding gesture.
Referring to
S1, the left source 11 and the right source 12 respectively generate a left basic signal Fl and a right basic signal Fh to the detection area.
S2, the detector 13 detects the left basic signal Fl, the right basic signal Fh and Doppler shift signals of the left and right basic signals after reflection by the hand.
S3, the processer is configured to deal with signals from the detector 13 and recognize a gesture. The detailed processing steps are described below.
S31, a Hamming window is windowed on the signals from the detector, the coefficient formula of the Hamming window is shown below.
where N=L−1, L is a length of the Hamming window same with FFT (Fast Fourier Transform).
L is in a range of 4096-8192. In this embodiment, L is 6144.
S32, the windowed signals is transformed into frequency-domain signals by computing Fast Fourier Transform (FFT), and a length of FFT is same with L.
Referring to
S33, the first frequency shift and the second frequency shift are normalized because an amplification difference between the left and right basic frequency may cause a strength difference of frequency shifts reflected by an identical hand. The first frequency shift is divided by an amplification of the left basic signal Fl. The second frequency shift is divided by an amplification of the right basic signal Fh.
S34, the processor calculates a sum of blue shift energy of the normalized signals, and a sum of red shift energy of the normalized signals. Referring to
The formula of sum of the red shift or blue shift energy is shown below.
E=ΣA
k
2,0≦k≦M−1
Where M is one half of the length of FFT, Ak is an amplification of every frequency of the red shift or blue shift.
We define S as a frequency shift energy in a time interval,
S=Eb−Er
Where Eb is blue shift energy, Er is red shift energy.
Referring to
S35, we select a suitable positive threshold and a negative threshold to simplify the frequency shift curve into a two-value curve. The frequency shift curve is compared with the positive threshold and the negative threshold, “+1” is signed while the frequency shift curve bigger than the positive threshold, “−1” is signed while the frequency shift curve smaller than the negative threshold, and “0” is signed while the frequency shift curve disposed between the positive threshold and the negative threshold. The positive threshold is in a range of 0.00005-0.0005, and the negative threshold is in a range of −0.00005-−0.0005. In this embodiment, the positive threshold is 0.0004, the negative threshold is −0.0001.
Referring to
S36, compare the state sequence with a gesture moulding board as shown in
S37, search the gesture L2R in the table as shown in
The gesture recognition in accordance with the present disclosure uses speakers and microphone embedded in the notebook, which is good for cost reduction. And a calculation of sound signals is less than that of images.
While the present disclosure has been described with reference to the specific embodiment, the description of the disclosure is illustrative and is not to be construed as limiting the disclosure. Various of modifications to the present disclosure can be made to the exemplary embodiments by those skilled in the art without departing from the true spirit and scope of the disclosure as defined by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
201210435804.5 | Nov 2012 | CN | national |