Impulse-like gesture recognition method, and impulse-like gesture recognition system

Description

BACKGROUND OF THE INVENTION
1. Field of the Invention

The present invention relates to a recognition method and a recognition system, and more particularly to an impulse-like gesture recognition method and an impulse-like gesture recognition system.

2. Description of the Related Art

Recognition systems generally receive sensing signals from a sensor to recognize a motion of a user. For example, the recognition system receives sensing signals from the sensor, processes the sensing signals using the recognition system, and utilizes the recognition system to implement a recognition method to determine whether a user being observed by the sensor is using portions of his or her body to make particular actions or form particular shapes or gestures. The recognition system classifies the motion of the user, and associates the motion of the user with executable commands or instructions.

Recently, online gesture recognition is getting more and more popular in the research community due to various application possibilities in human-machine interactions. However, the online gesture recognition is challenging mainly for the following reasons:

- 1. the detection score violates the monotonicity;
- 2. the reaction time is long;
- 3. the rapid consecutive gestures cannot be easily decomposed; and
- 4. the post-processing using hand-crafted mechanism needs to be applied.

Namely, when the motion of the user is a complex motion, a gesture recognition process of the recognition system is time consuming. The recognition system may not perform the online gesture recognition due to the complex motion of the user.

Therefore, the recognition system needs to be further improved.

SUMMARY OF THE INVENTION

An objective of the present invention is to provide an impulse-like gesture recognition method and an impulse-like gesture recognition system. The present invention may present an online gesture recognition, and achieve the following capabilities:

- 1. the detection score is non-decreasing;
- 2. fast reaction time for an incoming gesture is achieved;
- 3. the rapid consecutive gestures are easily decomposed; and
- 4. the expensive post-processing is not needed.

The impulse-like gesture recognition method includes a performing procedure, and the performing procedure includes steps of:

- receiving a sensing signal from a sensing unit; wherein the sensing signal comprises a plurality of sensing frames;
- determining a prediction with at least one impulse-like label according to the sensing frames by a deep learning-based model; wherein the at least one impulse-like label labels at least one detection score of the deep learning-based model; and
- classifying at least one gesture event according to the prediction.

Further, the impulse-like gesture recognition system includes a performing device. The performing device includes a sensing unit, a memory unit, and a processing unit.

The sensing unit senses a sensing signal, and the sensing signal comprises a plurality of sensing frames. The memory unit stores a deep learning-based model.

The processing unit is electrically connected to the sensing unit and the memory unit. The processing unit executes a performing procedure.

The processing unit receives the sensing signal from the sensing unit, determines a prediction with at least one impulse-like label according to the sensing frames by the deep learning-based model stored in the memory unit, and classifies at least one gesture event according to the prediction. The at least one impulse-like label labels at least one detection score of the deep learning-based model.

Since the impulse-like gesture recognition system uses the at least one impulse-like label to label the at least one detection score of the deep learning-based model, the detection score is non-decreasing.

When the at least one impulse-like label labels at least one detection score, the least one gesture event can be classified immediately. Therefore, reaction time of the at least one gesture event for an incoming gesture is short.

Further, rapid consecutive gesture events can be classified by individual impulse-like labels. Namely, the rapid consecutive gestures are easily decomposed, and an expensive post-processing is not needed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flowchart of a performing procedure of an impulse-like gesture recognition method of the present invention;

FIG. 2 is a flowchart of a training procedure of the impulse-like gesture recognition method of the present invention;

FIG. 3 is a block diagram of an impulse-like gesture recognition system of present invention; and

FIG. 4A is a schematic diagram of a prediction with at least one impulse-like label;

FIG. 4B is a schematic diagram of a ground truth with the at least one impulse-like label;

FIG. 5A is a schematic diagram of the filtered prediction; and

FIG. 5B is a schematic diagram of the filtered ground truth.

DETAILED DESCRIPTION OF THE INVENTION

With reference to FIGS. 1 and 2, the present invention relates to an impulse-like gesture recognition method and an impulse-like gesture recognition system.

The impulse-like gesture recognition method includes a performing procedure, and the performing procedure includes steps of:

- receiving a sensing signal from a sensing unit (S101); wherein the sensing signal comprises a plurality of sensing frames;
- determining a prediction with at least one impulse-like label according to the sensing frames by a deep learning-based model (S102); wherein the at least one impulse-like label labels at least one detection score of the deep learning-based model; and
- classifying at least one gesture event according to the prediction (S103).

When the at least one impulse-like label labels at least one detection score, the least one gesture can be classified immediately. Therefore, reaction time of the at least one gesture event for an incoming gesture is short.

Moreover, the impulse-like gesture recognition method further includes a training procedure for training the deep learning-based model, and the training procedure includes steps of:

- receiving a training signal (S201); wherein the training signal comprises a plurality of training frames;
- determining a length of the training signal (S202);
- initializing a function according to the length of the training signal (S203);
- determining the prediction with the at least one impulse-like label according to the training frames by the deep learning-based model (S204);
- receiving a ground truth with the at least one impulse-like label (S205);
- filtering the prediction and the ground truth (S206); wherein the prediction and the ground truth are filtered by the initialized function;
- measuring the Manhattan distance between the filtered prediction and the filtered ground truth (S207); and
- supervising a training of the deep learning-based model by using the Manhattan distance as a loss function (S208).

In an embodiment, the length of the training signal is determined according to an amount of the training frames, and the function is the Gaussian kernel.

Moreover, the Gaussian kernel is:

$Φ_{1 D} = G_{1 D} (μ; σ) = \frac{1}{\sqrt{2 π σ^{2}}} e^{- \frac{u^{2}}{2 σ^{2}}};$

- and
- the parameter σ determines a width of the Gaussian kernel.

In statistics, a Gaussian probability density function is considered as the standard deviation.

With reference to FIG. 3, the impulse-like gesture recognition system includes a performing device 10 training device 20.

The performing device 10 includes a sensing unit 101, a memory unit 102, and a processing unit 103.

The sensing unit 101 senses the sensing signal, and the sensing signal comprises a plurality of sensing frames. The memory unit 102 stores the deep learning-based model.

The processing unit 103 is electrically connected to the sensing unit 101 and the memory unit 102. The processing unit 103 receives the sensing signal from the sensing unit 101, determines a prediction with at least one impulse-like label according to the sensing frames by the deep learning-based model stored in the memory unit 102, and classifies at least one gesture event according to the prediction.

In the embodiment, the at least one impulse-like label labels at least one detection score of the deep learning-based model.

Further, the training device 20 includes a memory unit 201 and a processing unit 202. The memory unit 201 stores the deep learning-based model, a training signal, and a around truth. The training signal comprises a plurality of training frames.

The processing unit 202 is electrically connected to the memory unit 201. The processing unit 20 receives the training signal, determines the prediction with the at least one impulse-like label according to the training frames by the deep learning-based model, receives the ground truth with the at least one impulse-like label, filters the prediction and the around truth, measures the Manhattan distance between the filtered prediction and the filtered ground truth, and supervises a training of the deep learning-based model by using the Manhattan distance as a loss function.

In the embodiment, the deep learning-based model stored in the memory unit 102 of the performing device 10 is loaded from the memory unit 201 of the training device 20.

Moreover, the processing unit of the training device further determines a length of the training signal and initializes a function according to the length of the training signal.

In an embodiment, the length of the training signal is determined according to an amount of the training frames, and the length of the training signal is determined according to an amount of the training frames, and the function is the Gaussian kernel.

For example, with reference to FIGS. 4A and 4B, when the deep learning-based model receives the sensing signal, the deep learning-based model can determine the prediction Ŷ_lwith the at least one impulse-like label shown in FIG. 4A. The ground truth Y_iwith the at least one impulse-like label stored in the memory 201 of the training device 20 is shown in FIG. 4B.

Further, with reference to FIGS. 5A and 5B, when the prediction with the at least one impulse-like label is filtered by the Gaussian kernel, the filtered prediction is shown in FIG. 5A. When the ground truth with the at least one impulse-like label is filtered by the Gaussian kernel, the filtered ground truth is shown in FIG. 5B.

When the prediction and the ground truth are filtered by the Gaussian kernel, the processing unit 202 of the training device 20 can measure the Manhattan distance between the filtered prediction and the filtered ground truth, and further train the deep learning-based model by using the Manhattan distance as the loss function.

In the embodiment of the present invention, the sensing unit is a Doppler radar, the performing device is a smart phone with the Doppler radar, and the deep learning-based model is a Convolution Neural Network (CNN) or a Recurrent Neural Network (RNN).

In the embodiment of the present invention, the impulse-like gesture recognition method is executed by the impulse-like gesture recognition system. For example, the performing device 10 of the impulse-like gesture recognition system executes the performing procedure of the impulse-like gesture recognition method, and the training device 20 of the impulse-like gesture recognition system executes the training procedure of the impulse-like gesture recognition method.

Even though numerous characteristics and advantages of the present invention have been set forth in the foregoing description, together with details of the structure and function of the invention, the disclosure is illustrative only. Changes may be made in detail, especially in matters of shape, size, and arrangement of parts within the principles of the invention to the full extent indicated by the broad general meaning of the terms in which the appended claims are expressed.

Claims

1. A gesture recognition method comprising: a performing procedure, wherein the performing procedure includes steps of: receiving a sensing signal from a Doppler radar; wherein the sensing signal comprises a plurality of sensing frames;determining a prediction with at least one label according to the sensing frames by a deep learning-based model; wherein the at least one label labels at least one detection score of the deep learning-based model; andclassifying at least one gesture event according to the prediction.
2. The gesture recognition method as claimed in claim 1, further comprising: a training procedure for training the deep learning-based model, wherein the training procedure comprises steps of: receiving a training signal; wherein the training signal comprises a plurality of training frames;determining the prediction with the at least one label according to the training frames by the deep learning-based model;receiving a ground truth with the at least one label;filtering the prediction and the ground truth;measuring the Manhattan distance between the filtered prediction and the filtered ground truth; andsupervising a training of the deep learning-based model by using the Manhattan distance as a loss function.
3. The gesture recognition method as claimed in claim 2, wherein the training procedure further comprises steps of: determining a length of the training signal; andinitializing a function according to the length of the training signal;wherein the prediction and the ground truth are filtered by the initialized function.
4. The gesture recognition method as claimed in claim 3, wherein the length of the training signal is determined according to an amount of the training frames.
5. The gesture recognition method as claimed in claim 3, wherein the function is the Gaussian kernel.
6. A gesture recognition system comprising: a performing device; wherein the performing device comprises: a Doppler radar, wherein the Doppler radar senses a sensing signal and the sensing signal comprises a plurality of sensing frames;a first memory unit, wherein the first memory unit stores a deep learning-based model; anda first processing unit electrically connected to the Doppler radar and the first memory unit, wherein the first processing unit receives the sensing signal from the Doppler radar, determines a prediction with at least one label according to the sensing frames by the deep learning-based model stored in the first memory unit, and classifies at least one gesture event according to the prediction;wherein the at least one label labels at least one detection score of the deep learning-based model; anda training device, wherein the training device comprises: a second memory unit, wherein the second memory unit stores the deep learning-based model, a training signal, and a ground truth; wherein the training signal comprises a plurality of training frames; anda second processing unit, electrically connected to the second memory unit of the training device; wherein the second processing unit receives the training signal, determines the prediction with the at least one label according to the training frames by the deep learning-based model, receives the ground truth with the at least one label, filters the prediction and the ground truth, measures the Manhattan distance between the filtered prediction and the filtered ground truth, and supervises a training of the deep learning-based model by using the Manhattan distance as a loss function;wherein the deep learning-based model stored in the first memory unit is loaded from the second memory unit.
7. The gesture recognition system as claimed in claim 6, wherein the second processing unit further determines a length of the training signal and initializes a function according to the length of the training signal; wherein the prediction and the ground truth are filtered by the initialized function.
8. The gesture recognition system as claimed in claim 7, wherein the length of the training signal is determined according to an amount of the training frames.
9. The gesture recognition system as claimed in claim 7, wherein the function is the Gaussian kernel.

US Referenced Citations (12)

Number	Name	Date	Kind
8166421	Magal	Apr 2012	B2
9275347	Harada	Mar 2016	B1
10890653	Giusti et al.	Jan 2021	B2
20080170776	Albertson	Jul 2008	A1
20090100383	Sunday	Apr 2009	A1
20100199231	Markovic	Aug 2010	A1
20150277569	Sprenger	Oct 2015	A1
20170206405	Molchanov	Jul 2017	A1
20190042490	Schmisseur	Feb 2019	A1
20190258935	Umeda	Aug 2019	A1
20190279085	Umeda	Sep 2019	A1
20210342008	Sachidanandam	Nov 2021	A1

Foreign Referenced Citations (6)

Number	Date	Country
108664122	Oct 2018	CN
2400371	Dec 2011	EP
2711805	Mar 2014	EP
3188086	Jul 2017	EP
202009684	Mar 2020	TW
WO2020176105	Sep 2020	WO

Non-Patent Literature Citations (3)

Entry
Paul E. Black, “Manhattan distance”, in Dictionary of Algorithms and Data Structures [online], Paul E. Black, ed. Feb. 11, 2019. (accessed Jul. 24, 2023) Available from: https://www.nist.gov/dads/HTML/manhattanDistance.html (Year: 2019).
Souvik Hazra and Avik Santra, S. Hazra and A. Santra, Short Range Radar Based Gesture Recognition System Using 3D CNN With Triplet Loss, IEEE Access, Aug. 30, 2019, pp. 125623 to 125633, vol. 7, 2019, IEEE.
Sruthy Skaria, Akram Al-Hourani and Robin J. Evans, Deep Learning Methods for Hand Gesture Recognition Using Ultra Wideband Radar, IEEE Access, Nov. 10, 2020, pp. 203580 to 203590, vol. 8, 2020, IEEE.

Related Publications (1)

	Number	Date	Country
	20220137184 A1	May 2022	US

Impulse-like gesture recognition method, and impulse-like gesture recognition system

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

International Classifications