An example embodiment of the present invention relates generally to user interface technology and, more particularly, to a method, apparatus and computer program product for identifying a gesture.
In order to facilitate user interaction with a computing device, user interfaces have been developed to respond to gestures by the user. Typically, these gestures are intuitive and therefore serve to facilitate the use of the computing device and to improve the overall user experience. The gestures that may be recognized by a computing device may serve numerous functions, such as to open a file, close a file, move to a different location within the file, increase the volume, etc. One type of gesture that may be recognized by a computing device is a hand wave. A hand wave may be defined to provide various types of user input including, for example, navigational commands to control a media player, gallery browsing or a slide presentation.
Computing devices generally provide for gesture recognition based upon the signals provided by a single sensor, such as a camera, an accelerometer or a radar sensor. By relying upon a single sensor, however, computing devices may be somewhat limited in regards to the recognition of gestures. For example, a computing device that relies upon a camera to capture images from which a gesture is recognized may have difficulty in adapting to changes in the illumination as well as the white balance within the images captured by the camera. Also, computing devices that rely upon an accelerometer or gyroscope to provide the signals from which a gesture is recognized cannot detect the gesture in an instance in which the computing device itself is fixed in position. Further, a computing device that relies upon a radar sensor to provide the signals from which a gesture is identified may have difficulties in determining what the object that makes the gesture actually is.
A method, apparatus and computer program product are therefore provided according to an example embodiment in order to provide for improved gesture recognition based upon the fusion of signals provided by different types of sensors. In one embodiment, for example, a method, apparatus and computer program product are provided in order to recognize a gesture based upon the fusion of signals provided by a camera or other image capturing device and a radar sensor. By relying upon the signals provided by different types of sensors and by appropriately weighting the evaluation scores associated with the signals provided by the different types of sensors, a gesture may be recognized in a more reliable fashion with fewer limitations than computing devices that have relied upon a single sensor for the recognition of a gesture.
In one embodiment, a method is provided that includes receiving a series of image frames and receiving a sequence of radar signals. The method of this embodiment also determines an evaluation score for the series of image frames that is indicative of a gesture. In this regard, the determination of the evaluation score may include determining the evaluation score based on the motion blocks in an image area and the shift of the motion blocks between image frames. The method of this embodiment also includes determining an evaluation score for the sequence of radar signals that is indicative of the gesture. In this regard, the determination of the evaluation score may include determining the evaluation score based upon the sign distribution in the sequence and the intensity distribution in the sequence. The method of this embodiment also weighs each of the evaluation scores and fuses the evaluation scores, following the weighting, to identify the gesture.
The method may determine the evaluation score for the series of image frames by down-sampling image data to generate down-sampled image blocks for the series of image frames, extracting a plurality of features from the down-sampled image blocks and determining a moving status of the down-sampled image blocks so as to determine the motion blocks based upon changes in values of respective features in consecutive image frames. In this regard, the method may also determine a direction of motion of the gesture based on movement of a first border and a second border of a projection histogram determined based on the moving status of respective down-sampled image blocks.
The method of one embodiment may determine the evaluation score for the series of image frames by determining the evaluation score based on a ratio of average motion blocks in the image area. The intensity of the radar signals may depend upon the distance between an object that makes the gesture and the radar sensor, while a sign associated with the radar signals may depend upon the direction of motion of the object relative to the radar sensor. Weighting each of the evaluation scores may include determining weighs to be associated with the evaluation scores based upon linear discriminate analysis, Fisher discriminate analysis or a linear support vector machine. The method of one embodiment may also include determining a direction of motion of the gesture based upon the series of image frames in an instance in which the gesture is identified.
In another embodiment, an apparatus is provided that includes at least one processor and at least one memory including computer program code with the memory and the computer program code being configured to, with the processor, cause the apparatus to receive a series of image frames and to receive a sequence of radar signals. The at least one memory and the computer program code of this embodiment are also configured to, with the processor, cause the apparatus to determine an evaluation score for the series of image frames that is indicative of a gesture by determining the evaluation score based upon the motion blocks in an image area and a shift of motion blocks between image frames. The at least one memory in the computer program code of this embodiment are also configured to, with the processor, cause the apparatus to determine an evaluation score for the sequence of radar signals that is indicative of the gesture by determining the evaluation score based upon sign distribution in the sequence and the intensity distribution in the sequence. The at least one memory and the computer program code of this embodiment are also configured to, with the processor, cause the apparatus to weight each of the evaluation scores and fuse the evaluation scores, following the weighting, to identify the gesture.
The at least one memory and the computer program code are also configured to, with the processor, cause the apparatus of one embodiment to determine the evaluation score for the series of image frames by down-sampling image data to generate down-sampled image blocks for the series of image frames, extracting a plurality of features from the down-sampled image blocks and determining a moving status of the down-sampled image blocks so as to determine the motion blocks based upon changes in values of respective features in consecutive image frames. The at least one memory in the computer program code of this embodiment may be further configured to, with the processor, cause the apparatus to determine a direction of motion of the gesture based on movement of a first border and the second border of a projected histogram determined based on the moving status of respective down-sampled image blocks.
The at least memory and the computer program code of one embodiment may be configured to, with the processor, cause the apparatus to determine an evaluation score from a series of image frames by determining the evaluation score based upon a ratio of average motion blocks in the image area. The intensity of the radar signals may depend upon the distance between an object that makes the gesture and the radar sensor, while a sign associated with the radar signals may depend upon a direction of motion of the object relative to the radar signals. The at least one memory and the computer program code are configured to, with the processor, cause the apparatus of one embodiment to weight each of the evaluation scores by determining weights to be associated with the evaluation scores based upon linear discriminate analysis, Fisher discriminate analysis or a linear support vector machine. The at least one memory and the computer program code are further configured to, with the processor, cause the apparatus of one embodiment to determine a direction of motion of the gesture based upon the series of image frames in an instance in which the gesture is identified. The apparatus of one embodiment may also include user interface circuitry configured to facilitate user control of at least some functions of the apparatus through use of a display and cause at least a portion of the user interface of the apparatus to be displayed on the display to facilitate user control of at least some functions of the apparatus.
In a further embodiment, a computer program product is provided that includes at least one computer-readable storage medium having a computer-executable program code portions stored therein with the computer-executable program code portions including program instructions configured to receive a series of image frames and to receive a sequence of radar signals. The program instructions of this embodiment are also configured to determine an evaluation score for the series of image frames that is indicative of a gesture by determining the evaluation score based upon motion blocks in an image area and the shift of motion blocks between image frames. The program instructions of this embodiment are also configured to determine an evaluation score for the sequence of radar signals that is indicative of the gesture by determining the evaluation score based upon the sign distribution in sequence and the intensity distribution in the sequence. The program instructions of this embodiment are also configured to weigh each of the evaluation scores and to fuse the evaluation scores, following the weighing, to identify the gesture.
The computer-executable program portion to one embodiment may also include program instructions configured to determine the evaluation score for the series of image frames by down-sampling image data to generate down-sampled image blocks for the series of image frames, extracting a plurality of features from the down-sampled image blocks and determining a moving status of the down-sampled image blocks so as to determine the motion blocks based upon changes in values of respective features in consecutive images. The computer-executable program portion of this embodiment may also include program instructions configured to determine a direction of motion of the gesture based on movement of the first border and a second border of a projection histogram determined based on the moving status of respective down-sampled image blocks.
The program instructions that are configured to determine an evaluation score for the series of image frames in accordance with one embodiment may include program instructions configured to determine the evaluation score based upon a ratio of the average motion blocks in the image area. The radar signals may have an intensity that depends upon a distance between an object that makes the gesture on the radar sensor and a sign that depends upon a direction of motion of the object relative to the radar sensor. The program instructions that are configured to weight each of the evaluation scores may include, in one embodiment, program instructions configured to determine weights to be associated with the evaluation scores based upon linear discriminate analysis, Fisher discriminate analysis or a linear support vector machine. The computer-executable program code portions of one embodiment may also include program instructions configures to determine a direction of motion of the gesture based upon the series of image frames in an instance in which the gesture is identified.
In yet another embodiment, an apparatus is provided that includes means for receiving a series of image frames and means for receiving a sequence of radar signals. The apparatus of this embodiment also includes means for determining an evaluation score for the series of image frames that is indicative of a gesture. In this regard, the means for determining the evaluation score may determine the evaluation score based upon the motion blocks in an image area and a shift of motion blocks between image frames. The apparatus of this embodiment also includes means for determining an evaluation score for the sequence of radar signals as indicative of the gesture. In this regard, the means for determining the evaluation score may determine the evaluation score based upon the sign distribution in the sequence and the intensity distribution in the sequence. The apparatus of this embodiment also includes means for weighting each of the evaluation scores and means for fusing the evaluation scores, following the weighting, to identify the gesture.
Having thus described certain example embodiments of the present invention in general terms, reference will now be made to the accompanying drawings, which are not necessarily drawn to scale, and wherein:
Some embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all, embodiments of the invention are shown. Indeed, various embodiments of the invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like reference numerals refer to like elements throughout. As used herein, the terms “data,” “content,” “information,” and similar terms may be used interchangeably to refer to data capable of being transmitted, received and/or stored in accordance with embodiments of the present invention. Thus, use of any such terms should not be taken to limit the spirit and scope of embodiments of the present invention.
Additionally, as used herein, the term ‘circuitry’ refers to (a) hardware-only circuit implementations (e.g., implementations in analog circuitry and/or digital circuitry); (b) combinations of circuits and computer program product(s) comprising software and/or firmware instructions stored on one or more computer readable memories that work together to cause an apparatus to perform one or more functions described herein; and (c) circuits, such as, for example, a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation even if the software or firmware is not physically present. This definition of ‘circuitry’ applies to all uses of this term herein, including in any claims. As a further example, as used herein, the term ‘circuitry’ also includes an implementation comprising one or more processors and/or portion(s) thereof and accompanying software and/or firmware. As another example, the term ‘circuitry’ as used herein also includes, for example, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, other network device, and/or other computing device.
As defined herein, a “computer-readable storage medium,” which refers to a non-transitory physical storage medium (e.g., volatile or non-volatile memory device), can be differentiated from a “computer-readable transmission medium,” which refers to an electromagnetic signal.
As described below, a method, apparatus and computer program product are provided that permit a gesture, such as a hand wave, to be identified based upon the fusion of multiple and different types of sensor signals. For example, the method, apparatus and computer program product of one embodiment may identify a gesture based upon the fusion of sensor signals from a camera or other image capturing device and sensor signals from a radar sensor. As described below, the apparatus that may identify a gesture based upon the fusion of sensor signals may, in one example embodiment, be configured as shown in
It should also be noted that while
Referring now to
The apparatus 10 may, in some embodiments, be a user terminal (e.g., a mobile terminal) or a fixed communication device or computing device configured to employ an example embodiment of the present invention. However, in some embodiments, the apparatus 10 or at least components of the apparatus, such as the processor 12, may be embodied as a chip or chip set. In other words, the apparatus 10 may comprise one or more physical packages (e.g., chips) including materials, components and/or wires on a structural assembly (e.g., a baseboard). The structural assembly may provide physical strength, conservation of size, and/or limitation of electrical interaction for component circuitry included thereon. The apparatus 10 may therefore, in some cases, be configured to implement an embodiment of the present invention on a single chip or as a single “system on a chip.” As such, in some cases, a chip or chipset may constitute means for performing one or more operations for providing the functionalities described herein.
The processor 12 may be embodied in a number of different ways. For example, the processor 12 may be embodied as one or more of various hardware processing means such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), a processing element with or without an accompanying DSP, or various other processing circuitry including integrated circuits such as, for example, an ASIC (application specific integrated circuit), an FPGA (field programmable gate array), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like. As such, in some embodiments, the processor 12 may include one or more processing cores configured to perform independently. A multi-core processor may enable multiprocessing within a single physical package. Additionally or alternatively, the processor 12 may include one or more processors configured in tandem via the bus to enable independent execution of instructions, pipelining and/or multithreading.
In an example embodiment, the processor 12 may be configured to execute instructions stored in the memory 14 or otherwise accessible to the processor. Alternatively or additionally, the processor 12 may be configured to execute hard coded functionality. As such, whether configured by hardware or software methods, or by a combination thereof, the processor 12 may represent an entity (e.g., physically embodied in circuitry) capable of performing operations according to an embodiment of the present invention while configured accordingly. Thus, for example, when the processor 12 is embodied as an ASIC, FPGA or the like, the processor may be specifically configured hardware for conducting the operations described herein. Alternatively, as another example, when the processor 12 is embodied as an executor of software instructions, the instructions may specifically configure the processor 12 to perform the algorithms and/or operations described herein when the instructions are executed. However, in some cases, the processor 12 may be a processor of a specific device (e.g., a mobile terminal) configured to employ an embodiment of the present invention by further configuration of the processor 12 by instructions for performing the algorithms and/or operations described herein. The processor 12 may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processor.
Meanwhile, the communication interface 16 may be any means such as a device or circuitry embodied in either hardware or a combination of hardware and software that is configured to receive and/or transmit data from/to a network and/or any other device or module in communication with the apparatus 10. In this regard, the communication interface 16 may include, for example, an antenna (or multiple antennas) and supporting hardware and/or software for enabling communications with a wireless communication network. Additionally or alternatively, the communication interface 16 may include the circuitry for interacting with the antenna(s) to cause transmission of signals via the antenna(s) or to handle receipt of signals received via the antenna(s). In some environments, the communication interface 16 may alternatively or also support wired communication. As such, for example, the communication interface 16 may include a communication modem and/or other hardware/software for supporting communication via cable, digital subscriber line (DSL), universal serial bus (USB) or other mechanisms.
In some embodiments, such as instances in which the apparatus 10 is embodied by a user device, the apparatus may include a user interface 18 that may, in turn, be in communication with the processor 12 to receive an indication of a user input and/or to cause provision of an audible, visual, mechanical or other output to the user. As such, the user interface 18 may include, for example, a keyboard, a mouse, a joystick, a display, a touch screen(s), touch areas, soft keys, a microphone, a speaker, or other input/output mechanisms. Alternatively or additionally, the processor 12 may comprise user interface circuitry configured to control at least some functions of one or more user interface elements such as, for example, a speaker, ringer, microphone, display, and/or the like. The processor 12 and/or user interface circuitry comprising the processor may be configured to control one or more functions of one or more user interface elements through computer program instructions (e.g., software and/or firmware) stored on a memory accessible to the processor (e.g., memory 14, and/or the like). In other embodiments, however, the apparatus 10 may not include a user interface 18.
The apparatus 10 may include or otherwise be associated or in communication with a camera 20 or other image capturing element configured to capture a series of image frames including images of a gesture, such as a hand wave. In an example embodiment, the camera 20 is in communication with the processor 12. As noted above, the camera 20 may be any means for capturing an image for analysis, display and/or transmission. For example, the camera 20 may include a digital camera capable of forming a digital image file from a captured image. As such, the camera 20 includes all hardware, such as a lens or other optical device, and software necessary for creating a digital image file from a captured image. Alternatively, the camera 20 may include only the hardware needed to view an image, while the memory 14 stores instructions for execution by the processor 12 in the form of software necessary to create a digital image file from a captured image. In an example embodiment, the camera 20 may further include a processing element such as a co-processor which assists the processor 12 in processing image data and an encoder and/or decoder for compressing and/or decompressing image data. The encoder and/or decoder may encode and/or decode according to a joint photographic experts group (JPEG) standard format. The images that are recorded may be stored for future viewings and/or manipulations in the memory 14.
The apparatus 10 may also include or otherwise be associated or in communication with a radar sensor 22 configured to capture a sequence of radar signals indicative of the presence and movement of an object, such as the hand of a user that is making a gesture, such as a hand wave. Radar supports an object detection system that utilizes electromagnetic waves, such as radio waves, to detect the presence of objects, their speed and direction of movement, as well as their range from the radar sensor 22. Emitted waves which bounce back, e.g., reflect, from an object are detected by the radar sensor 22. In some radar systems, the range to an object may be determined based on the time difference between the emitted and reflected waves. Additionally, movement of the object toward or away from the radar sensor 22 may be detected through the detection of a Doppler shift. Further, the direction to an object may be determined by radar sensors 22 with two or more receiver channels by angle estimation methods, for example, beamforming. The radar sensor 22 may be embodied by any of a variety of radar devices, such as a Doppler radar system, a frequency modulated continuous wave (FMCW) radar or an impulse/ultra wideband radar.
The operations performed by a method, apparatus and computer program product of one example embodiment may be described with reference to the flowchart of
The series of image frames and the sequence of radar signals may then be processed and respective evaluation scores may be determined for the series of image frames and for the sequence of radar signals. In this regard, the evaluation score for the series of image frames may be indicative of a gesture in that the evaluation score provides an indication as to the likelihood that a gesture was recognized within the series of image frames. Similarly, the evaluation score that is determined for the sequence of radar signals provides an indication as to the likelihood that a gesture was recognized within the sequence of radar signals.
In this regard and as shown in block 34 of
In this regard and as shown in
In some embodiments, the preprocessing may include down-sampling, as indicated above, in order to reduce the influence that could otherwise be caused by pixel-wise noise. In an example embodiment, each input image may be smoothed and down-sampled such that a mean value of a predetermined number of pixels (e.g., a patch with 4-pixels height) may be assigned to a corresponding pixel of a down-sampled image. Thus, in an example, the working resolution would be 1/16 of the input one. In an example case, for a working image, Fi,j, where 1≦i≦H, 1≦j≦W, where W and H are the width and height of the image, respectively, if given a length λ (10 in one example), the image can be partitioned into M×N square blocks zi,j with 1≦i≦M and 1≦j≦N, where M=H/λ and N=W/λ, then for each block, various statistical characteristics may be computed with respect to red, green and blue channels descriptive of the pixel values within the down-sampled image. A plurality of features may then be extracted from the down-sampled image. In an example embodiment, the following 6 statistical characteristics (or features) may be computed including; the mean of the luminance L, the variance of the luminance L, the mean of the red channel R, the mean of the green channel G, the mean of the blue channel B, and the mean of normalized red channel NR. The normalized red value may be computed as shown in equation 1 below:
nr=255*r/(r+g+b) (1)
where r, g and b are values of the original three channels, respectively. An example embodiment has shown that the normalized red value may often be the simplest value that may be used to approximately describe the skin color in a phone camera environment. Normally, for a typical skin area (e.g. a hand and/or a face) in the image, the normalized red value will be rather large one, compared with those of the background objects.
Moving block estimation may then be performed with respect to the data corresponding to the 6 statistical characteristics (or features) extracted in the example described above. For gesture detection such as a hand wave detection, the moving status of blocks may be determined by checking for changes between the blocks of a current frame and a previous frame.
More specifically, a block Zi,j,t (where t denotes the index of frame) may be regarded as a moving block, if
(1) |Li,j,t−Li,j,t-1|>ƒ1 or NRi,j,t−NRi,j,t-1>θ2. This condition stresses the difference between the consecutive frames.
(2) LVi,j,t<θ3. This condition is based on the fact that the hand area typically has a uniform color distribution.
(3) Ri,j,t>θ4
(4) Ri,j,t>θ5*Gi,j,t and Ri,j,t22 θ5*Bi,j,t
(5) Ri,j,t>θ6*Gi,j,t or Ri,j,t>θ6*Bi,j,t
Of note, conditions (3-5) show that the red channel typically has a relatively larger value compared with the blue and green channels.
(6) θ7<Li,j,t<θ8. This is an empirical condition to discard the most evident background objects. In an example embodiment, the above θ1-θ8 may be set as 15, 10, 30, 10, 0.6, 0.8, 10 and 240, respectively.
The left border BLt and right border BRt of the histogram may be determined by
With respect to the sequential image frames designated as t, t-1 and t-2 in
(1) BRt>BRt-1+1 and HBL
(2) BRt>BRt-2+1 and HBL
However, if the two conditions below are satisfied instead, it may be determined that a left wave has occurred in the sequence:
(3) BLt<BLt-1−1 and HBR
(4) BLt<BLt-2−1 and HBR
To deal with cases in which the track of a hand is not entirely horizontal, such as the 0 degree left-to-right movement and the 0 degree right-to-left movement shown in
Similarly, equation (7) may be employed for use in 135 degree histograms:
The conditions above (with or without modifications for detection of angles other than 0 degrees) may be used for hand wave detection in various different orientations. An example of the vertical histograms associated with a series of image frames with moving blocks is shown
To eliminate or reduce the likelihood of false alarms caused by background movement (which may occur in driving environments or other environments where the user is moving), the region-wise color histogram may also be used to verify detection (as indicated in operation 62 of
After detection of a hand wave, HC1,t-HC6,t may be used for verification. Specifically, for example, if an ith sub-region contains moving blocks, the squared Euclidean distance may be computed between HCi,t and HCi,t-1.
Once the motion blocks have been identified, the apparatus 10, such as the processor 12, of one embodiment may determine the ratio of average effective motion blocks in the image area. The ratio of average effective motion blocks in the image area may be defined as the average percentage of motion blocks in each image of the series of image frames. As shown in
The apparatus 10, such as the processor 12, of one embodiment may also determine the shift of the motion blocks between image frames, such as between temporally adjacent image frames. In an image frame, such as shown in
Although the shift distance for a forward-backward gesture in an instance in which the apparatus 10 is laid upon a horizontal surface with the camera 20 facing upwards may be determined in the same manner as described above in regards to a left-right gesture, the shift distance may be defined differently for an up-down gesture. In this regard, the shift distance for an up-down gesture in an instance in which the apparatus is laid upon a horizontal surface with the camera facing upwards may be the sum of shift distances for both the left and right borders in the moving block histograms because only the shift distance of the left or right histogram border may not be sufficient for detection. Additionally and as described below, Pmin, Prange, Dmin and Drange for an up-down gesture may be the same as for other types of gesture, including a forward-backward gesture.
In one embodiment, the apparatus 10 may include means, such as the processor 12 or the like, for determining the evaluation score based upon the motion blocks in the image area and the shift of motion blocks between the image frames as shown in block 34 of
By way of further description with respect to Prange and Drange, an analysis of the collected signal data may permit Prange and Drange to be set so that a predefined percentage, such as 70%, of the moving block percentages are less than Prange and a predefined percentage, such as 70%, of the histogram border shiftings in the hand wave sequences are less than Drange. Although Prange may be less than ½, the moving block percentage is generally near the value in the sequence of the hand wave. For certain frame(s), such as frame t-1 in
With reference to block 36 of
By way of an example in which a hand moves from left to right relative to the radar sensor, the radar sensor may provide the following radar signals: 20, 13, 11, −12, −20 designated 1, 2, 3, 4 and 5, respectively, in
Based upon the radar signals, the apparatus 10, such as the processor 12, may initially determine the mean of the absolute values of the radar signal sequence R comprised of radar signals ri and having a length N. The mean of the absolute values advantageously exceeds a predefined threshold to insure that the sequence of radar signals represents a gesture and is not simply random background movement. In an instance in which the mean of the absolute values satisfies the predefined threshold such that the sequence of radar signals is considered to represent a gesture, the apparatus, such as the processor, may determine whether the gesture is parallel to the display plane or perpendicular to the display plane. In one embodiment, the
apparatus, such as the processor, may determine if
satisfies a predefined threshold, such as by being smaller than the predefined threshold. If
is smaller than the predefined threshold, the gesture may be interpreted to be parallel to the display plane, while if
equals or exceeds the predefined threshold, the gesture may be interpreted to be perpendicular to the display plane.
In an instance in which the gesture is interpreted to be parallel to the display plane, the apparatus 10, such as the processor 20, may then determine the evaluation score based upon the sign distribution in the sequence of radar signals and the intensity distribution in the sequence of radar signals. By way of example, a sequence of radar signals may be defined to be ri with i=1, 2, 3, . . . N. In this embodiment, the effectiveness Eori of sign distribution in this sequence may be defined to be equal to (Eori1+Eori2)/2. In order to determine the effectiveness of the sign distribution in the sequence of radar signals, the apparatus 10, such as the processor 12, may divide the sequence of radar signals into two portions, that is, R1 and R2. The length of R1 and R2 may be NR1 and NR2, respectively. In this regard, R1 and R2 may be defined as follows: R1={ri}, i=1, . . . NH, R2={ri}, i=NH+1 . . . , N. In this example, NH is the half position of the sequence of radar signals and may, in turn, be defined as:
As such, the apparatus 10, such as the processor 12, of this embodiment may define Eori1 and Eori2 as follows:
and
In this example, it is noted that if Eori1 or Eori2 is negative, the respective value will be set to zero.
The apparatus 10, such as the processor 12, of this embodiment may also determine the effectiveness Eint of the intensity distribution in the sequence of radar signals. In one example, the effectiveness Eint of the intensity distribution in the sequence of radar signals is defined as:
Based upon the effectiveness Eori of the sign distribution in the sequence of radar signals and the effectiveness Eint of the intensity distribution in the sequence of radar signals, the apparatus 10, such as the processor 12, of this embodiment may determine the evaluation score for the sequence of radar signals to be Sr=Eori Eint with the score varying between 0 and 1.
In another instance in which the gesture is determined to be perpendicular to the display plane, the apparatus 10, such as the processor 20, may initially determine the direction of movement based upon
In an instance in which this quantity is greater than 0, the hand is determined to be approaching the apparatus, while the hand will be determined to be moving away from the apparatus in an instance in which this quantity is less than 0. In this embodiment, the intensity and the score may vary between 0 and 1 and may both be determined by the apparatus, such as the processor as follows:
As shown in block 38 of
In this regard, the apparatus 10, such as the processor 12, of one embodiment may define a weight factor w=(wc,wr) in which wcand wr are the respective weights associated with the series of image frames and the sequence of radar signals, respectively. While the respective weights may be determined by the apparatus 10, such as the processor 12, in various manners, the apparatus, such as the processor, of one embodiment may determine the weights by utilizing, for example, a linear discriminate analysis (LDA), a Fisher discriminate analysis or a linear support vector machine (SVM). In this regard, the determination of the appropriate weights to be assigned the evaluation scores for the series of image frames and the sequence of radar signals is similar to the determination of axes and/or planes that separate two directions of a hand wave. In an embodiment that utilizes LDA in order to determine the weights, the apparatus 10, such as the processor 12, may maximize the ratio of the inter-class distance to the intra-class distance with the LDA attempting to determine a linear transformation to achieve the maximum class discrimination. In this regard, classical LDA may attempt to determine an optimal discriminate subspace, spanned by the column vectors of a projection matrix, to maximum the inter-class separability and the intra-class compactness of the data samples in a low-dimensional vector space.
As shown in operation 40 of
In one embodiment, the apparatus 10, such as the processor 12, may be trained so as to determine the combination of the weighted evaluation scores for a number of different movements. As such, the apparatus 10, such as the processor 12, may be trained so as to identify the combinations of weighted evaluation scores that are associated with a predefined gesture, such as a hand wave, and, conversely, the combinations of weighted evaluation scores that are not associated with a predefined gesture. The apparatus 10 of one embodiment may therefore include means, such as the processor 12 or the like, for identifying a gesture, such as a hand wave, based upon the similarity of the combination of weighted evaluation scores for a particular series of image frames and a particular sequence of radar signals to the combinations of weighted evaluation scores that were determined during training to be associated with a predefined gesture, such as a hand wave, and the combinations of weighted evaluation scores that were determined during training to not be associated with a predefined gesture. For example, the apparatus 10, such as the processor 12, may utilize a nearest neighbor classifier CNN to identify a gesture based upon these similarities.
As shown in operation 42 of
As described above,
Accordingly, blocks of the flowchart support combinations of means for performing the specified functions and combinations of operations for performing the specified functions for performing the specified functions. It will also be understood that one or more blocks of the flowchart, and combinations of blocks in the flowchart, can be implemented by special purpose hardware-based computer systems which perform the specified functions, or combinations of special purpose hardware and computer instructions.
In some embodiments, certain ones of the operations above may be modified or further amplified. Furthermore, in some embodiments, additional optional operations may be included. Modifications, additions, or amplifications to the operations above may be performed in any order and in any combination.
Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Moreover, although the foregoing descriptions and the associated drawings describe example embodiments in the context of certain example combinations of elements and/or functions, it should be appreciated that different combinations of elements and/or functions may be provided by alternative embodiments without departing from the scope of the appended claims. In this regard, for example, different combinations of elements and/or functions than those explicitly described above are also contemplated as may be set forth in some of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2011/083759 | 12/9/2011 | WO | 00 | 5/29/2014 |