This application is based on and hereby claims priority to International Application No. PCT/EP2013/003137 filed on Oct. 18, 2013 and German Application No. 10 2012 110 460.3 filed on Oct. 31, 2012, the contents of which are hereby incorporated by reference.
The invention relates to a method for inputting a control command for a component of a motor vehicle.
The operation of an electronic apparatus should, particularly in a motor vehicle, be possible with the least possible distraction of the driver from driving. In this context, a man-machine interaction on the basis of gesture recognition has proven expedient. In this case, the movement of a hand of the user or of another input object in space is identified by an acquisition device, and a corresponding control command for positioning the selection element on a screen of the electronic apparatus is generated therefrom.
DE 201 22 526 discloses a method in which so-called “structured light”, for example in the form of a fan beam, is used in order to produce a virtual input instrument. It is furthermore known to use stereoscopic cameras in order to determine the spatial position of a hand of a user and to use this for gesture recognition.
Known methods suffer from the disadvantage of requiring a large computing power for evaluating the images acquired. Commercially available instruments for gesture recognition are furthermore usually designed for use in the home entertainment sector. In contrast to most home applications, however, in motor vehicles there are much more complex illumination conditions, which make the evaluation of acquired images even more difficult.
One possible object is to provide a method, which allows simple and reliable gesture recognition in motor vehicles.
The inventors propose a method for inputting a control command for a component of a motor vehicle, which involves:
According to the proposal, the imaging device comprises at least one infrared-sensitive camera, and the acquisition region is illuminated with at least one infrared source.
In this way, simple and reliable gesture recognition can be achieved with little equipment outlay. In particular, the absence of stereoscopic image recognition reduces the computing outlay required, so that the image recognition can be carried out in real time. By virtue of the illumination of the acquisition region, reliable recognition is simultaneously made possible even under the difficult illumination conditions in motor vehicles. By way of example, a hand of the user may in this case be used as the input object, although the input of gestures is also possible by acquiring other objects. For example, a nod of the head or a shake of the head of the user may be identified. Objects outside the body, for example a stylus, may also be used.
It is particularly expedient here for the position change of the input object to be identified by adapting at least two images of the image sequence to a skeleton model of the input object and comparing the parameters of the skeleton model for the at least two images.
Adaptation to such a skeleton model allows rapid data reduction in the evaluation of the acquired images, so that particularly little computing power is required. Such a skeleton model may describe the shape of the input object by parameters, which describe for example the flexion angle of the individual finger joints. By varying these parameters until the skeleton model describes the same shape of the hand, as can be seen in the image, a set of parameters is obtained, namely for example the flexion angles of the finger joints, with the aid of which the evaluation of the relative position of the finger can be determined by a computer.
In another configuration, a calibration image is acquired for a predetermined position of the input object in the acquisition region in order to calibrate the skeleton model for a specific input object. This only needs to be done once, in order to be able to identify a particular object later. For example, the size of a hand of the user may thus be acquired accurately in order subsequently to determine position information from the ratio between the actual object size and the image size.
According to another aspect, the position of at least one point of the input object in a plane perpendicular to a viewing direction of the at least one camera is determined with the aid of the coordinates of the acquired point on a detector matrix of the camera. In this way, two-dimensional position information, which is directly usable on its own for the gesture recognition, is obtained particularly simply.
In order to determine the position of at least one point of the input object along an axis parallel to the viewing direction of the at least one camera, i.e. to obtain depth information, the depth position is therefore determined with the aid of an image distance of at least two points of the input object and the ratio thereof to a known object distance of the at least two points.
Further depth information can be obtained by determining the position of at least one point of the input object along the axis parallel to the viewing direction of the at least one camera with the aid of a luminous intensity of the light of the infrared source scattered back by the input object to the camera. This allows accurate distance determination in the simplest way, since the intensity of the light emitted by the infrared source—and therefore also of the light scattered back—decreases with the square of the distance. Even small distance changes therefore lead to a considerable brightness change, so that a high measurement accuracy is made possible.
In order to accommodate varying light conditions and the limited bandwidth of the camera, it is in this case expedient for an illumination power of the infrared source to be varied cyclically between at least two predetermined power values. In this way, it is reliably possible to avoid overexposures or underexposures, which may impair the distance determination. The variation of the illumination power may in this case also comprise intervals in which the infrared source does not emit any light. Since such cycling of the infrared source is carried out according to a known pattern, in this way the light component of the light source can be reliably separated in the camera signal from environmentally induced fluctuations in the ambient infrared light, so that the image recognition is simplified considerably.
For the same purpose, it is furthermore advantageous to vary an exposure time of the at least one camera cyclically between at least two predetermined values. Even in the event of strong contrasts in the image, it is thus possible to obtain full information, for example by evaluating dark regions only in images with a long exposure time and bright regions only in images with a short exposure time.
Since the geometry of the passenger compartment is furthermore known, a position ratio between the at least one point and one vehicle-fixed object with a known position acquired by the at least one camera may be determined in order to identify the position of at least one point of the input object. By setting the object in relation to known vehicle-fixed objects, additional geometrical information is obtained, which can be used in order to improve the accuracy of the position identification or to validate positions already determined.
The inventors further propose a motor vehicle having an electronic apparatus and a gesture control device with an imaging device for acquiring an operating gesture of an operating object in a passenger compartment of the motor vehicle. According to the proposal, the gesture control device is configured in order to carry out the proposed method. The advantages derive from the advantages explained with reference to the method.
These and other objects and advantages of the present invention will become more apparent and more readily appreciated from the following description of the preferred embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.
For the operation of motor-vehicle components, for example entertainment systems, mobile telephones, navigation equipment, or also electrical power windows, seat adjustments, air conditioning systems and the like, the attention of a driver of the motor vehicle should be distracted as little as possible from the road. For this reason, control is intended to be carried out by gesture recognition, since in this case the driver does not need to look for the operating elements, and does not possibly need to divert his attention from the road in order to do so.
To this end, the interior of the motor vehicle around the driver is recorded with at least one infrared camera 10 and this region is simultaneously illuminated with at least one infrared light source, preferably in the wavelength range of 780-870 nm. From the recorded image sequence, variations in the position of a hand 14 or of another input object can be determined, and these can in turn be assigned to particular control commands.
In order to reduce the computing outlay for the image recognition as much as possible, and to permit reliable real-time processing of the camera images, for this purpose a so-called skeleton model 16 of the hand 14 is initially constructed in a calibration phase. For this, the hand is recorded for the first time in a predetermined calibration position, so that the basic dimensions of the hand 14 can be determined. The resulting skeleton model 16 then assigns to the hand 14 the position of the individual joints 18 and of the fingertips 20, and furthermore comprises the invariant distances between respectively connected points.
Images subsequently acquired with the camera 10 for the gesture recognition can then be evaluated by adapting the skeleton model 16 to the shape of the hand 14 in the image. In this case, the possible movement space of the hand 14, which is limited by the respective flexion angle ranges of the joints, is searched for a configuration which corresponds to the image of the hand 14.
From changes in the configuration of the hand 14 and its position relative to the camera 10 in sequences of successive images, a respective gesture of the user can then be identified and the corresponding control command can be generated.
Besides the configuration of the hand 14 as defined by the flexion angles of the joints 18, it is thus also necessary to determine its position with respect to the camera 10.
Determination of the position of the hand 14 in the x-y plane, i.e. the plane perpendicular to the viewing direction of the camera 10, is in this case particularly simple. The camera 10 comprises a matrix sensor 22 having a multiplicity of pixels 24. The x-y position can therefore be determined easily from the position of the image 26 of the hand 14 on the matrix sensor 22.
An increased accuracy may be achieved by using two cameras 10. The cameras 10 are in this case to be oriented in such a way that the pixels 24 of the respective matrix sensors 22 are not exactly congruent, but rather have an offset from one another, as represented in
Besides the x-y position, however, it is also necessary to determine the position of the hand 14 in the z direction, i.e. in the viewing direction of the camera 10. To this end, there are several possibilities.
First, the distance between the hand 14 and the camera 10 may be deduced from the size of the hand 14, known by the calibration, and the image size of the hand 14 on the matrix sensor 22 of the camera 10. As shown in
An improved accuracy is achieved when the hand 14 is moved relative to the z axis. From the change of the image size in successive images, it is possible—substantially on the basis of the centric transformation—to calculate the distance change of the hand with an accuracy of ±1 cm. For many gesture recognition problems, such an accuracy is already sufficient.
In order to determine the z position with the highest accuracy, the propagation characteristic of the light of the infrared source 12 may be used. Specifically, the illumination strength, i.e. the light flux per unit area, is inversely proportional to the square of the distance from the infrared source 12. This naturally also applies for the light scattered back or reflected from the hand 14 to the camera 10. It follows that even small changes in distance between the hand 14 and the infrared source 12 lead to strong luminous intensity changes in the camera image of the hand 14, on the basis of which the distance change can be determined with an accuracy of ±0.5 mm.
In this case, however, the problem arises that the bandwidth of the camera 10 is restricted. If, in the event of a strong luminous intensity of the infrared source, the hand 14 is located very close to the camera 10, then overdriving of the camera 10 may occur, so that useful image evaluation is no longer possible. As illustrated in
This may be further reinforced by a cyclic variation of the exposure time of the camera 10, as shown in
Besides adaptation to the strongly varying intensity of the light scattered back by the hand 14, this moreover makes it possible to minimize error sources due to incident ambient light, which may vary greatly in the motor vehicle.
Since both the configuration of the hand and its position in all spatial directions can now be acquired, these values may be stored for each recorded image. From the sequence of changes in these parameters, gestures of the driver can then be identified reliably with known image analysis methods.
All the analysis methods described may be improved by the use of a plurality of redundant cameras 10. This allows plausibilization of the values acquired by each individual camera 10, and optionally the exclusion of implausible recordings or evaluations. In this way, despite strong perturbing influences present in the motor vehicle, gestures can be reliably identified and used for the control.
The invention has been described in detail with particular reference to preferred embodiments thereof and examples, but it will be understood that variations and modifications can be effected within the spirit and scope of the invention covered by the claims which may include the phrase “at least one of A, B and C” as an alternative expression that means one or more of A, B and C may be used, contrary to the holding in Superguide v. DIRECTV, 69 USPQ2d 1865 (Fed. Cir. 2004).
Number | Date | Country | Kind |
---|---|---|---|
10 2012 110 460 | Oct 2012 | DE | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2013/003137 | 10/18/2013 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2014/067626 | 5/8/2014 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6353428 | Maggioni et al. | Mar 2002 | B1 |
7456815 | Reime | Nov 2008 | B2 |
7925077 | Woodfill | Apr 2011 | B2 |
8175374 | Pinault | May 2012 | B2 |
8180114 | Nishihara et al. | May 2012 | B2 |
8249334 | Berliner | Aug 2012 | B2 |
20040184659 | Bang | Sep 2004 | A1 |
20050025345 | Ohta et al. | Feb 2005 | A1 |
20050031166 | Fujimura | Feb 2005 | A1 |
20050088407 | Bell | Apr 2005 | A1 |
20050185825 | Hoshino | Aug 2005 | A1 |
20060209072 | Jairam | Sep 2006 | A1 |
20060238490 | Stanley et al. | Oct 2006 | A1 |
20090096783 | Shpunt | Apr 2009 | A1 |
20090262070 | Wilson | Oct 2009 | A1 |
20100195867 | Kipman | Aug 2010 | A1 |
20100235786 | Maizels | Sep 2010 | A1 |
20100295773 | Alameh | Nov 2010 | A1 |
20100302138 | Poot | Dec 2010 | A1 |
20100303289 | Polzin | Dec 2010 | A1 |
20100306716 | Perez | Dec 2010 | A1 |
20100309288 | Stettner | Dec 2010 | A1 |
20100328475 | Thomas | Dec 2010 | A1 |
20110052006 | Gurman | Mar 2011 | A1 |
20110080490 | Clarkson et al. | Apr 2011 | A1 |
20110129124 | Givon | Jun 2011 | A1 |
20110150271 | Lee | Jun 2011 | A1 |
20110164032 | Shadmi | Jul 2011 | A1 |
20110211754 | Litvak | Sep 2011 | A1 |
20110237324 | Clavin | Sep 2011 | A1 |
20110293137 | Gurman | Dec 2011 | A1 |
20120019485 | Sato | Jan 2012 | A1 |
20120056982 | Katz et al. | Mar 2012 | A1 |
20120070070 | Litvak | Mar 2012 | A1 |
20120120073 | Haker | May 2012 | A1 |
20120127128 | Large | May 2012 | A1 |
20120229377 | Kim et al. | Sep 2012 | A1 |
20120327125 | Kutliroff | Dec 2012 | A1 |
20130141574 | Dalal | Jun 2013 | A1 |
20130157607 | Paek | Jun 2013 | A1 |
20150304638 | Cho | Oct 2015 | A1 |
Number | Date | Country |
---|---|---|
101813995 | Aug 2010 | CN |
102385237 | Mar 2012 | CN |
102439538 | May 2012 | CN |
19708240 | Sep 1998 | DE |
10022321 | Nov 2001 | DE |
10133823 | Feb 2003 | DE |
10242890 | Mar 2004 | DE |
20122526 | Jun 2006 | DE |
102009023875 | Feb 2010 | DE |
102010031801 | Jan 2012 | DE |
2009045861 | Apr 2009 | WO |
Entry |
---|
WIPO English language translation of the International Preliminary Report on Patentability for PCT/EP2013/003137, downloaded from WIPO website on Sep. 23, 2015, 9 pages. |
German Office Action for German Priority Patent Application No. 10 2012 110 460.3, issued May 5, 2014, 6 pages. |
English language International Search Report for PCT/EP2013/003137, mailed Feb. 26, 2014, 3 pages. |
PCT/EP2013/003137, Oct. 18, 2013, Ulrich Mueller et al. |
DE 102012110460.3, Oct. 31, 2012, Ulrich Mueller et al. |
European Office Action dated Jun. 13, 2016 from European Patent Application No. 13786625.7, 6 pages. |
Chinese Office Action dated Sep. 5, 2016 from Chinese Patent Application No. 201380042805.6, 7 pages. |
Number | Date | Country | |
---|---|---|---|
20150301591 A1 | Oct 2015 | US |