This disclosure relates to driver and machine interfaces in vehicles, and, more particularly, to such interfaces which permit a driver to interact with the machine without physical contact.
Systems for occupant's interaction with a vehicle are now available in the art. An example is the ‘SYNC’ system that provides easy interaction of a driver with the vehicle, including options to make hands-free calls, manage musical controls and other functions through voice commands, use a ‘push-to-talk’ button on the steering wheel, and access the internet when required. Further, many vehicles are equipped with human-machine interfaces provided at appropriate locations. This includes switches on the steering wheel, knobs on the center stack, touch screen interfaces and track-pads.
At times, many of these controls are not easily reachable by the driver, especially those provided on the center stack. This may lead the driver to hunt for the desired switches and quite often, the driver is required to stretch out his hand to reach the desired controlling function(s). Steering wheel switches are easily reachable, but, due to limitation on the space available thereon, there is a constraint on operating advanced control features through steering wheel buttons. Though voice commands may be assistive in this respect, this facility can be cumbersome when used for simple operations requiring a variable input, such as, for instance, adjusting the volume of the music system, changing tracks or flipping through albums, tuning the frequency for the radio system, etc. For such tasks, voice command operations take longer at times, and the driver prefers to control the desired operation through his hands, rather than providing repetitive commands in cases where the voice recognition system may not recognize the desired command in a first utterance.
Therefore, there exists a need for a better system for enabling interaction between the driver and the vehicle's control functions, which can effectively address the aforementioned problems.
The present disclosure describes a gesture-based recognition system, and a method for interpreting the gestures of a vehicle's occupant, and actuating corresponding desired commands after recognition.
In one embodiment, this disclosure provides a gesture-based recognition system to interpret the gestures of a vehicle occupant and obtain the occupant's desired command inputs. The system includes a means for capturing an image of the vehicle's interior section. The image can be a two-dimensional image or a three-dimensional depth map corresponding to the vehicle's interior section. A gesture recognition processor separates the occupant's image from the background in the captured image, analyzes the image, interprets the occupant's gesture from the separated image, and generates an output. A command actuator receives the output from the gesture recognition processor and generates an interpreted command. The actuator further generates a confirmation message corresponding to the interpreted command, delivers the confirmation message to the occupant and actuates the command on receipt of a confirmation from the occupant. The system further includes an inference engine processor coupled to a set of sensors. The inference engine processor evaluates the state of attentiveness of the occupant and receives signals from the sensors, corresponding to any potential threats. A drive-assist system is coupled to the inference engine processor and receives signals from it. The drive-assist system provides warning signals to the occupant when the inference engine detects any potential threat, at a specific time, based on the attentiveness of the occupant.
In another embodiment, this disclosure provides a method of interpreting a vehicle occupant's gestures and obtaining the occupant's desired command inputs. The method includes capturing an image of the vehicle's interior section and separating the occupant's image from the captured image. The separated image is analyzed, and the occupant's gesture is interpreted from the separated images. The occupant's desired command is then interpreted and a corresponding confirmation message is delivered to the occupant. On receipt of a confirmation, the interpreted command is actuated.
Additional aspects, advantages, features and objects of the present disclosure would be made apparent from the drawings and the detailed description of the illustrative embodiments construed in conjunction with the appended claims that follow.
The following detailed description discloses aspects of the disclosure and the ways it can be implemented. However, the description does not define or limit the invention, such definition or limitation being solely contained in the claims appended thereto. Although the best mode of carrying out the invention has been disclosed, those in the art would recognize that other embodiments for carrying out or practicing the invention are also possible.
The present disclosure pertains to a gesture-based recognition system and a method for interpreting the gestures of an occupant and obtaining the occupant's desired command inputs by interpreting the gestures.
The occupant's vehicle may also be equipped with a high-precision collision detection system 160, which may be any appropriate collision detection system commonly known in the art. The collision detection system 160 may include a set of radar sensors, image processors and side cameras etc., working in collaboration. The collision detection system 160 may also include a blind-spot monitoring system for side sensing and lane change assist (LCA), which is a short range sensing system for detecting a rapidly approaching adjacent vehicle. The primary mode of this system is a short-range sensing mode that normally operates at about 24 GHz. Blind spot detection systems can also include a vision-based system that uses cameras for blind-spot monitoring. In another embodiment, the collision detection system 160 may include a Valeo Raytheon system that operates at 24 GHz and monitors vehicles in the blind-spot areas on both sides of the vehicle. Using several beams of the multi-beam radar system, the Valeo system accurately determines the position, distance and relative speed of an approaching vehicle in the blind-spot region. The range of the system is around 40 meters, with about a 150 degree field of view.
On identification of any potential collision threats, the collision detection system 160 provides corresponding signals to a gesture recognition processor 120. For simplicity and economy of expression, the gesture recognition processor 120 will be referred to as ‘processor 120’ hereinafter. As shown in
Another gesture that the processor 120 interprets, with the corresponding images being stored in database 122, is a Scrolling/Flipping/Panning feature, as shown in
The image shown in
The image shown in
The gesture in
Other similar explicable and eventually applicable gestures and their corresponding images in the database 122, though not shown in the disclosure drawings, include those corresponding to a moon roof opening/closing function. To enable this feature, the occupant needs to provide an input by posing a gesture pretending to grab a cord near the front of the moon-roof, and then pulling it backward, or pushing it forward. Continuous capturing of the occupant's image provides a better enabling of this gesture-based interpretation, and the opening/closing moon-roof stops at the point when the occupant's hand stops moving. Further, a quick yank backward or forward results in the complete opening/closing of the moon-roof. Another gesture results in pushing-up the moon-roof away from the occupant. The occupant needs to bring his hands near the moon-roof, with the palm facing upwards towards it, and then push the hand slightly further, upwards. To close a ventilated moon-roof, the occupant needs to bring his hands close to the moon-roof, pretend to hold a cord, and then pull it down. Another possible explicable gesture that can be interpreted by the gesture recognition processor 120, is the ‘swipe gesture’ (though not shown in the figures). This gesture is used to move a displayed content between the heads up display (HUD), the cluster and the center stack of the vehicle. To enable the functionality of this gesture, the occupant needs to point his index finger towards the content desired to be moved, and move the index finger in the desired direction, in a manner resembling the ‘swiping action’. Moving the index finger from the heads up display towards the center stack, for example, moves the pointed content from the HUD to the center stack.
Processor 120 includes an inference engine processor 124 (referred to as ‘processor 124’ hereinafter). Processor 124 uses the image captured by the means 110, and inputs from vehicle's interior sensors 112 and exterior sensors 114, to identify the driver's state of attentiveness. This includes identifying cases where the driver is found inattentive, such as being in a drowsy or a sleepy state, or conversing with a back seat/side occupant. In such cases, if there is a potential threat, as identified by the collision detection system 160, for instance, a vehicle rapidly approaching the occupant's vehicle and posing a collision threat, the detection system 160 passes potential threat signals to the processor 124. The processor 124 conveys driver's inattentiveness to a drive-assist system 150. The drive-assist system 150 provides a warning signal to the driver/occupant. Such warning signal is conveyed by either verbally communicating with the occupant, or by an alarming beep. Alternatively, the warning signal can be rendered on a user interface, with details thereof displayed on the interface. The exact time when such a warning signal is conveyed to the occupant would depend upon the occupant's attentiveness. Specifically, for a drowsy or a sleepy driver, the signals are conveyed immediately and much earlier than when the warning signal would be provided to an attentive driver. If the vehicle's exterior sensors 114 identify a sharp turn ahead, a sudden speed bump, or something similar, and the occupant is detected sitting without having fastened a seat-belt, then the driver assist system 150 can provide a signal to the occupant to fasten the seat belt.
The processor 120 further includes a driver recognition module 126, which is configured to identify the driver's image. Specifically, the driver recognition 126 module is configured to identify the image of the owner of the car, or the person who most frequently drives the car. In one embodiment, the driver recognition module 126 uses a facial recognition system that has a set of pre-stored images in a facial database, corresponding to the owner or the person who drives the car most frequently. Each time, when the owner drives the car again, the driver-recognition module obtains the captured image of the vehicle's interior section from the means 110, and matches the occupant's image with the images in the facial database. Those skilled in the art will recognize that the driver recognition module 126 extracts features or landmarks from the occupant's captured image, and matches those features with the images in the facial database. The driver recognition module can use any suitable recognition algorithm known in the art, for recognizing the driver, including the Fisherface algorithm that uses Elastic bunch graph matching, Linear discriminate analysis, Dynamic link matching, and so on.
Once the driver recognition module 126 recognizes the driver/owner occupying the driving seat, it passes signals to a personalization functions processor 128. The personalization functions processor 128 readjusts a set of vehicle's personalization functions to a set of pre-stored settings. The pre-stored settings correspond to the driver's preferences, for example, a preferred temperature value for the air-conditioning system, a preferred range for the volume of the music controls, the most frequently visited radio frequency band, readjusting the driver's seat to the preferred comfortable position, etc.
A command actuator 130 (referred to as ‘actuator 130’ hereinafter) is coupled to the processor 120. The actuator 130 actuates the occupant's desired command after the processor 120 interprets the occupant's gesture. Specifically, on interpreting the occupant's gesture, the processor 120 generates a corresponding output and delivers the output to the actuator 130. The actuator 130 generates the desired command using the output, and sends a confirmation message to the occupant, before actuating the command. The confirmation message can be verbally communicated to the occupant through a communication module 134, in a questioning mode, or it can be rendered over a user interface 132 with an approving option embedded therein (i.e., ‘Yes’ or ‘No’ icons). The occupant confirms the interpreted command either by providing a verbal confirmation, or clicking the approving option on the user interface 132. In cases where the occupant provides a verbal confirmation, a voice-recognition module 136 interprets the confirmation. Eventually, the actuator 130 executes the occupant's desired command. In a case where a gesture is misinterpreted, and a denial to execute the interpreted command is obtained from the occupant, the actuator 130 renders a confirmation message corresponding to a different command option, though similar to the previous one. For instance, if the desired command is to increase the volume of music system, and it is misinterpreted as increasing the temperature of the air-conditioning system, then on receipt of a denial from the occupant in the first turn, the actuator 130 renders confirmation messages corresponding to other commands, until the desired action is implementable. In one embodiment, the occupant provides a gesture-based confirmation on the rendered confirmation message. For example, a gesture corresponding to the occupant's approval to execute an interpreted command can be a ‘thumb-up’ in the air, and a denial can be interpreted by a ‘thumb-down’ gesture. In those aspects, the gesture database 122 stores the corresponding images for the processor 120 to interpret the gesture-based approvals.
The
At step 514, the method evaluates the driver's state of attentiveness by analyzing the captured image for the vehicle's interior section. At step 516, the method identifies any potential threats, for example, any rapidly approaching vehicle, an upcoming speed bump, or a steep turn ahead. Any suitable means known in the art can be used for this purpose, including in-vehicle collision detection systems, radars, lidar, vehicle's interior and external sensors. If a potential threat exists, and the driver is found inattentive, then at step 520, warning signals are provided to the occupant at a specific time. The exact time when such signals are provided depends on the level of attentiveness of the occupant/driver, and for the case of a sleepy/drowsy driver, such signals are provided immediately.
At step 522, the method 500 recognizes the driver through an analysis of the captured image. Suitable methods, including facial recognition systems known in the art, as explained earlier, can be used for the recognition. The image of the owner of the car, or the person who drives the car very often, can be stored in a facial database. When the same person enters the car again, the method 500 matches the captured image of the person with the images in the facial database, to recognize him. On recognition, at step 524, a set of personalization functions corresponding to the person are reset to a set of pre-stored settings. For example, the temperature of the interiors can be automatically set to a pre-specified value or the driver-side window may half-open automatically when the person occupies the seat, as preferred by him normally.
The disclosed gesture-based recognition system can be used in any vehicle, equipped with suitable devices as described before, for achieving the objects of the disclosure.
Although the current invention has been described comprehensively, in considerable details to cover the possible aspects and embodiments, those skilled in the art would recognize that other versions of the invention may also be possible.