This disclosure generally relates to a human machine interface system for a vehicle, and more particularly relates to using an occupant monitor to determine a body orientation or gaze direction to help interpret a voice command.
As the number of devices or features in a vehicle that can be controlled with voice commands increases, it can become difficult for a voice recognition system to determine to which device or feature an occupant issued voice command is directed. One strategy is to have a rigorous voice command structure so the voice recognition system can follow a predetermined logic structure to interpret a voice command. However, the operator may have trouble remembering a complicated voice command logic structure, and so voice commands may be misinterpreted and the operator may become frustrated and/or dissatisfied with the voice recognition system.
In accordance with one embodiment, a human machine interface (HMI) system for a vehicle is provided. The system includes a plurality of voice activated devices, and an occupant monitor. The occupant monitor is configured to determine gaze direction of an occupant of the vehicle. The system is configured to determine to which of the plurality of voice activated devices a voice command is directed based on the gaze direction.
In another embodiment, a method of operating a vehicle is provided. The method includes the step of determining a gaze direction of an occupant of a vehicle with an occupant monitor. The method also includes the step of determining to which of a plurality of voice activated devices a voice command is directed based the gaze direction.
Further features and advantages will appear more clearly on a reading of the following detailed description of the preferred embodiment, which is given by way of non-limiting example only and with reference to the accompanying drawings.
The present invention will now be described, by way of example with reference to the accompanying drawings, in which:
Described herein is a Human Machine Interface (HMI) system that combines voice recognition with gaze direction detection or gesture detection to help the voice recognition determine to what the operator's voice command is referring. For example, an inquiry (i.e. voice command) of “What is that?” or “Give me more information?” might be clear when there is only one display or device operable by voice command. However, if there are multiple displays or devices that respond to voice commands, it may be unclear as to which display or device the voice command is directed. The system described herein helps interpret a voice command by considering a gaze direction or gesture of the occupant issuing the voice command.
The system 10 may also include a controller 30 in electrical and/or functional communication with the occupant monitor, 14, the microphone 28, and the plurality of voice activated devices 20, see
The microphone 28 generally conveys a voice signal 34 to the controller 30 based on a voice command 36 The microphone 28 may include an amplifier, filter, or other signal processing capability known to those skilled in the art. The filter may be configured to accentuate the voice command and/or reduce ambient noise so the voice command 36 can be more accurately interpreted.
By way of example and not limitation, the controller 30 responds to receiving a voice signal 34 that corresponds to a voice command by analyzing the image signal 32 to determine which of the plurality of voice activated devices 20 the voice command is for or directed toward. Alternatively, the occupant monitor may be configured to autonomously determine gaze direction of an occupant 18 of the vehicle, and communicate that gaze direction information directly to the controller 30.
Referring to
In another situation, the controller 30 receives a voice signal 34 corresponding to the occupant 18 saying, “'louder” or “increase volume.” If the gaze direction 16 of the occupant 18 is directed toward the navigation device 22, and so corresponds to dashed line 16B, then the controller 30 may direct the voice command 36 to the navigation device 22. In response, the navigation device 22 may increase the volume used to announce information relevant an up-coming turn to reach a selected destination. Alternatively, if the gaze direction is directed toward the entertainment device 24 and so corresponds to dashed line 16D, then the controller 30 may direct the voice command 36 to the entertainment device 24. In response, the entertainment device 24 may increase the volume setting of the entertainment device so music from speakers (not shown) is louder.
Step 410, DETECT VOICE COMMAND, may include the controller 30 processing the voice signal 34 to determine what words were spoken by the occupant 18.
Step 420, GAZE DIRECTION?, may include the controller 30 analyzing the image signal 32 from the occupant monitor 14 to determine a gaze direction 16 of the occupant. As shown in this non-limiting example, there are three choices. However, more than three choices are contemplated, and three choices are shown here only to simplify the illustration.
Option 431, OUTSIDE VEHICLE, is selected if the occupant monitor 14 indicates that the gaze direction corresponds to dashed line 16A.
Step 441, INTERPRET VOICE COMMAND, may include the controller 30 trying to match the words that were spoken by the occupant 18 to a list of possible command related to the navigation device 22.
Step 451, DESCRIBE POINT OF INTEREST, may include the navigation device 22 announcing the name of a business proximate to the gaze direction 16.
Option 432, NAVIGATION DEVICE, is selected if the occupant monitor 14 indicates that the gaze direction corresponds to dashed line 16B.
Step 442, INTERPRET VOICE COMMAND, may include the controller 30 trying to match the words that were spoken by the occupant 18 to a list of possible command related to the navigation device 22, for example “Show Map” or “Zoom In”.
Step 452, CHANGE DESTINATION, may include the navigation device 22 setting the destination to HOME in response to the occupant 18 saying the word ‘take me home’.
Option 433, ENTERTAINMENT DEVICE, is selected if the occupant monitor 14 indicates that the gaze direction corresponds to dashed line 16D.
Step 443, INTERPRET VOICE COMMAND, may include the controller 30 trying to match the words that were spoken by the occupant 18 to a list of possible command related to the entertainment device 24.
Step 453, PLAY SONG, may include the entertainment device playing a song entitled TAKE ME HOME in response to the occupant 18 saying the word ‘take me home’.
Accordingly, a system 10, a controller 30 for the system 10 and a method 400 of operating a vehicle equipped with the system 10 is provided.
While this invention has been described in terms of the preferred embodiments thereof, it is not intended to be so limited, but rather only to the extent set forth in the claims that follow.