This application claims the benefit of priority from European Patent Application No. 06 020730.5, filed Oct. 2, 2006, which is incorporated by reference.
1. Technical Field
This disclosure relates to control of vehicular functions. In particular, this disclosure relates to voice control of vehicular functions.
2. Related Art
Occupants of vehicles may operate different equipment in a vehicle cabin. Some equipment, such as side-view mirrors, may be manipulated by hand or by servo motors. Other equipment, such as locks or latches, are usually operated manually, either by use of a key or applying pressure to a lever or button. To release the hood or trunk, a user outside of the vehicle typically inserts a key into a locking mechanism. However, this may be difficult or inconvenient if the user's hands are not free.
A vehicular voice control system includes a first and a second microphone located on the vehicle external to a vehicle cabin. The microphones receive audio signals from an audio source external to the vehicle and generate microphone output signals. A signal processor processes the microphone output signals, generates a processed signal, and determines a location of the audio source. A speech recognition system receives the processed signal and obtains a recognition result. A controller controls one or more vehicular elements based on the recognition result and the determined location of the audio source.
Other systems, methods, features, and advantages will be, or will become, apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the invention, and be protected by the following claims.
The system may be better understood with reference to the following drawings and description. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. Moreover, in the figures, like-referenced numerals designate corresponding parts throughout the different views.
To respond to verbal commands issued by the user 306, the vehicle control system 300 may include a voice recognition system 310 in communication with a digital signal processor 320. One or more devices that convert sound into operating signals, such as sound transducers, may be mounted external to the vehicle cabin 120. The devices or microphone 326 may receive audio signals from a speaker or user 306 located outside of the vehicle 102. The voice recognition system 310 may process the signals from the microphones 326 through the digital signal processor 320. A vehicle computer or processor 340 may receive input from the voice recognition system 310, which may generate a recognition result. Based on the recognition result, the vehicle computer 340 may provide commands to an actuator system 350. The actuator system 350 may activate a vehicle function, such as releasing trunk latch 362, a door latch 364, a hood latch 366 and/or may activate a vehicle ignition system 368. The user 306 may also issue a command while outside of the vehicle 102 to turn on an air-conditioning system 370 to cool the vehicle, or control the operation of a convertible top system 372.
The voice recognition system 310 may recognize limited vocabularies, may process various speech patterns as well as accents, and may “learn” by receiving weighted inputs that, with adjustment time, and repetition, can produce a desired result. The digital signal processor 320 may process the digitized speech signal in parallel with the voice recognition system 310. The digital signal processor 320 or the voice recognition system 310 may perform spectral analysis. An analog-to-digital converter 380 may convert the output of the microphones 326 to digital form. The digitized speech may be sampled at a rate about between 6.6 kHz and 22.1 kHz. Representations of the digitized speech may be derived from the short term power spectra, and may represent a sequence of characterizing vectors containing values referred to as features or feature parameters. The values of the feature parameters may be used in succeeding processing stages to generate a probability estimate that the portion of the analyzed waveform corresponds to a word in a vocabulary list. The voice recognition system 310 may recognize verbal utterances as either isolated words or continuous speech captured by the microphones 326.
The recognition result provided by the voice recognition system 310, for example, an entry in a vocabulary list, may represent the verbal utterance or command issued by the user 306. For example, if the user 306 issues a command “open hatch,” an appropriate recognition result may provide access to the compartment. Based on a recognition result, the vehicle computer 340 may direct the actuator system 350 to open the trunk latch 362 via mechanical linkage, or electronic control of the physical hatch latch. Manual operation by the user 306 may be obviated.
The voice recognition system 310 may process the digitized audio signals and issue a command to the actuator system 350 corresponding to the command spoken by the user 306. Based on the issued command, the actuator system 350 may activate or release the trunk latch 362, the door latch 364, or the hood latch 366, or may activate the vehicle ignition system 368 or the convertible top 372.
The microphones 326 may be installed in a lens housing that is fixed to the vehicle body rather than in a lens housing fixed to the hatch 110 or trunk of the vehicle 102. Lens housings that are fixed to the hatch or trunk 110 may move when the hatch or trunk is opened, which may adversely affect the audio signal received by the microphones 326 due to their changing position.
The pattern of microphones 326 may be located in any housing of the vehicle 102, such as the left or right tail light housing 102, the left or right headlight housing 206, a fog light housing, and a turn signal housing. The microphones may be housed on other housings on or in the vehicle. For example, a plurality of microphones may be supported on or in a structure located on the roof, hood, trunk or other vehicle structure. Such lens or housing structures may be made of plastic or glass, which may conduct sound well. Audio signals, such as verbal commands, may be received by the microphones 326 without significant attenuation through the material from which the lens or housing is formed.
The beamforming circuit 1100 may be a fixed beamformer, such as a delay-and-sum beamformer. The beamforming circuit 1100 may be an adaptive beamformer having permanent adaptive filter coefficients. The beamforming circuit 1100 may include the processes described in “Adaptive Beamforming for Audio Signal Acquisition,” by Herbordt and Kellermann, from a book entitled “Adaptive Signal Processing: Applications to Real-World Problems,” p. 155, Springer, Berlin 2003. The beamforming circuit 1100 may include a general sidelobe canceling (GSC) circuit 1114. The GSC circuit may include processes described in “An Alternative Approach to Linearly Constrained Adaptive Beamforming,” by Griffiths and Jim, IEEE Transactions on Antennas and Propagation, vol. 30, p. 27, 1982. The GSC circuit may include a first adaptive path having a blocking matrix and an adaptive noise canceling circuit, and a second non-adaptive path having a fixed beamforming circuit.
The digital signal processor 1104 may determine a difference in the receipt times between the signals of the individual microphones 326 or the microphones in the microphone array 1110. This may be based on the time that each microphone 326 receives the audio signal from a particular source. An uppermost microphone of a vertical microphone arrangement may detect the speech signal before a lowermost microphone detects the same speech signal. Minimal time difference may be detected for the ambient noise 1206 because the incident angle is about zero degrees relative to the individual microphones 326. The audio signal corresponding to noise may reach each of the microphones 326 essentially at the same time.
A verbal utterance by the user 306 may originate at a finite vertical angle α with respect to the horizontal plane. Thus, the user's a speech signal may be detected by the substantially vertically arranged microphones 326 at slightly different times. Based on the measured time difference, the digital signal processor 1104 or a location determining circuit 1130 within the digital signal processor 1104 may determine a vertical angle of incidence α of the speech signal. The vertical angle of incidence may establish the location of the source of the audio signal, namely the speaker, relative to the microphones 326. A location determining circuit 1130 may be separate from the digital signal processor 1104.
Based on the measured vertical angle α, an attenuation circuit 1140 may attenuate audio signals having a vertical angle αelow, for example, between about 10 to 20 degrees. The attenuation circuit 1140 may be part of the beamforming circuit 1100 or the digital signal processor 1104, or may be a separate circuit. Thus, the ambient noise 1206 signals may be attenuated while preserving the speech signals. Such multi-channel signal processing by the digital signal processor 1104 or beamforming circuit 1100 may enhance the signal-to-noise ratio of the speech signal. This may increase voice recognition accuracy. Beamforming may include amplifying microphone signals corresponding to audio signals detected from a desired direction by equal phase addition. Beamforming may also include attenuation of microphone signals corresponding to audio signals originating from undesired directions.
The plurality of microphones 326 or the individual microphones of a particular microphone array 1110 need not necessarily be located in a single lens housing. For example, one or more microphones 326 of a vertically arranged array may be installed in the left side tail lens, while other microphones 326 of the vertically arranged array may be installed in the right side tail lens. Because the microphone arrays 1110 may be separated by a relatively large distance, a high spatial resolution of the direction of the speech signal may be available. However, it may be more cost-effective to have a microphone array 1110 installed in a single lens housing.
While various embodiments of the invention have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the invention. Accordingly, the invention is not to be restricted except in light of the attached claims and their equivalents.
| Number | Date | Country | Kind |
|---|---|---|---|
| 06020730.5 | Oct 2007 | EP | regional |