DVE system with normalized selection

BACKGROUND AND SUMMARY OF THE INVENTION

The invention relates to digital voice enhancement, DVE, communication systems, and more particularly to enhanced selection techniques between microphones.

The invention may be used in duplex systems, for example as shown in U.S. Pat. No. 5,033,082, and U.S. application Ser. No. 08/927,874, filed Sep. 11, 1997, simplex systems, for example as shown in U.S. application Ser. No. 09/050,511, filed Mar. 30, 1998, all incorporated herein by reference, and in other DVE communication systems.

The invention of the '874 application relates to acoustic echo cancellation systems, including active acoustic attenuation systems and communication systems. The invention of the '874 application arose during continuing development efforts relating to the subject matter of U.S. Pat. No. 5,033,082, incorporated herein by reference.

In one aspect of the invention of the '874 application, a fully coupled active echo cancellation matrix is provided, cancelling echo due to acoustic transmission between zones, in addition to cancellation of echoes due to electrical transmission between zones as in incorporated U.S. Pat. No. 5,033,082. In the latter patent, a communication system is provided including a first acoustic zone, a second acoustic zone, a first microphone at the first zone, a first loudspeaker at the first zone, a second microphone at the second zone and having an output supplied to the first loudspeaker such that a first person at the first zone can hear the speech of a second person at the second zone as transmitted by the second microphone and the first loudspeaker, a second loudspeaker at the second zone and having an input supplied from the first microphone such that the second person at the second zone can hear the speech of the first person at the first zone as transmitted by the first microphone and the second loudspeaker, a first model cancelling the speech of the second person in the output of the first microphone otherwise present due to electrical transmission from the second microphone to the first loudspeaker and broadcast by the first loudspeaker to the first microphone, the cancellation of the speech of the second person in the output of the first microphone preventing rebroadcast thereof by the second loudspeaker, and a second model cancelling the speech of the first person in the output of the second microphone otherwise present due to electrical transmission from the first microphone to the second loudspeaker and broadcast by the second loudspeaker to the second microphone, the cancellation of the speech of the first person in the output of the second microphone preventing rebroadcast thereof by the first loudspeaker. In the invention of the '874 application, there is provided a third model cancelling the speech of the first person in the output of the first microphone otherwise present due to acoustic transmission from the second loudspeaker in the second zone to the first microphone in the first zone, and a fourth model cancelling the speech of the second person in the output of the second microphone otherwise due to acoustic transmission from the first loudspeaker in the first zone to the second microphone in the second zone. The invention of the '874 application has desirable application in those implementations where there is acoustic coupling between the first and second zones, for example in a vehicle such as a minivan, where the first zone is the front seat and the second zone is a rear seat, and it is desired to provide an intercom communication system, and cancel echoes not only due to local acoustic transmission in a zone but also global acoustic transmission between zones, including in combination with active acoustic attenuation.

In another aspect of the invention of the '874 application, there is provided a switch having open and closed states, and conducting the output of a microphone therethrough in the closed state, a voice activity detector having an input from the output of the microphone at a node between the microphone and the switch, an occupant sensor sensing the presence of a person at the acoustic zone, and a logical AND function having a first input from the voice activity detector, a second input from the occupant sensor, and an output to the switch to actuate the latter between open and closed states. This feature is desirable in automotive applications when there are no additional passengers for a driver to communicate with.

In another aspect of the invention of the '874 application, an input to a model is supplied through a variable training signal circuit providing increasing training signal levels with increasing speech signal levels or increased interior ambient noise levels associated with higher vehicle speeds. This is desirable for on-line training noise to be imperceptible by the occupant yet have a sufficient signal to noise ratio for accurate model convergence.

In another aspect of the invention of the '874 application, a noise responsive high pass filter is provided between a microphone and a remote yet acoustically coupled loudspeaker, and having a filter cutoff effective at elevated noise levels and reducing bandwidth and making more gain available, to improve intelligibility of speech of a person in the zone of the microphone transmitted to the remote loudspeaker. In vehicle applications, the high pass filter is vehicle speed sensitive, such that at higher vehicle speeds and resulting higher noise levels, lower frequency speech content is blocked and higher frequency speech content is passed, the lower frequency speech content being otherwise masked at higher speeds by broadband vehicle and wind noise, so that the reduced bandwidth and the absence of the lower frequency speech content does not sacrifice the perceived quality of speech, and such that at lower vehicle speeds and resulting lower noise levels, the cutoff frequency of the filter is lowered such that lower frequency speech content is passed, in addition to higher frequency speech content, to provide enriched low frequency performance, and overcome objections to a tinny sounding system.

In another aspect of the invention of the '874 application, there is provided a feedback detector having an input from a microphone, and an output controlling an adjustable notch filter filtering the output of the microphone supplied to a remote yet acoustically coupled loudspeaker. This overcomes prior objections in closed loop communication systems which can become unstable whenever the total loop gain exceeds unity. Careful setting of system gain and acoustic echo cancellation may be used to ensure system stability. For various reasons, such as high gain requirements, acoustic feedback may occur, which is often at the system resonance or where the free response is relatively undamped. These resonances usually have a very high Q factor and can be represented by a narrow band in the frequency domain. Thus, the total system gain ceiling is determined by a small portion of the communication system bandwidth, in essence limiting performance across all frequencies in the band for one or more narrow regions. The present invention overcomes this objection.

In another aspect of the invention of the '874 application, an acoustic feedback tonal canceler is provided, removing tonal noise from the output of the microphone to prevent broadcast thereof by a remote but acoustically coupled loudspeaker.

The invention of the '511 application arose during development efforts directed toward reducing complexities of full duplex voice communication systems, i.e. bidirectional voice transmission where talkers exchange information simultaneously. In a full duplex system, acoustic echo cancellation is needed to overcome feedback generated by closed loop communication channel instabilities. Use of a simplex scheme that alternately selects one or another microphone or channel as active is another way to effectively control feedback into a near end microphone from a near end loudspeaker. In a simplex system, voice transmission is unidirectional, i.e. either one way or the other way at any given time, but not in both directions at the same time.

A simplex digital voice enhancement communication system does not rely on acoustic echo cancellation to ensure stable communication loop gains for closely coupled microphones and loudspeakers. However, there is a potential for feedback into a near end microphone from a far end loudspeaker. This situation exists because it would be self-defeating to have the active microphone switched off. The invention of the '511 application addresses and solves this problem in a particularly simple and effective manner with a combination of readily available known components.

The present invention relates to enhanced selection techniques in a digital voice enhancement communication system for selecting which of a plurality of microphones to connect to a loudspeaker. The switch in the DVE system must decide which microphone from an array of microphones to select as the active one. In the past, this decision was done by comparing the average magnitude of all microphone signals in which speech was detected (voice plus noise signals). The accuracy of this method was dependent on the sensitivity of each microphone and the background (noise) signal levels at each microphone. For example, a first talker might have a more sensitive microphone than a second talker and would therefore have a higher chance at being selected as the active talker. As another example, a third talker might be in a noiser location and therefore have a higher chance at being selected. The noted prior art method was not immune to different microphone sensitivities and different background noise levels. The present invention addresses and solves this problem.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1-8

are taken from the noted '874 application.

FIG. 1

shows an active acoustic attenuation and communication system in accordance with the invention of the '874 application.

FIG. 2

shows an intercom communication system in accordance with the invention of the '874 application.

FIG. 3

shows a portion of a communication system in accordance with the invention of the '874 application.

FIG. 4

shows a communication system in accordance with the invention of the'874 application.

FIG. 5

shows a communication system in accordance with the invention of the '874 application.

FIG. 6

shows a communication system in accordance with the invention of the '874 application.

FIG. 7

shows a communication system in accordance with the invention of the '874 application.

FIG. 8

shows a communication system in accordance with the invention of the '874 application.

FIG. 9

is taken from the noted '511 application.

FIG. 9

schematically illustrates a digital voice enhancement communication system in accordance with the invention of the '511 application.

FIG. 10

shows a DVE, digital voice enhancement, communication system in accordance with the present invention.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1

is similar to the drawing of incorporated U.S. Pat. No. 5,033,082, and uses like reference numerals where appropriate to facilitate understanding.

FIG. 1

shows an active acoustic attenuation system

10

having a first zone

12

subject to noise from a noise source

14

, and a second zone

16

spaced from zone

12

and subject to noise from a noise source

18

. Microphone

20

senses noise from noise source

14

. Microphone

22

senses noise from noise source

18

. Zone

12

includes a talking location

24

therein such that a person

26

at location

24

is subject to noise from noise source

14

. Zone

16

includes a talking location

28

therein such that a person

30

at location

28

is subject to noise from noise source

18

. Loudspeaker

32

introduces sound into zone

12

at location

24

. Loudspeaker

34

introduces sound into zone

16

at location

28

. An error microphone

36

senses noise and speech at location

24

. Error microphone

38

senses noise and speech at location

28

.

An adaptive filter model

40

adaptively models the acoustic path from noise microphone

20

to talking location

24

. Model

40

is preferably that disclosed in U.S. Pat. No. 4,677,676, incorporated herein by reference. Adaptive filter model

40

has a model input

42

from noise microphone

20

, an error input

44

from error microphone

36

, and outputs at output

46

a correction signal to loudspeaker

32

to introduce cancelling sound at location

24

to cancel noise from noise source

14

at location

24

, all as in incorporated U.S. Pat. No. 4,677,676.

An adaptive filter model

48

adaptively models the acoustic path from noise microphone

22

to talking location

28

. Model

48

has a model input

50

from noise microphone

22

, an error input

52

from error microphone

38

, and outputs at output

54

a correction signal to loudspeaker

34

to introduce cancelling sound at location

28

to cancel noise from noise source

18

at location

28

.

An adaptive filter model

56

adaptively cancels noise from noise source

14

in the output

58

of error microphone

36

. Model

56

has a model input

60

from noise microphone

20

, an output correction signal at output

62

subtractively summed at summer

64

with the output

58

of error microphone

36

to provide a sum

66

, and an error input

68

from sum

66

.

An adaptive filter model

70

adaptively cancels noise from noise source

18

in the output

72

of error microphone

38

. Model

70

has a model input

74

from noise microphone

22

, an output correction signal at output

76

subtractively summed at summer

78

with the output

72

of error microphone

38

to provide a sum

80

, and an error input

82

from sum

80

.

An adaptive filter model

84

adaptively cancels speech from person

30

in the output

58

of error microphone

36

. Model

84

has a model input

86

from error microphone

38

, an output correction signal at output

88

subtractively summed at summer

90

with sum

66

to provide a sum

92

, and an error input

94

from sum

92

. Sum

92

is additively summed at summer

96

with the output

54

of model

48

to provide a sum

98

which is supplied to loudspeaker

34

. Sum

92

is thus supplied to loudspeaker

34

such that person

30

can hear the speech of person

26

.

An adaptive filter model

100

adaptively cancels speech from person

26

in the output

72

of error microphone

38

. Model

100

has a model input

102

from error microphone

36

at sum

92

, an output correction signal at output

104

subtractively summed at summer

106

with sum

80

to provide a sum

108

, and an error input

110

from sum

108

. Sum

108

is additively summed at summer

112

with the output

46

of model

40

to provide a sum

114

which is supplied to loudspeaker

32

. Hence, sum

108

is supplied to loudspeaker

32

such that person

26

can hear the speech of person

30

. Model input

86

is provided by sum

108

, and model input

102

is provided by sum

92

.

Sum

98

supplied to loudspeaker

34

is substantially free of noise from noise source

14

as acoustically and electrically cancelled by adaptive filter models

40

and

56

, respectively. Sum

98

is substantially free of speech from person

30

as electrically cancelled by adaptive filter model

84

. Hence, sum

98

to loudspeaker

34

is substantially free of noise from noise source

14

and speech from person

30

but does contain speech from person

26

, such that loudspeaker

34

cancels noise from noise source

18

at location

28

and introduces substantially no noise from noise source

14

and introduces substantially no speech from person

30

and does introduce speech from person

26

, such that person

30

can hear person

26

substantially free of noise from noise sources

14

and

18

and substantially free of his own speech.

Sum

114

supplied to loudspeaker

32

is substantially free of noise from noise source

18

as acoustically and electrically cancelled by adaptive filter models

48

and

70

, respectively. Sum

114

is substantially free of speech from person

26

as electrically cancelled by adaptive filter model

100

. Sum

114

to loudspeaker

32

is thus substantially free of noise from noise source

18

but does contain speech from person

30

, such that loudspeaker

32

cancels noise from noise source

14

at location

24

and introduces substantially no noise from noise source

18

and introduces substantially no speech from person

26

and does introduce speech from person

30

, such that person

26

can hear person

30

substantially free of noise from noise sources

14

and

18

and substantially free of his own speech.

Each of the adaptive filter models is preferably that shown in above incorporated U.S. Pat. No. 4,677,676. Each model adaptively models its respective forward path from its respective input to its respective output on-line without dedicated off-line pretraining. Each of models

40

and

48

also adaptively models its respective feedback path from its respective loudspeaker to its respective microphone for both broadband and narrowband noise without dedicated off-line pretraining and without a separate model dedicated solely to the feedback path and pretrained thereto. Each of models

40

and

48

, as in above noted incorporated U.S. Pat. No. 4,677,676, adaptively models the feedback path from the respective loudspeaker to the respective microphone as part of the adaptive filter model itself without a separate model dedicated solely to the feedback path and pretrained thereto. Each of models

40

and

48

has a transfer function comprising both zeros and poles to model the forward path and the feedback path, respectively. Each of models

56

and

70

has a transfer function comprising both poles and zeros to adaptively model the pole-zero acoustical transfer function between its respective input microphone and its respective error microphone. Each of models

84

and

100

has a transfer function comprising both poles and zeros to adaptively model the pole-zero acoustical transfer function between its respective output loudspeaker and its respective error microphone. The adaptive filter for all models is preferably accomplished by the use of a recursive least mean square filter, as described in incorporated U.S. Pat. No. 4,677,676. It is also preferred that each of the models

40

and

48

be provided with an auxiliary noise source, such as

140

in incorporated U.S. Pat. No. 4,677,676, introducing auxiliary noise into the respective adaptive filter model which is random and uncorrelated with the noise from the respective noise source to be cancelled.

In one embodiment, noise microphones

20

and

22

are placed at the end of a probe tube in order to avoid placing the microphones directly in a severe environment such as a region of high temperature or high electromagnetic field strength. Alternatively, the signals produced by noise microphones

20

and

22

are obtained from a vibration sensor placed on the respective noise source or obtained from an electrical signal directly associated with the respective noise source, for example a tachometer signal on a machine or a computer generated drive signal on a device such as a magnetic resonance scanner.

In one embodiment, a single noise source

14

and model

40

are provided, with cancellation via loudspeaker

32

and communication from person

26

via microphone

36

. In another embodiment, only models

40

and

56

are provided. In another embodiment, only models

40

,

56

and

84

are provided.

It is thus seen that communication system

10

includes a first acoustic zone

12

, a second acoustic zone

16

, a first microphone

36

at the first zone, a first loudspeaker at the first zone, a second microphone

38

at the second zone and having an output supplied to first loudspeaker

32

such that a first person

26

at first zone

12

can hear the speech of a second person

30

at second zone

16

as transmitted by second microphone

38

and first loudspeaker

32

, and a second loudspeaker

34

at second zone

16

and having an input supplied from first microphone

36

such that the second person

30

at the second zone

16

can hear the speech of the first person

26

at the first zone

12

as transmitted by first microphone

36

and second loudspeaker

34

. Each of the zones is subject to noise. First person

26

at first talking location

24

in first zone

12

and second person

30

at second talking location

28

in second zone

16

are each subject to noise. Loudspeaker

32

introduces sound into first zone

12

at first talking location

24

. Loudspeaker

34

introduces sound into second zone

16

at second talking location

28

. Error microphone senses noise and speech at location

24

. Model

40

has a model input from a reference signal correlated to the noise as provided by input microphone

20

sensing noise from noise source

14

. Model

40

has an error input

44

from microphone

36

. Model

40

has a model output

46

outputting a correction signal to loudspeaker

32

to introduce canceling sound at location

24

to attenuate noise thereat. Error microphone

38

senses noise and speech at location

28

. Model

48

has a model input

50

from a reference signal correlated with the noise as provided by input microphone

22

sensing the noise from noise source

18

. Model

48

has an error input

52

from microphone

38

. Model

48

has a model output outputting a correction signal to loudspeaker

34

to introduce cancelling sound at location

28

to attenuate noise thereat. Model

56

has a model input

60

from microphone

20

, a model output

62

outputting a correction signal summed at summer

64

with the output

58

of microphone

36

to electrically cancel noise from first zone

12

in the output of microphone

36

, and an error input

68

from the output

66

of summer

64

. Model

70

has a model input

74

from microphone

22

, a model output

76

outputting a correction signal summed at summer

78

with the output

72

of microphone

38

to cancel noise from zone

16

in the output of microphone

38

, and an error input

82

from the output

80

of summer

78

. Model

84

cancels the speech of second person

30

in the output of microphone

36

otherwise present due to electrical transmission from microphone

38

to loudspeaker

32

and broadcast by loudspeaker

32

to microphone

36

, the cancellation of the speech of person

30

in the output of microphone

36

preventing rebroadcast thereof by loudspeaker

34

. Model

100

cancels the speech of person

26

in the output of microphone

38

otherwise present due to electrical transmission from microphone

36

to loudspeaker

34

and broadcast by loudspeaker

34

to microphone

38

, the cancellation of the speech of person

26

in the output of microphone

34

preventing rebroadcast thereof by loudspeaker

32

.

The system above described is shown in incorporated U.S. Pat. No. 5,033,082.

In the system of the '874 application, additional models

120

and

122

are provided. Model

120

cancels the speech of person

26

in the output of microphone

36

otherwise present due to acoustic transmission from loudspeaker

34

in zone

16

to microphone

36

in zone

12

. This is desirable in implementations where there is no acoustic isolation or barrier between zones

12

and

16

, for example as in a vehicle such as a minivan where zone

12

may be the front seat and zone

16

a back seat, i.e. where there is acoustic coupling of the zones and acoustic transmission therebetween such that sound broadcast by loudspeaker

34

is not only electrically transmitted via microphone and loudspeaker

32

to zone

12

, but is also acoustically transmitted from loudspeaker to zone

12

. Model

122

cancels the speech of person

30

in the output of microphone otherwise due to acoustic transmission from loudspeaker

32

in zone

12

to microphone

38

in zone

16

.

Model

84

models the path from loudspeaker

32

to microphone

36

. Model

100

models the path from loudspeaker

34

to microphone

38

. Model

120

models the path from loudspeaker

34

to microphone

36

. Model

122

models the path from loudspeaker

32

to microphone

38

. Model

84

has a model input

86

from the input to loudspeaker

32

supplied from the output of microphone

38

, and a model output

88

to the output of microphone

36

supplied to the input of loudspeaker

34

. Model

100

has a model input

102

from the input to loudspeaker

34

supplied from the output of microphone

36

, and a model output

104

to the output of microphone

38

supplied to the input of loudspeaker

32

. Model

120

has a model input

124

from the input to loudspeaker

34

supplied from the output of microphone

36

, and a model output

126

to the output of microphone

36

supplied to the input of loudspeaker

34

. Model

122

has a model input

128

from the input to loudspeaker

32

supplied from the output of microphone

38

, and a model output

130

to the output of microphone

38

supplied to the input of loudspeaker

32

. An auxiliary noise source

132

, like auxiliary noise source

140

in incorporated U.S. Pat. No. 4,677,676, introduces auxiliary noise through summer

134

into model inputs

102

and

124

of models

100

and

120

, respectively, which auxiliary noise is random and uncorrelated with the noise from the respective noise source to be canceled. In one embodiment, the auxiliary noise source

132

is provided by a Galois sequence, M. R. Schroeder,

Number Theory In Science And Communications

, Berlin: Springer-Verlag, 1984, pages 252-261, though other random uncorrelated noise sources may of course be used. The Galois sequence is a pseudo random sequence that repeats after 2

M-

1 points, where M is the number of stages in a shift register. The Galois sequence is preferred because it is easy to calculate and can easily have a period much longer than the response time of the system. An auxiliary random noise source

136

introduces auxiliary noise through summer

138

into model inputs

86

and

128

of models and

122

, respectively, which auxiliary noise is random and uncorrelated with the noise from the respective noise source to be canceled. It is preferred that auxiliary noise source

136

be provided by a Galois sequence, as above described. Each of auxiliary noise sources

132

and

136

is random and uncorrelated relative to each other and relative to noise from noise source

14

, speech from person

26

, noise from noise source

18

, and speech from person

30

. Model

120

is trained to converge to and model the path from loudspeaker

34

to microphone

36

by the auxiliary noise from source

132

. Model

100

is trained to converge to and model the path from loudspeaker

34

to microphone

38

by the auxiliary noise from source

132

. Model

84

is trained to converge to and model the path from loudspeaker

32

to microphone

36

by the auxiliary noise from source

136

. Model

122

is trained to converge to and model the path from loudspeaker

32

to microphone

38

by the auxiliary noise from source

136

.

FIG. 2

shows a system similar to

FIG. 1

, and uses like reference numerals where appropriate to facilitate understanding. The system of

FIG. 2

is used in a vehicle

140

, such as a minivan. Loudspeaker

32

provides enhanced voice from zone

2

, i.e. with noise and echo cancellation as above described. Loudspeaker

32

also provides audio for zone

1

and cellular phone for zone

1

at

12

such as the front seat. Also supplied at zone are voice in zone

1

from person

26

such as the driver and/or front seat passenger. Also supplied at zone

1

due to acoustic coupling from zone

2

are the echo of enhanced voice

1

broadcast by speaker

34

, with noise and echo cancellation as above described, and audio from zone

2

and cellular phone from zone

2

. The signal content in the output of microphone

36

as shown at

59

includes: voice

1

; enhanced voice

1

echo; enhanced voice

2

; audio

1

; audio

2

; cell phone

1

; cell phone

2

. Loudspeaker

34

broadcasts enhanced voice

1

, audio for zone

2

and cellular phone for zone

2

at

16

such as a rear seat of the vehicle. Also supplied at zone

2

are voice in zone

2

from person

30

, such as one or more rear seat passengers, enhanced voice

2

echo which is the voice from zone

2

as broadcast by speaker

32

in zone

1

due to acoustic coupling therebetween, as well as audio from zone

1

and cell phone from zone

1

as broadcast by speaker

32

. The signal content in the output

72

of microphone

38

as shown at

73

includes: voice

2

; enhanced voice

2

echo; enhanced voice

1

; audio

1

; audio

2

; cell phone

1

; cell phone

2

. Summer

90

sums the output

58

of microphone

36

, the output

88

of model

84

, and the output

126

of model

120

, and supplies the resultant sum at

92

to summer

134

, error correlator multiplier

142

of model

84

, and error correlator multiplier

144

of model

120

. Summer

134

sums the output

92

of summer

90

, the training signal from auxiliary random noise source

132

, and the audio

2

and cell phone

2

signals for zone

2

, and supplies the resultant sum to loudspeaker

34

, model input

124

of model

120

, and model input

102

of model

100

. Summer

106

sums the output

72

of microphone

38

, model output

104

of model

100

, and model output

130

of model

122

, and supplies the resultant sum at

108

to summer

138

, error correlator multiplier

146

of model

100

, and error correlator multiplier

148

of model

122

. Summer

138

sums the output

108

of summer

106

, the training signal from auxiliary random noise source

136

, and the audio

1

and cell phone

1

signals for zone

1

, and supplies the resultant sum to loudspeaker

32

, model input

86

of model

84

, and model input

128

of model

122

. The training signal from auxiliary random noise source

132

is supplied to summer

134

and to error correlator multipliers

146

and

144

of models

100

and

120

, respectively. The training signal from auxiliary random noise source

136

is supplied to summer

138

and to error correlator multipliers

142

and

148

of models

84

and

122

, respectively.

In digital voice enhancement, DVE, systems, acoustic echo cancelers, AEC, are used to minimize acoustic reflection and echo, prevent acoustic feedback, and remove additional unwanted signals. Acoustic echo cancelers are most often only applied between the immediate zone loudspeaker and microphone, e.g. model

84

modeling the path from loudspeaker

32

to microphone

36

. However, in certain applications where the propagation losses or physical damping between communication zones such as

12

and

16

is not sufficient, e.g. a vehicle interior such as a minivan, the acoustic path between these zones may allow significant coupling and cause added system echo, acoustic feedback and signal corruption.

The system applies acoustic echo cancelers between all microphones and loudspeakers in the digital voice enhancement system as shown in FIG.

2

. This allows signal contributions from the following sources to be removed from the microphone signal so that it includes only the voice signal from the near end talker: the far end voice broadcast from the near end loudspeaker; the near end audio broadcast from the near end loudspeaker; the near end voice broadcast from the far end loudspeaker; the far end audio broadcast from the far end loudspeaker; cellular phone broadcast from near end and far end loudspeakers. By removing these components, the closed loop full duplex communication system is more stable with desired system gains that were not previously possible. In addition, the resulting signal has less extraneous noise which allows enhanced precision in speech processing activities.

Acoustic echo cancellation may require on-line estimation of the acoustic echo path. In vehicle implementations, it is desirable to detect when occupant movement occurs, to as quickly as possible update the acoustic echo cancellation models. In a desirable feature enabled by the present invention, the available supplemental restraint occupant sensor or a seat belt use detector may be monitored. If the sensor indicates a change in occupant location or seat belt use, an occupant movement is assumed, and rapid adaptation occurs to correct the acoustic echo cancellation models and ensure optimal performance of the system.

Further in vehicle implementations, the proper placement of a communication microphone is difficult due to varying sizes of occupants and seat track locations. Less ideal microphone locations result in lower signal to noise ratios, higher required system gain, and lower performance. In a desirable aspect, the system enables utilization of supplemental restraint occupant sensors or seat track location sensors, potentially available in future supplemental restraint occupant position detection systems. From such sensors, certain weight, height, fore/aft location information, etc., may be available. The system enables use of such information to select the most appropriate microphone, e.g. from a bank of microphones, and/or gain selection to ensure system performance. For example, certain weight or height information would signal a short occupant. From this information, the general seat track position may be presumed or obtained from a seat track location sensor, and a best suited microphone selected. Also, from height information, the distance from the occupant to the selected microphone might be estimated, and an appropriate gain applied to account for extra distance from the selected microphone. The system enables utilization of such signals to increase system robustness by selecting appropriate transducers and parameters. This provides microphone selection and/or gain selection by occupant sensor input.

Multidimensional digital voice enhancement systems can be reconfigured during operation to match occupant requirements. Many activities are processor intensive and compromise system robustness when compared with smaller dimensioned systems. In a desirable aspect, the system enables utilization of vehicle occupant sensor or seat belt use detector information to determine if an occupant is present in a particular digital voice enhancement zone. If an occupant is not detected, certain functions associated with that zone may be eliminated from the computational activities. Processor ability may be reassigned to other zones to do more elaborate signal processing. The system enables the system to reconfigure its dimensionality to perform in an optimum fashion with the requirements placed on it. This provides digital voice enhancement zone hibernation based on occupant sensors.

In digital voice enhancement systems, acoustic echo cancelers are used to minimize echo, stabilize closed loop communication channels, and prevent acoustic feedback, as above noted. The acoustic echo cancelers model the acoustic path between each loudspeaker and each microphone associated with the system. This full coupling of all the loudspeakers and microphones may be computationally expensive and objectionable in certain applications. In a desirable aspect, the system allows acoustic echo cancelers to be applied to loudspeaker-microphone acoustic paths when limited processor capabilities exist. Transfer functions are taken between each loudspeaker-microphone combination. The gain over the communication system bandwidth is compared between transfer functions. Those transfer functions exhibiting a higher gain trend over the frequency band indicate greater acoustic coupling between the particular loudspeaker and microphone. The system designer may use a gain trend ranking to apply acoustic echo cancelers first to those paths with the greater acoustic coupling. This allows the system designer to prioritize applying acoustic echo cancelers to the loudspeaker-microphone paths which most need assistance to ensure stable communication. Paths that cannot be serviced with acoustic echo cancelers would rely on the physical damping and propagation losses of the acoustic path for echo reduction, or other less intensive electronic means for increased stability. This enables digital voice enhancement optimization using physical characteristics.

A voice activity detection algorithm is judged by how accurately it responds to a wide variety of acoustic events. One that provides a 100% hit rate on desired voice signals and a 0% falsing rate on unwanted noises is considered ideal. Use of an occupant sensing device as one of the inputs to the voice activity detection algorithm can provide certainty, within limits of the occupant sensing device, that no falsing will occur when a location is not occupied. This feature would be especially relevant to automotive applications when there are no additional passengers for a driver to communicate with. Smart airbags and other passive safety devices may soon be required to know attributes such as the size, shape, and presence of passengers in vehicles for proper deployment. The minimum desired information to be known at the time of deployment would be to know if there is a passenger to be protected. No passenger, or possibly more important, a small passenger or child seat would require disarming of the passive restraint system. This sensing information would be useful as a compounding condition in digital voice enhancement systems to also deactivate a voice sensing microphone when no occupant is present. This provides voice activity detection with occupant sensing devices.

FIG. 3

shows a switch

150

having open and closed states, and conducting the output of microphone

38

therethrough in the closed state. A voice activity detector

152

has an input from the output of microphone

38

at a node

154

between microphone and switch

150

. An occupant sensor

156

senses the presence of a person at acoustic zone

16

, for example a rear passenger seat. A logic AND function provided by AND gate

158

has a first input

160

from voice activity detector

152

, a second input

162

from occupant sensor

156

, and an output

164

to switch

150

to actuate the latter between the open and closed states, to control whether the latter passes a zone transmit out signal or not.

It is desirable for on-line training noise to be imperceptible by the occupant, yet have sufficient signal to noise ratio for accurate model convergence. In a desirable aspect, the present system may be used to exploit microphone gate activity to increase the allowable training signal and acoustic echo cancellation convergence. This allows the acoustic echo cancellation models to be more aggressively and accurately adapted. When the microphone gate is opened, some level of speech will be present. When speech is transmitted, a higher level training signal may be added to the speech signal and still be imperceptible to the occupant. This can be accomplished by a gate controlled training signal gain, FIG.

4

. The present invention enables utilization of pre-existing system features to increase overall robustness in an unobtrusive fashion. This provides acoustic echo cancellation training noise level based on microphone gate activity.

In

FIG. 4

, the input to model

84

is supplied through a variable training signal circuit

170

providing increased training signal level with increasing speech signal levels from microphone

38

. Training signal circuit

170

includes a summer

172

having an input

174

from microphone

38

, an input

176

from a training signal, and an output

178

to loudspeaker

32

and to model

84

. A variable gain element

180

supplies the training signal from training signal source

182

to input

176

of summer

172

. A voice activity detector gate

184

senses the speech signal level from microphone

38

at a node

186

between microphone

38

and input

174

of summer

172

, and controls the gain of variable gain element

180

. As noted above, it is desired that the training signal levels be maintained below a level perceptible to a person at zone

12

.

Further in

FIG. 4

, the input to model

100

is supplied through variable training signal circuit

188

providing increasing training signal levels with increasing speech signal levels from microphone

36

. Training signal circuit

188

includes a summer

190

having an input

192

from microphone

36

, an input

194

from a training signal, and an output

196

to loudspeaker

34

and to model

100

. Variable gain element

198

supplies the training signal from training signal source

200

to input

194

of summer

190

. Voice activity detector gate

202

senses the speech signal level from microphone

36

at node

204

between microphone

36

and input

192

of summer

190

, and controls the gain of variable gain element

198

. It is preferred that the training signal level be maintained below a level perceptible to a person at zone

16

.

It is desirable to detect when occupant movement or luggage loading changes occur. In one implementation of the system, the vehicle door ajar or courtesy light signal may be monitored. If any door is opened, all on-line modeling is halted. This prohibits the models from adapting to both changes in the acoustic boundary characteristics due to open doors, and also to changes in loudspeaker location when mounted to the moving door. After the doors are determined to be shut, and a system settling time has passed, it can be assumed that an occupant movement or luggage loading change is likely to have occurred. Accordingly, adaptation can occur to correct the acoustic echo cancellation models and ensure optimal performance of the system. Alternatively, an echo return loss enhancement measurement can be made on each model to calculate the echo reduction offered by each acoustic echo cancellation and to determine if they are adequate. If it is determined that they are deficient, an aggressive adaptation could then correct the acoustic echo cancellation models. Again, the system enables the utilization of available signals to ensure system stability and robustness not only by not adapting while the physical system is in a nonfunctional condition but also by modeling when the system is returned to a functional condition to account for possible occupant or luggage movements.

Digital voice enhancement systems may pickup and rebroadcast engine related noise in vehicle applications or other applications involving periodic or tonal noise. This becomes particularly annoying when one of the communication zones has much lower engine related noise than others. In this situation, the rebroadcast noise is not masked by the primary engine related noise. In a desirable aspect of the system, the engine or engine related tach signal may be conditioned with DC blocking and magnitude clipping to meet proper A/D limitations. A rising edge or zero crossing detector monitors the input signal and calculates a scaler frequency value. An average magnitude detector also monitors the input signal to shut down the frequency detection routine if the average magnitude drops below a specified level. This is a noise rejection scheme for signals with varying amplitude depending on engine speed, revolutions per minute, RPM. The calculated frequency is then converted to the engine related frequencies of interest which are summed and input to an electronic noise control, ENC, filter reference, to be described. The output of the filter is then subtracted from the microphone signal to remove the engine related component from the signal.

In

FIG. 5

, a tonal noise remover

210

senses periodic noise and removes same from the output of microphone

36

to prevent broadcast thereof by loudspeaker

34

. Tonal noise remover

210

includes a summer

212

having an input

214

from microphone

36

, an input

216

from a tone generator

218

generating one or more tones in response to periodic noise and supplying same through adaptive filter model

220

, and an output

222

to loudspeaker

34

through summer

90

. Tone generator

218

receives a plurality of tach signals

224

,

226

, and outputs a plurality of tone signals to summer

228

for each of the tach signals, for example a tone signal

1

N

1

which is the same frequency as tach signal

1

, a tone signal

2

N

1

which is twice the frequency of tach signal

1

, a tone signal

4

N

1

which is four times the frequency of tach signal

1

, a tone signal

1

N

2

which is the same frequency as tach signal

2

, a tone signal

2

N

2

which is twice the frequency of tach signal

2

, etc. Model

220

has a model input

230

from summer

228

, a model output

232

outputting a correction signal to summer input

216

, and an error input

234

from summer output

222

.

Further in

FIG. 5

, a second tonal noise remover

240

senses periodic noise and removes same from the output of microphone

38

to prevent broadcast thereof by loudspeaker

32

. Tonal noise remover

240

includes summer

242

having an input

254

from microphone

38

, an input

246

from a tone generator

248

generating one more tones in response to periodic noise and supplying same through adaptive filter model

260

, and an output

262

to loudspeaker

32

through summer

106

. Tone generator

258

receives a plurality of tach signals such as

264

and

266

, and outputs a plurality of tone signals to summer

268

, one for each of the tach signals, as above described for tone generator

218

and tach signals

224

and

226

. Model

260

has a model input

270

from summer

268

, a model output

272

outputting a correction signal to summer input

246

, and an error input

274

from summer output

262

. In the noted vehicle implementation, tach

1

signals

224

and

264

are the same, and tach 2 signals

226

and

266

are the same.

In vehicle implementations, background ambient noise increases with vehicle speed, and as a result more gain is needed in a communication system to sustain adequate speech intelligibility. In a desirable aspect, the system enables application of a noise responsive, including vehicle speed sensitive, high pass filter to the microphone signal. The filter cutoff would increase with elevated noise levels, such as elevated vehicle speeds, and therefor reduce the system bandwidth. By limiting system bandwidth, more gain is available, resulting in improved speech intelligibility. At higher speeds, the lower frequency speech content is masked by broadband vehicle and wind noise, so that the reduced bandwidth does not sacrifice the perceived quality of speech. At low speeds, the high pass filter lowers its cutoff frequency, to provide enriched low frequency performance, thus overcoming objections to a tinny sounding digital voice enhancement system. This provides noise responsive, including speed dependent, band limiting for a communication system.

The adaptation of the acoustic echo cancellation models with random noise may be accomplished by injecting the training noise before or after the noise responsive or speed sensitive filter, FIG.

6

. Injection before such filter provides a system wherein the training noise is speed varying filtered. This approach is advantageous in obtaining the highest training signal allowed while being imperceptible to the occupant. However, the acoustic echo cancellation filters would have potentially unconstrained frequency components. Injection after the speed sensitive filter provides a system wherein the training noise would always be full bandwidth. This has the potential of being more robust, yet has the limitation of lower training noise levels allowed to be imperceptible to the occupant. In a desirable aspect, the system utilizes the natural trade-offs between bandwidth and gain, and results in a more robust communication system.

In

FIG. 6

, a noise responsive high pass filter

290

between microphone

36

and loudspeaker

34

has a filter cutoff effective at elevated noise levels and reducing bandwidth and making more gain available, to improve intelligibility of speech of person

26

transmitted from microphone

36

to loudspeaker

34

. In the noted vehicle application, high pass filter

290

is vehicle speed sensitive, such that at higher vehicle speeds and resulting higher noise levels, lower frequency speech content is blocked, and higher frequency speech content is passed, the lower frequency speech content being otherwise masked at higher speeds by broadband vehicle and wind noise, so that the reduced bandwidth and the absence of the lower frequency speech content does not sacrifice the perceived quality of speech, and such that at lower vehicle speeds and resulting lower noise levels, the cutoff frequency of the filter is lowered such that lower frequency speech content is passed, in addition to higher frequency speech content, to provide enriched low frequency performance, and overcome objections to a tinny sounding system. In one embodiment, a summer

292

has a first input

294

from microphone

36

, a second input

296

from a training signal supplied by training signal source

298

, and an output

300

to high pass filter

290

, such that the training signal is variably filtered according to noise level, namely vehicle speed in vehicle implementations. In an alternate embodiment, training signal source

298

is deleted, and a summer

302

is provided having an input

304

from high pass filter

290

, an input

306

from a training signal supplied by training signal source

308

, and an output

310

to loudspeaker

34

. In this embodiment, the training signal is full bandwidth and not variably filtered according to noise level or vehicle speed.

Further in

FIG. 6

, a noise responsive high pass filter

312

between microphone

38

and loudspeaker

32

has a filter cutoff effective at elevated noise levels and reducing bandwidth and making more gain available, to improve intelligibility of speech of person

30

transmitted from microphone

38

to loudspeaker

32

. In the noted vehicle application, high pass filter

312

is vehicle speed sensitive, such that at higher vehicle speeds and resulting high noise levels, lower frequency speech content is blocked and higher frequency speech content is passed, the lower frequency speech content being otherwise masked at higher speeds by broadband vehicle and wind noise, so that the reduced bandwidth and the absence of the lower frequency speech content does not sacrifice the perceived quality of speech, and such that at lower vehicle speeds and resulting lower noise levels, the cutoff frequency of the filter is lowered such that lower frequency speech content is passed, in addition to higher frequency speech content, to provide enriched low frequency performance, and overcome objections to a tinny sounding system. In one embodiment, a summer

314

has a first input

316

from microphone

38

, a second input

318

from a training signal supplied by training signal source

320

, and an output

322

to high pass filter

312

, such that the training signal is variably filtered according to noise level, namely vehicle speed in vehicle implementations. In an alternate embodiment, training signal source

320

is deleted, and a summer

324

is provided having an input

326

from high pass filter

312

, an input

328

from a training signal supplied by training signal source

330

, and an output

332

to loudspeaker

32

. In this embodiment, the training signal is full bandwidth and not variably filtered according to noise level or vehicle speed.

Optimal voice pickup in a digital voice enhancement system can be characterized by having the largest talking zone and the highest signal to noise ratio. The larger the talking zone the less sensitivity the digital voice enhancement system will have to the talkers physical size, seating position, and head position/movement. Large talking zones are attributed with good system performance and ergonomics. High signal to noise ratios are associated with speech intelligibility and good sound quality. These two design goals are not always complementary. Large talking zones may be accomplished by having multiple microphones to span the talking zone, however this may have a negative impact on the signal to noise ratio. It is desired that the available set of microphones be scanned to determine the best candidate for maximum speech reception. This may be based on short term averages of power or magnitude. An average magnitude estimation and subsequent comparison from two microphones is one implementation in a digital voice enhancement system.

As above noted, closed loop communication systems can become unstable whenever the total loop gain exceeds unity. Careful setting of the system gain, and acoustic echo cancellation may be used to ensure system stability. For various reasons such as high gain requirements, or less than ideal acoustic echo cancellation performance, acoustic feedback can occur. Acoustic feedback often occurs at a system resonance or where the free response is relatively undamped. These resonances usually occur at a very high Q, quality factor, and can be represented by a narrow band in the frequency domain. Therefore, the total system gain ceiling is determined by only a small portion of the communication system bandwidth, in essence limiting performance across all frequencies in the band for one or more narrow regions. In a desirable aspect, the system enables observation, measurement and treatment of persistent high Q system dynamics. These dynamics may relate to acoustic instabilities to be minimized. The observation of acoustic feedback can be performed in the frequency domain. The nature and sound of acoustic feedback is commonly observed in a screeching or howling burst of energy. The sound quality of this type of instability is beyond reverberation, echoes, or ringing, and is observable in the frequency domain by monitoring the power spectrum. Measurement of such a disturbance can be accomplished with a feedback detector, where the exact frequency and magnitude of the feedback can be quantified. Time domain based schemes such as auto correlation could alternatively be applied to obtain similar measurements. Observation and measurement steps could be performed as a background task reducing real time digital signal processing requirements. Treatment follows by converting this feedback frequency information into notch filter coefficients that are implemented by a filter applied to the communication channel. The magnitude of the reduction, or depth of the notch filter's null, can be progressively applied or set to maximum attenuation as desired. Once the filter has been applied, the observation of the acoustic feedback should vanish, however hysteresis in the measurement process should be applied to not encourage cycling of the feedback reduction. Long term statistics of the feedback treatment process can be utilized for determining if the notch filter could be removed from the communication channel. Additionally, multiple notch filters may be connected in series to eliminate more complicated acoustic feedback situations often encountered in three dimensional sound fields.

In

FIG. 7

, feedback detector

350

has an input

352

from microphone

36

, and an output

354

controlling an adjustable notch filter

356

filtering the output of microphone

36

supplied to loudspeaker

34

. Adjustable notch filter

356

has an input

358

from the output of microphone

36

. Feedback detector

350

has an input

352

from microphone

36

at a node

360

between the output of microphone

36

and the input

358

of adjustable notch filter

356

. Summer

90

has an input from the output of model

84

, an input from the output of model

120

, and an input from the output of adjustable notch filter

356

, and an output supplied to loudspeaker

34

. A second feedback detector

370

has an input

372

from microphone

38

, and an output

374

controlling a second adjustable notch filter

376

filtering the output of microphone

38

supplied to loudspeaker

32

. Adjustable notch filter

376

has an input

378

from microphone

38

at a node

380

between the output of microphone

38

and the input

378

of adjustable notch filter

376

. Summer

106

has an input from the output of model

100

, an input from the output of model

122

, and an input from the output of adjustable notch filter

376

. Summer

106

has an output supplied to loudspeaker

32

.

In a further aspect, a sine wave or multiple sine waves can be generated from the detected feedback frequency and serve as the reference to the electronic noise control filter. The ENC filter will form notches at the exact frequencies, and adjust its attenuation until the offending feedback tones are minimized to the level of the noise floor. The ENC filter is similar to a classical adaptive interference canceler application as discussed in

Adaptive Signal Processing

, Widrow and Steams, Prentice-Hall, Inc., Englewood Cliffs, N.J. 07632, 1985, pages 316-323. The output of the filter is then subtracted from the microphone signal to remove the feedback component from the signal. The feedback suppression is performed before the acoustic echo cancellation.

In

FIG. 8

, an acoustic feedback tonal canceler

390

removes tonal feedback noise from the output of microphone

36

to prevent broadcast thereof by loudspeaker

34

. Feedback tonal canceler

390

includes a summer

392

having an input

394

from microphone

36

, an input

396

from feedback detector

398

and tone generator

400

supplied through adaptive filter model

402

, and an output

404

to loudspeaker

34

through summer

90

. Model

402

has a model input

406

from tone generator

400

, a model output

408

supplying a correction signal to summer input

396

, and an error input

410

from summer output

404

. A second feedback tonal canceler

420

is comparable to feedback tonal canceler

390

. Feedback tonal canceler

420

includes a summer

422

having an input

424

from microphone

38

, an input

426

from feedback detector

428

and tone generator

430

supplied through adaptive filter model

432

, and an output

434

supplied to loudspeaker

32

through summer

106

. Model

432

has a model input

436

from tone generator

430

, a model output

438

supplying a correction signal to summer input

426

, and an error input

440

from summer output

434

.

It is desirable for communication systems to be usable as soon as possible after activated. However, this cannot take place until the acoustic echo cancellation models have converged to an accurate solution so that the system may be used with appropriate gain. In a desirable aspect of the system, the acoustic echo cancellation models may be stored in memory and used immediately upon system start up. These models may need some minor correction to account for changes in occupant position, luggage loading, and temperature. These model corrections may be accomplished with quicker adaptation from the stored models rather than starting from null vectors, for example in accordance with U.S. Pat. No. 5,022,082, incorporated herein by reference.

FIG. 9

shows a simplex digital voice enhancement communication system

502

in accordance with the noted '511 application, including a first acoustic zone

504

, a second acoustic zone

506

, a first microphone

508

in the first zone, a first loudspeaker

510

in the first zone, a second microphone

512

in the second zone, and a second loudspeaker

514

in the second zone. A voice sensitive gated switch

516

has a first mode with switch element

516

a

closed and supplying the output of microphone

508

over a first channel

518

to loudspeaker

514

. Switch

516

has a second mode with switch element

516

b

closed and supplying the output of microphone

512

over a second channel

520

to loudspeaker

510

. The noted first and second modes are mutually exclusive such that only one of the channels

518

and

520

can be active at a time. In the first mode, switch element

516

a

is closed and switch element

516

b

is open such that the switch blocks, or at least substantially reduces, transmission from microphone

512

to loudspeaker

510

. In the second mode, switch element

516

b

is closed and switch element

516

a

is open to block or substantially reduce transmission from microphone

508

to loudspeaker

514

. Voice activity detectors or gates

522

and

524

have respective inputs from microphones

508

and

512

, for controlling operation of switch

516

. When switch

516

is in its first mode, with switch element

516

a

closed and switch element

516

b

open, the speech of person

526

in zone

504

can be heard by person

528

in zone

506

as broadcast by speaker

514

receiving the output of microphone

508

. The speech of person

528

and the output of speaker

514

as picked up by microphone

512

are not transmitted to speaker

510

because switch element

516

b

is open. Thus, there is no echo transmission of the voice of person

526

back through microphone

512

and speaker

510

, and hence no need to cancel same. This provides the above noted simplification in circuitry and processing otherwise required for echo cancellation. The same considerations apply in the noted second mode of switch

516

, with switch element

516

b

closed and switch element

516

a

open, wherein there is no rebroadcast by speaker

514

of the speech of person

528

and hence no echo and hence no need to cancel same. A suitable gate and switch combination

522

,

524

,

516

uses a short-time, average magnitude estimating function to detect if a voice signal is present in the respective channel. Other suitable estimating functions are disclosed in

Digital Processing of Speech Signals

, Lawrence R. Rabiner, Ronald W. Schafer, 1978, Bell Laboratories, Inc., Prentice-Hall, pp. 120-126, and also as noted in U.S. Pat. No. 5,706,344, incorporated herein by reference.

A first noise sensitive bandpass filter

530

and a first equalization filter

532

are provided in first channel

518

. A second noise sensitive bandpass filter

534

and a second equalization filter

536

are provided in second channel

520

. Noise sensitive bandpass filter

530

is a noise responsive highpass filter having a filter cutoff frequency effective at elevated noise levels and reducing bandwidth and making more gain available, to improve intelligibility of speech of person

526

transmitted from microphone

508

to loudspeaker

514

, and as disclosed in the noted '874 application. Noise sensitive bandpass filter

534

is like filter

530

and is a noise responsive highpass filter having a filter cutoff effective at elevated noise levels and reducing bandwidth and making more gain available, to improve intelligibility or quality of speech of person

528

transmitted from microphone

512

to loudspeaker

510

. Equalization filter

532

reduces resonance peaks in the acoustic transfer function between loudspeaker

514

and microphone

508

to reduce feedback by damping the resonance peaks. This is desirable because in various applications, including vehicle implementations where zone

506

is the back seat and zone

504

is the front seat, there may be acoustic coupling between speaker

514

and microphone

508

. The resonance peaks may or may not be unstable, depending on total system gain. The equalization filter can take several forms including but not limited to graphic, parametric, inverse, adaptive, and as disclosed in U.S. Pat. Nos. 5,172,416, 5,396,561, 5,715,320, all incorporated herein by reference. The equalization filter may also take the form of a notch filter designed to selectively remove transfer function resonance peaks. Such a filter could be adaptive or determined offline based on the acoustic characteristics of a particular system. In one embodiment, equalization filter

532

is a set of one or more frequency selective notch filters determined from the acoustic transfer function between loudspeaker

514

in zone

506

and microphone

508

in zone

504

. Equalization filter

536

is like filter

532

and reduces resonance peaks in the acoustic transfer function between loudspeaker

510

and microphone

512

to reduce feedback by damping resonance peaks.

In the above noted vehicle implementation, each of highpass filters

530

and

534

is vehicle speed sensitive, preferably by having an input from the vehicle speedometer

538

. At higher vehicle speeds and resulting higher noise levels, lower frequency speech content is blocked and higher frequency speech content is passed, the lower frequency speech content being otherwise masked at higher speeds by broadband vehicle and wind noise, so that the reduced bandwidth and the absence of the lower frequency speech content does not sacrifice the perceived quality of speech. At lower vehicle speeds and resulting lower noise levels, the cutoff frequency of each of highpass filters

530

and

534

is lowered such that lower frequency speech content is passed, in addition to higher frequency speech content, to provide enriched low frequency performance, and overcome objections to a tinny sounding system. In vehicles having an in-cabin audio system, i.e. a radio and/or tape player and/or compact disc player and/or mobile phone, a digital voice enhancement activation switch

540

is provided for actuating and deactuating the voice sensitive gated switch

516

, i.e. turn the latter on or off, and providing an audio mute signal muting, or reducing to some specified level, the in-cabin audio system as shown at radio mute

542

.

In one embodiment, equalization filter

532

is a first frequency responsive spectral transfer function, and equalization filter

536

is a second frequency responsive spectral transfer function each for example as disclosed in above noted U.S. Pat. No. 5,715,320. The first frequency responsive spectral transfer function is a function of a model of the acoustic transfer function between loudspeaker

514

and microphone

508

. The second frequency responsive spectral transfer function of filter

536

is a function of a model of the acoustic transfer function between loudspeaker

510

and microphone

512

. In some embodiments, these first and second acoustic transfer functions are the same, e.g. where zones

504

and

506

are small, and in some implementations these first and second acoustic transfer functions are different. In one preferred form, the first frequency responsive spectral transfer function of filter

532

is the inverse of the noted first acoustic transfer function between loudspeaker

514

and microphone

508

, for example as disclosed in above noted U.S. Pat. No. 5,715,320. Likewise, the noted second frequency responsive spectral transfer function of filter

536

is the inverse of the noted second acoustic transfer function between loudspeaker

510

and microphone

512

, also as in above noted U.S. Pat. No. 5,715,320.

The disclosed combination is simple and effective, and is particularly desirable because it enables use of available known components. By using a speed variable highpass filter in the communication channel, the digital voice enhancement system does not excite lower order cabin modes in vehicle implementations. The highpass filter also greatly reduces transmitted wind and road noises, which are a function of speed, improving the overall sound quality of the digital voice enhancement system. No losses in speech quality are perceived due to aural masking effects from the in-cabin noise. Secondly, the post-processing equalization filter minimizes resonance peaks in the total acoustic transfer function. This has the benefit of reducing the potential for feedback by damping resonance peaks, and also creating a more natural sounding reproduction of speech. The audio mute signal from activation switch

540

is desirable so that when the user selects the digital voice enhancement system, the in-cabin audio system, if present, is disabled, or its output significantly reduced, i.e. muted, as shown at radio mute

542

. This prevents the digital voice enhancement system from detecting false information from the audio system and prevents distortions of the audio system by not allowing the digital voice enhancement system to rebroadcast the audio program.

FIG. 10

shows a DVE, digital voice enhancement, communication system in accordance with the present invention, and uses like reference numerals from above where appropriate to facilitate understanding. The system may be used in a duplex mode as in

FIGS. 1-8

, a simplex mode as in

FIG. 9

, and in other modes.

FIG. 10

shows a DVE system

550

having a plurality of microphones

508

,

552

,

554

,

556

, etc., and at least one loudspeaker

514

, and other loudspeakers if desired such as

558

,

560

, etc. Each microphone has a respective gate

562

,

564

,

566

,

568

, etc., as above, and the microphone signals are supplied in parallel through respective SNNR ratio calculators

570

,

572

,

574

,

576

, to be described, and supplied in parallel to switch

578

. As above described for gates

522

,

524

, a short-time average magnitude estimating function is used to detect if a voice signal is present in the respective channel, to provide a measure or function of the respective voice +noise signals

580

,

582

,

584

,

586

, etc. Other suitable estimating functions may be used as noted above and disclosed in

Digital Signal Processing of Speech Signals

, Lawrence W. Rabiner, Ronald W. Schafer, 1978, Bell Laboratories, Inc., Prentice-Hall, pages 120-126, and also as noted in U.S. Pat. No. 5,706,344, incorporated herein by reference. A longer-time average magnitude sensing function is used in the absence of voice activity detection, to create a measure or function of noise signals

588

,

590

,

592

,

594

, etc.

Switch

578

selects which microphone to electrically couple to loudspeaker

514

, and to any other loudspeaker if desired, so that a listener at loudspeaker

514

can hear the speech of a talker at the selected microphone. The selection decision is based on a given function of the speech of a respective talker relative to his/her acoustic environment at the respective microphone. The selection decision is based on a selection technique normalizing at least one and preferably both of a) different microphone sensitivities and b) different background noise levels at the respective microphones. This is accomplished by calculators

570

,

572

,

574

,

576

, etc. Calculator

570

determines the ratio

S N N R = \frac{f (voice + noise)}{f (noise)}

where SNNR is the ratio of speech+noise to noise, and f is a given function thereof, preferably average magnitude, average power (magnitude

2

), or peak hold with a given decay rate, and outputs an SNNR signal

580

. The remaining calculators likewise determine the respective ratio for the respective inputs and output SNNR signals

582

,

584

,

586

, etc. The switching decision by switch

578

is based on the largest of the SNNR signals. Switch

578

electrically couples the loudspeaker to the respective selected microphone. The selection decision is based on the ratio of how much louder a talker speaks over the background noise at his/her respective microphone.

As an example, if a first talker and his microphone

508

were in a library, and a second talker and his microphone

552

were in a car on a cell phone, the background noise alone in the car might be louder than the first talker's voice plus the background noise in a library, and hence microphone

552

would always be selected, even if the first talker at microphone

508

was talking. If the second talker is also talking, the addition of his voice to the background noise in the car even further increases the sound level thereat, and further reduces the chances of the first talker ever being selected. In contrast, in the present invention, with the normalizing effect of the SNNR ratio, the selection decision is based on the ratio of how much louder the talker speaks over the background noise at his/her respective microphone. The talker in the library does not have to shout as loud as the talker in the car, nor shout over the background noise in the car, to have his microphone chosen to be active because it is not the overall voice+noise power which is used for the selection decision, but rather the ratio of voice+noise to noise, i.e. SNNR as noted above. The noted time average functions for the microphones are selected such that the addition of the talker's voice to the background noise signal is quickly recognized to provide the voice+noise signal

580

as the numerator to the calculator

570

, at which time the most recent noise value from the slower time averaging signal

588

is used for the denominator of the SNNR ratio. When the voice+noise signal

580

falls, the slower longer-time averaging is used to monitor noise signal

588

, with the resulting SNNR ratio being approximately unity, awaiting the next voice activated fast averaging rise of signal

580

.

It is recognized that various equivalents, alternatives and modifications are possible within the scope of the appended claims.

Number	Name	Date	Kind
4359602	Ponto et al.	Nov 1982	A
4602337	Cox	Jul 1986	A
4658425	Julstrom	Apr 1987	A
4677676	Eriksson	Jun 1987	A
5022082	Eriksson et al.	Jun 1991	A
5033082	Eriksson et al.	Jul 1991	A
5111508	Gale et al.	May 1992	A
5172416	Allie et al.	Dec 1992	A
5216722	Popovich	Jun 1993	A
5243659	Stafford et al.	Sep 1993	A
5355419	Yamamoto et al.	Oct 1994	A
5386477	Popovich et al.	Jan 1995	A
5396561	Popovich et al.	Mar 1995	A
5533120	Staudacher	Jul 1996	A
5544242	Robinson	Aug 1996	A
5557682	Warner et al.	Sep 1996	A
5561598	Nowak et al.	Oct 1996	A
5586189	Allie et al.	Dec 1996	A
5590205	Popovich	Dec 1996	A
5602928	Eriksson et al.	Feb 1997	A
5602929	Popovich	Feb 1997	A
5621803	Laak	Apr 1997	A
5627747	Melton et al.	May 1997	A
5633936	Oh	May 1997	A
5673327	Julstrom	Sep 1997	A
5680337	Pedersen et al.	Oct 1997	A
5706344	Finn	Jan 1998	A
5710822	Steenhagen et al.	Jan 1998	A
5715320	Allie et al.	Feb 1998	A
5940486	Schlaff	Aug 1999	A
6031918	Chahabadi	Feb 2000	A

Number	Date	Country
0568129	Nov 1993	EP
0721178	Jul 1996	EP

DVE system with normalized selection

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Abstract

Description

Claims

US Referenced Citations (31)

Foreign Referenced Citations (2)

Non-Patent Literature Citations (4)