Claims
- 1. A method for detecting speech in a vocoded signal, comprising the steps of:
- receiving a vocoded signal having a succession of frames, each frame containing audio information and a corresponding frame energy value;
- calculating a staggered average value derived from the frame energy value by:
- comparing a current frame energy value with a present staggered average value;
- if the current frame energy value is greater than the present staggered average value, setting the staggered average value equal to the current frame energy value; and
- if the current frame energy value is less than the present staggered average value, calculating a current staggered average value by reducing the present staggered average value by an averaging factor;
- providing a threshold voice indicator value; and
- declaring speech present when the staggered average value is greater than the threshold voice indicator value.
- 2. A method for detecting speech as defined in claim 1, wherein in the step of calculating, the averaging factor has a form of y(n)=a.multidot.y(n-1)+(1-a).multidot.x(n), where:
- y(n) is the current staggered average value;
- a is a scaling factor having a value from zero to one;
- y(n-1) is the present staggered average value; and
- x(n) is the current frame energy value.
- 3. A method for detecting speech as defined in claim 2, wherein in the step of calculating, the scaling factor has a value dependent on the current frame energy value.
- 4. A method for detecting speech as defined in claim 3, wherein in the step of calculating, the value of the scaling factor is dependent on a range of the current frame energy value.
- 5. A method for detecting speech as defined in claim 1, wherein the vocoded signal comprises a voicing value with each frame, in the step of calculating the staggered average value, the staggered average value is the product of the frame energy value and the voicing value.
- 6. A method for detecting speech as defined in claim 5, wherein the step of calculating a staggered average comprises:
- comparing a product of a current frame energy value and a current voicing value with a present staggered average value;
- if the product is greater than the present staggered average value, setting the staggered average value equal to the product; and
- if the product is less than the present staggered average value, calculating a current staggered average value by reducing the present staggered average value by an averaging factor.
- 7. A method for detecting speech as defined in claim 6, wherein in the step of calculating, the averaging factor has the form of y[n]=a.multidot.y(n-1)+(1-a).multidot.x(n), where:
- y(n) is the current staggered average value;
- a is a scaling factor having a value from zero to one;
- y(n-1) is the present staggered average value; and
- x(n) is the product of the current frame energy value and the current voicing value.
- 8. A method for detecting speech as defined in claim 6, wherein in the step of calculating, the scaling factor has a value dependent on the current frame energy value.
- 9. A method for detecting speech as defined in claim 8, wherein in the step of calculating, the value of the scaling factor is dependent on a range of the current frame energy value.
- 10. A method for detecting speech as defined in claim 1, wherein in the step of declaring speech, the threshold voice indicator value is a constant value.
- 11. A method for detecting speech as defined in claim 1, wherein the step of providing a threshold voice indicator value comprises calculating a running average of the frame energy when the staggered average value is below a previous threshold voice indicator value and a voicing value corresponding to the frame energy value indicates an unvoiced frame.
- 12. A method for detecting speech in a vocoded signal, comprising the steps of:
- receiving a vocoded signal having a succession of frames, each frame containing audio information and a corresponding frame energy value and a voicing value;
- calculating a staggered average value derived from a product of the frame energy value and the voicing value by:
- comparing a current frame energy value with a present staggered average value;
- if the current frame energy value is greater than the present staggered average value, setting the staggered average value equal to the current frame energy value; and
- if the current frame energy value is less than the present staggered average value, calculating a current staggered average value by reducing the present staggered average value by an averaging factor;
- providing a threshold voice indicator value; and
- declaring speech present when the staggered average value is greater than the threshold voice indicator value.
- 13. A method for detecting speech as defined in claim 12, wherein in the step of calculating, the averaging factor has the form of y[n]=a.multidot.y(n-1)+(1-a).multidot.x(n), where:
- y(n) is the current staggered average value;
- a is a scaling factor having a value from zero to one;
- y(n-1) is the present staggered average value; and
- x(n) is the product of the current frame energy value and the current voicing value.
- 14. A method for detecting speech as defined in claim 13, wherein in the step of calculating, the scaling factor has a value dependent on the current frame energy value.
- 15. A method for detecting speech as defined in claim 14, wherein in the step of calculating, the value of the scaling factor is dependent on a range of the current frame energy value.
- 16. A method for detecting speech as defined in claim 14, wherein in the step of declaring speech, the threshold voice indicator value is a constant value.
- 17. A method for detecting speech as defined in claim 14, wherein the step of providing a threshold voice indicator value comprises calculating a running average of the frame energy when the staggered average value is below a previous threshold voice indicator value and a voicing value corresponding to the frame energy value indicates an unvoiced frame.
Parent Case Info
This application is related to co-pending application entitled "Method For Suppressing Speaker Activation In A Portable Communication Device Operated In A Speakerphone Mode" having U.S. patent application Ser. No. 09/127,692; to co-pending application entitled "A Method For Selectively Including Leading Fricative Sounds In A Portable Communication Device Operated In A Speakerphone Mode", and having U.S. patent application Ser. No. 09/127,536; and to co-pending application entitled "Method And Apparatus For Providing Speakerphone Operation In A Portable Communication Device" and having U.S. patent application Ser. No. 09/127,348, of said applications being commonly assigned with the present application and filed evenly herewith.
US Referenced Citations (4)