Claims
- 1. A voice presence/absence discriminator for dividing an input voice signal into unitary base frames each of which corresponds to a prescribed time period and for discriminating between the voice presence and voice absence in each base frame, said discriminator comprising:
- voice signal generation means for generating said input voice signal;
- frame generation means for dividing said input voice signal into a plurality of base frames and for dividing each of said base frames into a plurality of sub-frames;
- sub-frame power calculation means for calculating respective sub-frame electric powers of said sub-frames;
- frame maximum power production means for determining a frame maximum power of each of said base frames to be a maximum one of values related to sub-frame powers corresponding to a respective base frame;
- background noise power estimation means for estimating a background noise electric power based on a plurality of consecutive sub-frames electric powers that include a most recent sub-frame power; and
- voice presence/absence discrimination means for discriminating between a voice presence condition and a voice absence condition of said input voice signal for each base frame based on a difference between said frame maximum power and said background noise power.
- 2. The discriminator of claim 1, wherein said frame maximum power production means comprises:
- short-period average value calculation means for, each time a sub-frame power is calculated by said sub-frame power calculation means, calculating, based on a prescribed number of consecutive sub-frame electric powers, which are smaller in number than a number of sub-frames into which base frames are decided, and which include a most recent sub-frame power having been calculated by said sub-frame power calculation means, short-period average values each of which is an electric power average value of the prescribed number of consecutive sub-frames electric powers;
- wherein said frame maximum power production means determines a maximum one of the short-period average values to be said frame maximum power.
- 3. The discriminator of claim 2, wherein said background noise power estimation means comprises:
- long-period average value calculation means for, each time a sub-frame power is calculated by said sub-frame power calculation means, calculating, based on a prescribed number of consecutive sub-frame electric powers, which are larger in number than a number of sub-frames into which base frames are decided, and which include a most recent sub-frame power having been calculated by said sub-frame power calculation means, long-period average values each of which is an electric power average value of the prescribed number of consecutive sub-frames electric powers; and
- selection means for determining as a background noise power of each base frame a minimum one of long-period average values of sub-frames corresponding to a respective base frame, which have been calculated by the long-period average value calculation means.
- 4. The discriminator of claim 1, wherein said background noise power estimation means comprises:
- long-period average value calculation means for, each time a sub-frame power is calculated by said sub-frame power calculation means, calculating, based on a prescribed number of consecutive sub-frame electric powers, which are larger in number than a number of sub-frames into which base frames are decided, and which include a most recent sub-frame power having been calculated by said sub-frame power calculation means, long-period average values each of which is an electric power average value of the prescribed number of consecutive sub-frames electric powers; and
- selection means for determining as a background noise power of each base frame a minimum one of long-period average values of sub-frames corresponding to a respective base frame, which have been calculated by the long-period average value calculation means.
- 5. The discriminator of claim 1, further comprising:
- parameter extraction means for performing linear estimated analysis on said input voice signal in units of base frames to thereby extract a characteristic parameter that represents a characteristic of a frequency spectrum envelope of said input voice signal;
- wherein said voice presence/absence discrimination means includes
- first determination means for determining a base frame wherein a difference between a frame maximum power and a background noise power thereof is not smaller than a prescribed first threshold value to be a voice presence frame and for determining a base frame wherein the difference therebetween is not greater than a prescribed second threshold value that is smaller than said first threshold value to be a voice absence frame, and
- second determination means for, when said difference therebetween is greater than said first threshold value and smaller than said second threshold value, performing a determination of said voice presence condition and said voice absence condition based on said characteristic parameter extracted by said parameter extraction means.
- 6. The discriminator of claim 5, wherein said characteristic parameter extracted by said parameter extraction means is a lower-order reflection coefficient.
- 7. The discriminator of claim 1, further comprising period determination means for, of voice presence frames that have been so determined by said voice presence/absence discrimination means and voice absence frames that have been so determined thereby, determining a voice presence frame and a prescribed, and smaller than prescribed, number of voice absence frames that consecutively succeed said voice presence frame to be a voice presence period and determining voice absence frames that further consecutively succeed the prescribed number of voice absence frames to be a voice absence period.
- 8. A voice presence/absence discriminator for dividing an input voice signal into unitary base frames each of which corresponds to a prescribed time period and for discriminating between the voice presence and voice absence in each base frame, said discriminator comprising:
- voice signal generation means for generating said input voice signal;
- frame generation means for dividing said input voice signal into a plurality of base frames and for dividing each of said base frames into a plurality of sub-frames;
- sub-frame power calculation means for calculating respective sub-frame electric powers of said sub-frames;
- voice presence/absence discrimination means for determining a base frame to be a voice presence frame if a value representative of sub-frame powers of sub-frames of said base frame exceeds a specified parameter,
- wherein said background noise power estimation means comprises:
- long-period average value calculation means for, each time a sub-frame power is calculated by said sub-frame power calculation means, calculating, based on a prescribed number of consecutive sub-frame electric powers, which are larger in number than a number of sub-frames into which base frames are decided, and which include a most recent sub-frame power having been calculated by said sub-frame power calculation means, long-period average values each of which is an electric power average value of the prescribed number of consecutive sub-frames electric powers; and
- selection means for determining as a background noise power of each base frame a minimum one of long-period average values of sub-frames corresponding to a respective base frame, which have been calculated by the long-period average value calculation means; and
- reference value setting means for setting said specified parameter based on a selected background noise power.
- 9. A method of detecting a voice presence condition of an electrical signal, said method comprising the steps of:
- dividing said signal into a plurality of base frames;
- dividing each of said base frames into a plurality of sub-frames;
- calculating power parameters representative of powers of said sub-frames;
- determining a voice presence condition in a portion of said signal corresponding to a base frame in which one of said power parameters exceeds a first given level,
- determining a background noise power level of said signal; said background noise power level is estimated based on a plurality of consecutive sub-frames electric powers that include a most recent sub-frame power; and
- setting said first given level based on said background noise power level.
- 10. The method of claim 9, wherein said background noise power level determining step comprises the steps of:
- calculating a plurality of moving averages of said sub-frame powers; and
- selecting a minimum value in said plurality of moving averages as said background noise power level.
- 11. The method of claim 10, wherein a number of sub-frame powers averaged in each of said plurality of moving averages is greater than a number of sub-frames into which each of said base frames is divided.
- 12. The method of claim 9, said value calculating step comprising the steps of:
- calculating a plurality of moving averages of said sub-frame powers; and
- selecting a maximum value in said plurality of moving averages as a power parameters corresponding to a base frame containing said averaged sub-frame powers.
- 13. The method of claim 12, wherein a number of sub-frame powers averaged in each of said plurality of moving averages is less than a number of sub-frames into which each of said base frames is divided.
- 14. The method of claim 9, further comprising the step of determining a voice absence condition in a portion of said signal corresponding to a base frame in which one of said power parameters exceeds a second given level.
- 15. The method of claim 14, further comprising the steps of:
- determining a background noise power level of said signal; and
- setting said second given level based on said background noise power level.
- 16. The method of claim 15, wherein said background noise power level determining step comprises the steps of:
- calculating a plurality of moving averages of said sub-frame powers; and
- selecting a minimum value in said plurality of moving averages as said background noise power level.
- 17. The method of claim 16, wherein a number of sub-frame powers averaged in each of said plurality of moving averages is greater than a number of sub-frames into which each of said base frames is divided.
- 18. The method of claim 9, further comprising the steps of:
- calculating a first-order reflection coefficient of said base frame; and
- determining a voice presence condition in a portion of said signal corresponding to said base frame when said first-order reflection coefficient is greater than a second given level.
- 19. The method of claim 9, further comprising the steps of:
- calculating a second-order reflection coefficient of said base frame; and
- determining a voice presence condition in a portion of said signal corresponding to said base frame when said second-order reflection coefficient is less than a second given level.
Priority Claims (1)
Number |
Date |
Country |
Kind |
7-312814 |
Nov 1995 |
JPX |
|
CROSS-REFERENCE TO RELATED APPLICATION
The present application is related to and claims priority from Japanese Patent Application No. Hei 7-312814, incorporated herein by reference.
US Referenced Citations (4)
Number |
Name |
Date |
Kind |
4351983 |
Crouse et al. |
Sep 1982 |
|
5016205 |
Shumway |
May 1991 |
|
5649055 |
Gupta et al. |
Jul 1997 |
|
5664052 |
Nishiguchi et al. |
Sep 1997 |
|
Foreign Referenced Citations (2)
Number |
Date |
Country |
5-323996 |
Dec 1993 |
JPX |
6-75599 |
Mar 1994 |
JPX |