Claims
- 1. A method for identifying audio content, said method comprising the steps of:
obtaining an audio signal characterized by a time dependent power spectrum; analyzing the spectrum to obtain a plurality of time dependent frequency components; and detecting a plurality of events in the plurality of time dependent frequency components.
- 2. The method according to claim 1, wherein the detecting step includes the sub-step of detecting a plurality of extremum in the plurality of time dependent frequency components.
- 3. The method according to claim 1, further comprising the steps of:
detecting a set of events occurring approximately simultaneously in a set of adjacent time dependent frequency components; and selecting a subset of the set of events for further processing.
- 4. The method according to claim 1, further comprising the step of determining a time dependent frequency component power corresponding to each event.
- 5. The method according to claim 1, wherein the analyzing step includes the sub-steps of:
sampling the audio signal to obtain a plurality of audio signal samples; taking a plurality of subsets from the plurality of audio signal samples; and performing a Fourier transform on each of the plurality of subsets to obtain a set of Fourier frequency components.
- 6. The method according to claim 5, wherein the analyzing step further includes the sub-step of averaging together corresponding Fourier frequency components obtained from two or more successive subsets selected from the plurality of subsets.
- 7. The method according to claim 6, wherein the analyzing step further includes the sub-step of collecting Fourier frequency components into a plurality of semitone frequency bands.
- 8. The method according to claim 1, wherein the detecting step includes the sub-steps of:
taking a first running average with a first averaging period of each of a first subset of the plurality of time dependent frequency components to obtain a first sequence of average powers at a set of successive times for each of the plurality of time dependent frequency components; taking a second running average with a second averaging period that is different from the first averaging period of each of the subset of the plurality of time dependent frequency components to obtain a second sequence of average powers at the set of successive times for each of the plurality of time dependent frequency components; and identifying a plurality of average crossing events at a plurality of event times at which the first running average crosses the second running average.
- 9. The method according to claim 8,
wherein the first averaging period is between {fraction (1/10)} of a second and 1 second, and the second averaging period is from 2 to 8 times as long as the first averaging period.
- 10. The method according to claim 1, further comprising the step of collecting the plurality of events in a plurality of time groups each of which covers an interval of time.
- 11. The method according to claim 10, further comprising the step of:
in response to detecting each event in each of the plurality of time dependent frequency components, selecting one or more combinations of events from a plurality of events that occurred within a number of time groups, and within a number of time dependent frequency components.
- 12. The method according to claim 11, wherein the selecting step includes the sub-step of selecting one or more combinations of events from a plurality of events that occurred within a number of time groups, and within a number of time dependent frequency components, taking only one event at a time from each time group.
- 13. The method according to claim 11, further comprising the step of forming a plurality of keys from the one or more combinations each of which comprises a time to be associated with the combination of events, and a key sequence including information about each event in the combination.
- 14. A method for forming an identifying feature of a portion of a recording of audio signals, said method comprising the steps of:
performing a Fourier transformation of the audio signals of the portion into a time series of audio power dissipated over a first plurality of frequencies; grouping the frequencies into a smaller second plurality of bands that each include a range of neighboring frequencies; detecting power dissipation events in each of the bands; and grouping together the power dissipation events from mutually adjacent bands at a selected moment so as to form the identifying feature.
- 15. The method according to claim 14, further comprising the step of integrating power dissipation in each of the bands over a predetermined period.
- 16. The method according to claim 15, wherein each of the power dissipation events is a crossover of rolling energy dissipation levels over time periods of different lengths.
- 17. A method of determining whether an audio stream includes at least a portion of a known recording of audio signals, said method comprising the steps of:
forming at least a first identifying feature based on the portion of the known recording using the method of claim 14; storing the first identifying feature in a database; forming at least a second identifying feature based on a portion of the audio stream using the method of claim 14; and comparing the first and second identifying features to determine whether there is at least a selected degree of similarity.
- 18. The method according to claim 17, wherein each of the power dissipation events is a crossover of rolling energy dissipation levels over time periods of different lengths.
- 19. A computer-readable medium encoded with a program for identifying audio content, said program containing instructions for performing the steps of:
obtaining an audio signal characterized by a time dependent power spectrum; analyzing the spectrum to obtain a plurality of time dependent frequency components; and detecting a plurality of events in the plurality of time dependent frequency components.
- 20. The computer-readable medium according to claim 19, wherein said program further contains instructions for performing the steps of:
detecting a set of events occurring approximately simultaneously in a set of adjacent time dependent frequency components; and selecting a subset of the set of events for further processing.
- 21. The computer-readable medium according to claim 19, wherein the analyzing step includes the sub-steps of:
sampling the audio signal to obtain a plurality of audio signal samples; taking a plurality of subsets from the plurality of audio signal samples; and performing a Fourier transform on each of the plurality of subsets to obtain a set of Fourier frequency components.
- 22. The computer-readable medium according to claim 19, wherein the detecting step includes the sub-steps of:
taking a first running average with a first averaging period of each of a first subset of the plurality of time dependent frequency components to obtain a first sequence of average powers at a set of successive times for each of the plurality of time dependent frequency components; taking a second running average with a second averaging period that is different from the first averaging period of each of the subset of the plurality of time dependent frequency components to obtain a second sequence of average powers at the set of successive times for each of the plurality of time dependent frequency components; and identifying a plurality of average crossing events at a plurality of event times at which the first running average crosses the second running average.
- 23. A computer-readable medium encoded with a program for forming an identifying feature of a portion of a recording of audio signals, said program containing instructions for performing the steps of:
performing a Fourier transformation of the audio signals of the portion into a time series of audio power dissipated over a first plurality of frequencies; grouping the frequencies into a smaller second plurality of bands that each include a range of neighboring frequencies; detecting power dissipation events in each of the bands; and grouping together the power dissipation events from mutually adjacent bands at a selected moment so as to form the identifying feature.
- 24. A system for identifying a recording of an audio signal, said system comprising:
an interface for receiving an audio signal to be identified; a spectrum analyzer for obtaining a plurality of time dependent frequency components from the audio signal; an event detector for detecting a plurality of events in each of the time dependent frequency components; and a key generator for grouping the plurality of events by frequency and time, and assembling a plurality of keys based on the plurality of events.
- 25. The system according to claim 24, wherein the event detector is a peak detector.
- 26. The system according to claim 24, further comprising a database of keys of known recordings of audio signals.
- 27. A system for forming an identifying feature of a portion of a recording of audio signals, said system comprising:
means for performing a Fourier transformation of the audio signals of the portion into a time series of audio power dissipated over a first plurality of frequencies; means for grouping the frequencies into a smaller second plurality of bands that each include a range of neighboring frequencies; means for detecting power dissipation events in each of the bands; and means for grouping together the power dissipation events from mutually adjacent bands at a selected moment so as to form the identifying feature.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is based upon and claims priority from prior U.S. Provisional Application No. 60/245,799, filed Nov. 3, 2000, the entire disclosure of which is herein incorporated by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60245799 |
Nov 2000 |
US |