Method and apparatus for estimating tempo based on inter-onset interval count

Information

  • Patent Application
  • 20070180980
  • Publication Number
    20070180980
  • Date Filed
    November 22, 2006
    18 years ago
  • Date Published
    August 09, 2007
    17 years ago
Abstract
An apparatus for estimating a tempo includes a peak time detection unit for detecting peak times of input audio data when an amplitude of the audio data reaches peak values; an inter-onset interval (IOI) determining unit for determining IOIs between the detected peak times; an IOI clustering unit for clustering the IOIs into a plurality of 101 clusters and for determining an average of the IOIs contained in each of the IOI clusters; a tempo estimating unit for estimating a tempo of the input audio data based on the average of the IOIs of one of the IOI clusters.
Description

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other objects, features and advantages of the present invention will become apparent from the following description of preferred embodiments given in conjunction with the accompanying drawings, in which:



FIG. 1 is a block diagram of a conventional tempo estimating apparatus;



FIG. 2 is a block diagram of a tempo estimating apparatus according to an embodiment of the present invention;



FIG. 3 is a detailed block diagram of a preprocessing unit 100 of FIG. 2 according to an embodiment of the present invention;



FIG. 4 is a flowchart illustrating a method of estimating a tempo according to an embodiment of the present invention;



FIG. 5 is a flowchart illustrating a method of preprocessing audio data according to an embodiment of the present invention;



FIG. 6 is a flowchart illustrating a method of detecting peak times according to an embodiment of the present invention;



FIG. 7 is a flowchart illustrating an IO calculating method according to an embodiment of the present invention;



FIG. 8 is a flowchart illustrating an IOI clustering method according to an embodiment of the present invention;



FIG. 9 is a flowchart illustrating a method of detecting associated IOI clusters according to an embodiment of the present invention;



FIG. 10 shows a block diagram of a tempo estimating apparatus according to another embodiment of the present invention;



FIG. 11 is a flowchart illustrating a method of estimating a tempo according to another embodiment of the present invention;



FIG. 12 is a graph showing a relation between a Mel frequency and a linear frequency; and



FIG. 13 is a graph showing the weighting factors of a triangle filter.


Claims
  • 1. An apparatus for estimating a tempo, comprising: a peak time detection unit for detecting peak times of input audio data when an amplitude of the audio data reaches peak values;an inter-onset interval (IOI) calculation unit for calculating IOIs between the detected peak times;an IOI clustering unit for clustering the IOIs according to the respective IOIs with a predetermined range of size difference into a plurality of IOI clusters and for calculating a number of the IOIs and a mean of the IOIs contained in each of the IOI clusters; anda tempo estimating unit for determining one of the means of the IOIs in the 101 clusters as a tempo of the input audio data according to the number of the IOIs contained in each of the IOI clusters.
  • 2. The apparatus as claimed in claim 1, wherein the IOI calculation unit calculates IOIs between a peak time and a predetermined number of adjacent peak times detected after the peak time.
  • 3. The apparatus as claimed in claim 1, wherein the IOI clustering unit sorts the IOIs in order of size and clusters the sequentially sorted IOIs using the IOIs within a predetermined range of size difference.
  • 4. The apparatus as claimed in claim 1, wherein the tempo estimating unit estimates the mean of the IOIs of one of the IOI clusters having a largest number of the IOIs as the tempo of the input audio data.
  • 5. The apparatus as claimed in claim 1, wherein the tempo estimating unit determines a genre weighting factor for each of the IOI clusters according to predetermined genre data and determines the one of the means of the IOIs in the IOI clusters as the tempo of the input audio data according to the number of the IOIs and the genre weighting factor.
  • 6. The apparatus as claimed in claim 1, further comprising: an IOI association unit for determining a cluster weighting factor of each of the IOI clusters according to the number of the IOIs contained in the IOI cluster,wherein among all of the IOI clusters, any one of the IOI clusters whose mean IOI is a predetermined rational number multiple of the mean of the IOIs of a relevant one of the IOI clusters is detected, and the tempo estimating unit determines the one of the means of the IOIs in the IOI clusters as the tempo of the input audio data according to the determined cluster weighting factor.
  • 7. The apparatus as claimed in claim 6, wherein the tempo estimating unit determines a genre weighting factor for each of the IOI clusters according to predetermined genre data and determines the one of the means of the IOIs in the IOI clusters as the tempo of the input audio data according to the cluster weighting factor and the genre weighting factor.
  • 8. The apparatus as claimed in claim 1, further comprising: a preprocessing unit for dividing the received audio data into frames with a predetermined length, and extracting frequency coefficients contained in each of the frames through discrete Fourier transform to perform a band pass filtering operation if the input audio data are audio data in a time domain or extracting frequency coefficients contained in each of the frames to perform a band pass filtering operation if the input audio data are compressed audio data in a frequency domain.
  • 9. The apparatus as claimed in claim 8, wherein the preprocessing unit further comprises: a linear regression unit for calculating slope data of the audio data by performing linear regression on the band pass filtered audio data,wherein the peak time detection unit detects peak times at which the slope data reach the peak values.
  • 10. A method of estimating a tempo, the method comprising: detecting peak times of input audio data when an amplitude of the audio data reaches peak values;calculating inter-onset intervals (IOIs) between the detected peak times;clustering the IOIs according to the respective IOIs within a predetermined range of size difference into a plurality of IOI clusters;calculating a number the IOIs and a mean of the IOIs contained in each of the IOI clusters; anddetermining the one of the means of the IOIs in the IOI clusters as a tempo of the input audio data according to the number of the IOIs contained in each of the IOI clusters.
  • 11. The method as claimed in claim 10, wherein the step of calculating the IOIs comprises the step of calculating the IOIs between a peak time and a predetermined number of adjacent peak times detected after the peak time.
  • 12. The method as claimed in claim 10, wherein the step of clustering comprises the step of sorting the IOIs in order of size and clustering the sequentially sorted IOIs using the IOIs within the predetermined range of size difference.
  • 13. The method as claimed in claim 10, wherein the determining step comprises the step of estimating the mean IOI of one of the IOI clusters having a largest number of the IOIs as the tempo of the input audio data.
  • 14. The method as claimed in claim 10, wherein the estimating step comprises the step of determining a genre weighting factor of each of the IOI clusters according to predetermined genre data and determining the one of the means of the IOIs in the IOI clusters as the tempo of the input audio data according to the number of the IOIs and the genre weighting factor.
  • 15. The method as claimed in claim 10, said method further comprising: detecting, among all of the IOI clusters, any one of the IOI clusters whose mean IOI is a predetermined rational number multiple of the mean of the IOIs of a relevant one of the IOI clusters; anddetermining a cluster weighting factor for each of the IOI clusters according to the number of the IOIs contained in the corresponding IOI cluster and the IOI clusters detected as the rational number multiple, wherein the determining step comprises the step of determining the one of the means of the IOIs in the IOI clusters as the tempo of the input audio data according to the determined cluster weighting factor.
  • 16. The method as claimed in claim 15, wherein the determining step comprises the step of determining a genre weighting factor of each of the IOI cluster according to predetermined genre data and determining the one of the means of the IOIs in the IOI clusters as the tempo of the input audio data according to the cluster weighting factor and the genre weighting factor.
  • 17. The method as claimed in claim 10, said method further comprising: preprocessing to divide the received audio data into frames with a predetermined length, and to extract frequency coefficients contained in each of the frames through discrete Fourier transform to perform a band pass filtering operation if the input audio data are audio data in a time domain or to extract frequency coefficients contained in each of the frames to perform a band pass filtering operation if the input audio data are compressed audio data in a frequency domain.
  • 18. The method as claimed in claim 17, wherein the step of preprocessing further comprises: calculating slope data of the audio data by performing linear regression on the band pass filtered audio data,wherein the peak time detecting step comprises the step of detecting the peak times at which the slope data reach the peak values.
  • 19. An apparatus for estimating a tempo, comprising: a peak time detection unit for detecting peak times of input audio data when an amplitude of the audio data reaches peak values;an inter-onset interval (IOI) determining unit for determining IOIs between the detected peak times;an IOI clustering unit for clustering the IOIs into a plurality of IOI clusters and for determining an average of the IOIs contained in each of the IOI clusters;a tempo estimating unit for estimating a tempo of the input audio data based on the average of the IOIs of one of the IOI clusters.
  • 20. The apparatus of claim 19, wherein the IOI clustering unit clusters the IOIs based on a predetermined range of size difference of the IOIs.
  • 21. The apparatus of claim 19, wherein the IOI clustering unit determines a number of the IOIs contained in each of the IOI clusters, and the a tempo estimating unit estimates the tempo of the input audio data based on the number of the IOIs contained in each of the IOI clusters.
  • 22. The apparatus of claim 21, wherein the tempo estimating unit estimates the tempo of the input audio data as the average of the IOIs of one of the IOI clusters with a largest number of the IOIs.
  • 23. The apparatus of claim 20, further comprising: an IOI association unit for determining a cluster weighting factor of each of the IOI clusters based on the number of the IOIs contained in the corresponding IOI cluster, wherein, among all of the IOI clusters, any one of the IOI clusters whose average IOI is a predetermined rational number multiple of the average of the IOIs of a relevant one of the IOI clusters is detected, and the tempo estimating unit determines the average of the IOIs in the one of the IOI clusters as the tempo of the input audio data based on the determined cluster weighting factor.
Priority Claims (1)
Number Date Country Kind
10-2006-0011618 Feb 2006 KR national