Claims
- 1. A method of identifying recordings, comprising:
extracting at least one candidate fingerprint from at least one portion of an unidentified recording; and searching for a match between at least one value derived from the at least one candidate fingerprint and at least one value in at least one reference fingerprint among a plurality of reference fingerprints.
- 2. A method as recited in claim 1, wherein said searching comprises computing at least one weighted absolute difference between the at least one candidate fingerprint and the at least one reference fingerprint using a weight based on a value derived from the at least one candidate fingerprint.
- 3. A method as recited in claim 1, further comprising prior to said extracting, expanding dynamic range of the at least one portion of the unidentified recording.
- 4. A method as recited in claim 3, wherein said expanding of the dynamic range makes all sample values within the at least one portion of an unidentified recording more equally likely.
- 5. A method as recited in claim 1, further comprising:
storing in a cache memory matched candidate fingerprints with identifiers of corresponding reference fingerprints; and determining whether a new candidate fingerprint is included in the matched candidate fingerprints in the cache memory prior to said searching using the new candidate fingerprint.
- 6. A method as recited in claim 5, further comprising:
indicating a match between the new candidate fingerprint and a corresponding reference fingerprint when the new candidate fingerprint is included in the matched candidate fingerprints in the cache memory; and adding the new candidate fingerprint to the cache memory and associating a corresponding identifier for the corresponding reference fingerprint with new candidate fingerprint in the cache memory.
- 7. A method as recited in claim 1,
wherein said extracting results in each candidate fingerprint including a predetermined number of candidate values for corresponding frequency ranges and each reference fingerprint includes the predetermined number of reference values for the corresponding frequency ranges, and wherein said method further comprises determining whether each candidate fingerprint matches one of the reference fingerprints based on selectively weighted differences between corresponding candidate and reference values for different frequency ranges.
- 8. A method as recited in claim 7, further comprising generating each of the candidate and reference fingerprints to include values representing a magnitude of power at frequencies in frequency ranges with mid-range frequencies weighted less than high- and low-range frequencies.
- 9. A method as recited in claim 1, wherein generation of each of the candidate and reference fingerprints comprises:
computing power in each of a plurality of frequency bands; and normalizing the power for each frequency within each band so that a mean of the power within each band is equal to a predetermined value.
- 10. A method as recited in claim 1, wherein generation of each of the candidate and reference fingerprints comprises computing a frequency distribution within each of a plurality of different frequency bands using a finer resolution at lower frequency bands than at higher frequency bands.
- 11. A method as recited in claim 1,
wherein said extracting extracts first and second candidate fingerprints from the at least one portion of the unidentified recording, the first candidate fingerprint having low discernability of frequency variation from the original and the second candidate fingerprint having low discernability of amplitude variation from the original, and wherein said method further comprises:
storing first reference fingerprints having low discernability of frequency variation and second reference fingerprints with low discernability of amplitude variation; and comparing the first candidate fingerprint with the first reference fingerprints and the second candidate fingerprint with the second reference fingerprints.
- 12. A method as recited in claim 11, wherein a first processor is used for said comparing of the first candidate fingerprint with the first reference fingerprints and concurrently a second processor is used for said comparing of the second candidate fingerprint with the second reference fingerprints.
- 13. A method as recited in claim 11, wherein a first result of said comparing of the first candidate fingerprint with the first reference fingerprints is combined with a second result of said comparing of the second candidate fingerprint with the second reference fingerprints to determine whether corresponding first and second reference fingerprints for both the first and second fingerprints are stored.
- 14. A method as recited in claim 1, wherein the portion of the unidentified recording has a duration of less than 25 seconds.
- 15. A method as recited in claim 14, wherein the portion of the unidentified recording has a duration of at least 10 seconds and no greater than 20 seconds.
- 16. A method as recited in claim 1, wherein said extracting obtains weighted frequency spectra using overlapping frames with time weighting to smoothly transition between frames, and
wherein said searching comprises:
transforming the weighted frequency spectra to transformed frequency spectra using a perceptual power scale attenuating high values relative to low values; and computing the at least one value from the transformed frequency spectra.
- 17. A method as recited in claim 1,
wherein said extracting comprises partitioning the portion of the unidentified recording into time-frequency regions, each time-frequency region covering at least three ranges of time frames and at least three ranges of frequencies, and wherein said searching comprises:
weighting the time-frequency regions to produce weighted time-frequency regions with emphasis on at least one middle-time and middle-frequency region; and computing the at least one value using the weighted time-frequency regions.
- 18. A method as recited in claim 1,
further comprising storing a plurality of the reference fingerprints for each of a plurality of reference recordings, wherein said extracting produces a plurality of candidate fingerprints from successive frames at a regular time interval, and wherein said searching identifies the unidentified recording as corresponding to a single reference recording only if matches are found between the reference fingerprints from the single reference recording and the candidate fingerprints obtained from a predetermined number of the successive frames.
- 19. A method as recited in claim 1,
further comprising storing a plurality of the reference fingerprints for each of a plurality of reference recordings, wherein said extracting produces a plurality of candidate fingerprints, and wherein said searching comprises:
finding a first match between a first candidate fingerprint and one of the reference fingerprints for a potentially matching reference recording; and comparing other candidate fingerprints from the unknown recording with the reference fingerprints for the potentially matching reference recording until a predetermined number of matches are found.
- 20. A method as recited in claim 1,
further comprising storing a plurality of the reference fingerprints for each of a plurality of reference recordings, and wherein said searching includes all of the reference fingerprints, unless a match is found.
- 21. A method as recited in claim 1, further comprising generating the reference fingerprints for reference recordings by
extracting a principal fingerprint from a specified portion of each reference recording; extracting auxiliary fingerprints from the reference recording at a regular time interval; computing distance measures from the principal fingerprint to the auxiliary fingerprints, respectively; generating a song profile based on the distance measures; and storing the principal fingerprint combined with the song profile as the reference fingerprint for the reference recording.
- 22. A method as recited in claim 1, wherein said extracting comprises:
separating the at least one portion of the unidentified recording into frequency bands; computing power spectra for the frequency bands, respectively; and computing at least one value from all the power spectra.
- 23. A method as recited in claim 22, wherein the frequency bands are output from filters derived from one prototype filter corresponding to an analysis wavelet.
- 24. A method as recited in claim 23, wherein a ratio of bandwidth to center frequency is substantially identical for all of the filters.
- 25. A method as recited in claim 1,
further comprising generating the reference fingerprints for reference recordings by
extracting a principal fingerprint from a specified portion of each reference recording; extracting auxiliary fingerprints from the reference recording at a regular time interval; computing reference distance measures from the principal fingerprint to the auxiliary fingerprints, respectively; generating a reference song profile based on the reference distance measures; and storing the principal fingerprint combined with the reference song profile as the reference fingerprint for the reference recording, wherein said extracting produces an initial candidate fingerprint and subsequent candidate fingerprints following the initial candidate fingerprint at the regular time interval, and wherein said searching comprises
comparing the initial candidate fingerprint with the principal fingerprint for the reference recordings, and when a potentially matching reference recording is found,
computing candidate distance measures from the initial candidate fingerprint to the subsequent candidate fingerprints, respectively; generating a candidate song profile based on the candidate distance measures; and identifying the unknown recording as the potentially matching reference recording only if the candidate song profile has a predetermined correlation to the reference song profile for the potentially matching reference recording.
- 26. A method as recited in claim 25, wherein said comparing begins prior to completing said extracting of the subsequent candidate fingerprints.
- 27. A method as recited in claim 1, wherein each of the candidate and reference fingerprints include a vector of at least 5 elements having at least 256 values each.
- 28. A method as recited in claim 27, wherein each of the candidate and reference fingerprints include a vector of up to 38 elements having no more than 65,536 values each.
- 29. A method as recited in claim 28, wherein each of the candidate and reference fingerprints include a vector of approximately 30 elements of approximately 16 bits each.
- 30. A method as recited in claim 1, wherein said extracting produces a plurality of candidate fingerprints, each from different copies corresponding to a single reference recording, at least one of the different copies having been modified prior to said extracting.
- 31. A method as recited in claim 30, wherein the at least one of the different copies having been modified by at least one of a time based audio effect, a frequency based audio effect, and a signal compression scheme.
- 32. A method of generating reference fingerprints of reference recordings for identifying unknown recordings, comprising:
extracting a principal fingerprint from a specified portion of each reference recording; extracting auxiliary fingerprints from the reference recording at regular frame intervals; computing distance measures from the principal fingerprint to the auxiliary fingerprints, respectively; generating a song profile based on the distance measures; and storing the principal fingerprint combined with the song profile as the reference fingerprint for the reference recording.
- 33. A method of generating reference fingerprints of reference recordings for identifying unknown recordings, comprising:
separating a specified portion of each reference recording into frequency bands; computing power spectra for the frequency bands, respectively; and computing at least one value from all the power spectra.
- 34. A method as recited in claim 33, wherein the frequency bands are output from filters derived from one prototype filter corresponding to an analysis wavelet.
- 35. A method as recited in claim 34, wherein a ratio of bandwidth to center frequency is substantially identical for all of the filters.
- 36. At least one computer readable medium storing at least one program embodying a method of identifying recordings, comprising:
extracting at least one candidate fingerprint from at least one portion of an unidentified recording; and searching for a match between at least one value derived from the at least one candidate fingerprint and at least one value in at least one reference fingerprint among a plurality of reference fingerprints.
- 37. At least one computer readable medium as recited in claim 36, wherein said searching comprises computing at least one weighted absolute difference between the at least one candidate fingerprint and the at least one reference fingerprint using a weight based on a value derived from the at least one candidate fingerprint.
- 38. At least one computer readable medium as recited in claim 36, further comprising prior to said extracting, expanding dynamic range of the at least one portion of the unidentified recording.
- 39. At least one computer readable medium as recited in claim 38, wherein said expanding of the dynamic range makes all sample values within the at least one portion of an unidentified recording more equally likely.
- 40. At least one computer readable medium as recited in claim 36, further comprising:
storing in a cache memory matched candidate fingerprints with identifiers of corresponding reference fingerprints; and determining whether a new candidate fingerprint is included in the matched candidate fingerprints in the cache memory prior to said searching using the new candidate fingerprint.
- 41. At least one computer readable medium as recited in claim 40, further comprising:
indicating a match between the new candidate fingerprint and a corresponding reference fingerprint when the new candidate fingerprint is included in the matched candidate fingerprints in the cache memory; and adding the new candidate fingerprint to the cache memory and associating a corresponding identifier for the corresponding reference fingerprint with new candidate fingerprint in the cache memory.
- 42. At least one computer readable medium as recited in claim 36,
wherein said extracting results in each candidate fingerprint including a predetermined number of candidate values for corresponding frequency ranges and each reference fingerprint includes the predetermined number of reference values for the corresponding frequency ranges, and wherein said method further comprises determining whether each candidate fingerprint matches one of the reference fingerprints based on selectively weighted differences between corresponding candidate and reference values for different frequency ranges.
- 43. At least one computer readable medium as recited in claim 42, further comprising generating each of the candidate and reference fingerprints to include values representing a magnitude of power at frequencies in frequency ranges with mid-range frequencies weighted less than high- and low-range frequencies.
- 44. At least one computer readable medium as recited in claim 36, wherein generation of each of the candidate and reference fingerprints comprises:
computing power in each of a plurality of frequency bands; and normalizing the power for each frequency within each band so that a mean of the power within each band is equal to a predetermined value.
- 45. At least one computer readable medium as recited in claim 36, wherein generation of each of the candidate and reference fingerprints comprises computing a frequency distribution within each of a plurality of different frequency bands using a finer resolution at lower frequency bands than at higher frequency bands.
- 46. At least one computer readable medium as recited in claim 36,
wherein said extracting extracts first and second candidate fingerprints from the at least one portion of the unidentified recording, the first candidate fingerprint having low discernability of frequency variation from the original and the second candidate fingerprint having low discernability of amplitude variation from the original, and wherein said method further comprises:
storing first reference fingerprints having low discernability of frequency variation and second reference fingerprints with low discernability of amplitude variation; and comparing the first candidate fingerprint with the first reference fingerprints and the second candidate fingerprint with the second reference fingerprints.
- 47. At least one computer readable medium as recited in claim 46, wherein a first processor is used for said comparing of the first candidate fingerprint with the first reference fingerprints and concurrently a second processor is used for said comparing of the second candidate fingerprint with the second reference fingerprints.
- 48. At least one computer readable medium as recited in claim 46, wherein a first result of said comparing of the first candidate fingerprint with the first reference fingerprints is combined with a second result of said comparing of the second candidate fingerprint with the second reference fingerprints to determine whether corresponding first and second reference fingerprints for both the first and second fingerprints are stored.
- 49. At least one computer readable medium as recited in claim 36, wherein the portion of the unidentified recording has a duration of less than 25 seconds.
- 50. At least one computer readable medium as recited in claim 49, wherein the portion of the unidentified recording has a duration of at least 10 seconds and no greater than 20 seconds.
- 51. At least one computer readable medium as recited in claim 36,
wherein said extracting obtains weighted frequency spectra using overlapping frames with time weighting to smoothly transition between frames, and wherein said searching comprises:
transforming the weighted frequency spectra to transformed frequency spectra using a perceptual power scale attenuating high values relative to low values; and computing the at least one value from the transformed frequency spectra.
- 52. At least one computer readable medium as recited in claim 36,
wherein said extracting comprises partitioning the portion of the unidentified recording into time-frequency regions, each time-frequency region covering at least three ranges of time frames and at least three ranges of frequencies, and wherein said searching comprises:
weighting the time-frequency regions to produce weighted time-frequency regions with emphasis on at least one middle-time and middle-frequency region; and computing the at least one value using the weighted time-frequency regions.
- 53. At least one computer readable medium as recited in claim 36,
further comprising storing a plurality of the reference fingerprints for each of a plurality of reference recordings, wherein said extracting produces a plurality of candidate fingerprints from successive frames at a regular time interval, and wherein said searching identifies the unidentified recording as corresponding to a single reference recording only if matches are found between the reference fingerprints from the single reference recording and the candidate fingerprints obtained from a predetermined number of the successive frames.
- 54. At least one computer readable medium as recited in claim 36,
further comprising storing a plurality of the reference fingerprints for each of a plurality of reference recordings, wherein said extracting produces a plurality of candidate fingerprints, and wherein said searching comprises:
finding a first match between a first candidate fingerprint and one of the reference fingerprints for a potentially matching reference recording; and comparing other candidate fingerprints from the unknown recording with the reference fingerprints for the potentially matching reference recording until a predetermined number of matches are found.
- 55. At least one computer readable medium as recited in claim 36,
further comprising storing a plurality of the reference fingerprints for each of a plurality of reference recordings, and wherein said searching includes all of the reference fingerprints, unless a match is found.
- 56. At least one computer readable medium as recited in claim 36, further comprising generating the reference fingerprints for reference recordings by
extracting a principal fingerprint from a specified portion of each reference recording; extracting auxiliary fingerprints from the reference recording at a regular time interval; computing distance measures from the principal fingerprint to the auxiliary fingerprints, respectively; generating a song profile based on the distance measures; and storing the principal fingerprint combined with the song profile as the reference fingerprint for the reference recording.
- 57. At least one computer readable medium as recited in claim 36, wherein said extracting comprises:
separating the at least one portion of the unidentified recording into frequency bands; computing power spectra for the frequency bands, respectively; and computing at least one value from all the power spectra.
- 58. At least one computer readable medium as recited in claim 57, wherein the frequency bands are output from filters derived from one prototype filter corresponding to an analysis wavelet.
- 59. At least one computer readable medium as recited in claim 58, wherein a ratio of bandwidth to center frequency is substantially identical for all of the filters.
- 60. At least one computer readable medium as recited in claim 36,
further comprising generating the reference fingerprints for reference recordings by
extracting a principal fingerprint from a specified portion of each reference recording; extracting auxiliary fingerprints from the reference recording at a regular time interval; computing reference distance measures from the principal fingerprint to the auxiliary fingerprints, respectively; generating a reference song profile based on the reference distance measures; and storing the principal fingerprint combined with the reference song profile as the reference fingerprint for the reference recording, wherein said extracting produces an initial candidate fingerprint and subsequent candidate fingerprints following the initial candidate fingerprint at the regular time interval, and wherein said searching comprises
comparing the initial candidate fingerprint with the principal fingerprint for the reference recordings, and when a potentially matching reference recording is found,
computing candidate distance measures from the initial candidate fingerprint to the subsequent candidate fingerprints, respectively; generating a candidate song profile based on the candidate distance measures; and identifying the unknown recording as the potentially matching reference recording only if the candidate song profile has a predetermined correlation to the reference song profile for the potentially matching reference recording.
- 61. At least one computer readable medium as recited in claim 60, wherein said comparing begins prior to completing said extracting of the subsequent candidate fingerprints.
- 62. At least one computer readable medium as recited in claim 36, wherein each of the candidate and reference fingerprints include a vector of at least 5 elements having at least 256 values each.
- 63. At least one computer readable medium as recited in claim 72, wherein each of the candidate and reference fingerprints include a vector of up to 38 elements having no more than 65,536 values each.
- 64. At least one computer readable medium as recited in claim 63, wherein each of the candidate and reference fingerprints include a vector of approximately 30 elements of approximately 16 bits each.
- 65. At least one computer readable medium as recited in claim 36, wherein said extracting produces a plurality of candidate fingerprints, each from different copies corresponding to a single reference recording, at least one of the different copies having been modified prior to said extracting.
- 66. At least one computer readable medium as recited in claim 65, wherein the at least one of the different copies having been modified by at least one of a time based audio effect, a frequency based audio effect, and a signal compression scheme.
- 67. At least one computer readable medium storing at least one program embodying a method of generating reference fingerprints of reference recordings for identifying unknown recordings, said method comprising:
extracting a principal fingerprint from a specified portion of each reference recording; extracting auxiliary fingerprints from the reference recording at regular frame intervals; computing distance measures from the principal fingerprint to the auxiliary fingerprints, respectively; generating a song profile based on the distance measures; and storing the principal fingerprint combined with the song profile as the reference fingerprint for the reference recording.
- 68. At least one computer readable medium storing at least one program embodying a method of generating reference fingerprints of reference recordings for identifying unknown recordings, said method comprising:
separating a specified portion of each reference recording into frequency bands; computing power spectra for the frequency bands, respectively; and computing at least one value from all the power spectra.
- 69. At least one computer readable medium as recited in claim 68, wherein the frequency bands are output from filters derived from one prototype filter corresponding to an analysis wavelet.
- 70. At least one computer readable medium as recited in claim 69, wherein a ratio of bandwidth to center frequency is substantially identical for all of the filters.
- 71. A system for identifying recordings, comprising:
a storage unit storing reference fingerprints; and a processor, coupled to said storage unit, to extract at least one candidate fingerprint from at least one portion of an unidentified recording and to search for a match between at least one value derived from the at least one candidate fingerprint and at least one value in at least one reference fingerprint among the reference fingerprints.
CROSS-REFERENCE TO RELATED APPLICATION(S)
[0001] This application is related and claims priority to U.S. provisional application entitled AUTOMATIC IDENTIFICATION OF SOUND RECORDINGS, having serial No. 60/306,911, by Wells et al., filed Jul. 20, 2001 and incorporated by reference herein.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60306911 |
Jul 2001 |
US |