The present invention generally relates to techniques for identifying digitized samples of time varying signals and in particular, to a method and apparatus for identifying input files using reference files associated with nodes of a sparse binary tree.
In searching for particular audio files on the Internet, it is useful to be able to determine the identity of untitled audio files as well as to confirm that titled audio files are what they purport to be. Although a human may conceivably make such determinations and confirmations by simply listening to the content of the audio files by playing them through a media player, such an approach is not always reliable. Also, a process such as this involving human judgment is inherently very slow.
Therefore, it is advantageous to employ a computer to determine the identity of untitled audio files as well as to confirm that titled audio files are what they purport to be. The computer can not only store a lot of information to assist in identifying an input audio file, it can also process that information very quickly.
In one technique employing a computer, an algorithm is used to uniquely identify audio file content. Using this approach, a master code is generated by performing the algorithm on content in a master audio file. By applying the same algorithm to the content of an input audio file, the calculated code may then be compared with the master code to determine a match.
Use of such an algorithm, however, does not always lead to proper identification, because the content of an audio file may not have exactly the same length of recording as the content of the master audio file, for example, by starting at a point a little later in time, thus giving rise to a calculated code that would not match the master code. Also, if the content of the input audio file contains noise spikes or background noise, this would also give rise to a calculated code that would not match the master code. Thus, in both of these cases, the stored content is not properly identified.
Accordingly, one object of the present invention is to provide a method and apparatus for identifying input files that are reliable even if their content is offset in time, or contains noise spikes or background noise.
Another object is to provide a method and apparatus for identifying input files that are computationally fast when performed in a computer system.
Another object is to provide a method and apparatus for identifying input files that minimize data storage requirements in a computer system.
These and other objects are accomplished by the various aspects of the present invention, wherein briefly stated, one aspect is a method for matching an input audio file with reference audio files, comprising: identifying potential matches of an input audio file among reference audio files based upon at least one common characteristic; and searching for a match of the input audio file among the potential matches.
Another aspect is a method for matching an input audio file with reference audio files, comprising: identifying potential matches of an input audio file among reference audio files based upon at least one common characteristic; and comparing an input profile resulting from a measurable attribute of the input audio file with reference profiles resulting from the same measurable attribute of the potential matches to determine a match.
Another aspect is a method for matching an input file with reference files, comprising: identifying potential matches of an input file among reference files by associating nodes of a sparse binary tree with the input file in a same manner used to associate nodes of the sparse binary tree with the reference files; and searching for a match of the input file among the potential matches.
Another aspect is a method for matching an input file with reference files, comprising: identifying potential matches of an input file among reference files by associating nodes of a sparse binary tree with the input file in a same manner used to associate nodes of the sparse binary tree with the reference files; and comparing a profile resulting from a measurable attribute of the input file with profiles resulting from the same measurable attribute of the potential matches to determine a match.
Another aspect is a method for matching an input audio file with reference audio files, comprising: generating an input profile from an input audio file based upon a measurable attribute also used to generate reference profiles from reference audio files; identifying potential matches among the reference profiles with the input profile by processing the input profile in a manner used to associate individual of the reference profiles with nodes of a sparse binary tree; and comparing the input profile with the potential matches to determine a match.
Still another aspect is a method for matching an input audio file with reference audio files, comprising: generating reference profiles from reference audio files using a measurable attribute; generating a sparse binary tree by applying a process to the reference profiles such that identifications of the reference profiles are associated at resulting nodes of the sparse binary tree; generating an input profile from the input audio file using the measurable attribute; applying the process to the input profile so that associated reference profiles are identified from resulting nodes of the sparse binary tree; and comparing at least a portion of the input profile with corresponding portions of the identified reference profiles to determine a match.
Another aspect is an apparatus for matching an input audio file with reference audio files, comprising at least one computer configured to: identify potential matches of an input audio file among reference audio files based upon at least one common characteristic; and search for a match of the input audio file among the potential matches.
Another aspect is an apparatus for matching an input audio file with reference audio files, comprising at least one computer configured to: identify potential matches of an input audio file among reference audio files based upon at least one common characteristic; and compare an input profile resulting from a measurable attribute of the input audio file with reference profiles resulting from the same measurable attribute of the potential matches to determine a match.
Another aspect is an apparatus for matching an input file with reference files, comprising at least one computer configured to: identify potential matches of an input file among reference files by associating nodes of a sparse binary tree with the input file in a same manner used to associate nodes of the sparse binary tree with the reference files; and search for a match of the input file among the potential matches.
Another aspect is an apparatus for matching an input file with reference files, comprising at least one computer configured to: identify potential matches of an input file among reference files by associating nodes of a sparse binary tree with the input file in a same manner used to associate nodes of the sparse binary tree with the reference files; and compare a profile resulting from a measurable attribute of the input file with profiles resulting from the same measurable attribute of the potential matches to determine a match.
Another aspect is an apparatus for matching an input audio file with reference audio files, comprising at least one computer configured to: generate an input profile from an input audio file based upon a measurable attribute also used to generate reference profiles from reference audio files; identify potential matches among the reference profiles with the input profile by processing the input profile in a manner used to associate individual of the reference profiles with nodes of a sparse binary tree; and compare the input profile with the potential matches to determine a match.
Yet another aspect is an apparatus for matching an input audio file with reference audio files, comprising at least one computer configured to: generate reference profiles from reference audio files using a measurable attribute; generate a sparse binary tree by applying a process to the reference profiles such that identifications of the reference profiles are associated at resulting nodes of the sparse binary tree; generate an input profile from the input audio file using the measurable attribute; apply the process to the input profile so that associated reference profiles are identified from resulting nodes of the sparse binary tree; and compare at least a portion of the input profile with corresponding portions of the identified reference profiles to determine a match.
Additional objects, features and advantages of the various aspects of the present invention will become apparent from the following description of its preferred embodiment, which description should be taken in conjunction with the accompanying drawings.
All methods, generators and programs described herein are preferably performed on one or more computers cooperating together such as in a distributed or other processing environment.
Referring to
The reference audio clips in this case may be published music that is protected by copyright law, and the input audio clips may be audio files either residing on user computers or being transmitted through the Internet using a file sharing network. Formats for the audio clips may be any standard format such as MP3.
The profile generator 202 is used to generate reference profiles 102 from reference audio clips 201 (as shown in
As used herein, the term “chunk offset” means the difference in number of chunks between a current chunk of the reference profile and a first chunk of the reference profile, plus one. Thus, the number of the chunk is equal to the chunk offset in this convention.
Two programmable parameters are used in the method. The term “velocity” means the number of chunks between local maximums in the reference profile, and the term “acceleration” means the change in velocity divided by the number of chunks over which the change occurs. Initial values for velocity and acceleration are pre-defined prior to performance of the function 402. As an example, the initial velocity may be set to 1, and the initial acceleration may also be set to 1. The velocity is then modified according to the method. The acceleration, on the other hand, is generally constant at its initial value.
In 501, the chunk offset is initialized to be equal to the initial velocity. In 502, a determination is made whether the zero crossing count for the current chunk is a local maximum. To be considered a local maximum, the zero crossing count for the current chunk must be greater by a programmed threshold value than both the zero crossing count for the chunk right before the current chunk and the zero crossing count for the chunk right after the current chunk. In situations where the current chunk does not have either a chunk right before it (i.e., it is the first chunk in the reference profile) or a chunk right after it (i.e., it is the last chunk in the reference profile), a zero will be assumed for the zero crossing count in those cases.
If the determination in 502 is YES, then in 503, a profile hook for this chunk offset is stored in the reference profiles tree 103. Additional details on 503 are described in reference to
On the other hand, if the determination in 502 is NO, then in 504, the chunk offset is incremented by the velocity.
In 505, a determination is then made whether the end of the reference file has been reached. This determination would be YES, if the new chunk offset is greater than the chunk number of the last chunk in the reference profile. Therefore, if the determination in 505 is YES, then the method is done, and another reference profile can be processed as shown in
On the other hand, if the determination in 505 is NO, then in 506, the velocity is incremented by the acceleration. By incrementing the velocity in this fashion, chunks will be processed in a more efficient manner. Rather then processing every chunk in a reference profile to see if it is a local maximum, chunks are processed in a quadratically increasing fashion to take advantage of the observation that matches between input profiles and reference profiles usually can be determined early on in the profiles.
The method then loops back to 502 to process the newly calculated chunk offset, and continues looping through 502˜506 until the end of the reference profile is determined in 505.
In 603, a determination is made whether the zero crossing count for the current chunk is greater than a programmable constant or threshold value. If the determination in 603 is NO, then in 604, the current node is changed to a right-branch child node, which is created at that time if it doesn't already exist in the reference profiles tree 103. On the other hand, if the determination in 603 is YES, then in 605, the current node is changed to a left-branch child node, which is created at that time if it doesn't already exist in the reference profiles tree 103.
In 606, a determination is then made whether the current chunk is the last chunk in the reference profile. If the determination in 606 is NO, then in 607, the current chunk is incremented by 1, and the method loops back to 603, and continues looping through 603˜607 until the determination in 606 is YES. When the determination in 606 is YES, then in 608, the method stores the profile hook in the then current node, and is done. The profile hook in this case includes a profile identification or “ID” and the chunk offset that is being processed at the time in function 503. The profile ID serves to uniquely identify the content of the reference profile in this case.
In the following description, it is now assumed that generation of the reference profiles tree 103 is complete so that it contains information of profile hooks for each of the reference profiles 102 at various of its nodes.
Assuming mini-matches have been identified between the input profile and one or more reference profiles, then in a second function 802, the audio matcher 100 then stores and merges when appropriate the mini-matches for subsequent processing. In a third function 803, the audio matcher 100 then determines one of the following: an acceptable best match for the input profile; a determination that the input profile is a spoof; or a no-match if the input profile is not determined to be a spoof or if an acceptable best match cannot be found.
In 903, however, rather than storing a profile hook in the reference profiles tree for the chunk offset as performed in 503 of
Starting in 1001, the current node in the reference profiles tree 103 is initially set to the root node, and in 1002, the current chunk is set to the chunk offset currently being processed.
In 1003, a determination is made whether the zero crossing count for the current chunk is greater than a programmable constant. The constant that is to be used here is the same as that used in 603 of
If the determination in 1003 is NO, then in 1004, the current node is changed to a right-branch child node. On the other hand, if the determination in 1003 is YES, then in 1005, the current node is changed to a left-branch child node.
In 1006, a determination is then made whether the current chunk is the last chunk in the input profile. If the determination in 1006 is NO, then in 1007, the current chunk is incremented by 1, and the method loops back to 1003, and continues looping through 1003˜1007 until the determination in 1006 is YES. When the determination in 1006 is YES, then in 1008, the method matches the input profile against all reference profiles identified in profile hooks stored at the current node of the reference profiles tree 103.
On the other hand, if the determination in 1101 is YES, then in 1102, the first N chunks of the input profile are compared with the corresponding first N chunks of a first reference profile identified. In 1103, a determination is made whether they match. In order for corresponding chunks to match, their zero crossing counts do not have to be exactly equal. As long as the absolute difference between the zero crossing counts is within a programmed tolerance, they may be determined to be a match. Also, it may not be necessary for all of the first N chunks to match, the match determination may be a YES as long as a high enough percentage of the first N chunks match.
If the determination in 1103 is a YES, then in 1104, a mini-match at the current offset of the input profile is generated. Generation of the mini-match involves including the information in the following table in the mini-match.
On the other hand, if the determination in 1103 is a NO, then in 1105, a determination is made whether there is another reference profile identified at the current node of the reference profiles tree 103. If the determination in 1105 is YES, then in 1106, the first N chunks of the input profile are then compared with those of the next identified reference profile, and the method continues by looping through 1103˜1106 until either a match is found or there are no more reference profiles to be compared against the input profile.
If the determination in 1105 results at any time in a NO, then in 1107, the method generates a “non-full” mini-match using the best matching one of the reference profiles identified at the current node of the reference profiles tree 103 (i.e., the reference profile whose first N chunks came closest to being determined as a match to the first N chunks of the input profile). As with the “full” mini-match generated in 1104, the “non-full” mini-match will also be associated to the current offset of the input profile.
In 1207, a determination is then made whether there are any more mini-matches to be input. If the determination in 1207 is YES, then the method jumps back to 1201 to input the next mini-match. In 1202, a determination is once again made whether there are any stored mini-matches. This time, since the first mini-match was stored, the determination will result in a YES, so that the method proceeds to 1204.
In 1204, a search is performed to find a merger candidate for the current mini-match among the mini-matches already in the store. In order to be considered a merger candidate, the current mini-match and the stored mini-match must refer to the same reference profile ID, and any difference between their respective wt1 parameters (offsets into the input profile at which the reference profile begins) must be within a specified tolerance such as 50 chunks or 5 seconds.
In 1205, a determination is then made whether a merger candidate has been found. If the determination in 1205 is NO, then the current mini-match is added to the store in 1203, and the method proceeds from there as previously described.
On the other hand, if the determination in 1205 is YES, then in 1206, the current mini-match is merged with the merger candidate. When merging the current mini-match with the merger candidate, the parameter values for wt1, wt2, time1 and time2 of the merged mini-match are weighted averages of the current mini-match and the merger candidate values, weighted by their respective matched times. The parameter value for “err” of the merged mini-match is the sum of the current mini-match and the merger candidate values. If either the current mini-match or the merger candidate is a “full” match, then the merged mini-match has its full match parameter set to true.
After merger, the method proceeds to 1207.
In 1207, a determination is made whether there are any more mini-matches to be processed. If the determination in 1207 is YES, then the method proceeds by looping through 1201˜1207 until all mini-matches have been processed by either being stored individually in the audio matcher store or merged with another mini-match already stored in the audio matcher store, and the determination in 1207 at that time results in a NO.
On the other hand, if the determination in 1401 is YES, then in 1403, a determination is made whether the sum of the time matched for all the mini-matches in the store is greater than some threshold percentage of the input profile such as, for example, 70%. If the determination in 1403 results in a NO, then in 1402, a no spoof found conclusion is made and the method stops at that point.
On the other hand, if the determination in 1403 is YES, then in 1404, a determination is made whether each mini-match has an error/second value that is less than some maximum value. The error/second value for each mini-match may be calculated by the ratio of the mini-match's “err” parameter and “time matched” parameter. If the determination in 1404 results in a NO, then in 1402, a no spoof found conclusion is made and the method stops at that point.
On the other hand, if the determination in 1404 is YES, then in 1405, a spoof found conclusion is made and the method stops at that point. In this case, the spoof may be formed by compositing several tracks together or looping the same segment of one track. Since these kinds of spoofs are quite common on peer-to-peer networks, the ability to automatically identify them is useful.
In 1503, the method then identifies one of the remaining mini-matches as a best match according to programmed criteria such as its errors/second value, its time matched value, and the percentage of its reference profile that it recognizes. Typically, the best match will be a mini-match that exceeds all other mini-matches in all of these criteria. In the event that two mini-matches are close, some weighting of the criteria may be performed to determine a best match between the two.
In 1504, a determination is then made whether the percentage of the input profile and the reference profile covered by the best match exceeds some minimum value. If the determination in 1504 is YES, then in 1505, the best match identified in 1503 is concluded to be an acceptable best match and the method ends at that point. On the other hand, if the determination in 1504 is NO, then the best match identified in 1503 is concluded in 1506 to be an unacceptable best match and the method ends at that point with a conclusion in this case that no acceptable best match was found.
Although the various aspects of the present invention have been described with respect to a preferred embodiment, it will be understood that the invention is entitled to full protection within the full scope of the appended claims.
This application claims priority to U.S. Provisional Application Ser. No. 60/568,881 filed May 6, 2004, which is incorporated herein by reference; and is a continuation-in-part of commonly-owned U.S. application Ser. No. 10/472,458, filed Sep. 19, 2003, now abandoned entitled “Method and Apparatus for Identifying Electronic Files,” which is also incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
4790017 | Hinton | Dec 1988 | A |
5437050 | Lamb et al. | Jul 1995 | A |
5708759 | Kemeny | Jan 1998 | A |
5914714 | Brown | Jun 1999 | A |
5918223 | Blum et al. | Jun 1999 | A |
5925843 | Miller et al. | Jul 1999 | A |
5956671 | Ittycheriah et al. | Sep 1999 | A |
5978791 | Farber et al. | Nov 1999 | A |
6188010 | Iwamura | Feb 2001 | B1 |
6415280 | Farber et al. | Jul 2002 | B1 |
6502125 | Kenner et al. | Dec 2002 | B1 |
6553403 | Jarriel et al. | Apr 2003 | B1 |
6625643 | Colby et al. | Sep 2003 | B1 |
6665726 | Leighton et al. | Dec 2003 | B1 |
6678680 | Woo | Jan 2004 | B1 |
6708212 | Porras et al. | Mar 2004 | B2 |
6732180 | Hale et al. | May 2004 | B1 |
6799221 | Kenner et al. | Sep 2004 | B1 |
6892227 | Elwell et al. | May 2005 | B1 |
6947386 | Temudo de Castro et al. | Sep 2005 | B2 |
6981180 | Bailey et al. | Dec 2005 | B1 |
7020701 | Gelvin et al. | Mar 2006 | B1 |
7100199 | Ginter et al. | Aug 2006 | B2 |
7111061 | Leighton et al. | Sep 2006 | B2 |
7120800 | Ginter et al. | Oct 2006 | B2 |
7136922 | Sundaram et al. | Nov 2006 | B2 |
7143170 | Swildens et al. | Nov 2006 | B2 |
7155723 | Swildens et al. | Dec 2006 | B2 |
7185052 | Day | Feb 2007 | B2 |
7194522 | Swildens et al. | Mar 2007 | B1 |
7203753 | Yeager et al. | Apr 2007 | B2 |
7313619 | Torrant et al. | Dec 2007 | B2 |
7356487 | Kitze | Apr 2008 | B2 |
7363278 | Schmelzer et al. | Apr 2008 | B2 |
7376749 | Loach et al. | May 2008 | B2 |
7409644 | Moore et al. | Aug 2008 | B2 |
7490149 | Omote et al. | Feb 2009 | B2 |
20010037314 | Ishikawa | Nov 2001 | A1 |
20020065880 | Hasegawa et al. | May 2002 | A1 |
20020082999 | Lee et al. | Jun 2002 | A1 |
20020083060 | Wang et al. | Jun 2002 | A1 |
20020087885 | Peled et al. | Jul 2002 | A1 |
20020099955 | Peled et al. | Jul 2002 | A1 |
20020120859 | Lipkin et al. | Aug 2002 | A1 |
20020141387 | Orshan | Oct 2002 | A1 |
20020143894 | Takayama | Oct 2002 | A1 |
20020152173 | Rudd | Oct 2002 | A1 |
20020152261 | Arkin et al. | Oct 2002 | A1 |
20020152262 | Arkin et al. | Oct 2002 | A1 |
20020174216 | Shorey et al. | Nov 2002 | A1 |
20020194108 | Kitze | Dec 2002 | A1 |
20030023421 | Finn et al. | Jan 2003 | A1 |
20030028889 | McCoskey et al. | Feb 2003 | A1 |
20030056118 | Troyansky et al. | Mar 2003 | A1 |
20030061287 | Yu et al. | Mar 2003 | A1 |
20030070070 | Yeager et al. | Apr 2003 | A1 |
20030093794 | Thomas et al. | May 2003 | A1 |
20030095660 | Lee et al. | May 2003 | A1 |
20030097299 | O'Kane et al. | May 2003 | A1 |
20030130953 | Narasimhan et al. | Jul 2003 | A1 |
20030135548 | Bushkin | Jul 2003 | A1 |
20030233541 | Fowler et al. | Dec 2003 | A1 |
20030236787 | Burges | Dec 2003 | A1 |
20040010417 | Peled | Jan 2004 | A1 |
20040030691 | Woo | Feb 2004 | A1 |
20040030743 | Hugly et al. | Feb 2004 | A1 |
20040031038 | Hugly et al. | Feb 2004 | A1 |
20040034798 | Yamada et al. | Feb 2004 | A1 |
20040093354 | Xu et al. | May 2004 | A1 |
20040103280 | Balfanz et al. | May 2004 | A1 |
20040107215 | Moore et al. | Jun 2004 | A1 |
20040139329 | Abdallah et al. | Jul 2004 | A1 |
20040181688 | Wittkotter | Sep 2004 | A1 |
20050075119 | Sheha et al. | Apr 2005 | A1 |
20050089014 | Levin et al. | Apr 2005 | A1 |
20050091167 | Moore et al. | Apr 2005 | A1 |
20050105476 | Gotesdyner et al. | May 2005 | A1 |
20050108378 | Patterson et al. | May 2005 | A1 |
20050114709 | Moore | May 2005 | A1 |
20050147044 | Teodosiu et al. | Jul 2005 | A1 |
20050154681 | Schmelzer | Jul 2005 | A1 |
20050198317 | Byers | Sep 2005 | A1 |
20050198535 | Basche et al. | Sep 2005 | A1 |
20050203851 | King et al. | Sep 2005 | A1 |
20050265367 | Teodosiu et al. | Dec 2005 | A1 |
20050267945 | Cohen et al. | Dec 2005 | A1 |
20060015936 | Illowsky et al. | Jan 2006 | A1 |
20060149806 | Scott et al. | Jul 2006 | A1 |
20070074019 | Seidel | Mar 2007 | A1 |
20070143405 | Bland et al. | Jun 2007 | A1 |
Number | Date | Country |
---|---|---|
WO 0111496 | Feb 2001 | WO |
WO 0177775 | Oct 2001 | WO |
WO 02075595 | Sep 2002 | WO |
WO 02077847 | Oct 2002 | WO |
WO 02082271 | Oct 2002 | WO |
WO 2005006157 | Jan 2005 | WO |
WO 2005043359 | May 2005 | WO |
WO 2005043819 | May 2005 | WO |
WO 2005046174 | May 2005 | WO |
WO 2005084252 | Sep 2005 | WO |
WO 2006041742 | Apr 2006 | WO |
WO 2006086158 | Aug 2006 | WO |
Number | Date | Country | |
---|---|---|---|
20050216433 A1 | Sep 2005 | US |
Number | Date | Country | |
---|---|---|---|
60568881 | May 2004 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10472458 | Sep 2003 | US |
Child | 10963306 | US |