Claims
- 1. A method for detecting a copy of a digital image, comprising:
dividing a suspected copy of a digital image into sub-images; determining average intensities of pixels associated with each of the sub-images; transforming the average intensities into a series of coefficients; defining a rank matrix from the series of coefficients; and comparing the rank matrix from the series of coefficients to a rank matrix of a query image to determine if the suspected copy is an actual copy of the digital image.
- 2. The method of claim 1, wherein each of the sub-images consists of an 8×8 block, and the digital image has 64 equal sized sub-images.
- 3. The method of claim 1, wherein the series of coefficients include AC coefficients of a discrete cosine transform (DCT) function.
- 4. The method of claim 1, wherein the method operation of transforming the average intensities into a series of coefficients includes,
generating a two dimensional series of coefficients by performing a two dimensional discrete cosine transform (DCT).
- 5. The method of claim 1, wherein the method operation of defining a rank matrix from the series of coefficients includes,
arranging the series of coefficients in a one dimensional rank matrix.
- 6. The method of claim 5, wherein the one dimensional rank matrix places magnitudes of the series of coefficients in descending order.
- 7. A content-based image copy detection method, comprising:
selecting image data associated with an image; defining a signature index from the image data; storing the signature index from the image data; determining a signature index for a query image; and identifying a match between the stored signature index from the image data and the signature index for the query image.
- 8. The method of claim 7, wherein a size of the signature index from the image data and a size of the signature index for a query image are optimized to minimize false detections.
- 9. The method of claim 7, wherein both the signature index from the image data and the signature index for a query image are defined by a signature index having a number of coefficients selected from the group consisting of 24, 35, 48, and 63.
- 10. The method of claim 7, wherein the signature index from the image data is stored in a database.
- 11. The method of claim 7, wherein the method operation of selecting image data associated with an image includes,
searching a distributed network for the image data; and defining a cluster from a plurality of signature indices.
- 12. The method of claim 11, wherein a k-means algorithm is used to define the cluster.
- 13. The method of claim 11, wherein a number of clusters is determined by a cluster validity analysis.
- 14. The method of claim 11, wherein the cluster includes a cluster centroid.
- 15. The method of claim 14, further including:
comparing the signature index for the query image with the cluster centroid to determine whether the cluster is searched.
- 16. A method for finding an unauthorized copy of a digital image, comprising:
computing a rank matrix for a test image; computing a rank matrix for a query image; determining a threshold value, the threshold value indicating whether the test image is a copy of the query image; determining a distance value associated with a distance between the rank matrix for the test image and the rank matrix of the query image; and comparing the distance value with the threshold value, wherein if the distance value is less than the threshold value, then the test image is a copy of the query image.
- 17. The method of claim 16, wherein the method operation of determining a threshold value includes,
applying a maximum a posteriori (MAP) criterion to calculate the threshold value.
- 18. The method of claim 16, wherein the threshold value is inversely proportional to a ratio of prior probabilities.
- 19. The method of claim 16, further including:
calculating the rank matrix for the test image and the rank matrix for the query image through a two dimensional discrete cosine transform.
- 20. Computer code configured to be executed on a computer system, the computer code comprising:
program instructions for identifying image data; program instructions for defining a feature vector from the identified image data; program instructions for storing the feature vector; program instructions for determining a match between a feature vector of a query image and the feature vector from the identified image; and program instructions for displaying the match between the feature vector of the query image and the feature vector from the identified image.
- 21. The computer code of claim 20, wherein the feature vector is a rank matrix.
- 22. The computer code of claim 20, wherein the program instructions defining a feature vector include program instructions for performing a two dimensional discrete cosine transform (DCT).
- 23. The computer code of claim 20, wherein the program instructions for determining a match include program instructions for cluster-based detection.
- 24. The computer code of claim 23, wherein the program instructions for cluster-based detection are k-means clustering program instructions.
- 25. Computer code for determining whether a test image is a copy of a query image, the computer code comprising:
program instructions for calculating a rank matrix associated with a query image and a rank matrix associated with a test image; program instructions for determining a threshold value indicative of whether the test image is a copy of the query image; and program instructions for comparing the rank matrix associated with the query image to the rank matrix associated with the test image, the program instructions for comparing including,
program instructions for determining a distance value between the test image and the query image; and program instructions for examining whether the distance value is less than the threshold value, wherein when the distance value is less than the threshold value the test image is a copy of the query image.
- 26. The computer code of claim 25, wherein the program instructions for determining a threshold value indicative of whether the test image is a copy of the query image includes,
program instructions for applying a maximum a posteriori (MAP) criterion to calculate the threshold value.
- 27. The computer code of claim 25, wherein the program instructions for calculating a rank matrix associated with a query image and a rank matrix associated with a test image includes,
program instructions for calculating each rank matrix by applying a two dimensional discrete cosine transform.
- 28. The computer code of claim 25, wherein the threshold value is associated with an optimal precision value and an optimal recall value.
- 29. A computer readable media having program instructions for detecting a copy of a digital image, comprising:
program instructions for dividing a suspected copy of a digital image into sub-images; program instructions for determining average intensities associated with each of the sub-images; program instructions for transforming the average intensities into a series of coefficients; program instructions for defining a rank matrix from the series of coefficients; and program instructions for comparing the rank matrix from the series of coefficients to a rank matrix of a query image to determine if the suspected copy is an actual copy of the digital image.
- 30. The computer readable media of claim 29, wherein the program instructions for transforming the average intensities into a series of coefficients includes,
program instructions for generating a two dimensional series of coefficients by performing a two dimensional discrete cosine transform (DCT).
- 31. The computer readable media of claim 29, wherein the program instructions for defining a rank matrix from the series of coefficients includes,
program instructions for arranging the series of coefficients in a one dimensional rank matrix.
- 32. The computer readable media of claim 29, further including,
program instructions for defining a cluster from a plurality of rank matrices.
- 33. The computer readable media of claim 32, further including,
program instructions for comparing the rank matrix of a query image with a rank matrix of a centroid of the cluster to determine if the cluster is searched.
- 34. A computer system, comprising:
a database generation system for assembling image data, the data base generation system including,
an image gathering system for identifying the image data; feature extraction code for extracting a signature index for the image data; and a database query system for matching the image data with query data; the data base query system including;
a database configured to store the signature index for the image data; a feature matching system configured to identify matches between a signature index of the query data and the signature index for the image data to determine if the image data is a copy of the query data.
- 35. The computer system of claim 34, wherein the image gathering system is a web crawler.
- 36. The computer system of claim 34, wherein the signature index is a rank matrix.
- 37. The computer system of claim 34, wherein the feature matching system includes the capability to determine whether a cluster associated with the database is searched.
- 38. The computer system of claim 34, wherein the signature index is derived through a discrete cosine transform (DCT) function.
- 39. The computer system of claim 34, wherein the feature matching system defines a threshold value for use in determining whether the image data is a copy of the query data.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority from: (1) U.S. Provisional Patent Application No. 60/364,769, filed Mar. 14, 2002, entitled “CONTENT-BASED IMAGE COPY DETECTION,” (2) U.S. Provisional Patent Application No. 60/384,584 filed May 31, 2002, entitled “CONTENT-BASED IMAGE COPY DETECTION,” and (3) U.S. Provisional Patent Application No. 60/372,208 filed Apr. 12, 2002, entitled “ORDINAL MEASURE OF DCT COEFFICIENTS FOR IMAGE COPY DETECTION.” Each of these provisional applications is incorporated by reference herein.
Provisional Applications (3)
|
Number |
Date |
Country |
|
60364769 |
Mar 2002 |
US |
|
60384584 |
May 2002 |
US |
|
60372208 |
Apr 2002 |
US |