System and method for automatic detection of duplicate digital photos

Information

  • Patent Application
  • 20080091725
  • Publication Number
    20080091725
  • Date Filed
    October 13, 2006
    18 years ago
  • Date Published
    April 17, 2008
    16 years ago
Abstract
Hashes of metadata of digital photographs on, e.g., a removable camera memory are compared against values in a hash table representing previously stored photographs on an archive computer to ensure that only previously unstored photos are copied onto the archive computer.
Description

BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a block diagram of a non-limiting system that can be used to implement the invention; and



FIG. 2 is a flow chart of non-limiting logic that can be executed by the system shown in FIG. 1.





DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

Referring initially to FIG. 1, a system is shown, generally designated 10, that includes a user computer 12, such as but not limited to a personal computer, laptop computer, notebook computer, etc. or a dedicated computerized storage device such as a so-called “digital shoebox” that may if desired communicate over the Internet 14 or other wide area network with a server 16, although Internet communication is not necessarily central to the present invention. In typical non-limiting implementations the user computer 12 includes data entry devices 18 such as keyboards, mice, etc. and data output devices such as a monitor 20.


Additionally, the user computer 12 can include a local internal or external data store 22 such as but not limited to hard disk drive, optical disk drive, alone or in combination with solid state memory, etc. Digital photographs may be stored in the local data store 22. Also, the computer 12 may be engageable with a removable memory 24 such as but not limited a Sony Memory Stick® that may also bear digital photographs taken by a camera 26 with which the removable memory 24 can be engaged. A user computer processor 28 can execute logic stored in local memory to execute various steps described further below.


The camera 26 typically stores a digital photograph in file form, appending metadata to the file known as “Exchangeable Image File” (EXIF) data. In one non-limiting implementation, the EXIF data may include but may not be limited to file name, camera model name, shooting date/time, shooting mode, photo effect, shutter speed, aperture value, light metering, exposure compensation, ISO speed, lens type, focal length, whether zoom was used, IS mode, image size, image quality, and whether a flash was used and if so what type. Additional metadata that can be appended to a photo file either by the camera 26 or by the processor 28 can include the file size.


Completing the description of FIG. 1, the server 16 includes a server processor 32 that can access a server store 34, and the server store 34 can contain photograph files and other data, including user shipping data and billing information. Also, the server 16 can print hard copy prints of digital photographs using a server printer 36, for shipping of the prints to a user of the user computer 12.


Now turning to FIG. 2, the present logic can be seen. Commencing at block 40, when, e.g., the removable memory 24 is engaged with the computer 12 for the purpose of automatically archiving photograph files generated by the camera 26 onto the local storage 22, for each photo file a do loop is entered. More generally, photos on one storage such as the removable memory 24 or other storage, including, e.g., the Internet server store 34 or other data store via wired or wireless connection, are sought to be archived onto the local data store 22.


The do loop proceeds to block 42 to obtain metadata of the file. In one non-limiting implementation, the data obtained is file name, file size, and other (or all of the above) EXIF data. In a particularly preferred implementation the file name, size, and other EXIF data are hashed.


Decision diamond 44 indicates that the metadata obtained at block 42 is compared to metadata in a table that is accessible to the processor 28 and that contains metadata of photo files that have already been stored on the local data store 22. When a hash is used, the table stores hash values, and at decision diamond 44 the processor 28 simply compares the hash obtained at block 42 with the values in the hash table. If no match is found, at block 46 the hash that was obtained at block 42 is added to the table and the photograph file is stored on the local data store 22. On the other hand, if the metadata of the photo file under test matches data in the table (e.g., if the hash from block 42 matches a hash in the table), the process flows from decision diamond 44 to block 48 wherein the photo file is not stored and, if desired, a message is returned to the user to the effect that “this photograph has already been stored.” Further, if desired the process can erase redundant photo files from the removable memory 24.


While the particular SYSTEM AND METHOD FOR AUTOMATIC DETECTION OF DUPLICATE DIGITAL PHOTOS is herein shown and described in detail, it is to be understood that the subject matter which is encompassed by the present invention is limited only by the claims.

Claims
  • 1. A method for storing, onto a first computer storage, digital photo files on a second computer storage, comprising: for at least one photo file sought to be stored, accessing metadata of the file;comparing the metadata or a hash thereof with data in a data structure representing photo files that have been previously stored onto the first computer storage; anddetermining whether to store the photo file onto the first computer storage based at least in part on the result of the comparing act.
  • 2. The method of claim 1, wherein all photo files on the second computer storage are automatically sought to be stored on the first computer storage, and metadata or a hash thereof for each photo file is compared to data in the data structure.
  • 3. The method of claim 1, wherein a hash of metadata is compared to data in the data structure, and the data structure is a hash table.
  • 4. The method of claim 3, wherein the hash is a hash of file name, file size, and predetermined EXIF data.
  • 5. The method of claim 3, wherein if the hash of metadata associated with the photo file sought to be stored matches a value in the hash table, the photo file is not copied onto the first computer storage.
  • 6. The method of claim 5, wherein if the hash of metadata associated with the photo file sought to be stored does not match a value in the hash table, the photo file is copied onto the first computer storage and the hash of metadata associated with the photo file sought to be stored is added to the hash table.
  • 7. The method of claim 4, wherein the EXIF data includes camera model name and/or shooting date/time and/or shooting mode and/or photo effect and/or shutter speed and/or aperture value and/or light metering and/or exposure compensation and/or ISO speed and/or lens type and/or focal length and/or whether zoom was used and/or IS mode and/or image size and/or image quality and/or and whether a flash was used and if so what type.
  • 8. The method of claim 4, wherein the EXIF data includes camera model name and shooting date/time and shooting mode and photo effect and shutter speed and aperture value and light metering and exposure compensation and ISO speed and lens type and focal length and whether zoom was used and IS mode and image size and image quality and whether a flash was used and if so what type.
  • 9. An apparatus for storing digital photo files, comprising: at least a first computer storage;at least one processor accessing a second computer storage to compare hash values of metadata associated with photo files on the second computer storage with values in a hash table and determining whether to store each photo file onto the first computer storage at least partially based thereon.
  • 10. The apparatus of claim 9, wherein all photo files on the second computer storage are automatically sought to be stored on the first computer storage.
  • 11. The apparatus of claim 10, wherein the hash value represents a hash of file name, file size, and predetermined EXIF data.
  • 12. The apparatus of claim 9, wherein if a hash value of metadata associated with a photo file sought to be stored matches a value in the hash table, the photo file is not copied onto the first computer storage.
  • 13. The apparatus of claim 12, wherein if a hash value of metadata associated with a photo file sought to be stored does not match a value in the hash table, the photo file is copied onto the first computer storage and the hash value of metadata associated with the photo file is added to the hash table.
  • 14. A computer readable medium bearing instructions executable by a computer processor to compare hashes of metadata of digital photographs against values in a hash table representing previously stored photographs on an archive data store to ensure that only previously unstored photos are copied onto the archive data store.
  • 15. The computer readable medium of claim 14, wherein the instructions include causing the processor to hash file name, file size, and EXIF data of each digital photograph sought to be stored on the archive data store.
  • 16. The computer readable medium of claim 14, wherein if a hash value of metadata associated with a digital photograph sought to be stored matches a value in the hash table, the digital photograph is not copied onto the archive computer storage.
  • 17. The computer readable medium of claim 16, wherein if a hash value of metadata associated with a digital photograph sought to be stored does not match a value in the hash table, the digital photograph is copied onto the archive computer storage and the hash value of metadata associated with the digital photograph is added to the hash table.