The present invention relates generally to software. More specifically, a method and system for stacking images is described.
The advances of conventional digital camera technology have made the process of capturing images easier. However, capturing a good quality image is difficult and problematic using conventional techniques. To work around this problem, many users capture the same image multiple times with the expectation that one of the images may be useable for printing or assembly in a slide show. For example, a feature referred to as “burst mode” or “continuous shooting mode,” may allow a digital camera to take a sequence of images in rapid succession. Some digital camera models allow for an unlimited number of image captures, while other models limit the number of successive images to a single burst.
Capturing multiple images may contribute to image window clutter and unmanageability. In some conventional applications, users can group images in folders or files, such as the filing system in an operating system of a computer. Users may be required to manually select the images, drag them to a folder, and drop them in. This may be very tedious for an avid photographer. Even after the tedious organizational effort, the generic filing systems of operating systems may require the user to locate and select one particular image for printing, viewing, or copying functions.
Image-based applications may provide the concept of a “stack” which may provide a way to group images with some added benefits. The “stack” of images may be treated as if it were one image. That is, one thumbnail image is shown in a display window as a representation of the stack. The other images in the stack are stored and available, but may not have a visible thumbnail. Further, the image selected as the thumbnail may be automatically pre-selected for printing, or viewing. However, similarly to files or folders, users may be required to manually select the images in the stack, which may be tedious and time consuming.
Some applications provide for the grouping of images based on time proximity. That is, the images captured during a selected time interval may be grouped together. However, this method may group images together that have no subject matter commonality. In other words, images taken during a selected time interval may be visually different or disparate. For example, a user may be on a whale watching trip and capture images of a whale breaching when a sea bird passes between the lens and the whale. These images may be grouped by time proximity analysis simply because they were captured during the same time range. Other applications provide for the grouping of images based on visual similarity. That is, the images that share common scenery or common images are grouped together. However, this method may group differing views of the same subject captured years apart that a user may want to keep separate.
Thus, what is needed is a method and system for stacking images without the limitations of conventional techniques.
Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings:
Various embodiments of the invention may be implemented in numerous ways, including as a system, a process, an apparatus, or a series of program instructions on a computer readable medium such as a computer readable storage medium or a computer network where the program instructions are sent over optical or electronic communication links. In general, operations of disclosed processes may be performed in an arbitrary order, unless otherwise provided in the claims.
A detailed description of one or more embodiments is provided below along with accompanying figures. The detailed description is provided in connection with such embodiments, but is not limited to any particular example. The scope is limited only by the claims and numerous alternatives, modifications, and equivalents are encompassed. Numerous specific details are set forth in the following description in order to provide a thorough understanding. These details are provided for the purpose of example and the described techniques may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the embodiments has not been described in detail to avoid unnecessarily obscuring the description.
Techniques for stacking images are described, including receiving a plurality of electronic images, determining one or more groupings for the plurality of electronic images based on a time proximity and a visual similarity, and creating one or more stacks for the plurality of electronic images based on one or more of the groupings. A system for stacking digital images is also described, including a visual similarity engine that may be configured to assess a visual similarity between digital images, a time analyzer configured to analyze a time proximity between digital images, a grouping module configured to group digital images in one or more groups based on the visual similarity and the time proximity, and a user interface configured to display the groups of digital images.
In some examples, raster formats may be based on picture elements, or pixels. A display space or image may be divided up into thousands or millions of pixels. The pixels may be arranged in rows and columns and may be referred to as colored dots. Raster formats differ in the number of bits used per pixel and the compression technique used in storage. One bit per pixel can store black and white images. For example, a bit value of 0 may indicate a black pixel, and a bit value of 1 may indicate a white pixel. As the number of bits per pixel increase, the number of colors a pixel can represent may increase. Example raster formats for image files may include: Graphic Interchange Format (GIF), Joint Photographers Experts Group (JPEG), and Microsoft Windows™ Bitmap (Bitmap). Additional raster formats for image files may include: Windows® Clip Art (CLP), Soft Paintbrush™ (DCX), OS/2™ Warp Format (DIB), Kodak FlashPix™ (FPX), GEM Paint™ Format (IMG), JPEG Related Image Format (JIF), MacPaint™ (MAC), MacPaint™ New Version (MSP), Macintosh™ PICT Format (PCT), ZSoft Paintbrush™ (PCX), Portable Pixel Map by UNIX™ (PPM), Paint Shop Pro™ format (PSP), Unencoded image format (RAW), Run-Length Encoded (RLE), Tagged Image File Format (TIFF), and WordPerfect™ Image Format (WPG).
Vector formats are not based on pixels, but on vectors of data stored in mathematical formats, which may allow for curve shaping and improved stability over raster images. Some example meta/vector image formats may include: CorelDraw™ (CDR), Hewlett-Packard Graphics Language (HGL), Hewlett-Packard Plotter Language (GL/2), Windows™ Metafile (EMF or WMF), Encapsulated PostScript (EPS), Computer Graphics Metafile (CGM) Flash™ Animation, Scalable Vector Graphics (SVG), and Macintosh™ graphic file format (PICT).
Thumbnails image 1 through image 24 in
Display window 100 may be a user interface in a file management application such as a digital image management application, a web page, or other application. Although display window 100 displays thumbnails image 1 through image 24, some other embodiments may include many more images to be managed. Management of images may include creating stacks, selecting stacks for slide shows and/or printing, and the like.
In some embodiments, creating a stack refers to the grouping of image files in a common holder represented by a single thumbnail. The stack may preserve the plurality of images without cluttering up the display window or user interface with multiple thumbnails. The image chosen as the stack representative may be printed or used in a slide show by selecting the stack. That is, upon selecting the stack, the representative image is automatically selected.
In some embodiments, grouping module 202 may include time analyzer 206. In some embodiments, time analyzer 206 may analyze the plurality of images for time proximity. Here, time proximity may include the measure of the difference between the time stamps of two or more images and the comparison of the difference to a time range. Time analyzer 206 may include a time range such that images which have a time stamps that fall within the time range may be considered to have time proximity. In some embodiments, the time range may be set via user interface 210. In some embodiments, when two or more images posses time proximity and at least a low to medium visual similarity they may be grouped together by grouping module 202.
In some embodiments, grouping module may include visual similarity engine 204. Visual similarity engine 204 may perform visual simulation analysis. That is, visual similarity engine 204 may analyze the plurality of images input into grouping module 202 to determine if they are visually similar. In some embodiments, visual similarity may be the measure of the degree of similarity between two or more images. Here, visual similarity may refer to having a likeness such that the images capture the same scene. In some embodiments, visual similarity engine may select nodal points in an image for correlation to nodal points in another image. Visual similarity engine 204 may include a threshold value used as the measure beyond which visual similarity may be achieved. In some embodiments, visual similarity engine 204 may include more than one threshold value (e.g., minimum/low and maximum/high thresholds or low, medium, and high thresholds) such that the high threshold is greater than the medium threshold and the medium threshold is greater than the low threshold. Multiple threshold values may allow various levels of visual similarity to be determined. For example, if low, medium, and high visual similarity thresholds are set, the various levels of visual similarity may include low visual similarity, medium visual similarity, and high visual similarity. In some embodiments, the threshold(s) may be set via user interface 210. In some embodiments, two or more images that possess time proximity and at least a low visual similarity may be grouped. In some embodiments, two or more images that are determined to have a visual similarity above a “high” or a stricter threshold may be grouped together.
In some embodiments, the groups may be displayed in user interface 210. Once the groups are displayed in user interface 210 they may be modified. Modification may refer to alteration, adjustment, revision, or change. Group modification or alteration may include moving images from one group to another group, deleting images from a group, and splitting a group into two groups. In some other embodiments, the groups may be sent to stacking module 212 to create stacks based on the groups.
In some embodiments, system 200 may include stacking module 212. Stacking module 212 may create stacks based on the groups. As mentioned previously, creating a stack refers to the grouping of image files in a common holder represented by a single thumbnail. The stack may preserve the plurality of images without cluttering up the display window or user interface with multiple thumbnails. One image in the stack may be chosen as the stack representative. This representative image may be printed or used in a slide show by selecting the stack. That is, upon selecting the stack, the representative image is automatically selected. In some embodiments, the stack representative image may be selected by a user. In some embodiments, the stack representative image may be automatically selected. The resulting stacks may be displayed in user interface 210.
If decision block 272 determines that the time range has not been met, then decision block 282 determines if the visual similarity of image A and image B has met a maximum similarity threshold. If the maximum visual similarity is met, process action 276 may group the images. If the maximum, or stricter, similarity threshold is not met, then no grouping occurs. Process 285 flows to decision block 278 where it may be determined if there are more images to group. If there are more images, image B may become image A and the next image in time order becomes image B in process action 280. Then process 285 compares the new image A and image B as described above. The maximum similarity threshold may be stricter than, more stringent than, or greater than the minimum similarity threshold. That is, if the images do not possess time proximity, process 285 allows for a more stringent visual similarity requirement to group the images. In other words, the maximum similarity threshold requires a higher degree of similarity than the minimum similarity threshold. For example, the minimum visual similarity threshold may be 70% similarity and the maximum similarity threshold may be 80%. With these settings, process 285 may group some images that have time proximity and have a visual similarity of at least 70%. Process 285 may also group images that do not have time proximity but do have a visual similarity of 80% or more.
The groupings created by the grouping process may, in some embodiments, be modified by a user. The modifications may include splitting a group into two groups, moving images from one group to another group, and deleting images from a group.
In an example, group 336 in
According to some embodiments of the invention, computer system 500 performs specific operations by processor 504 executing one or more sequences of one or more instructions stored in system memory 506. Such instructions may be read into system memory 506 from another computer readable medium, such as static storage device 508 or disk drive 510. In some embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention.
The term “computer readable medium” refers to any medium that participates in providing instructions to processor 504 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as disk drive 510. Volatile media includes dynamic memory, such as system memory 506. Transmission media includes coaxial cables, copper wire, and fiber optics, including wires that comprise bus 502. Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
Common forms of computer readable media includes, for example, floppy disk, flexible disk, hard disk, magnetic tape, any other magnetic medium, CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, RAM, PROM, EPROM, FLASH-EPROM, any other memory chip or cartridge, carrier wave, or any other medium from which a computer can read.
In some embodiments of the invention, execution of the sequences of instructions to practice the invention is performed by a single computer system 500. According to some embodiments of the invention, two or more computer systems 500 coupled by communication link 520 (e.g., LAN, PSTN, or wireless network) may perform the sequence of instructions to practice the invention in coordination with one another. Computer system 500 may transmit and receive messages, data, and instructions, including program, i.e., application code, through communication link 520 and communication interface 512. Received program code may be executed by processor 504 as it is received, and/or stored in disk drive 510, or other non-volatile storage for later execution.
Although the foregoing examples have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed examples are illustrative and not restrictive.
Number | Name | Date | Kind |
---|---|---|---|
5675358 | Bullock et al. | Oct 1997 | A |
6282317 | Luo et al. | Aug 2001 | B1 |
6538698 | Anderson | Mar 2003 | B1 |
6580437 | Liou et al. | Jun 2003 | B1 |
6745186 | Testa et al. | Jun 2004 | B1 |
6819783 | Goldberg et al. | Nov 2004 | B2 |
6834122 | Yang et al. | Dec 2004 | B2 |
6915011 | Loui et al. | Jul 2005 | B2 |
6928233 | Walker et al. | Aug 2005 | B1 |
6950533 | Zlotnick | Sep 2005 | B2 |
6956573 | Bergen et al. | Oct 2005 | B1 |
7437005 | Drucker et al. | Oct 2008 | B2 |
7694236 | Gusmorino et al. | Apr 2010 | B2 |
7831599 | Das et al. | Nov 2010 | B2 |
8326087 | Perronnin et al. | Dec 2012 | B2 |
8385663 | Xu et al. | Feb 2013 | B2 |
20020001404 | Yoshikawa et al. | Jan 2002 | A1 |
20020009286 | Kasutani | Jan 2002 | A1 |
20020075322 | Rosenzweig et al. | Jun 2002 | A1 |
20020168108 | Loui et al. | Nov 2002 | A1 |
20030026507 | Zlotnick | Feb 2003 | A1 |
20030059107 | Sun et al. | Mar 2003 | A1 |
20030072486 | Loui et al. | Apr 2003 | A1 |
20030084065 | Lin et al. | May 2003 | A1 |
20030123713 | Geng | Jul 2003 | A1 |
20030123737 | Mojsilovic et al. | Jul 2003 | A1 |
20030145279 | Bourbakis et al. | Jul 2003 | A1 |
20030152363 | Jeannin et al. | Aug 2003 | A1 |
20030184653 | Ohkubo | Oct 2003 | A1 |
20030189602 | Dalton et al. | Oct 2003 | A1 |
20030195883 | Mojsilovic et al. | Oct 2003 | A1 |
20030206668 | Nakajima et al. | Nov 2003 | A1 |
20030227468 | Takeda | Dec 2003 | A1 |
20040001631 | Camara et al. | Jan 2004 | A1 |
20040208365 | Loui et al. | Oct 2004 | A1 |
20040228504 | Chang | Nov 2004 | A1 |
20050004690 | Zhang et al. | Jan 2005 | A1 |
20050163378 | Chen | Jul 2005 | A1 |
20050283742 | Gusmorino et al. | Dec 2005 | A1 |
20060026524 | Ma et al. | Feb 2006 | A1 |
20060071942 | Ubillos et al. | Apr 2006 | A1 |
20060071947 | Ubillos et al. | Apr 2006 | A1 |
20060106816 | Oisel et al. | May 2006 | A1 |
20060195475 | Logan et al. | Aug 2006 | A1 |
20060214953 | Crew et al. | Sep 2006 | A1 |
20060220986 | Takabe et al. | Oct 2006 | A1 |
20070035551 | Ubillos | Feb 2007 | A1 |
20070088748 | Matsuzaki et al. | Apr 2007 | A1 |
20070201558 | Xu et al. | Aug 2007 | A1 |
20070226255 | Anderson | Sep 2007 | A1 |
20080205772 | Blose et al. | Aug 2008 | A1 |
20090123021 | Jung et al. | May 2009 | A1 |
20090150376 | O'Callaghan et al. | Jun 2009 | A1 |
20090161962 | Gallagher et al. | Jun 2009 | A1 |
20090220159 | Tanaka et al. | Sep 2009 | A1 |
20090313267 | Girgensohn et al. | Dec 2009 | A1 |
20100128919 | Perronnin et al. | May 2010 | A1 |
20100172551 | Gilley et al. | Jul 2010 | A1 |
20110064317 | Ubillos | Mar 2011 | A1 |
20120082378 | Peters et al. | Apr 2012 | A1 |
20120328190 | Bercovich et al. | Dec 2012 | A1 |
20130121590 | Yamanaka et al. | May 2013 | A1 |
20130125002 | Spaeth et al. | May 2013 | A1 |
Number | Date | Country |
---|---|---|
1246085 | Oct 2002 | EP |
1369792 | Dec 2003 | EP |
Entry |
---|
Yong Rui Huang, T.S. Mehrotra, S., “Exploring video structure beyond the shots”, Jun. 28, 1998, “Multimedia Computing and Systems, 1998. Proceedings. IEEE International Conference on”, p. 1-4. |
Graham, Adrian, Garcia-Molina, Hector, Paepcke, Andreas, Winograd, Terry, “Time as essence for photo browsing through personal digital libraries”,2002, Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, p. 326-335. |
Rodden, Kerry, “How do people manage their digital photographs”, 2003, ACM Press, p. 409-416. |
Platt, John C. “AutoAlbum: Clustering Digital Photographs using Probabilistic Model Merging”, “http://research.microsoft.com/pubs/68971/cbaivl.pdf”, p. 1-6. |
Cooper et al., “Temporal Event Clustering for Digital Photo Collections”, Aug. 2005, ACM Transactions on Multimedia Computing, Communications and Applications, vol. 1, No. 3, p. 269-288. |
Loui, A. and Savakis, A. 2003. Automatic event clustering and quality screening of comsumer pictures for digital albuming. IEEE Trans. Multimed. 5, 3, p. 390-402. |
J. Platt, et al, “PhotoTOC: Automatic Clustering for Browsing Personal Photographs”, 2002, http://research.microsoft.com/en-us/um/people/jplatt/PhotoToC-pacrim.pdf, p. 1-5. |
Platt, John C. “AutoAlbum: Clustering Digital Photographs using Probabilistic Model Merging”, Jun. 2000, http://research.microsoft.com/pubs/68971/cbaivl.pdf, p. 1-6. |
Number | Date | Country | |
---|---|---|---|
20130125002 A1 | May 2013 | US |