The present disclosure relates generally to editing photos and more specifically relates to systems and methods for editing digital photos using surrounding context.
Handheld digital cameras have become relatively inexpensive and easy to use and are now widely used as a convenient way to take pictures. Such handheld digital cameras typically include storage for hundreds or more pictures on a small removable memory device and allow the user to browse photos that have been taken and delete any that are unsatisfactory.
Digital cameras have even been incorporated into other devices, such as cellular telephones, laptop computers, smartphones, and tablet computers. Despite the prevalence of such devices, photo editing functionality on such devices is typically limited to deleting and retaking photos. For example, if a person takes a photograph, but accidentally covers part of the lens with his thumb, his only option will typically be to retake the picture. This is not ideal as in some cases it may be very difficult or impossible to recreate the scene from the original photo. And while some cameras allow a user to create simulated panorama photographs based on multiple successive pictures of different parts of a larger scene, such technology does not allow a user to edit a photo, but only to stitch multiple photos together to create a larger image.
Embodiments disclosed herein provide systems and methods for editing digital photos using surrounding context. For example, one embodiment disclosed herein is a device comprising a camera, a display, memory for storing images, and a processor. In this illustrative embodiment, the processor is configured to receive a first image; receive a selection of a portion of the first image; receive a second image, the second image comprising a different image than the first image; determine a portion of the second image corresponding to the portion of the first image; and replace the portion of the first image with the portion of the second image.
This illustrative embodiment is mentioned not to limit the disclosure, but rather to provide an example to aid understanding thereof. Illustrative embodiments are discussed in the Detailed Description, which provides further description of various embodiments. Advantages offered by various embodiments may be further understood by examining this specification.
The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate one or more examples of embodiments and, together with the description of example embodiments, serve to explain the principles and implementations of the embodiments.
Example embodiments are described herein in the context of systems and methods for editing digital photos using surrounding context. An example system according to the present disclosure is a mobile device capable of taking digital photo. If a photo is taken but is undesirable for some reason, such as showing a thumb partially covering the lens of the camera, a photography application executing on the mobile device allows the user to manually select the undesired portion of the photo. The photography application then allows the user to take a second photo to correct the defect in the first photo. While the user is preparing to take the second photo, the photography application displays a preview of the first photo with the camera viewfinder image overlaid on the first photo to assist the user in aligning the camera to take the second photo. The photography application then blends the two photos to create a new photo with no unwanted artifacts.
Those of ordinary skill in the art will realize that the preceding example and the following description are illustrative only and are not intended to be in any way limiting. Other embodiments will readily suggest themselves to such skilled persons having the benefit of this disclosure. Reference will now be made in detail to implementations of example embodiments as illustrated in the accompanying drawings. The same reference indicators will be used throughout the drawings and the following description to refer to the same or like items.
In the interest of clarity, not all of the routine features of the implementations described herein are shown and described. It will, of course, be appreciated that in the development of any such actual implementation, numerous implementation-specific decisions must be made in order to achieve the developer's specific goals, such as compliance with application- and business-related constraints, and that these specific goals will vary from one implementation to another and from one developer to another.
An example system according to the present disclosure is a mobile device capable of taking digital photo. If a photo is taken but is undesirable for some reason, such as showing a thumb partially covering the lens of the camera, a photography application executing on the mobile device allows the user to manually select the undesired portion of the photo. The photography application then allows the user to take a second photo to correct the defect in the first photo. While the user is preparing to take the second photo, the photography application displays a preview of the first photo with the camera viewfinder image overlaid on the first photo to assist the user in aligning the camera to take the second photo. The photography application then blends the two photos to create a new photo with no unwanted artifacts.
In this case, however, rather than displaying the second image, the camera 100 continues to display the first image, but allows a portion of the second image to be overlaid on (or be seen through) the selected portion of the first image. Thus, the camera 100 displays a composite image based on the first image and the second image, with a part of the second image replacing the selected portion of the first image, e.g. the user's thumb. In this embodiment, the camera performs an analysis on the two images to align the first image and the second image such that the visible portion of the second image fills in the selected portion of the first image. However, the user may also maneuver the second image to obtain the desired alignment between the two images. After aligning the two images, the user can effectively erase the undesired artifact or object and replace it with image data from the second image to provide a new image of the scene that appears to have been taken naturally in a single photograph. The camera 100 then stitches the two images together, such as by using known image stitching techniques, to provide a substantially seamless integration of the two images into a new composite image.
Referring now to
In the embodiment shown, the processor 212 is in communication with memory 214 and is configured to execute processor-executable instructions stored in memory 214. The camera 216 is in communication with memory 214 and is configured to capture images of scenes and to store them in memory 214. Within the context of this disclosure, the term ‘scene’ generally refers to a portion of reality visible from a viewpoint in a particular direction. While scenes include objects that are visible from a large number of different perspectives, for the purposes of this disclosure, when the specification refers to multiple images of the same scene, the images of the scene result from substantially the same viewpoint in substantially the same direction. It is generally extremely difficult to capture two images from precisely the same viewpoint in precisely the same direction, therefore, while images discussed throughout the disclosure may have slightly different viewpoints and directions, they may be considered images of the same scene. Note that while the rotation of a camera about an axis along the direction will affect the orientation of a resulting image, the image may still be of the same scene as the image may be rotated to obtain a similar image as others taken from other rotational orientations.
Referring again to
In the embodiment shown in
In some embodiments, the system 200 may include additional components, such as additional input devices, such as buttons, directional pads, switches, or touch-sensitive input devices. For example, in one embodiment, the system includes a shutter button to cause the camera to capture an image, and a directional pad configured to allow the user to navigate amongst a plurality of captured images stored in memory 214, to select various options to configure image capture parameters, or to otherwise allow the user to interact with the system 200.
Some embodiments according to the present disclosure may comprise a plurality of cameras. For example, the system 220 shown in
As is understood in the art, digital images may be represented by one or more pixels, and a size of a digital image is typically described by the length and width of an image as a number of pixels (e.g. 1900×1200). Digital images may be stored in a variety of formats, including compressed and uncompressed formats. For example, a digital camera may capture and store an image in an uncompressed format, such as in a bitmap. Alternatively, some digital cameras capture and store images in a compressed image format, such as a JPEG format. This disclosure is not intended to be limited to images captured or stored in a particular format or by a particular method. Examples or embodiments described throughout this specification may discuss a particular image format or structure for ease of understanding, not to limit the scope of the disclosure in any way.
Referring now to
In the embodiment shown in
In some embodiments, the processor 212 may receive an image that was not captured by the camera 216. For example, in one embodiment, the system 200 may receive a flash memory card having one or more images stored on it. The processor 212 may then receive one or more addresses associated with one or more of the images stored on the flash memory card. In some embodiments, a user may use an input device to select an image stored in memory 214. In such an embodiment, the processor 212 may receive the image based on the selection. In some embodiments, after receiving the image, the processor 212 may generate a display signal configured to cause the display 218 to display the received image and transmit the display signal to the display. After receiving the first image 400, the method 300 proceeds to block 304.
At block 304, the processor 212 receives a selection of a portion of the first image 400, wherein the selected portion of the first image 400 is less than the whole first image 400. For example, in one embodiment, the first image 400 comprises an image of a scene with portion having an undesired object: a thumb that covered a part of the lens when the image was captured. Thus, in some embodiments, a user may select a portion of the first image 400 corresponding to the undesired object. For example, in the system 200 shown in
In some embodiments, a user may select a portion of the image by touching the display 218 at a location corresponding to the undesired portion of the image. In one such embodiment, the display 218 may generate and transmit a touch signal to the processor 212 with one or more parameters associated with the location of the contact. The processor 212 may then determine a location within the image corresponding to the location of the contact. The processor 212 may then perform image analysis to identify an object within the image associated with the location of the contact and select the object. For example, in an photo intending to capture an image of two people, a third person may have inadvertently been captured in the background. Or, in another case, an image may include two people at different distances from the camera resulting in one of the people being out of focus. To select a portion of the image corresponding to the third person, a user may touch the display 218 at a location corresponding to the image of the third person. The processor 212 receives a touch signal from the display indicating the location of the contact and determines a corresponding location on the image. The processor 212 then performs an analysis of the image and identifies the third person as a visual object within the image and selects pixels within the image corresponding to the visual object (i.e. the third person in this example), thereby selecting a portion of the first image.
In some embodiments, the camera may automatically identify such undesirable artifacts or objects, such as a thumb or finger over a part of a lens or an object that is out of focus. In one such embodiment, the system 200 is configured to automatically detect one or more undesirable artifacts or objects within an image and to select any such detected artifacts or objects. A user may then deselect any detected artifacts or objects or select any portion of the first image according to embodiments described herein.
As may be seen in
In some embodiments, the processor 212 may receive selections of a plurality of portions of the image. For example, in one embodiment a user may desire to select multiple portions of the first image. In such an embodiment, the user may touch the touch-sensitive display 218 and draw a boundary around the undesired portions of the first image to select each of the multiple portions. Alternatively, the user may touch the touch-sensitive display 218 at a location corresponding to a displayed object to select the object. As was discussed above, in some embodiments, the processor 212 may generate a display signal configured to cause the display to show an indication for some or all of the selected portions of the first image. After the processor 212 receives a selection of a portion of the first image 400, the method 300 proceeds to block 306.
In block 306, the processor 212 receives a second image. In this embodiment, the second image 420 comprises an image similar to the first image 400 as the second image 420 is intended to provide image data to replace the selected portion 410 of the first image 400; however, in some embodiments, the second image may comprise an image of substantially the same scene, but under different conditions, such as different lighting, different time of day, different weather, different persons within the scene, and so forth, to provide desired image data. For example, a first image may include two persons within a scene at different distances from the camera, such that one person is shown out of focus in the image. A second image may be captured in which the person who was out of focus in the first image is captured in focus in the second image, thus allowing a user to replace the out-of-focus portion of the first image with the in-focus portion of the second image.
In some embodiments, the second image 420 may be captured and received subsequent to capturing and receiving the first image 400. For example,
In some embodiments, a system 220 may comprise a plurality of cameras and may be configured to capture multiple images of similar subject matter substantially simultaneously. For example, system 220 comprises two cameras, which may be activated substantially simultaneously to capture first and second images. Such an embodiment may advantageously provide two images having substantially identical views of the same scene and will be described in more detail below with respect to
In embodiments having only a single forward-facing camera, it may be difficult for a user to located a suitable viewpoint from which to capture the second. Thus, in one embodiment in which a system 200 comprises a single camera facing in a direction, the system 200 provides cues to a user attempting to capture a second image. For example, the processor 212 may generate a display signal to cause the display 218 to display a partially-transparent copy of the first image 400 overlaid on the display as the user aligns the system 200 to capture the second image. Such an embodiment may be advantageous in that as the user prepares to capture the second image 420, he may better align or focus the camera to replicate the image of the scene from the first image 400.
In a similar embodiment, the processor 212 may generate a display signal to cause the display 218 to display a partially-transparent copy of the selected portion of first image 400 overlaid on the display 218, rather than the full first image 400, as the user aligns the system 200 to capture the second image. Such an embodiment may be seen in
In one embodiment, the processor 212 may further shift the preview of the second image to be captured to ensure that the user captures sufficient information to fully replace the selected portion of the first image. A user may capture the first image and then capture a second image such that the alignment between the first image and the second image results in the second image only having sufficient image information to partially replace the selected portion. For example, the user may align the camera too far to the left or the right. Thus, the processor 212 may analyze the location of the selected portion of the first image and translate the preview of the second image from the camera's true perspective to cause the user to orient the camera to capture sufficient information.
For example, as shown in
As was discussed with respect to block 302, in this embodiment, a camera 216 captures a second image 420 and stores it to memory 214. The processor 212 subsequently receives the second image 420 from memory 214, such as by receiving a memory address of the second image 420. However, in other embodiments, the processor 212 may receive an image that was not captured by the camera 216. For example, in one embodiment, the system 200 may receive a flash memory card having one or more images stored on it. The processor 212 may then receiving one or more addresses associated with one or more of the images stored on the flash memory card. In some embodiments, a user may use an input device to select an image stored in memory 214. In such an embodiment, the processor 212 may receive the second image 420 based on the selection. After receiving the second image 420, the method 300 proceeds to block 308.
At block 308, the processor 212 determines a portion of the second image 420 corresponding to the selected portion of the first image 400. As may be seen in
As can be seen in
For example, in this embodiment, the processor 212 determines a portion of the second image 420 corresponding to the selected portion 410 of the first image 400 based on an input received from an input device. For example, the user can touch the display 218 and drag or rotate the second image to obtain a desired alignment between the first image 400 and the second image 420. As may be seen in the composite image 430 in
In some embodiments, the processor 212 determines a portion of the second image 420 corresponding to the selected portion of the first image 400, such as without input from a user after receiving the second image 420. For example, in one embodiment, the processor 212 executes an image alignment and stitching algorithm configured to identify relationships and overlap between the first and second images 400, 420, to align the first and second images 400, 420 based on the identified relationships and overlap, and to blend first and second images 400, 420 such that the selected portion of the first image 400 is replaced by a corresponding portion of the second image 400. After the portion of the second image 420 has been determined, the method proceeds to block 310.
At block 310, the processor 212 replaces the selected portion 410 of the first image 400 with the determined portion of the second image 420. In the embodiment shown in
After the selected portion 410 of the first image 400 has been replaced by the determined portion of the second image 420, the method may finish, or it may return to an earlier step, such as block 304 or 306 to identify further portions of the image 400, 440 to be replaced or to capture additional images to be used to replace portions of the first image 400. Steps of the method 300 may be repeated iteratively until a desired image is obtained.
Referring now to
In the embodiment shown in
In this embodiment, the two cameras 236a,b each capture an image, such as the images 500, 520 shown in
As discussed previously, in some embodiments, the processor 232 may receive images that were not captured by the camera 236a,b. For example, in one embodiment, the system 220 may receive a flash memory card having a plurality of images stored on it. The processor 212 may then receive addresses corresponding to the images stored on the flash memory card.
In this embodiment, because multiple images are received, a user may select one of the images as a first image to be modified, such as by configuring the portable image capture device 220 such that images captured by camera 236a are first images, while images captured by camera 236b are second images. In some embodiments, the user may be able to manually select one of the images to be the first image, or the portable image capture device may determine one of the images as the first image. In this embodiment, image 500 will be referred to as the first image, though no limitation on which image comprises the first image in some embodiments should be inferred. After receiving the images 500, 520, the method 320 proceeds to block 324.
At block 324, the processor 232 receives a selection of a portion of a first image 500, wherein the selected portion of the first image 500 is less than the whole first image 500. In this embodiment, the first image 500 comprises multiple people within the image, some of which are out of focus. However, because, in this embodiment, the second image 520 was captured using a different focal depth, the people shown out of focus in the first image are in proper focus in the second image. Thus, in this embodiment, the portable image capture devices receives a selection 530 of the people that are shown as out of focus in the first image 500. After receiving the selection, the method 300 proceeds to block 326.
At block 326, the processor 232 determines a portion of the second image 520 corresponding to the selected portion of the first image 500. As may be seen in
As can be seen in
In this embodiment, the processor 232 determines a portion of the second image 520 corresponding to the selected portion 510 of the first image 500 based on an input received from an input device. For example, as described previously, the user can touch the display 238 and drag, rotate, zoom, or otherwise adjust the second image to obtain a desired alignment between the first image 500 and the second image 520 as described previously.
In some embodiments, the processor 232 determines a portion of the second image 520 corresponding to the selected portion of the first image 500. For example, in one embodiment, the processor 212 executes an image alignment and stitching algorithm, such as described below, configured to identify relationships and overlap between the first and second images 500, 520, to align the first and second images 500, 520 After the portion of the second image 520 has been determined, the method proceeds to block 328.
At block 328, the processor 232 replaces the selected portion 510 of the first image 500 with the determined portion of the second image 520. In the embodiment shown in
After the selected portion 510 of the first image 500 has been replaced by the determined portion of the second image 520, the method may finish, or it may return to an earlier step, such as block 322 or 324 to identify further portions of the image 500, 540 to be replaced or to capture additional images to be used to replace portions of the first image 500. Steps of the method 320 may be repeated iteratively until a desired image is obtained.
In one embodiment according to the methods 300, 320 shown in
In one such embodiment, the processor 212 identifies one or more interest points in the first image 400 and the second image 420. An interest point may comprise a particular feature within each image, such as a window, an eye, a corner of an object, SIFT (scale-invariant feature transform), SURF (speeded-up robust feature), ORB (oriented BRIEF (binary robust independent elementary features)), or FAST (features from accelerated segment test) image features (as are understood in the art), etc. The processor 212 then attempts to match interest points that are common to both images 400, 420 which may be used to align the two images 400, 420, such as by using a brute-force matching analysis or other algorithm, such as FLANK (fast approximate nearest neighbor) matching. For example the processor 212 may compute a dot product of coordinates for each interest point and attempt to match interest points in the respective images if the angle between interest points in the two images are less than a threshold.
After matching the interest points, the processor 212 estimates a homography matrix, such as by employing the matched interest points as well as random sample consensus (RANSAC) and a normalized direct linear transform. The processor 212 then transforms the second image 420 according to the homography matrix to create a transformed second image.
After creating the transformed second image, the selected portion of the first image is eroded, such as using a 5×5 square kernel to obtain an eroded selected portion, which the processor 212 then deletes from the first image 400. The (uneroded) selected portion is also dilated, such as using a 5×5 square kernel, to create a dilated selected portion. The processor 212 uses the dilated selected portion to determine and extract a portion of the second image 420. After the determined portion has been extracted from the second image 420, the processor 212 applies gain compensation to minimize the intensity difference of corresponding pixels from the first image 400 and the second image 420. Finally, at block 310, the processor 212 replaces the selected portion of the first image with the extracted portion of the second image to create the composite image 440. The processor 212 may further employ other manipulations when replacing the selected portion to provide a better quality composite image, such as multiband blending to avoid blur or cloning techniques like mean-value cloning or Poisson cloning.
Embodiments according to the present disclosure may have significant advantages over existing digital cameras and editing options on such cameras. For example, while a user may download images from a camera to a computer and perform image editing using software products such as Adobe® Photoshop®, in many cases, the user may not have immediate access to such a facility. Further, a user may not with to take and save multiple images of a scene for later editing. It may be more efficient and more desirable for a user to simply execute a method according to this disclosure immediately after taking a picture with an undesired feature, to save memory on a flash card or to quickly share the photo with friends and family. Further, photo editing products for conventional computers typically are expensive and may be difficult for an amateur to use. A digital camera or cellular phone with such integrated photo editing functionality provides significant benefits to a user and allows immediate correction of otherwise desirable digital photographs. Thus, a user may quickly and easily obtain a desired image without intensive manual editing or the use of a separate computer device.
While the methods and systems herein are described in terms of software executing on various machines, the methods and systems may also be implemented as specifically-configured hardware, such a field-programmable gate array (FPGA) specifically to execute the various methods. For example, referring again to
Such processors may comprise, or may be in communication with, media, for example computer-readable media, that may store instructions that, when executed by the processor, can cause the processor to perform the steps described herein as carried out, or assisted, by a processor. Embodiments of computer-readable media may comprise, but are not limited to, an electronic, optical, magnetic, or other storage device capable of providing a processor, such as the processor in a web server, with computer-readable instructions. Other examples of media comprise, but are not limited to, a floppy disk, CD-ROM, magnetic disk, memory chip, ROM, RAM, ASIC, configured processor, all optical media, all magnetic tape or other magnetic media, or any other medium from which a computer processor can read. The processor, and the processing, described may be in one or more structures, and may be dispersed through one or more structures. The processor may comprise code for carrying out one or more of the methods (or parts of methods) described herein.
The foregoing description of some embodiments has been presented only for the purpose of illustration and description and is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Numerous modifications and adaptations thereof will be apparent to those skilled in the art without departing from the spirit and scope of the disclosure.
Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, operation, or other characteristic described in connection with the embodiment may be included in at least one implementation. The disclosure is not restricted to the particular embodiments described as such. The appearance of the phrase “in one embodiment” or “in an embodiment” in various places in the specification does not necessarily refer to the same embodiment. Any particular feature, structure, operation, or other characteristic described in this specification in relation to “one embodiment” may be combined with other features, structures, operations, or other characteristics described in respect of any other embodiment.