Exemplary aspects of embodiments of the present invention are related to the technical field of digital photography, especially the field of enhancing the quality of digital photographs in an interactive way. Advantages of the invention may become particularly prominent in assembling a composite image or panoramic image from two or more component images.
Digital photography in general refers to the technology of using an electronic image capturing device for converting a scene or a view of a target into an electronic representation of an image. Said electronic representation typically consists of a collection of pixel values stored in digital form on storage medium either as such or in some compressed form. At the time of writing this description a typical electronic image capturing device comprises an optical system designed to direct rays of electromagnetic radiation in or near the range of visible light onto a two-dimensional array of radiation-sensitive elements, as well as reading and storage electronics configured to read radiation-induced charge values from said elements and to store them in memory.
Panoramic image capturing refers to a practice in which two or more images are captured separately and combined so that the resulting panoramic image comprises pixel value information that originates from at least two separate exposures.
A human observer will conceive a displayed image as being of the higher quality the less it contains artifacts that deviate from what the human observer would consider a natural representation of the whole scene covered by the image.
The following terminology is used in this text.
Scene is an assembly of one or more physical objects, of which a user may want to produce one or more images.
Image is a two-dimensional distribution of electromagnetic radiation intensity at various wavelengths, typically representing a delimited view of a scene.
Electronic representation of an image is an essentially complete collection of electrically measurable and storable values that corresponds to and represents the two-dimensional distribution of intensity values at various wavelengths that constitutes an image.
Pixel value is an individual electrically measurable value that corresponds to and represents an intensity value of at least one wavelength at a particular point of an image.
Image data is any data that constitutes or supports an electronic representation of an image, or a part of it. Image data typically comprises pixel values, but it may also comprise metadata, which does not belong to the electronic representation of an image but complements it with additional information.
Artifact is a piece of image data that, when displayed as a part of an image, makes a human observer conceive the image as being of low quality. An artifact typically makes a part of the displayed image deviate from what the human observer would consider a natural representation of the corresponding scene.
Characterisation of an artifact is data in electronic form and contains information related to a particular artifact.
Representation of an artifact is user-conceivable information that is displayed or otherwise brought to the attention of a human user in order to tell the user about the artifact.
Exemplary embodiments of the invention, which may have the character of a method, device, component, module, system, service, arrangement, computer program, and/or computer program product, may provide an advantageous way of producing a panoramic image that a human observer could conceive as being of high quality. Advantages of such exemplary embodiments of the invention may involve ease of use, reduced need of storage capacity, a user's experience of good quality, and many others.
According to an embodiment of the invention there is provided an apparatus, comprising:
an artifact locating subsystem configured to locate an artifact in an electronic representation of an image,
an artifact evaluating subsystem configured to store a characterisation of a located artifact, and
an artifact data handling subsystem configured to output at least one of a characterisation of an artifact or a representation of a stored characterisation of an artifact.
According to another embodiment of the invention there is provided an apparatus, comprising:
an image data handling subsystem configured to store electronic representations of images,
an artifact data handling subsystem configured to handle characterisations of artifacts located in an image,
a displaying subsystem configured to display an image and representations of artifacts located in said image, and
a user input subsystem configured to receive user inputs concerning corrective action to be taken to correct artifacts, representations of which were displayed in said displaying subsystem.
According to another embodiment of the invention there is provided a method, comprising:
locating an artifact in an electronic representation of an image,
storing a characterisation of the located artifact, and
outputting at least one of a characterisation of the artifact or a representation of the artifact.
According to another embodiment of the invention there is provided a method, comprising:
storing an electronic representation of an image,
displaying the image and representations of artifacts located in said image, and
receiving user inputs concerning corrective action to be taken to correct artifacts, representations of which were displayed.
According to another embodiment of the invention there is provided a computer-readable storage medium having computer-executable components that, when executed on a processor, are configured to implement a process comprising:
locating an artifact in an electronic representation of an image,
storing a characterisation of the located artifact, and
outputting at least one of a characterisation of the artifact or a representation of the artifact.
According to another embodiment of the invention there is provided a computer-readable storage medium, having computer-executable components that, when executed on a processor, are configured to implement a process comprising:
storing an electronic representation of an image,
displaying the image and representations of artifacts located in said image, and
receiving user inputs concerning corrective action to be taken to correct artifacts, representations of which were displayed.
A number of advantageous embodiments of the invention are further described in the depending claims.
a illustrates a part of a user interface for image handling.
b illustrates a part of a user interface for image handling.
c illustrates a transition between states in a method and a computer program product.
Examples of troublesome effects concerning the production of a panoramic image are such features in the component images that tend to make the borders of the component images pronouncedly visible in the panoramic image. For example, a significant difference between component images in the level of exposure of a field that should continue smoothly from one component image to another tends to cause an odd-looking colour change in the panoramic image. Optical aberration in the imaging optics may cause graphical distortion that increases towards the edges of each component image; if neighbouring component images do not overlap enough, it may prove to be difficult to find the correct way of aligning and stitching them together in the production of the panoramic image.
Artifacts that could appear in even a single image include, but are limited to, those of the above that are not associated with combining image data from different images.
Artifacts in an image, which cause a human observer to conceive it as being of low quality, may be such that the photographer may not notice them while he is still at the scene, although there are also artifacts that are easy to notice. Considering one of the artifacts illustrated in
Similar considerations apply to the other artifacts. Although some of the artifacts may be correctable with later processing of the image data, some are such that a better starting point would be achieved, especially for producing a panoramic image which a human observer would conceive as being of high quality, by taking one or more additional component images. Noticing the artifacts immediately, and/or using human judgement about whether or not an artifact is susceptible to correction by post-processing, may be difficult if the user has only a limited-size display available in the equipment that he carries around for taking images.
According to block 301, the method comprises acquiring image data. It may also comprise producing a panoramic image, or a combined image that includes image data from two or more component images. An example of the latter is a process of acquiring a first image and acquiring at least a second image and possibly a number of subsequent images, so that at least some of the acquired images have some overlapping areas that allow a stitching algorithm to recognize an appropriate way of stitching the images into a combined image. If the method is executed in an electronic image capturing device, acquiring an image typically means reading into run-time memory the digitally stored form of an image that the user of the device has taken. If the method is executed in a processing apparatus external to any electronic image capturing device, acquiring an image typically means receiving into run-time memory the digitally stored form of an image over a communications connection, or reading into run-time memory the digitally stored form of an image from a storage memory that can be internal, external and/or removable.
For producing a panoramic it is possible to apply a stitching algorithm to stitch acquired images into a larger, combined image. It should be noted that combining a number of component images is not limited to producing an image that covers a wider view than any of the component images alone. Combining images may also involve utilizing the redundant image data of the overlapping areas to selectively enhance resolution or other features of the resulting combined image.
We assume that—irrespective of whether a panoramic image was produced in block 302—the image contains some artifacts. According to block 302, artifacts are located and indicated to a user. Locating an artifact means identifying a number of pixels in the digitally stored form of an image that according to an evaluation criterion deviate from optimal image content. Examples of evaluation criteria include, but are not limited to, the following:
Of course these are just very simple examples and listed here mainly for illustration purpose. In practice, more complex and advanced mechanisms or methods are likely to be used. For example, a filter can be designed to address each specific artefact type (such as motion blur, defocus, insufficient or too large exposure, etc.). The filter operates on the image pixel values, and gives a positive feedback for each pixel or area if it contains the corresponding artefact. The filter can additionally tell the likelihood that an artefact occurs and how severe it is.
An addition or alternative to making the apparatus automatically locate artifacts is the possibility of receiving inputs from a user, explicitly marking a part of a displayed image as containing an artifact.
If the image is a panoramic image or other kind of combined image, a so-called registration between two component images has been performed, for example by calculating a homography transformation. Evaluation methods can be applied to find out, how good the transformation is. It is possible to compare pixel values, gradient values, image descriptors, SIFT (Scale Invariant Feature Transform) features, or the like. If the registered images do not agree well, within a given tolerance, this can be determined to be an artifact.
When an artifact has been located, it is advantageous to store a characterisation of the artifact. An example of a characterisation includes data about the location of the artifact in the image (which pixels are affected), the type of the artifact (which evaluation criterion caused the artifact to be located), and the severity of the artifact. The severity of the artifact can be analyzed and represented in various forms, like the size of the affected area in the image, the marginal by or the extent to which the evaluation criterion was fulfilled, the likelihood the artefact will appear in the image and others.
Further according to block 302, a representation of at least some of the located artifacts is brought to the attention of a user. We assume that a user interface exists, through which the user receives indications of how the image looked like and/or how the process of producing the panoramic image is proceeding. Most advantageously the user interface comprises a display configured to give alphanumeric and/or graphical indications to the user. Various advantageous ways of indicating located artifacts to a user are considered later.
In addition to displaying representations of the located artifacts to a user, the user interface is configured to receive inputs from the user, indicating what the user wants to do with the located and indicated artifacts. According to block 303, corrective measures are applied according to the inputs received from the user. In an exemplary case, at least one located and indicated artifact is of some nature that is susceptible to correction by processing the image data. In that case the indication to the user may include a prompt for the user to select, whether corrective processing should be applied. If the user gives a positive input, corrective processing (such as recalculating some of the pixel values with some kind of a filtering algorithm) is applied. In another exemplary case, artifact(s) contained in at least one image is of some nature that would be difficult to correct by just processing existing image data. In that case the indication to the user may include a prompt for the user to shoot at least a significant part of that component image again. If the user takes another component image, that image is taken as additional image data to the production of the panoramic image.
The back-and-forth arrows between blocks 301, 302, and 303 illustrate the fact that the invention does not require (but does not preclude either) executing corresponding method steps in any strictly defined temporal order. Locating artifacts according to block 302 may begin as soon as there is at least one image available, and may continue in parallel with the acquisition of further images according to block 301. Above we already indicated that one way of applying corrective measures according to block 303 can be prompting for and proceeding to acquiring more component images according to block 301. Some artifacts may have been corrected already according to block 303 while locating other artifacts is still running according to block 302. A large number of other examples can be presented, illustrating the not-any-particular-order character of the method.
Step 404 illustrates examining the (panoramic or single) image for artifacts. If the evaluation-criteria-based approach explained above is used, step 404 may involve going through a large number of stored pixel values that represent the image, and examining said stored pixel values part by part in order to notice, whether some part(s) of the image fulfil one or more of the criteria. If artifacts are found according to step 405, their characterisations are stored according to step 406. A return from the check of step 407 back to analyzing the image occurs until the whole image has been thoroughly analyzed.
Step 408 illustrates displaying a representation of the found artifacts to the user, preferably together with some prompt(s) or action alternative(s) for the user to give commands about what corrective measures should be taken. If user input is detected at step 409, respective corrective measures are taken according to step 409 and the method returns to displaying the representations of remaining artifacts according to step 408. When no user input is detected at step 409 (or some other user input is detected than such that would have caused a transition to step 410), the method ends.
Assuming that the method and computer program product are executed in an electronic image acquisition device, there is a shutter switch or some other control, the activation of which causes the device to enter an image acquisition state 502, where a new image is acquired. We assume that the current operating mode involves automatic adding of new images to the currently displayed panoramic image, so from said image acquisition state 502 an immediate transition occurs to a stitching state 503, in which the newly acquired image is stitched to the panoramic image that is currently displayed. After that the execution returns to the panoramic display state 501.
If the method and computer program product are executed in an apparatus that is not an electronic image acquisition device, it may happen that there is no shutter switch and no direct means of creating new images by the apparatus itself. In that case there may be a new image acquisition process that otherwise resembles that illustrated as the loop through states 502 and 503 in
We assume that the method and computer program product are executed in an apparatus that comprises a processor. In the embodiment of
A natural alternative to making the processor look for artifacts as a background process is to implement the looking for artifacts as a dedicated process, which is commenced as response to a particular input received from the user and ended either when all applicable parts of the image have been searched through or when an ending command is received.
A specific case of locating an artifact in state 504 is the case of receiving an input from the user, indicating an explicit marking of some part of the image as containing an artifact. In terms of
The panoramic display state 501 of
More than one representation of artifact can be selected and highlighted simultaneously. In
After a loop through state 507 the representation of at least one artifact is highlighted in the user interface. The term “highlighted” may mean that in addition to providing the user with visual feedback about the selection of the artifact itself, the apparatus may be configured to offer the user some suggested possibilities of corrective action. Examples include, but are not limited to, displaying action alternatives associated with softkeys or actuatable icons, like “corrective processing”, “take new image”, and the like. If at such moment the apparatus detects an input from the user that means the selection of corrective processing, the execution enters state 509 in which corrective processing is performed, followed by a return to state 501. As an alternative, if at said moment the apparatus detects a new press of the shutter switch or other signal of acquiring a new image, a new loop through the image acquisition and stitching states 502 and 503 occurs.
Re-entering state 501 after e.g. state 509 or 503 may mean that—while processor time is available—the apparatus is configured to run a check at state 504 to see whether the corrective action was sufficient to remove at least one artifact. If that is the case, returning from state 504 through states 505 and 506 to state 501 may mean that the user does not observe any representation for the corrected artifact any more. If some other artifacts remain, the user may direct the apparatus to select each of them in turn and apply the selected corrective action through repeated actions like those described above. If the user decides to accept a panoramic image displayed in state 501, he may issue a mode de-selection command to exit panoramic imaging mode, or begin acquiring component images for a completely new panoramic image. Depending on how the user interface has been implemented, the latter alternative may involve receiving, at the image acquisition apparatus, an explicit command from the user, or e.g. just acquiring a new component image that does not overlap with any of the component images that constituted the previous panoramic image.
There are various ways of utilizing new image data that has been acquired as a response to receiving from a user a corresponding command. Examples of such ways include, but are not limited to, blending the new image data to the image data of the processed images, creating a tunnel of data to the processed image (i.e. enabling a ‘close-up’ inside an image, that is, with higher resolution than the rest of the image), and simply detaching or attaching data to the processed image.
The user interface 601 comprises also action alternative indicators, or means for indicating action alternatives, 604. These may be audible, visual, tactile, or other kinds of outputs to the user for making the user conscious about what action alternatives are available for responding to the occurrence and indication of known artifact(s) in the displayed image. The user interface 601 comprises also general control indicators, or means for indicating general control alternatives, 605. These may be audible, visual, tactile, or other kinds of outputs to the user for making the user conscious about what general control functionalities, like exiting a current state or moving a selection, are available.
Additionally the user interface 601 comprises input mechanisms, or user input means, 606. These may include, and be any combination of, key(s), touchscreen(s), mouse, joystick(s), navigation key(s), roller ball(s), voice control, or other types of input mechanisms.
a illustrates a part of a user interface according to an embodiment of the invention. The user interface comprises a display 701, which comprises an image display area 702, an information display area 703, and indicators of input alternatives 704, 705, 706, 707, and 708. It is not necessary to divide the area of the display into separate areas for displaying e.g. the image and information; it is likewise possible to use overlaid displaying practices so that e.g. informative graphical elements are displayed on top of a displayed image. Even the elements of a displayed image itself may be given informative functions, for example by making some feature(s) of the image appear in a distinct, artificial colour and/or by making some feature(s) of the image blink or exhibit other kinds of dynamic behaviour.
According to an exemplary embodiment of the invention, the display 701 may be a touch-sensitive display, and the indicators of input alternatives 704, 705, 706, 707, and 708 may be touch keys implemented as predefined areas of the touch-sensitive display. According to another exemplary embodiment, the indicators of input alternatives 704, 705, 706, 707, and 708 may be visual indicators associated with softkeys (not shown), so that the user is given guidance concerning how the apparatus will respond to pressing a particular softkey. According to yet another exemplary embodiment the apparatus may comprise a mouse, a joystick, a navigation key, a roller ball, or some corresponding control device (not shown) with immediate graphical feedback on display, so that the indicators of input alternatives 704, 705, 706, 707, and 708 could be clickable icons. Alternative embodiments of indicators are mutually combinable, so that different techniques can be used for different indicators.
The indicators of input alternatives are in this exemplary embodiment the following:
According to one alternative, the indicator is only displayed on the display to remind the user that one possible way of correcting a particular artifact is to take a new image, but in order to actually take a new image the user must press a separate shutter switch.
Comparing the illustrated state of the user interface of
In this exemplary embodiment we assume that the apparatus is configured to evaluate the severity of each found artifact on a three-tier scale, to store the result of the evaluation as a part of the characterisation of the artifact, and to indicate the stored result of the evaluation with one, two, or three exclamation marks in the representation of the artifact. Additionally we assume that the apparatus is configured to automatically organise the displayed list of artifact representations so that the representations of artifacts for which severity was evaluated to be high are displayed first in said list.
It is also useful to zoom into the artefact area so the user can better visually evaluate whether the artefact is something that should be addressed, or whether the user is happy with the current result, and the artefact detection was in fact a false alarm. The zooming level should be calculated to be sufficient so that the user can clearly see the problem, in most cases the system should be able to automatically calculate the correct level. If the system keeps track of which artifacts at which severity level the user finds objectionable, the system can train its threshold levels so that in the future there will be fewer false alarms.
Which indicators are displayed may depend on which kind of artifact is currently highlighted. The apparatus may have been configured to only offer a particular subset of corrective action alternatives, depending on whether it is assumed to be possible to correct the selected artifact with any of the available alternatives for corrective action. For example, it is hardly plausible to attempt correcting a large area of missing picture content through any other corrective action than taking a new image, while it is may be possible to correct a small area of unfocused image content with filtering or other suitable processing. One of the displayed alternatives may be “no action” or “leave as is” or other indication of no action at all, to prepare for cases in which an algorithm for locating artifacts believes something to be an artifact, while it actually is an intended visual element of the (possibly panoramic) image.
In the exemplary case of
b illustrates another aspect of user interface interactivity. We may assume that a representation of such an artifact has been displayed, the correction of which is most advantageously done by acquiring a new component image. A certain part of the displayed image has been considered as problematic, i.e., as containing the artifact. This part is illustrated in the display with a frame 711 overlaid with the displayed image. According to the aspect illustrated in
An example of such instructions is illustrated in
A number of different mechanisms can be utilized to make the electronic image capturing device aware of what kinds of instructions it should give to the user. For example, the image currently provided by the viewfinder functionality (i.e. the electronic representation of an image that is dynamically read from the image sensor) can be compared with the image data of the displayed (possibly panoramic) image to find a match, at which location the current-view frame (illustrated in
Irrespective of which mechanisms are used to instruct the user to prepare for taking a new image, a feedback signal could be given to the user when the apparatus concludes that the user has followed the instructions within an acceptable tolerance, so that the new image can be taken. Such feedback may comprise, e.g., flashing the correctly aligned instructive frames on display, outputting an audible signal, or even automatically acquiring the new image without requiring the user to separately actuate any shutter switch.
The most optimal settings (exposure, aperture, white balance, focus, etc.) that should be used in taking the new component image are possibly not the same as those that were used to take the original (component) image(s). As a part of giving instructions to the user about taking the new component image, instructions may be given about how to make the most appropriate settings. Such instructions could appear as displayed prompts, other visual signals, synthesized speech signals, or other. As an alternative, the apparatus could prepare certain settings for use automatically, and take them into use as a response to observing that the user has followed certain instructions, e.g. pointed and zoomed the electronic image acquisition device according to the suggested frame for the new component image.
c illustrates a feature that can be utilized to make it easier for a human user to make a judgment about a selected artifact. In a so-called regular displaying state 721, which may be for example the state illustrated as 501 in
Subsystems on the right in
According to an embodiment of the invention, the task of correcting located artifacts can be dedicated solely to acquiring new images and stitching them into a panoramic image (if one exists), instead of attempting any kind of corrective processing of the previously existing image data. In such cases the artifact correcting subsystem 810 is actually accommodated in the image acquisition subsystem 802 and the image data handling subsystem 806.
The apparatus of
An image acquisition subsystem in the apparatus of
The user input subsystem in the apparatus of
The subsystems of
The apparatus may comprise another processor or processors, and other functionalities than those illustrated in the exemplary embodiment of
The apparatus of
In order to offer a user the possibility of operating the first apparatus, it comprises a first displaying subsystem 1003 and a first user input subsystem 1004, both coupled to the first processing subsystem 1001. A first power subsystem 1005 is configured to provide the first apparatus with operating power.
The second apparatus comprises a second processing subsystem 1021 configured to perform digital data processing involved in executing methods and computer program products according to embodiments of the invention. Coupled to the second processing subsystem 1021 are a second image data handling subsystem 1026, an artifact locating subsystem 1027, an artifact evaluating subsystem 1028, an artifact data handling subsystem 1029, and an artifact correcting subsystem 1030. These resemble the correspondingly named subsystems in the apparatus of
A second power subsystem 1025 is configured to provide the second apparatus with operating power. A second operations control subsystem 1031 is configured to, and comprises means for, controlling the general operation of the second apparatus, including but not being limited to implementing changes of operating mode according to inputs from user(s), organising work between the different subsystems, distributing processor time, and allocating memory usage. A second displaying subsystem 1023 and a second user input subsystem 1024 may be provided and coupled to the second processing subsystem, but these are not necessary, at least not in all embodiments with two apparatuses like in
The arrangement of
A second way is an embodiment of the invention where a user utilizes the first apparatus for image data acquisition, sends image data over to the second apparatus for panoramic image processing and/or locating artifacts, and receives completed panoramic images and/or other feedback to the first apparatus. As an example, the first apparatus can be a portable electronic device equipped with both a digital camera and a communications part, and the second apparatus can be a server that is coupled to a network and used to offer image processing services to users over the network.
Assuming the last-mentioned purpose and configuration of the first and second apparatuses, an example of processing panoramic image data goes as follows. The user of the first apparatus acquires a number of component images with subsystem 1002 and sends them over to the second apparatus, with the image data handling operations being performed by the first image data handling subsystem 1006. The second apparatus receives the component images, stitches them into a panoramic image in subsystem 1026, and processes the panoramic image for locating and evaluating artifacts in subsystems 1027 and 1028 respectively. Characterisations of artifacts handled by subsystem 1029 are sent back to the first apparatus, together with the stitched panoramic image and possible other information that the user may use in deciding, whether the panoramic image should be improved and in which way. The first apparatus handles the characterisations and/or representations of artifacts in subsystem 1009, displays the panoramic image and representations of artifacts on subsystem 1003, and receives inputs from the user through subsystem 1004 concerning the required corrective action. Information associated with such corrective action that can be implemented through processing is transmitted to the second apparatus, which performs the corrective processing in subsystem 1030 and transmits the corrected panoramic image (or, if only a part of the panoramic image needed to be corrected, the corrected part of the panoramic image) back to the first apparatus. Depending on where the final form of the panoramic image is to be stored, the first apparatus may respond to corresponding user input by storing the corrected panoramic image locally and/or by transmitting to the second apparatus a request for storing the corrected panoramic image at the second apparatus or somewhere else in the network.
A third way is a variation of that described immediately above, with the difference that the first image data handling subsystem 1006 in the first apparatus is configured to stitch component images into an output compound image, so that what gets transmitted to the second apparatus is not component images but a single stitched output image. Here we may consider a continuum, starting from a single untouched image, a single image where some parts of an image has been changed, to a panoramic image that extends the field of view to be larger than what can be obtained in a single view, so that the first apparatus may transmit even a single image to the second apparatus. This embodiment of the invention saves transmission bandwidth, because the second apparatus only needs to transmit back the characterisations of artifacts; at that stage both apparatuses already both possess that form of the (possibly panoramic) image for which these characterisations are pertinent. Representations of the artifacts may be generated in subsystem 1009 and shown to the user on subsystem 1003, and corrective action may be implemented in the first apparatus, by acquiring one or more additional images and/or by applying corrective processing. The first apparatus may produce a corrected (possibly panoramic) image after such corrective action. It is optional, whether the corrected image should be once more transmitted to the second apparatus for another round of locating artifacts and returning artifact data.
The exemplary embodiments of the invention presented in this patent application are not to be interpreted to pose limitations to the applicability of the appended claims. The verb “to comprise” is used in this patent application as an open limitation that does not exclude the existence of also unrecited features. The features recited in depending claims are mutually freely combinable unless otherwise explicitly stated.