The present invention relates to a method of capturing digital images, and image capturing apparatus. In particular, the invention relates to determination of motions present in an image, and storing an indication of the motions associate with the image.
An increasing amount of multimedia content in apparatuses, such as mobile telephones with camera capabilities or digital cameras, gives an increased desire to assign proper metadata to the pieces of content for facilitating management of the multimedia content.
Metadata has traditionally been information about creator, naming of content, date, number, etc. Within imaging, data such as light sensitivity settings, shutter speed, time, date, and manually entered text tags has been present. However, as a picture is captured, there are other circumstances that may be of importance for managing a stock of images, which may be too trying to describe in e.g. the text tag. Therefore, there is a desire to provide at least some such circumstances automatically into the metadata.
The present invention is based on the understanding that during the capturing of an image, information can be collected about activity in the scene. This information can be stored as metadata, which for example can be utilised during rendering of the image to enhance the expression of the image.
According to a first aspect, there is provided a method of capturing digital images. The method comprises registering an image projected on an image sensor; determining motions present in the image; determining a metric representing an amount of the motions; and storing the registered image with associated meta data comprising the metric.
The meta data may be stored in a meta data field of the file of the registered image, in a meta data file separate from the file of the registered image, or in a database with an index associating the meta data to the file of the registered image.
The determining of motions may comprise capturing at least two frames of pictures separate in time; providing the frames to a video encoder; and receiving present motions from the video encoder as vectors.
The determining of motions may comprise capturing at least two frames of pictures separate in time; and determining a shift between one of the frames to another, wherein the motions are described by the at least one vector based on the shift.
The determining of the metric may comprise analyzing the at least one vector; and assigning a metric based on the vector analysis. The analysis may provide at least two vectors, and the analyzing of the at least two vectors may comprise averaging of the size of the vectors. The analyzing of the vectors may comprise normalising the vectors by a theoretical maximum of vectors to represent motions in the image. The analyzing of the at least one vector may comprise filtering of the vectors. The analyzing of the at least one vector may comprise compensating for global motions of the image.
The determining of motions and determining the metric may be performed by recording a video clip; determining the motions and metric; and deleting the video clip.
The determining of motions may be performed during a period where an autofocus function of optics projecting the image to the image sensor is operating.
The determining of motions may be performed on a reduced resolution image compared to the registered image.
According to a second aspect, there is provided an image capturing apparatus comprising an image sensor; optics arranged to project an image on the image sensor; a signal processor arranged to receive signals provided by the image sensor, to determine motions present in the image, and to determine a metric representing an amount of the motions; and a memory arranged to store a registered image with associated meta data comprising the metric.
The apparatus may be arranged to store the meta data in a meta data field of the file of the registered image, in a meta data file separate from the file of the registered image, or in a database with an index associating the meta data to the file of the registered image.
The signal processor may comprise a video encoder arranged to receive at least two frames of pictures separate in time and to provide present motions as vectors. The signal processor may further comprise a vector processing mechanism arranged to provide an average of the size of the vectors, filter the vectors, normalise the vectors, or compensate for global motions of the image, or any combination thereof, wherein the metric is determined from an output of the vector processing mechanism.
The optics projecting the image to the image sensor may comprise an autofocus function, and a control signal may be provided when the autofocus function is operating wherein the determining of motions is arranged to be performed during a period when the control signal indicates operation of the autofocus function.
Here, as a rule of thumb, for economy versions, all processing can be made in the same processor that is handling other applications of the apparatus. Often, in such a case, the size of the image and performance may be limited by the shared performance of the application processor. In more sophisticated versions, a video encoder is provided, and the approach described above can be utilized. Thus, the processing capability may not need to be shared with other applications, an performance and capability is increased. For even more sophisticated versions, multiple video encoders can be utilized, and the image sensor itself can also comprise some processing. In those cases, even small details in the images can be considered for determination of motions, and a high granularity of representation of activity is enabled.
The determination of shift can be based on block matching algorithms wherein it is determined the amount of changed/unchanged blocks between the frames. Alternatively, the determination of shift can be based on other division of the image into parts, e.g. by recognizing objects and their shifts between the images, or be based on a complex analysis of the aggregate representation of the content of the image An example of a practical approach is to capture a short video sequence, i.e. a video clip, at the time of capturing the picture. From the video clip, motions and metric are determined according to the video encoder approach demonstrated above, and then the video clip is erased. Another example of practical implementation is to perform the motion determination on reduced resolution images compared to the registered and stored image. Further an example of practical implementation is to activate the motion determination during a period where an autofocus mechanism of the optics is activated. Any combination of these practical implementations is of course further advantageous.
For provision of a representation of activity that can be properly used, e.g. at rendering of the picture, a proper metric representing the motions is determined in a metric determination step 104. The metric can be determined by analyzing the vectors, and then based on the analysis assigning a metric. The analyzing can comprise averaging of the vectors to form the metric. Filtering and/or normalizing of the vectors can be made to get a proper representation. The normalising of the vectors is preferably done in view of a theoretical maximum of vectors to represent motions in the image. Thus, considering a case with a single large motion compared to a case with many smaller motions, especially when application of averaging, normalisation may then give a more representative metric of the motion of the scene. The theoretical maximum of vectors can be determined from the video encoder in use, or from a capability limit of the processing means.
Compensation for global motions, i.e. where all of the image is moving the same way during capturing, e.g. because of it being hard to keep the camera steady when shooting the picture, can be provided to get a representation of true motion in sense of the expression of the picture, and not a representation of a shaky hand.
When the metric is determined, it is stored as metadata to the image in a metadata storing step 106. The metadata can be stored in a data field of the stored image, in a separate metadata file together with the image file, or be stored in a meta data database with an index associating it with the image file.
As discussed above, the provision of vectors can be made in other ways as well. Video encoding models is a feasible way, as such models often provide a vector based representation. Other models that are not vector based can also be used, where amount of motion is determined from other parameters provided by video encoding approaches arranged to provide reduced bit rate representation of dynamic scenes.