1. Field of the Invention
The present invention relates to a technology for displaying image data related to a region of user's interest.
2. Description of the Related Art
A recent trend toward a small-size, large-capacity internal memory, such as the one used in a digital camera, allows a large amount of image data to be accumulated. This increases the accumulation amount of image data that was imaged in the past, generating a problem that the user has to spend a long time searching for desired image data from a large amount of image data.
To solve this problem, Japanese Patent Application Laid-Open No. 2010-211785 discusses a technology for increasing the efficiency of searching for photographs that contain the same face by detecting the face region in each photograph and grouping photographs based on the similarity in the face region.
The technology discussed in Japanese Patent Application Laid-Open No. 2010-211785 can result in that, if the faces of a plurality of persons are included in a photograph but the interested person is only one of them, the photographs including other person's faces may be grouped into the same group. Grouping the photographs in this way sometimes can make it difficult to search for the photographs including persons who are a target of interest.
One aspect of the present invention is directed to a technology for displaying image data related to a region of user's interest without cumbersome operations.
According to an aspect of the present invention, an information processing device includes a memory and a processor, coupled to the memory, wherein the processor controls a display control unit configured to enlarge or reduce and display an area of a part of image data, displayed on a display unit, in response to a user operation, and a determination unit configured to, based on objects included in the area enlarged or reduced and displayed by the display control unit and on objects included in a plurality of image data pieces, determine a display order of the plurality of image data pieces.
Further features of the present invention will become apparent from the following description of exemplary embodiments (with reference to the attached drawings).
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the invention and, together with the description, serve to explain the principles of the invention.
Various exemplary embodiments, features, and aspects of the invention will be described in detail below with reference to the drawings.
For description purposes, a first exemplary embodiment of the present invention will refer to a mobile phone as an example of an information processing device. In addition, in the first exemplary embodiment, for discussion purposes, image data is the digital data of a photograph and that an object in the image data are persons in the photograph. Finally, again for description purposes, information on the persons in the photograph is added in advance to the image data as metadata. The following describes a case where a user browses image data on a mobile terminal.
In browsing the image date, the user performs a flick operation with the user's finger(s) to move the image data horizontally, e.g., right to left or left to right in order to browse various types of image data. The flick operation refers to the user sliding a finger on a touch panel of a display to move the image data in a horizontal direction. The user can also perform a pinch operation using the user's fingers to enlarge or reduce a displayed image. The pinch operation refers to the user sliding, in opposite directions, at least two fingers on the touch panel to enlarge or reduce the image.
In
A scroll unit 1014 moves image data horizontally via a flick operation on the touch panel 1002. A holding unit 1015 holds a plurality of image data pieces. An acquisition unit 1016 acquires image data from the holding unit 1015. A cache unit 1017 holds image data, which will be preferentially displayed according to a display order, in a Random Access Memory (RAM) 1022. An order determination unit 1018 determines the order in which image data held in the RAM 1022 is to be displayed. An extraction unit 1019 extracts information on a person in the image data from metadata, attached to image data to identify the person in the image data.
In
The CPU 1021 loads the above-described program from the ROM 1023 to the RAM 1022 for executing the program therein to implement the input unit 1011, enlargement/reduction unit 1013, display unit 1012, scroll unit 1014, acquisition unit 1016, cache unit 1017, order determination unit 1018, and extraction unit 1019. The holding unit 1015 corresponds to the memory card 1025. The touch panel 1024 corresponds to the touch panel 1002 shown in
Recall=2 persons (=A,B)/3 persons (=number of persons in image data P)=2/3
In addition, the acquisition unit 1016 calculates the precision based on the number of persons who are included in the image data Q and who are also included in the image data P. The acquisition unit 1016 acquires (second acquisition) the information on the number of persons who are included in the image data Q and who are also included in the image data P and calculates the precision as follows:
Precision=2 persons (=A,B)/2 persons (=number of persons in image data Q)=1
After that, the acquisition unit 1016 calculates the matching degree between the image data P and the image data Q using the product of the recall and the precision as follows:
Matching degree between image data P and image data Q=Recall×Precision=2/3×1=2/3
Similarly, the acquisition unit 1016 calculates the matching degree between the image data P and the image data R and the matching degree between the image data P and the image data S as follows:
Matching degree between image data P and image data R=3/3×3/3=1
Matching degree between image data P and image data S=2/3×2/2=2/3
From the image data stored in the holding unit 1015, the acquisition unit 1016 acquires only the image data whose matching degree calculated as described above is greater than or equal to the threshold (for example, greater than or equal to ½). After that, the order determination unit 1018 stores the image data, acquired by the acquisition unit 1016, sequentially in the positions [0], [1], [2], [3], and [4] in the cache 2011 in descending order of the matching degree.
Next, with reference to the flowchart in
In step S402, the display unit 1012 moves the image data horizontally according to the user's flick operation. For example, when the user flicks left with the image data P (image data in the position [0] in the cache 2011) displayed, the display unit 1012 moves the displayed image data P to the left until it is no longer visible on the screen. At the same time, the display unit 1012 moves the image data in the position [1] in the cache 2011 into the screen from the right end of the screen and displays it in the center, as shown in
In step S403, the input unit 1011 determines whether the user performs the pinch operation on the displayed image data P. If the user performs the pinch operation (YES in step S403), the processing proceeds to step S404. On the other hand, if the user does not perform the pinch operation (NO in step S403), the processing returns to step S401. In step S404, the enlargement/reduction unit 1013 enlarges or reduces the image data according to the pinch operation and the display unit 1012 displays the enlarged or reduced image data. For example, assume that the pinch operation is performed on image data P 7001 shown in
Coordinates of top-left corner=(30, 40), Coordinates of bottom-right corner=(90, 90)
Similarly, enlarged and displayed image data, such as the image data 7003 shown in
In step S405, the acquisition unit 1016 detects the persons in the enlarged or reduced image data based on the metadata associated with the enlarged or reduced image data. For example, in the example in
Recall=2 persons (=A,B)/2 persons (=number of persons in enlarged image data P)=1
In step S504, the acquisition unit 1016 calculates the precision as follows based on the number of persons who are included in the image data Q and who are included also in the enlarged image data P:
Precision=2 persons (=A,B)/2 persons (=number of persons in image data Q)=1
In step S505, the acquisition unit 1016 calculates the matching degree between the enlarged image data P and the image data Q as follows:
Matching degree between enlarged image data P and image data Q=Recall×Precision=1×1=1
In step S506, the acquisition unit 1016 determines whether the image data being processed is the last image data stored in the holding unit 1015. If the image data is the last image data (YES in step S506), the processing proceeds to step S407. On the other hand, if the image data is not the last image data (NO in step S506), the processing proceeds to step S507. In step S507, the acquisition unit 1016 adds 1 to the counter N. After that, the processing returns to step S502 and the acquisition unit 1016 performs the processing for the next image data.
In step S503, if the next image data is the image data R indicated by the metadata 3002 in
Recall=2 persons (=A,B)/2 persons (=number of persons in enlarged image data P)=1
In step S504, the acquisition unit 1016 calculates the precision as follows based on the number of persons who are included in the image data R and who are included also in the enlarged image data P:
Precision=2 persons (=A,B)/3 persons (=number of persons in image data R)=2/3
In step S505, the acquisition unit 1016 calculates the matching degree between the enlarged image data P and the image data R as follows:
Matching degree between enlarged image data P and image data R=Recall×Precision=1×2/3=2/3
In step S503, if the next image data is the image data S indicated by the metadata 3003 in
Recall=1 person (=B)/2 persons (=number of persons in enlarged image data P)=1/2
In step S504, the acquisition unit 1016 calculates the precision as follows based on the number of persons who are included in the image data S and who are included also in the enlarged image data P:
Precision=1 person (=B)/2 persons (=number of persons in image data S)=1/2
In step S505, the acquisition unit 1016 calculates the matching degree between the enlarged image data P and the image data S as follows:
Matching degree between enlarged image data P and image data S=Recall×Precision=1/2×1/2=1/4
After the matching degree is calculated to the last image data as described above, the processing proceeds to step S407.
In step S407, the acquisition unit 1016 acquires only the image data whose matching degree recalculated as described above is greater than or equal to the threshold (for example, greater than or equal to ½), from the image data stored in the holding unit 1015. The order determination unit 1018 sequentially stores the image data, acquired by the acquisition unit 1016, in the positions [0], [1], [2], [3], and [4] in the cache 2011 in descending order of the matching degree. In this way, the image data stored in the cache 2011 and its storing order are changed as shown in
While the present exemplary embodiment describes that metadata on the persons in image data is attached to the image data in advance, object recognition such as face recognition may also be performed at image data browsing time.
Although the matching degree is calculated based on the recall and the precision in the present exemplary embodiment, the calculation method of the matching degree is not limited to this method. For example, the matching degree may be calculated according to how many persons in the area 7002 match the persons in the target image data. In this case, the matching degree between the enlarged image data P (=A and B are in the image data) and the image data Q (=A and B are in the image data) is 2 (=A and B), and the matching degree between the enlarged image data P and the image data R (=A, B, and C are in the image data) is also 2 (=A and B). When the recall and the precision are used, the matching degree of the image data Q is determined to be higher than the matching degree of the image data R as described above. This is because the matching degree is reduced by the fact that the image data R includes person C who is not in the enlarged image data P. This means that using the recall and the precision allows the matching degree to be calculated more accurately.
Although the image data is stored in the holding unit 1015 (memory card 1025) and the image data, whose matching degree is greater than or equal to the threshold, is stored in the cache 2011 in the present exemplary embodiment, the image data may also be stored in a location other than the holding unit 1015 (memory card 1025). For example, image data may be stored on a network server, where a user can access the image date via a network. In this case, image data may be displayed according to the order based on the matching degree without using the cache 2011. As far as the display order is determined based on the matching degree, this method also achieves the effect that the user can quickly find image data correlating to the user's interest. However, storing image data in the cache 2011 provides the user with some additional effects. One is that image data stored in the cache 2011 can be displayed more quickly when moved horizontally. Another is that selecting image data with a matching degree greater than or equal to the threshold reduces the amount of memory usage. In addition, while image data in the present exemplary embodiment is scrolled using a flick operation, the image may also be scrolled using a key or voice input.
As described above, in the present exemplary embodiment, image data related to a user's interested area can be preferentially displayed without cumbersome operations.
Next, a second exemplary embodiment is described. In the first exemplary embodiment, the coordinate positions of the persons in the image data are used as the coordinate positions of the persons in the metadata as shown in the metadata 2002 in
The area enclosing each object corresponding to a person in the image data is represented by a rectangular area such as the one indicated by the dotted line in image data 9001 in
The image data T includes only person A, one of the two persons A and B in the enlarged image data P. Therefore, the acquisition unit 1016 calculates the recall and the precision as follows:
Recall=0.9 (=display ratio of A)/2 persons (=number of persons in enlarged image data P)=0.45
Precision=0.9 (=display ratio of A)/2 persons (number of persons in image data T)=0.45
From the recall and the precision calculated as described above, the acquisition unit 1016 calculate the matching degree between the image data P and the image data T as follows:
Matching degree between image data P and image data T=Recall×Precision=0.45×0.45≈0.2
On the other hand, the image data U includes only person B, one of the two persons A and B in the enlarged image data P. Therefore, the acquisition unit 1016 calculates the recall and the precision as follows:
Recall=0.6 (display ratio of B)/2 persons (number of persons in enlarged image data P)=0.3
Precision=0.6 (=display ratio of B)/2 persons (=number of persons in image data U)=0.3
From the recall and the precision calculated as described above, the acquisition unit 1016 calculates the matching degree between the image data P and the image data U as follows:
Matching degree between image data P and image data U=Recall×Precision=0.3×0.3≈0.09
As described above, the matching degree between the image data P and the image data T is larger than the matching degree between the image data P and the image data U. Therefore, the order determination unit 1018 determines the order so that, in the cache 2011, the image data T is placed nearer to the image data P than the image data U. When the user enlarges the image data P, as shown by the image data 7003 in
Although metadata associated with the persons in image data is attached to the image data in advance in the present exemplary embodiment as in the first exemplary embodiment, the face recognition processing can be performed at image data browsing time. While image data in the present exemplary embodiment is scrolled using a flick operation, the image data can be scrolled using a key press or voice input.
A third exemplary embodiment will now be described. In the second exemplary embodiment, the information indicating the rectangular area of a person in image data is stored as metadata. However, when a photograph is taken with the face of a person tilted, as with person C indicated by image data 12001 in
A fourth exemplary embodiment will now be described. While the user performs multi-touch-based pinch operation to enlarge or reduce image data in the first to third exemplary embodiments, the fourth exemplary embodiment provides additional methods in which to enlarge or reduce image data. A dialog for entering an enlargement ratio or a reduction ratio may be displayed to prompt the user to enter the numeric value of an enlargement ratio or a reduction ratio or to speak “enlarge” or “reduce” to specify the enlargement/reduction operation via voice. Another method is to display a display frame, such as the one shown by a frame 13002 in
A fifth exemplary embodiment will now be described. While the order determination unit 1018 determines to display image data in descending order of the matching degree in the first exemplary embodiment, the image data need not be displayed in descending order. In the present exemplary embodiment, for example, the order determination unit 1018 may determine to display image data in ascending order of the matching degree. This is useful to search for image data including a specific area that does not include any person. In addition, as shown in
Although only one target area is specified as a target area to be enlarged or reduced and displayed in the first to third exemplary embodiments, two or more target areas may also be specified. Although a person, its position information, and its area information are used as the object information in the first to third exemplary embodiments, a non-person object may also be used. Depth information and color information may also be used in addition to the object position information and the object area information. The area may be any closed area, not just a rectangular area or an elliptical area.
Aspects of the present invention can also be realized by a computer of a system or apparatus (or devices such as a CPU or MPU) that reads out and executes a program recorded on a memory device to perform the functions of the above-described embodiments, and by a method, the steps of which are performed by a computer of a system or apparatus by, for example, reading out and executing a program recorded on a memory device to perform the functions of the above-described embodiments. For this purpose, the program is provided to the computer for example via a network or from a recording medium of various types serving as the memory device (e.g., computer-readable storage medium). In such a case, the system or apparatus, and the recording medium where the program is stored, are included as being within the scope of the present invention.
The exemplary embodiments described above enable the user to preferentially display the image data related to a region of user's interest without cumbersome operations.
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all modifications, equivalent structures, and functions.
This application claims priority from Japanese Patent Application No. 2011-084020 filed Apr. 5, 2011, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2011-084020 | Apr 2011 | JP | national |