The present application claims priority upon Japanese Patent Application No. 2007-038350 filed on Feb. 19, 2007, and Japanese Patent Application No. 2007-315244 filed on Dec. 5, 2007, which are herein incorporated by reference.
1. Technical Field
The present invention relates to category classification apparatuses, category classification methods, and storage media storing a program.
2. Related Art
Apparatuses have been proposed that classify categories to which the images to be classified belong and perform processing that is suitable for the classified category. For example, an apparatus has been proposed, which classifies the category of an image based on the image data and performs enhancement processing that is suitable for the classified category for images to be classified (see WO 2004/30373). With this apparatus, the color hue of pixels within a subject region is calculated based on the image data. Then, the category (portrait, landscape etc.) of the image is classified in accordance with the proportion of pixels having a specific hue.
For this kind of category classification, there is a demand to speed up processing.
An advantage of some aspects of the present invention is that, it is possible to speed up processing.
An aspect of the invention is a category classification apparatus comprising:
Other features of the present invention will become clear by reading the description of the present specification with reference to the accompanying drawings.
For a more complete understanding of the present invention and the advantages thereof, reference is now made to the following description taken in conjunction with the accompanying drawings wherein:
At least the following matters will be made clear by the present specification and the accompanying drawings.
A category classification apparatus can be realized that comprises:
With such a category classification apparatus, the overall characteristic amount obtaining section obtains the overall characteristic amount based on the plurality of partial characteristic amounts. Thus, compared to the case where the partial characteristic amounts and the overall characteristic amount are obtained separately, the processing can be sped up.
In this category classification apparatus, it is preferable that the partial characteristic amount obtaining section obtains, as the partial characteristic amounts, partial average information given as an average value of colors of each of a plurality of pixels constituting the partial image data and partial variance information indicating a variance of colors of each of the plurality of pixels constituting the partial image data, and
With such a category classification apparatus, the partial average information and the partial variance information are obtained as the partial characteristic amounts, and the overall average information and the overall variance information are obtained as the overall characteristic amounts. Therefore, category classification property with the category classification section can be raised.
In this category classification apparatus, it is preferable that, the overall characteristic amount obtaining section obtains the overall average information based on a plurality of the partial average information. In this case it is preferable that the overall characteristic amount obtaining section considers an average value of the plurality of partial average information as the overall average information.
With such a category classification apparatus, the overall average information can be obtained with a simple processing, and the processing can be sped up.
In this category classification apparatus, it is preferable that the overall characteristic amount obtaining section obtains the overall variance information, based on the plurality of the partial average information, a plurality of the partial variance information, and the overall average information.
With such a category classification apparatus, the overall average information can be obtained with a simple processing, and the processing can be sped up.
In this category classification apparatus, it is preferable that the overall characteristic amount obtaining section considers moment information indicating a moment of colors of each of a plurality of pixels constituting the image data as the overall characteristic amount.
With such a category classification apparatus, moment information is obtained as the overall characteristic amount, so that category classification property with the category classification section can be raised.
In this category classification apparatus, it is preferable that the overall characteristic amount obtaining section obtains the moment information based on the plurality of partial average information.
With such a category classification apparatus, moment information can be obtained efficiently, and the processing can be sped up.
In this category classification apparatus, it is preferable that the partial characteristic amount obtaining section obtains the plurality of partial image data by dividing the image data in a grid shape.
With such a category classification apparatus, the processing can be sped up.
In this category classification apparatus, it is preferable that the category classifier includes the probability information obtaining sections that obtain probability information indicating the probability that the image belongs to a predetermined category, based on one of the partial characteristic amounts and the overall characteristic amount, the number of probability information obtaining sections corresponding to the number of types of categories.
With such a category classification apparatus, the classification is carried out based on probability information, so that a high level of both processing speed and classification accuracy can be attained.
In this category classification apparatus, it is preferable that, the probability information obtaining section is a support vector machine that has performed classification training of an image.
With such a category classification apparatus, the accuracy of the obtained probability information is increased even for limited training data.
It should furthermore become clear, that the following category classification method can be realized.
A category classification method can be realized that comprises:
It should furthermore become clear, that the following storage medium storing a program used for a category classification apparatus can be realized.
A storage medium can be realized that stores a program used for a category classification apparatus classifying a category to which an image data belongs, the storage medium storing a program that lets the category classification apparatus:
The following is an explanation of embodiments of the present invention. It should be noted that the following explanations take the multifunctional apparatus 1 shown in
As shown in
The printer-side controller 30 is a component that carries out the printing control, such as the control of the print mechanism 40. The printer-side controller 30 shown in the figure includes a main controller 31, a control unit 32, a driving signal generation section 33, an interface 34, and a memory slot 35. These various components are communicably connected via a bus BU.
The main controller 31 is the central component responsible for control, and includes a CPU 36 and a memory 37. The CPU 36 functions as a central processing unit, and carries out various kinds of control operations in accordance with an operation program stored in the memory 37. Accordingly, the operation program includes code for realizing control operations. The memory 37 stores various kinds of information. As shown for example in
The control unit 32 controls for example motors 41 with which the print mechanism 40 is provided. The driving signal generation section 33 generates driving signals that are applied to driving elements (not shown in the figures) of a head 44. The interface 34 is for connecting to a host apparatus, such as a personal computer. The memory slot 35 is a component for mounting a memory card MC. When the memory card MC is mounted in the memory slot 35, the memory card MC and the main controller 31 are connected in a communicable manner. Accordingly, the main controller 31 is able to read information stored on the memory card MC and to store information on the memory card MC. For example, it can read image data created by capturing an image with the digital still camera DC or it can store enhanced image data, which has been subjected to enhancement processing or the like.
The print mechanism 40 is a component that prints on a medium, such as paper. The print mechanism 40 shown in the figure includes motors 41, sensors 42, a head controller 43, and a head 44. The motors 41 operate based on the control signals from the control unit 32. Examples for the motors 41 are a transport motor for transporting the medium and a movement motor for moving the head 44 (neither is shown in the figures). The sensors 42 are for detecting the state of the print mechanism 40. Examples for the sensors 42 are a medium detection sensor for detecting whether a medium is present or not, a transport detection sensor for detecting the transport of the medium, and a head position sensor for detecting the position of the head 44 (none of which is shown in the figures). The head controller 43 is for controlling the application of driving signals to the driving elements of the head 44. In this image printing section 20, the main controller 31 generates the head control signals in accordance with the image data to be printed. Then, the generated driving signals are sent to the head controller 43. The head controller 43 controls the application of driving signals, based on the received head control signals. The head 44 includes a plurality of driving elements that perform an operation for ejecting ink. The necessary portion of the driving signals that have passed through the head controller 43 is applied to these driving elements. Then, the driving elements perform an operation for ejecting ink in accordance with the applied necessary portion. Thus, the ejected ink lands on the medium and an image is printed on the medium.
Configuration of the Various Components Realized by the Printer-Side Controller 30
The following is an explanation of the various components realized by the printer-side controller 30. The CPU 36 of the main controller 31 performs a different operation for each of the plurality of operation modules (program units) constituting the operation program. At this time, the main controller 31 fulfills different functions for each operation module, either alone or in combination with the control unit 32 or the driving signal generation section 33. In the following explanations, it is assumed for convenience that the printer-side controller 30 is expressed as a separate device for each operation module.
As shown in
Configuration of Scene Classification Section 30B
The following is an explanation of the scene classification section 30B. The scene classification section 30B of the present embodiment classifies whether a targeted image for which the scene has not been determined with the face detection section 30A belongs to a landscape scene, an evening scene, a night scene, a flower scene, an autumnal scene, or another scene. As shown in
The Characteristic Amount Obtaining Section 30E
The characteristic amount obtaining section 30E obtains a characteristic amount indicating a characteristic of the targeted image from the data of the targeted image. This characteristic amount is used for the classification with the overall classifier 30F and the partial image classifier 30G. As shown in
The partial characteristic amount obtaining section 51 obtains partial characteristic amounts for individual sets of partial data, based on partial data obtained by partitioning the data subjected to classification. These partial characteristic amounts represent a characteristic of one portion to be classified, corresponding to the partial data. In this embodiment, an image is subjected to classification. Accordingly, the partial characteristic amounts represent characteristic amounts for each of the plurality of regions into which the overall image has been partitioned (also referred to simply as “partial images”). More specifically, they represent the characteristic amounts of the partial images of 1/64 size that are obtained by partitioning the overall image into partial images corresponding to regions obtained by splitting width and height of the overall image into eight equal portions, that is, by partitioning the overall image into a grid shape. Moreover, the data of the targeted image corresponds to the data to be classified, the partial image data corresponds to partial data, and the pixels constituting the partial image data correspond to a plurality of samples constituting the partial data. It should be noted that the data of the targeted image in this embodiment is data of QVGA size. Therefore, the partial image data is data of 1/64 of that size (40×30 pixels=1200 pixels).
The partial characteristic amount obtaining section 51 obtains the color average and the color variance of the pixels constituting the partial image data as the partial characteristic amounts indicating the characteristics of the partial image. Consequently, the partial characteristic amounts are obtained based on the partial image data, and correspond to characteristic amounts obtained from the color information of the pixels.
The color of the pixels can be expressed by numerical values in a color space such as YCC or HSV. Accordingly, the color average can be obtained by averaging these numerical values. Moreover, the variance indicates the extent of spread from the average value for the colors of all pixels. Here, the color average obtained from the partial image data corresponds to partial average information for color, and the variance obtained from the partial image data corresponds to partial variance information for color.
The overall characteristic amount obtaining section 52 obtains the overall characteristic amount from the data subjected to classification. This overall characteristic amount indicates an overall characteristic of the image to be classified. Examples of this overall characteristic amount are the color average and the color variance of the pixels constituting the data of the targeted image. Here, the pixels correspond to a plurality of samples constituting the data to be classified, and the color average and the color variance of the pixels correspond to the overall average information and the overall variance information for color. Other than that, the overall characteristic amount can also be a moment. This moment is a characteristic amount indicating the distribution (centroid) of color, and corresponds to moment information. The color average, color variance and the moment serving as the overall characteristic amounts are characteristic amounts that used to be directly obtained from the data of the targeted image. However, the overall characteristic amount obtaining section 52 of the present embodiment obtains these characteristic amounts using the partial characteristic amounts (this is explained later). Moreover, if the data of the targeted image has been generated by capturing an image with the digital still camera DC, then the overall characteristic amount obtaining section 52 obtains also the Exif attribute information as an overall characteristic amount. For example, image capturing information, such as aperture information indicating the aperture, shutter speed information indicating the shutter speed, and strobe information indicating whether a strobe is set or not are also obtained as overall characteristic amounts. It should be noted that the Exif attribute information corresponds to one type of appended information that is appended to the image data. In the present embodiment, the Exif attribute information that is appended at the time a picture is taken with the digital still camera is given as an example of appended information, but there is no limitation to this. For example, it may also be Exif attribute information that is appended to the image data generated by the image reading section 10 or a scanner (not shown in the figures) by executing a computer program for image processing. Moreover, the appended information is not limited to Exif attribute information, and may also be a similar kind of information.
Obtaining the Characteristic Amounts
The following is an explanation of how the characteristic amounts are obtained. As noted above, in the present embodiment, first the partial characteristic amounts are obtained from the data of the targeted image, and then the overall characteristic amounts are obtained from the obtained partial characteristic amounts. This is in order to speed up the processing. This aspect is explained in the following.
If the characteristic amounts are obtained from the data of the targeted image, then it is necessary to read in the image data from the memory card MC serving as the storage medium into the memory 37 (main memory) of the main controller 31. In this case, the access to the memory card MC and the writing into the memory 37 needs to be carried out repeatedly, which takes a lot of time. Moreover, if the data of the targeted image is in JPEG format (such data is also referred to in short as “JPEG image data”), then it is necessary to decode this JPEG image data. For this, it is necessary to perform Huffman decoding or inverse DCT transformations, and also these processes take a lot of time.
In order to reduce the number of times the memory card MC is accessed and the number of writing operations with respect to the memory 37, it might seem to be sufficient to provide memory of the corresponding capacity, but the capacity of the memory 37 that can be installed is limited, so that this is difficult in practice. To address this problem, when the overall characteristic amounts and the partial characteristic amounts are obtained, it might seem to be possible to decode the JPEG image data into RGB image data and convert the RGB image data into YCC image data each time the overall characteristic amounts are obtained and the partial characteristic amounts are obtained. However, when this method is employed, the processing time becomes long.
In view of this situation, with the multifunctional apparatus 1 according to the present embodiment, the partial characteristic amount obtaining section 51 obtains the partial characteristic amounts for each set of partial data. Then, the obtained partial characteristic amounts are stored in the characteristic amount storage section 37e (which corresponds to a partial characteristic amount storage section) of the memory 37. The overall characteristic amount obtaining section 52 obtains the overall characteristic amounts by reading out the partial characteristic amounts stored in the characteristic amount storage section 37e. Then, the obtained overall characteristic amounts are stored in the characteristic amount storage section 37e (which corresponds to an overall characteristic amount storage section). By employing this configuration, it is possible to keep the number of transformations performed on the data of the targeted image low, and compared to a configuration in which the partial characteristic amounts and the overall characteristic amounts are obtained separately, the processing speed can be increased. Moreover, the capacity of the memory 37 for the decoding can also be kept to the necessary minimum.
Obtaining the Partial Characteristic Amounts
The following is an explanation of how the partial characteristic amounts are obtained by the partial characteristic amount obtaining section 51. As shown in
Then, the partial characteristic amount obtaining section 51 obtains the partial characteristic amounts (S13). In this embodiment, the partial characteristic amount obtaining section 51 obtains the color average and the color variance of the partial image data as the partial characteristic amounts. Here, the color average in the partial image data corresponds to partial average information. For convenience, the color average of the partial image data is also referred to as “partial color average”. Moreover, the variance of the partial image data corresponds to partial variance information. For convenience, the variance in the partial image data is also referred to as “partial color variance”. In the j-th (j=1 . . . 64) set of partial image data, the color information of the i-th (i=1 . . . 76800) pixel (for example the numerical value expressed in YCC color space) is xi. In this case, the partial color average xavj for the j-th set of partial image data can be expressed by the following Equation (1):
Moreover, for the variance S2 of the present embodiment, the variance defined in Equation (2) below is used. Therefore, the partial color variance Sj2 for the j-th partial image data can be expressed by the following Equation (3), which is obtained by modifying Equation (2).
Consequently, the partial characteristic amount obtaining section 51 obtains the partial color average xavj and the partial color variance Sj2 or the corresponding partial image data by performing the calculations of Equation (1) and Equation (3). Then, the partial color average xavj and the partial color variance Sj2 are stored in the characteristic amount storage section 37e of the memory 37.
When the partial color average xavj and the partial color variance Sj2 have been obtained, the partial characteristic amount obtaining section 51 judges whether there is unprocessed partial image data left (S14). If the partial characteristic amounts have been obtained in order starting with the lowest numbers, then the partial characteristic amount obtaining section 51 judges that there is unprocessed partial image data left until the partial characteristic amounts have been obtained for the 64-th set of partial image data. Then, when the partial characteristic amounts have been obtained for the 64-th partial image data, it judges that there is no unprocessed partial image data left. If it judges that there is unprocessed partial image data left, then the partial characteristic amount obtaining section 51 advances to step S11 and carries out the same process (S11-S13) for the next set of partial image data. On the other hand, if it is judged at Step 514 that there is no unprocessed partial image data left, then the processing with the partial characteristic amount obtaining section 51 ends. In this case, the overall characteristic amounts are obtained with the overall characteristic amount obtaining section 52 in Step S15.
obtaining the Overall Characteristic Amounts
The following is an explanation of how the overall characteristic amounts are obtained with the overall characteristic amount obtaining section 52 (S15). The overall characteristic amount obtaining section 52 obtains the overall characteristic amounts based on the plurality of partial characteristic amounts stored in the characteristic amount storage section 37e. As noted above, the overall characteristic amount obtaining section 52 obtains the color average and the color variance of the data of the targeted image as the overall characteristic amounts. These overall characteristic amounts are obtained from the data of the targeted image and correspond to characteristic amounts that are obtained from the color information of the pixels. Moreover, the color average of the data of the targeted image corresponds to overall average information. The color average of the data of the targeted image is also referred to simply as “overall color average”. Moreover, the color variance of the data of the targeted image corresponds to overall variance information. The color variance of the data of the targeted image is also referred to simply as “overall color variance”. Moreover, if the partial color average of the j-th set of partial image data among the 64 sets of partial image data is xavj, then the overall color average xav can be expressed by the Equation (4) below. In this Equation (4), m represents the number of partial images. The overall color variance S2 can be expressed by the Equation (5) below. It can be seen that with this Equation (5), it is possible to obtain the overall color variance S2 from the partial color averages xavj, the partial color variances Sj2, and the overall color average xav.
Consequently, the overall characteristic amount obtaining section 52 obtains the overall color average xav and the overall color variance S2 for the data of the targeted image by calculating the Equations (4) and (5). Then, the overall color average xav and the overall color variance S2 are stored in the characteristic amount storage section 37e of the memory 37.
The overall characteristic amount obtaining section 52 obtains the moment as another overall characteristic amount. In this embodiment, an image is to be classified, so that the positional distribution of colors can be quantitatively obtained through the moment. In this embodiment, the overall characteristic amount obtaining section 52 obtains the moment from the color average xavj for each set of partial image data. Here, when the partial image data constituting the data of the targeted image is expressed as a matrix of horizontally I (I=1 . . . 8) and vertically J (J=1 . . . 8) and the partial color averages of the partial image data specified by I and J are expressed as Xav (I, J), then the n-th moment mnh in horizontal direction for the partial color average can be expressed as in Equation (6) below.
m
nh=ΣI,JIn×Xav(I,J) (6)
Here, the value obtained by dividing the simple primary moment by the sum total of the partial color averages Xav (I, J) is referred to as “primary centroid moment”. This primary centroid moment is as shown in Equation (7) below and indicates the centroid position in horizontal direction of the partial characteristic amount of partial color average. The n-th centroid moment, which is a generalization of this centroid moment is as expressed by Equation (8) below. Among the n-th centroid moments, the odd-numbered (n=1, 3. . . ) centroid moments generally seem to indicate the centroid position. The even-numbered centroid moments generally seem to indicate the extent of the spread of the characteristic amounts near the centroid position.
m
glh=ΣI,JI×Xav(I,J)/ΣI,JXav(I,J) (8)
m
gnh=ΣI,J(I−mglx)n×Xav(I,J)/ΣI,JXav(I,J) (8)
The overall characteristic amount obtaining section 52 of this embodiment obtains six types of moments. More specifically, it obtains the primary moment in a horizontal direction, the primary moment in a vertical direction, the primary centroid moment in a horizontal direction, the primary centroid moment in a vertical direction, the secondary centroid moment in a horizontal direction, and the secondary centroid moment in a vertical direction. It should be noted that the combination of moments is not limited to this. For example, it is also possible to use eight types, adding the secondary moment in a horizontal direction and the secondary moment in a vertical direction.
By obtaining these moments, it is possible to recognize the color centroid and the extent of the spread of color near the centroid. For example, information such as “a red region spreads at the top portion of the image” or “a yellow region is concentrated near the center” can be obtained. With the classification process of the classification processing section 30I (see
Normalization of the Characteristic Amounts
The overall classifier 30F and the partial image classifier 30G constituting a part of the classification processing section 30I perform the classification using support vector machines (also written “SVM”), which are explained later. These support vector machines have the property that their influence (extent of weighting) on the classification increases the larger the variance of the characteristic amounts is. Accordingly, the partial characteristic amount obtaining section 51 and the overall characteristic amount obtaining section 52 perform a normalization on the obtained partial characteristic amounts and the overall characteristic amounts. That is to say, the average and the variance is calculated for each characteristic amount, and normalized such that the average becomes “0” and the variance become “1”. More specifically, when μi is the average value and σi is the variance for the i-th characteristic amount xi, then the normalized characteristic amount xi′ can be expressed by the Equation (9) below.
x
i′=(xi−μi)/σi (9)
Consequently, the partial characteristic amount obtaining section 51 and the overall characteristic amount obtaining section 52 normalize each characteristic amount by performing the calculation of Equation (9). The normalized characteristic amounts are stored in the characteristic amount storage section 37e of the memory 37, and used for the classification process with the classification processing section 30I. Thus, in the classification process with the classification processing section 30I, each characteristic amount can be treated with equal weight. As a result, the classification accuracy can be improved.
Summary of Characteristic Amount Obtaining Section 30E
As explained above, when the characteristic amounts used for classification are obtained with the characteristic amount obtaining section 30E of this embodiment, the partial characteristic amounts are obtained first based on partial image data, and then the overall characteristic amounts are obtained based on the plurality of partial characteristic amounts. Therefore, the processing performed when obtaining the overall characteristic amounts is simplified and a speed-up of the processing is achieved. For example, it is possible to suppress the number of times the data of the targeted image is read out from the memory 37 to the necessary minimum. And as far as the conversion of image data is concerned, the conversion of partial image data is performed during the obtaining of the partial characteristic amounts, so that no conversion needs to be performed during the obtaining of the overall characteristic amounts. Also with regard to this aspect, a speed-up of the processing is achieved. In this case, the partial characteristic amount obtaining section 51 obtains the partial characteristic amounts based on the partial image data corresponding to portions obtained by dividing the targeted image into a grid shape. With this configuration, it is possible to specify the partial image data by specifying two pixels (coordinates) located on a diagonal line. Accordingly, the processing is simplified and a speed-up is achieved.
Moreover, the partial characteristic amount obtaining section 51 obtains partial color averages and partial color variances as the partial characteristic amounts, whereas the overall characteristic amount obtaining section 52 obtains overall averages and overall color variances as the overall characteristic amounts. These characteristic amounts are used for the process of classifying the targeted image with the classification processing section 30I. Therefore, the classification accuracy of the classification processing section 30I can be increased. This is because in the classification process, information about the coloring and information about the localization of colors is taken into account, which is obtained for the overall targeted image as well as for the partial images.
The overall characteristic amount obtaining section 52 obtains, as the overall characteristic amounts, the moments of a plurality of pixels constituting the data of the targeted image. With these moments, it is possible to let the overall classifier 30F recognize the position of the centroid of a color and the extent of spread of a color. As a result, it is possible to increase the accuracy with which the targeted image is classified. Furthermore, the overall characteristic process obtaining section 52 uses the partial characteristic amounts to obtain the moments. Thus, the moments can be obtained efficiently, and a speed-up of the processing is achieved.
Classification Processing Section 30I
The following is an explanation of the classification processing section 30I. First, an overview of the classification processing section 30I is explained. As shown in
Overall Classifier 30F
The overall classifier 30F includes sub-classifiers (also referred to simply as “overall sub-classifiers”), which correspond in number to the number of scenes that can be classified. The overall sub-classifiers classify whether a targeted image belongs to a specific scene based on the overall characteristic amounts. As shown in
The overall classifier 30F carries out the classification with the various overall sub-classifiers in a predetermined order. To explain this in more detail, the overall classifier 30F first classifies with the landscape scene classifier 61 whether the targeted image belongs to a landscape scene. Then, if it has been determined that it does not belong to a landscape scene, it classifies with the evening scene classifier 62 whether the targeted image belongs to an evening scene. After this, the classification with the night scene classifier 63, the flower scene classifier 64 and the autumnal scene classifier 65 are carried out in that order. That is to say, if the overall classifier 30F could not classify that the targeted image belongs to a corresponding specific scene (a first category) with a given overall sub-classifier (first overall sub-classifier), then it classifies whether the targeted image belongs to another specific scene (second category) with another overall sub-classifier (second overall sub-classifier). Thus, the overall classifier 30F lets the individual overall sub-classifiers carry out the classification of the targeted image in order, so that the reliability of the classification can be increased.
These overall sub-classifiers each include a support vector machine and a decision section. That is to say, the landscape scene classifier 61 includes a landscape scene support vector machine 61a and a landscape scene decision section 61b, whereas the evening scene classifier 62 includes an evening scene support vector machine 62a and an evening scene decision section 62b. The night scene classifier 63 includes a night scene support vector machine 63a and a night scene decision section 63b, the flower scene classifier 64 includes a flower scene support vector machine 64a and a flower scene decision section 64b, and the autumnal scene classifier 65 includes an autumnal scene support vector machine 65a and an autumnal scene decision section 65b.
The Support Vector Machines
The following is an explanation of the support vector machines (landscape scene support vector machine 61a to autumnal scene support vector machine 65a). The support vector machines correspond to probability information obtaining sections and obtain probability information indicating the probability that the object to be classified belongs to a certain category, based on the characteristic amounts indicating the characteristics of the image to be classified. Here, probability information is information that is associated with the probability that an image belongs to a given category. That is to say, if the value of the probability information is determined, the probability whether an object to be classified belongs to a certain category is determined in accordance with that value. In this embodiment, the output value (classification function value) of the support vector machines corresponds to the probability information.
The basic form of the support vector machines is that of linear support vector machines. As shown in
Now, linear support vector machines can classify samples that can be linearly separated with high accuracy, but their classification accuracy for images to be classified that cannot be linearly separated is low. It should be noted that the targeted images that are handled by the multifunctional apparatus 1 correspond to objects to be classified that cannot be linearly separated. Accordingly, for such an object to be classified, the characteristic amounts are converted non-linearly (that is, mapped to a higher-dimensional space), and a non-linear support vector machine performing linear classification in this space is used. With such a non-linear support vector machine, a new function that is defined by a suitable number of non-linear functions is taken as data for the linear support vector machines. With such non-linear support vector machines, a linear classification is carried out in a higher-dimensional space, so that also samples that are classified by the non-linear function can be classified with high accuracy. Moreover, non-linear support vector machines use kernel functions. By using kernel functions, it is possible to determine relatively easily the classification function by a kernel calculation, even without performing complicated calculations in higher-dimensional space.
As shown diagrammatically in
In this embodiment, the overall characteristic amounts are assigned to characteristic amount X1 and characteristic amount X2, as shown in
The overall sub-classifiers (landscape scene classifier 61 to autumnal scene classifier 65) each include such a non-linear support vector machine (that is, a classification function). In each of the support vector machines (landscape scene support vector machine 61a to autumnal scene support vector machine 65a), the parameters in the classification function are determined by training based on different support vectors. As a result, the properties of each of the overall sub-classifiers can be optimized, and it is possible to improve the classification properties of the overall classifier 30F. Each of the support vector machines outputs a numerical value, that is, a classification function value, which depends on the entered sample (image data). This classification function value indicates the degree (probability) to which the entered sample belongs to a certain category. To explain this with the example of
The Decision Sections
The following is an explanation of the decision sections (landscape scene decision section 61b to autumnal scene decision section 65b). Based on the classification function values (probability information) obtained with the support vector machines, these decision sections decide whether the targeted image belongs to a corresponding scene. Each decision section makes a decision based on the above-mentioned probability threshold. That is to say, each decision section decides that the targeted image belongs to a corresponding scene, if the probability based on the classification function value obtained by the corresponding support vector machine is equal to or greater than a probability that is prescribed by the probability threshold. The reason why the decision is made with such a probability threshold is in order to increase the speed of the processing while maintaining the accuracy of the decision. If the sorting of a scene is carried out using probabilities, ordinarily the probability that an image belongs to a scene is obtained for all possible scenes, and the image is sorted according to which of these probabilities is highest. With this method, it is necessary to obtain the probabilities for all scenes, so that the processing amount becomes large and the processing tends to become slow. With regard to this aspect, the decision sections of this embodiment can decide whether a targeted image is sorted as a specific scene based on the probability information for that specific scene, so that a simplification of the processing is achieved. That is to say, it is possible to process this with a simple comparison of the classification function value (probability information) and the probability threshold. Moreover, it is possible to set the extent of wrong decisions in accordance with the setting of the probability thresholds, so that the balance between the processing speed and the decision accuracy can be easily adjusted.
As a measure indicating the accuracy of the decisions made by the decision sections, the recall ratio and the precision (ratio of the correct answers) are used, as shown in
The Probability Threshold
The probability threshold of the overall classifier 30F is determined taking the precision (ratio of the correct answers) as the standard. This is because, even though there may be some false results, a classification is performed afterwards by the partial image classifier 30G and by the consolidated classifier 30H. Therefore, with the overall classifier 30F, the emphasis is placed on reliability, and targeted images belonging to the respective scene category are selectively classified. However, if the reliability is set too high, the number of targeted images for which the scene can be determined by the overall classifier 30F becomes very low. Accordingly, almost all targeted images are classified by the classifiers of the later stages, and a lot of time will be necessary for the processing. Consequently, the probability threshold is determined such that the reliability and the processing time are balanced. For example, as shown in
As can be seen by comparing
As noted above, the classification function values (calculation results) obtained with the various support vector machines correspond to the probability information, which indicates the probability that an image belongs to that scene, as described above. That the probability that an image belongs to that scene is small means that the probability is large that it does not belong to that scene. Consequently, it is possible to classify based on the classification function value obtained with a support vector machine that an image does not belong to that scene. For example, if the classification function value obtained with a support vector machine is a value that is smaller than a probability threshold for classifying that the image does not belong to that category, then it can be classified that the targeted image does not belong to that scene. Such a probability threshold enables the negative decision that the targeted image does not belong to the scene handled by that overall sub-classifier. Consequently, in the following explanations, a probability threshold for enabling such a negative decision is also referred to as “negative threshold”. If it can be classified that the targeted image does not belong to a certain scene, then the classifiers of the later stages do not need to carry out a classification for the same scene, so that the processing is simplified and sped up.
The above-explained negative threshold is a probability threshold with which a certain overall sub-classifier decides that an object to be classified does not belong to the category handled by that overall sub-classifier. Here, let us consider the case that there are a plurality of categories whose characteristics differ considerably. In this case, the characteristics differ considerably, so that if the probability is high that an image belongs to a certain category, then the probability that it belongs to another category tends to be small. For example, let us consider the case of a landscape scene and a night scene. The landscape image, which belongs to the landscape scene category, has the basic color tones green and blue, whereas the night image, which belongs to the evening scene category, has the basic color tone black. Therefore, for images having the basic color tones green and blue, the probability that they belong to the landscape scene will be high, whereas the probability that they belong to the night scene will be low. And for images having the basic color tone black, the probability that they belong to the night scene will be high, whereas the probability that they belong to the landscape scene will be low. Accordingly, it can be seen that based on the classification function value obtained with a support vector machine, it is possible to classify that a targeted image does not belongs to a scene other than the scene handled by that overall sub-classifier. For example, if the classification function value obtained with a support vector machine is larger than the probability threshold for classifying that a targeted image does not belong to another scene, it can be classified that the targeted image does not belong to another scene. Such a probability threshold enables the negative decision that a targeted image does not belong to a scene other than the scene handled by that overall sub-classifier, that is, to another scene category handled by another overall sub-classifier. Consequently, in the following explanations, the probability threshold for enabling such a negative decision is also referred to as “other negative threshold” (other probability threshold)
The example of
Such negative thresholds are likewise set also with respect to the other overall sub-classifiers. For example, as shown in
Partial Image Classifier 30G
The partial image classifier 30G includes several sub-classifiers (also referred to below simply as “partial sub-classifiers”), corresponding in number to the number of scenes that can be classified. The partial sub-classifiers classify, based on the partial characteristic amounts, whether a targeted image belongs to a specific scene category. That is to say, they perform a classification based on the characteristics of each partial image (the characteristics of each portion of the image). The partial sub-classifiers also classify that the targeted image does not belong to a specific scene. If the partial sub-classifiers have ascertained that the targeted image belongs to a certain scene, then a positive flag is stored in the corresponding region of the positive flag storage section 37h. And if the partial sub-classifiers have ascertained that the targeted image does not belong to a certain scene, then a negative flag is stored in the corresponding region of the negative flag storage section 37i.
It should be noted that in the partial image classifier 30G of the present embodiment, the partial sub-classifiers also use the overall characteristic amounts in addition to the partial characteristic amounts when obtaining the classification function value. That is to say, when classifying a partial image, the partial sub-classifiers also take into account the overall characteristics of the targeted image, in addition to the characteristics of the partial images. This is in order to increase the classification accuracy of the partial images (this is explained further below).
As shown in
Next, the images suitable for classification with the partial image classifier 30G are considered. First of all, a flower scene and an autumnal scene are considered. In both of these scenes, the characteristics of the scene tend to appear locally. For example, in an image of a flower bed or a flower field, a plurality of flowers tend to accumulate in a specific portion of the image. In this case, the characteristics of a flower scene appear in the portion where the plurality of flowers accumulate, whereas characteristics that are close to a landscape scene appear in the other portions. This is the same for autumnal scenes. That is to say, if autumn leaves on a portion of a hillside are captured, then the autumn leaves accumulate on a specific portion of the image. Also in this case, the characteristics of an autumnal scene appear in one portion of the hillside, whereas the characteristics of a landscape scene appear in the other portions. Consequently, by using the flower scene partial classifier 72 and the autumnal scene partial classifier 73 as partial sub-classifiers, the classification properties can be improved even for flower scenes and for autumnal scenes, which are difficult to classify with the overall classifier 30F. That is to say, the classification is carried out for each partial image, so that even if it is an image in which the characteristics of the essential object, such as flowers or autumnal leaves, appear only in a portion of the image, it is possible to increase the ratio at which the essential object is present within the partial image. As a result, the classification can be performed with high accuracy. Next, evening scenes are considered. Also in evening scenes, the characteristics of the evening scene may appear locally. For example, let us consider an image in which the evening sun is captured as it sets at the horizon, and the image is captured immediately prior to the complete setting of the sun. In this image, the characteristics of a sunset scene appear at the portion where the evening sun sets, whereas the characteristics of a night scene appear in the other portions. Consequently, by using the evening scene partial classifier 71 as the partial sub-classifier, the classification properties can be improved even for evening scenes that are difficult to classify with the overall classifier 30F.
In the partial image classifier 30G, the classification with the partial sub-classifiers is carried out successively one by one, like the classification with the overall sub-classifiers. With this partial image classifier 30G, it is first classified with the evening scene partial classifier 71 whether the targeted image belongs to an evening scene. Then, if it is determined that it does not belong to an evening scene, it is classified with the flower scene partial classifier 72 whether the targeted image belongs to a flower scene. Furthermore, if it is determined that it does not belong to a flower scene, it is classified with the autumnal scene partial classifier 73 whether the targeted image belongs to an autumnal scene. That is to say, if a given partial sub-classifier (first partial sub-classifier) has not classified the targeted image as belonging to the corresponding specific scene (first category), then the partial image classifier 30G classifies with another partial sub-classifier (second partial sub-classifier) whether the targeted image belongs to another specific scene (second category). Thus, it is possible to increase the classification reliability, since the configuration is such that the classification is carried out with each partial sub-classifier individually.
The partial sub-classifiers each include a partial support vector machine and a detection number counter. That is to say, the evening scene partial classifier 71 includes an evening scene partial support vector machine 71a and an evening scene detection number counter 71b, the flower scene partial classifier 72 includes a flower scene partial support vector machine 72a and a flower scene detection number counter 72b, and the autumnal scene partial classifier 73 includes an autumnal scene partial support vector machine 73a and an autumnal scene detection number counter 73b.
The partial support vector machines (evening scene partial support vector machine 71a to autumnal scene partial support vector machine 73a) are similar to the support vector machines (landscape scene support vector machine 61a to autumnal scene support vector machines 65a) of the overall sub-classifiers. The partial support vector machines differ from the support vector machines of the overall sub-classifier with regard to the fact that their training data is partial data. Consequently, the partial support vector machines carry out a calculation based on the partial characteristic amounts indicating the characteristics of the portions to be classified. It should be noted that the partial support vector machines of the present embodiment carry out their calculation by taking into account the overall characteristic amounts in addition to the partial characteristic amounts.
The more characteristics of the given category to be classified the portion to be classified has, the larger is the value of the calculation result, that is, the classification function value. By contrast, the more characteristics of another category that is not to be classified that portion has, the smaller is that value of the calculation result. It should be noted that if that portion has an even amount of both the characteristics of the given category and the characteristics of the other category, then the classification function value obtained with the partial support vector machine becomes “0”. Consequently, with regard to portions (of the targeted image) where the classification function value obtained with a partial support vector machine has a positive value, scenes that are handled by that partial support vector machine contain more characteristics than other scenes. Thus, the classification function value obtained with the partial support vector machine corresponds to probability information indicating the probability that this portion belongs to a certain category.
The detection number counters (evening scene detection number counter 71b to autumnal scene detection number counter 73b) count the number of portions for which the classification function value obtained with the partial support vector machine is positive. In other words, they count the number of partial images in which the characteristics of the corresponding scene are stronger than the characteristics of other scenes. These detection number counters constitute a portion of the judgment section that judges that the partial targeted image belongs to the corresponding category. That is to say, if the count value of the detection number counter has exceeded a judgment threshold, the CPU 36 of the main controller 31 judges that the partial targeted image belongs to the corresponding category, based on the count value of the detection number counter and the judgment threshold. Consequently, this judgment section can be said to be constituted by the main controller 31. Moreover, the judgment threshold provides a positive judgment that the targeted image belongs to the scene handled by the partial sub-classifier. Consequently, in the following explanations, the judgment threshold for providing this positive judgment is also referred to as “positive count value”. A positive count value is determined for each partial sub-classifier. In this embodiment, for the evening scene partial classifier 71, the value “5” is determined, for the flower scene partial classifier 72, the value “9” is determined, for the autumnal scene partial classifier 73, the value “6” is determined, as the positive count value (judgment threshold), as shown in
If a partial category for an object to be classified is known, then it is also possible to judge other categories based on this category. For example, if the object to be classified contains a portion belonging to a given category, then it can be judged that this object to be classified does not belong to another category whose characteristics differ considerably from that category. For example, if there is a partial image determined to belong to a flower scene during the classification of the targeted image, then it can be judged that the targeted image does not belong to a night scene, whose characteristics are very different from that of a flower scene. Accordingly, if the count value of the detection number counter exceeds another judgment threshold, then the partial sub-classifiers judge, based on the count value of the detection number counter and that other judgment threshold, that the targeted image does not belong to the corresponding category.
This other judgment threshold enables the negative judgment that the targeted image does not belong to a certain scene, which is different from the scene handled by the partial sub-classifier. Consequently, the other judgment threshold for providing such a negative judgment is also referred to as “negative count value” in the following explanations. Like for the positive count values, also for the negative count values, a value is set for each of the partial sub-classifiers. In this embodiment, as shown in
As noted above, the partial support vector machines perform their calculation taking into account the overall characteristic amounts in addition to the partial characteristic amounts. The following is an explanation of this aspect. The partial images contain less information than the overall image. Therefore, it occurs that the classification of categories is difficult. For example, if a given partial image has characteristics that are common for a given scene and another scene, then their classification becomes difficult. Let us assume that the partial image is an image with a strong red tone. In this case, it may be difficult to classify with the partial characteristic amounts alone whether the partial image belongs to an evening scene or whether it belongs to an autumnal scene. In this case, it may be possible to classify the scene to which this partial image belongs by taking into account the overall characteristic amounts. For example, if the overall characteristic amounts indicate an image that is predominantly black, then the probability is high that the partial image with the strong red tone belongs to an evening scene. And if the overall characteristic amounts indicate an image that is predominantly green or blue, then the probability is high that the partial image with the strong red tone belongs to an autumnal scene. Thus, the classification accuracy of the partial support vector machines can be increased by performing the calculation while taking into account the overall characteristic amounts.
The Consolidated Classifier 30H
As mentioned above, the consolidated classifier 30H classifies the scenes of targeted images for which the scene could be decided neither with the overall classifier 30F nor with the partial image classifier 30G. The consolidated classifier 30H of the present embodiment classifies scenes based on the probability information determined with the overall sub-classifiers (the support vector machines). More specifically, the consolidated classifier 30H selectively reads out the probability information for positive values from the plurality of sets of probability information stored in the probability information storage section 37f of the memory 37. Then, the probability information with the highest value among the sets of probability information that have been read out is specified, and the corresponding scene is taken as the scene of the targeted image. For example, if the probability information for landscape scenes and autumnal scenes is selectively read out and if the probability information for landscape scenes has the value “1.25” and the probability information for landscape scenes has the value “1.10”, then the consolidated classifier 30H classifies the targeted image as being a landscape scene. And if none of the sets of probability information has a positive value, then the consolidated classifier 30H classifies the targeted image as being another scene. By providing such a consolidated classifier 30H, it is possible to classify suitable scenes, even when the characteristics of the scene to which the image belongs do not appear strongly in the targeted image. That is to say, it is possible to improve the classification properties.
The Result Storage Section 37j
The result storage section 37j stores the classification results of the object to be classified that have been determined by the classification processing section 301. For example, if, based on the classification results according to the overall classifier 30F and the partial image classifier 30G, a positive flag is stored in the positive flag storage section 37h, then the information is stored that the object to be classified belongs to the category corresponding to this positive flag. If a positive flag is set that indicates that the targeted image belongs to a landscape scene, then result information indicating that the targeted image belongs to a landscape scene is stored. Similarly, if a positive flag is set that indicates that the targeted image belongs to an evening scene, then result information indicating that the targeted image belongs to an evening scene is stored. It should be noted that for targeted images for which a negative flag has been stored for all scenes, result information indicating that the targeted image belongs to another scene is stored. The classification result (result information) stored in the result storage section 37j is looked up by later processes. In the multifunctional apparatus 1, the image enhancement section 30C (see
The Image Classification Process
The following is an explanation of the image classification process performed by the main controller 31. By executing this image classification process, the main controller 31 functions as a face detection section 30A and a scene classification section 30B (characteristic amount obtaining section 30E, overall classifier 30F, partial image classifier 30G, consolidated classifier 30H, and result storage section 37j) Moreover, the computer program executed by the main controller 31 includes code for realizing the image classification process.
As shown in
If the targeted image contains no face image, then the main controller 31 carries out a process of obtaining characteristic amounts (S23). In the process of obtaining the characteristic amounts, the characteristic amounts are obtained based on the data of the targeted image. That is to say, the overall characteristic amounts indicating the overall characteristics of the targeted image and the partial characteristic amounts indicating the partial characteristics of the targeted image are obtained. It should be noted that the obtaining of these characteristic amounts has already been explained above (see S11 to S15,
When the characteristic amounts have been obtained, the main controller 31 performs a scene classification process (S24). In this scene classification process, the main controller 31 first functions as the overall classifier 30F and performs an overall classification process (S24a). In this overall classification process, classification is performed based on the overall characteristic amounts. Then, when the targeted image could be classified by the overall classification process, the main controller 31 determines the scene of the targeted image as the classified scene (YES in S24b). For example, it determines the image to be the scene for which a positive flag has been stored in the overall classification process. Then, it stores the classification result in the result storage section 37j. It should be noted that the details of the overall classification process are explained later. If the scene was not determined in the overall classification process, then the main controller 31 functions as a partial image classifier 30G and performs a partial image classification process (S24c). In this partial image classification process, classification is performed based on the partial characteristic amounts. Then, if the targeted image could be classified by the partial image classification process, the main controller 31 determines the scene of the targeted image as the classified scene (YES in S24d), and stores the classification result in the result storage section 37j. It should be noted that the details of the partial image classification process are explained later. If the scene was also not determined by the partial image classifier 30G, then the main controller 31 functions as a consolidated classifier 30H and performs a consolidated classification process (S24e). In this consolidated classification process, the main controller 31 reads out the probability information with positive values from the probability information storage section 37f and determines the image to be a scene corresponding to the probability information with the largest value, as explained above. Then, if the targeted image could be classified by the consolidated classification process, the main controller 31 determines the scene of the targeted image as the classified scene (YES in S24f). On the other hand, if the targeted image could also not be classified by the consolidated classification process, and negative flags have been stored for all scenes, then the targeted image is classified as being another scene (NO in S24f). It should be noted that in the consolidated classification process, the main controller 31 functioning as the consolidated classifier 30H first judges whether negative flags are stored for all scenes. Then, if it is judged that negative flags are stored for all scenes, the image is classified as being another scene, based on this judgment. In this case, the processing can be performed by confirming only the negative flags, so that the processing can be sped up.
The Overall Classification Process
The following is an explanation of the overall classification process. As shown in
When an overall sub-classifier has been selected, the main controller 31 judges whether the scene classified by the selected overall sub-classifier is subjected to classification processing (S32). This judgment is carried out based on positive flags and negative flags. That is to say, if a positive flag has been stored for a given scene, then the targeted image is decided to be a scene corresponding to that positive flag. Therefore, there is no need to classify for the other scenes. Therefore, the other scenes can be excluded from the classification process. Similarly, if a negative flag has been set for a given scene, then the targeted image is not classified for the scene corresponding to this negative flag. Therefore, also the scenes corresponding to negative flags can be excluded from the classification process. Let us assume that during the classification with the landscape scene classifier 61, a positive flag for landscape scenes has been stored. In this case, a classification with the remaining classifiers does not need to be carried out. Therefore, it is judged that the scene is not subject to processing (NO in S32), and the classification process is skipped. Let us now assume that during the classification with the landscape scene classifier 61, a negative flag for night scenes has been stored. In this case, the classification with the night scene classifier 63 does not need to be carried out. Therefore, after the classification process with the evening scene classifier 62 is finished, it is judged that the scene is not subject to processing (NO in S32), and the classification process is skipped. By adopting such a configuration, unnecessary classification processing is eliminated, so that the processing can be sped up.
On the other hand, if it is judged in Step S32 that the scene is subject to processing, a calculation with the support vector machine is carried out. In other words, probability information is obtained based on the overall characteristic amounts. In this situation, the main controller 31 functions as the overall sub-classifier corresponding to the scene being processed, and obtains the classification function value serving as the probability information by a calculation based on the overall color average, the overall color variance, the moments and the appended Exif information.
]When the classification function value has been obtained, it is judged whether a condition for positive judgment is established (S34). That is to say, the main controller 31 judges whether a condition is established for deciding that the targeted image is a certain scene. In this example, this is judged by comparing the classification function value with a positive threshold. For example, as shown in
If a positive condition has not been established, then it is judged whether a negative condition has been established (S36). That is to say, the main controller 31 judges whether a condition for deciding that the targeted image does not belong to a given scene is established. In this example, this is judged by comparing the classification function value with a negative threshold. For example, as shown in
After the storing of the positive flag (S35) or the negative flags (S37), or after it has been judged that a negative condition is not established (NO in S36), it is judged whether there is a further overall sub-classifier (S38). Here, the main controller 31 judges whether the processing has been finished up to that of the autumnal scene classifier 65, which has the lowest priority. Then, if the processing has been finished up to that of the autumnal scene classifier 65, it is judged that there is no further classifier, and the sequence of the overall classification process is finished. On the other hand, if the processing up to that of the autumnal scene classifier 65 has not been finished, then the overall sub-classifier with the next highest priority is selected ($31) and the above-described process is repeated.
The Partial Image Classification Process
The following is an explanation of the partial image classification process. As shown in
When a partial sub-classifier has been selected, the main controller 31 judges whether the scene classified by the selected partial sub-classifier is subjected to classification processing (S42). This judgment is carried out based on positive flags and negative flags, like in the overall classifier 30F. Here, for the positive flags, the flags stored by the classification with the partial sub-classifiers are used for this judgment, and the flags stored by the classification with the overall classifier are not used for this judgment. This is because when positive flags are set with the overall sub-classifier, the scene is decided by the overall classification process, and the partial image classification process is not carried out. For the negative flags on the other hand, the flags stored by the classification with the partial sub-classifiers and those stored by the classification with the overall sub-classifiers are used for the judgment. Also in this partial image classification process, if it is judged that the scene is not subject to processing, the classification process is skipped (NO in S42). Therefore, unnecessary classification processing is eliminated, so that the processing can be sped up.
On the other hand, if it is judged in Step S42 that the scene is subject to processing, a calculation with the partial support vector machine is carried out (S43). In other words, probability information for the partial image is obtained based on the partial characteristic amounts. In this situation, the main controller 31 functions as a partial sub-classifier corresponding to the scene being processed, and obtains the classification function value serving as the probability information by a calculation based on the partial color average and the partial color variance. Then, if the obtained classification function value is a positive value, the corresponding detection number counter is incremented (+1). If the classification function value is not a positive value, then the count value of the detection number counter stays the same. It should be noted that the count value of the detection number counter is reset when processing a new targeted image (new targeted image data).
When the obtaining of the probability information for the partial images and the counter processing has been carried out, it is judged whether a condition for positive judgment is established (S44). That is to say, the main controller 31 judges whether a condition is established for deciding that the targeted image is the scene subject to processing. In this example, this is judged by comparing the count value of the detection number counter with a positive count value. For example, as shown in
If a positive condition has not been established, then it is judged whether a negative condition has been established (S46). That is to say, the main controller 31 judges whether a condition for deciding that the targeted image does not belong to a given scene is established. In this example, this is judged by comparing the count value with a negative count value. For example, as shown in
If a negative condition has not been established (NO in S46), then it is judged whether the number of partial images that have been processed has exceeded a predetermined number (S48). Here, if this predetermined number has not yet been exceeded, the procedure advances to Step S43 and the above-described process is repeated. On the other hand, if the predetermined number is exceeded or if a positive flag or a negative flag has been stored (S45, S47), then it is judged whether there is a further partial sub-classifier (S49). Here, the main controller 31 judges whether the processing has been finished up to that of the autumnal scene partial classifier 73, which has the lowest priority. Then, if the processing has been finished up to that of the autumnal scene partial classifier 73, it is judged that there is no further classifier, and the sequence of the partial classification process is finished. On the other hand, if the processing up to that of the autumnal scene partial classifier 73 has not been finished, then the partial sub-classifier with the next highest priority is selected (S41) and the above-described process is repeated.
Summary of Classification Processing Section 30I
As should become clear from the above explanations, with this classification processing section 30I, the overall classifier 30F classifies the scene to which a targeted image belongs, based on the overall characteristic amounts, and the partial image classifier 30G classifies the scene to which the targeted image belongs, based on the partial characteristic amounts. Thus, the category to which a given targeted image belongs is classified using a plurality of types of classifiers with different properties, so that the accuracy with which scenes are classified can be improved. Furthermore, the overall classifier 30F includes a plurality of overall sub-classifiers that classify whether the targeted image belongs to a specific scene (predetermined category), the number of overall sub-classifiers corresponding to the number of specific scene types that can be classified (the number of predetermined categories). Thus, the properties can be optimized for each overall sub-classifier individually, and the classification accuracy can be increased.
The overall sub-classifiers carry out the classification of the targeted image based on probability information (classification function values) indicating whether the probability that the targeted image belongs to a specific scene is high or low. That is to say, if the probability indicated by the probability information is within a probability range, specified by a probability threshold, for which it can be decided that the object to be classified belongs to a given category, then the targeted image is classified as belonging to that specific category. Thus, the processing can be sped up while guaranteeing the accuracy of the classification. That is to say, it is possible to achieve a high level of both processing speed and classification accuracy. Moreover, based on probability information, the partial sub-classifiers classify whether an image portion belongs to a specific scene (predetermined category), individually for each of the plurality of partial characteristic amounts obtained from the plurality of sets of partial image data, and count the number of portions that are classified as belonging to a specific scene with a detection number counter. Then, based on this count value, it is classified whether the overall targeted image belongs to a specific scene. Thus, the count value serves as a basis for the judgment, so that the classification processing can be performed efficiently.
In this classification processing section 301, the classification is performed using the consolidated classifier 30H for targeted images whose scenes could be classified neither with the overall classifier 30F nor with the partial image classifier 30G. This consolidated classifier 30H classifies the scene corresponding to the probability information indicating the highest probability of the probability information (classification function values) obtained for the plurality of scenes for the targeted image as the scene to which the targeted image belongs. By providing this consolidated classifier 30H, classification can be carried out with the consolidated classifier 30H, even if the scene to which an image belongs could not be classified with the overall classifier 30F and the partial image classifier 30G. Therefore, the accuracy of the classification can be improved.
The overall classifier 30F of the classification processing section 30I includes a plurality of overall sub-classifiers with differing classification targets. If the scene to which the targeted image belongs could be decided with the overall sub-classifier of an earlier stage, then a classification with the overall sub-classifiers of the later stages is not carried out. That is to say, if the overall sub-classifier of the earlier stage obtains the probability information with its support vector machine, and if the probability indicated by this probability information is within a probability range, specified by a probability threshold, for which it can be decided that the targeted image belongs to that scene, then a positive flag is stored. In accordance with the stored positive flags, it is judged that the overall sub-classifiers of the later stages do not carry out a classification for this targeted image. In this case, probability information is not obtained by their support vector machines. Consequently, the processing for the scene classification can be sped up. Here, the support vector machine of the overall sub-classifier of an earlier stage and the support vector machines of the overall sub-classifiers of a later stage use the same characteristic amounts. Thus, the process of obtaining the characteristic amounts is shared, so that the processing can be made more efficient.
Moreover, the overall classifier 30F and the partial image classifier 30G of the classification processing section 301 include sub-classifiers performing the classification of the same scenes. In the above-described embodiment, the evening scene classifier 62 of the overall classifier 30F and the evening scene partial classifier 71 of the partial image classifier 30G both classify evening scenes. This is similar for the flower scene classifier 64 and the flower scene partial classifier 72 as well as for the autumnal scene classifier 65 and the autumnal scene partial classifier 73. Moreover, if the scene to which the targeted image belongs could be decided with the overall sub-classifiers (evening scene classifier 62, flower scene classifier 64, and autumnal scene classifier 65), then the partial sub-classifiers (evening scene partial classifier 71, flower scene partial classifier 72, and autumnal scene partial classifier 73) do not perform a classification for the targeted image. Thus, the processing of the scene classification is sped up. Furthermore, the overall sub-classifiers classify the scene to which an image belongs based on the overall characteristic amounts indicating the overall characteristics of the targeted image, and the partial sub-classifiers classify the scene to which an image belongs based on the partial characteristic amounts indicating the partial characteristics of the targeted image. Thus, characteristic amounts that are suitable for the properties of the classifier are used, so that the accuracy of the classification can be increased. For example, with the overall sub-classifiers, a classification is possible that takes into account the overall characteristics of the targeted image, and with the partial sub-classifiers, a classification is possible that takes into account the partial characteristics of the targeted image.
Moreover, with the overall sub-classifiers, a classification by other overall sub-classifiers is not performed in accordance with probability information obtained by the support vector machine of a given overall sub-classifier. That is to say, a given overall sub-classifier compares the obtained probability information with a probability threshold, and can judge that the targeted image does not belong to another scene corresponding to another overall sub-classifier. Then, if it has been judged that the image does not belong to this other scene, a negative flag corresponding to this other scene is stored. Based on this negative flag, it is judged that the other overall sub-classifier does not carry out a classification for the targeted image. With this configuration, the processing can be made more efficient. Moreover, the probability information obtained with the support vector machine of the given overall sub-classifier is used for the judgment of the scene corresponding to that given overall sub-classifier as well as the judgment of the scene corresponding to the other overall sub-classifier. Thus, the probability information is used in various ways, so that also with regard to this aspect, the processing can be made more efficient.
Furthermore, if the overall classifier 30F has decided that the image does not belong to any of the scenes, based on the probability information obtained with the overall sub-classifiers, then the partial image classifier 30G does not perform a classification for that targeted image. Accordingly, the processing can be sped up.
In the embodiment explained above, the object to be classified is an image based on image data, and the classification apparatus is the multifunctional apparatus 1. However, the classification apparatus classifying images is not limited to the multifunctional apparatus 1. For example, it may also be a digital still camera DC, a scanner, or a computer that can execute a computer program for image processing (for example, retouching software). Moreover, it can also be an image display device that can display images based on image data or an image data storage device that stores image data. Furthermore, the object to be classified is not limited to images. That is to say, any object that can be sorted into a plurality of categories using a plurality of classifiers can serve as the object to be classified.
Furthermore, in the embodiment above, a multifunctional apparatus 1 was described, which classifies the scene of a targeted image, but this includes therein also the disclosure of a category classification apparatus, a category classification method, a method for using a classified category (for example a method for enhancing an image, a method for printing, and a method for ejecting a liquid based on a scene), a computer program, and a storage medium storing a computer program or code.
Moreover, regarding the classifiers, the above-described embodiment explained support vector machines, but as long as they can sort the category of a targeted image, there is no limitation to support vector machines. For example, it is also possible to use a neural network or the AdaBoost algorithm as a classifier.
Number | Date | Country | Kind |
---|---|---|---|
2007-038350 | Feb 2007 | JP | national |
2007-315244 | Dec 2007 | JP | national |