This application is based upon and claims the benefit of priority from prior Japanese Patent Application No. 2005-015629, filed Jan. 24, 2005, the entire contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to an image compression method and an image compression device which compress an image having characteristics which are recognition objects in a character recognition device, a face collation device or the like.
2. Description of the Related Art
For example, in a mail handling device which performs sorting based on a recognition result of characters described on the mail, there is a restriction on a handling time for handling each mail. Therefore, in the mail handling device, a time is preferably short for transmitting an image of the mail retaining a resolution required for character recognition. In a face collation device, an image including a collated face image is stored as a log in many cases. In this face collation device, a time is preferably short for extracting the image stored as the log.
Conventional technologies on image compression are disclosed in Jpn. Pat. Appln. KOKAI Publication Nos. 9-116765 and 2000-156861.
In Jpn. Pat. Appln. KOKAI Publication No. 9-116765, an image coding method is disclosed in which a difference value between adjacent pixels is calculated, and a variable-length code is allocated depending on an appearance frequency in coding of static image data by a reversible system.
Moreover, in Jpn. Pat. Appln. KOKAI Publication No. 2000-156861, a method of compressing multi-valued information is disclosed in which there are calculated positive or negative binary information of a difference value between pieces of luminance information of the adjacent pixels and an absolute value, and a bitmap is developed every pixel to perform the coding.
However, in the technologies disclosed in Jpn. Pat. Appln. KOKAI Publication Nos. 9-116765 and 2000-156861, there is a problem that a compression ratio is limited because the image compression is reversible. In the technologies disclosed in Jpn. Pat. Appln. KOKAI Publication Nos. 9-116765 and 2000-156861, there is a problem that characteristics as recognition objects might be lost in an image subjected to general irreversible compression such as so-called JPEG.
According to one aspect of the present invention, an object is to provide an image compression method and an image compression device capable of raising a compression efficiency of an image while maintaining characteristics as recognition objects in the image.
According to one aspect of the present invention, there is provided a method of compressing an image, comprising: performing image input processing to input the image having characteristics which are recognition objects; performing conversion processing to convert a pixel value of each pixel forming the image while retaining the characteristics which are the recognition objects in the image input by the image input processing; and performing coding processing with respect to the image converted by the conversion processing.
According to another aspect of the present invention, there is provided an image compression device comprising: an image input section which inputs an image having characteristics which are recognition objects; a conversion processing section which converts a pixel value of each pixel forming the image while retaining the characteristics which are the recognition objects in the image input by the image input section; and a coding processing section which performs coding processing with respect to the image converted by the conversion processing section.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out hereinafter.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention, and together with the general description given above and the detailed description of the embodiments given below, serve to explain the principles of the invention.
First to fifth embodiments of the present invention will be described hereinafter with reference to the drawings.
First, a first embodiment will be described.
According to the first embodiment, as shown in
The image input section 101 comprises, for example, a digital camera, a television camera, an image scanner or the like. The image input section 101 may input a color image or a monochromatic image. The image input section 101 inputs an image including a recognition object. The image input section 101 supplies the input image to the smoothing processing section 102.
Moreover, it is assumed that the image input section 101 inputs digital image data in which a value (pixel value) of each pixel is represented by a digital value. For example, the image input section 101 inputs as the image the digital image data constituted of 640 pixels in a lateral direction (x-direction) and 480 pixels in a longitudinal direction (y-direction). Here, description will be given assuming that the image input section 101 inputs the image data in which the pixel value of each pixel is represented by a luminance value. It is to be noted that the image input section 101 may input the image data in which the value (pixel value) of each pixel is represented by a density value.
The smoothing processing section 102 performs smoothing processing with respect to the image. The smoothing processing section 102 performs the smoothing processing as conversion processing with respect to the image supplied from the image input section 101. That is, the smoothing processing section 102 functions as a conversion processing section which performs the conversion processing with respect to the image input as an object of compression processing by the image input section 101. The smoothing processing section 102 supplies the smoothed image to the coding section 103.
For example, the smoothing processing section 102 has a function of performing filtering processing as the smoothing processing with respect to the image by a median filter. That is, the smoothing processing section 102 has a function of subjecting the image supplied from the image input section 101 to the filtering processing as the smoothing processing by the median filter.
The coding section 103 performs variable-length coding processing as the compression processing with respect to the image. The coding section 103 subjects the image smoothed by the smoothing processing section 102 to the coding processing. The coding section 103 supplies the coded image to the data output section 104.
The coding section 103 performs the coding processing (Huffman coding processing), for example, based on Huffman coding theory. In this case, first the coding section 103 calculates an appearance frequency for each pixel value of each pixel forming the image (image data) smoothed by the smoothing processing section 102. On calculating the appearance frequency every pixel value, the coding section 103 allocates a code having a less bit number to the pixel value having a higher appearance frequency. Accordingly, in the coding section 103, it is possible to allocate a Huffman code to the pixel value of each pixel in the image data. As a result, the coding section 103 performs variable-length coding processing with respect to the image (image data) supplied from the smoothing processing section 102 based on the Huffman coding theory.
It is to be noted that the coding section 103 does not necessarily have to use the Huffman coding processing as the coding processing. For example, the coding section 103 may use run length coding processing or the like as the coding processing.
The data output section 104 outputs as a compressed image the coded image supplied from the coding section 103. For example, the data output section 104 outputs the compressed image to a recording device (not shown).
Next, the smoothing processing in the smoothing processing section 102 will be described.
An image 121 shown in
On the other hand,
In the example shown in
In the smoothing processing section 102, the smoothing processing is performed depending on a magnitude (difference value) of a difference between the intermediate value in the median filter and the luminance value of a noted pixel. That is, when the difference value between the intermediate value and the luminance value of the noted pixel is not less than a preset threshold value, the smoothing processing section 102 leaves the luminance value of the noted pixel after smoothed as that of the noted pixel before smoothed. When the difference value between the intermediate value and the luminance value of the noted pixel is less than a predetermined threshold value, the luminance value of the noted pixel after smoothed is converted into the intermediate value.
In other words, as shown in
Moreover, the following equation 1 represents the smoothing processing of the smoothing processing section 102 depending on the magnitude of the difference between the intermediate value and the luminance value of the noted pixel:
|Median(f(x+i, y+i))−f(x, y)|≧Th, f(x, y)=f(x, y); and
|Median(f(x+i, y+i))−f(x, y)|<Th, f(x, y)=Median(f(x+i, y+i)), (Equation 1)
wherein f(x, y) denotes the luminance value of the noted pixel before smoothed, Median(f(x+i, y+i)) denotes the intermediate value, and Th denotes the threshold value.
Next, a flow of the smoothing processing in the smoothing processing section 102 will be described.
In the smoothing processing section 102 to which the image has been supplied from the image input section 101, each pixel in the image is successively regarded as the noted pixel, and the median filter is applied to the pixel. When the median filter is applied to the noted pixel in the image, first the smoothing processing section 102 calculates the intermediate value in the median filter (step S1). For example, when the smoothing processing section 102 has the median filter with respect to 3×3 pixels, the section calculates the intermediate value with respect to the image constituted of 3×3 pixels.
On calculating the intermediate value, the smoothing processing section 102 calculates the magnitude (difference value) of the difference between the intermediate value and the luminance value of the noted pixel. On calculating the difference value between the intermediate value and the luminance value of the noted pixel, the smoothing processing section 102 judges whether or not the calculated difference value is not less than the preset threshold value (step S2).
When it is judged by this judgment that the difference value between the luminance value of the noted pixel and the intermediate value is not less than the preset threshold value (step S2, YES), the smoothing processing section 102 leaves the luminance value of the noted pixel after smoothed as that of the noted pixel before smoothed.
Moreover, when it is judged by the judgment that the difference value between the luminance value of the noted pixel and the intermediate value is less than the preset threshold value (step S2, NO), the smoothing processing section 102 replaces the luminance value of the noted pixel after smoothed with the intermediate value (step S3).
The smoothing processing section 102 regards as the noted pixels all of the pixels in the image supplied from the image input section 101 to perform the above-described processing. That is, the smoothing processing section 102 repeatedly executes the processing of the steps S1 to S3 to thereby regard all of the pixels in the image as the noted pixels and perform the smoothing processing. Accordingly, the smoothing processing section 102 subjects the whole image supplied from the image input section 101 to the smoothing processing.
In the above-described smoothing processing, it is possible to retain, for example, image information in which the luminance value rapidly changes in the image. Therefore, even in a case where the image information in which the luminance value rapidly changes are characteristics which are recognition objects, the smoothing processing section 102 can perform the smoothing processing suitable for the compression while retaining the characteristics which are the recognition objects with respect to the image of the compression object.
As described above, in the image compression device of the first embodiment, the image of the compression object is subjected to such smoothing processing as to leave the luminance value of the noted pixel after smoothed as that of the noted pixel before smoothed in a case where the difference value between the intermediate luminance value and the luminance value of the noted pixel is not less than the predetermined threshold value, and to set the luminance value of the noted pixel after smoothed to the intermediate value in a case where the difference value between the intermediate luminance value and the luminance value of the noted pixel is less than the predetermined threshold value.
Consequently, in the image compression device and method of the first embodiment, the smoothing processing suitable for the compression can be performed with respect to the image which is the compression processing object while retaining the characteristics which are such recognition objects that the luminance value rapidly changes. As a result, in the image compression device and method of the first embodiment, redundancy can be imparted while retaining the characteristics which are the recognition objects, and a compression efficiency of the image can be raised.
Next, a second embodiment will be described.
The image compression device 200 according to the second embodiment is a modification of the image compression device 100 of the first embodiment. As shown in
It is to be noted that the image input section 201, the coding section 203, and the data output section 204 are similar to the image input section 101, the coding section 103, and the data output section 104 in the image compression device 100 of the first embodiment. Therefore, detailed description of the image input section 201, coding section 203, and data output section 204 will be omitted.
The smoothing and difference processing section 202 subjects an image supplied from the image input section 201 to smoothing processing, and processing of a difference between pixels. The smoothing and difference processing section 202 supplies the image as a processing result to the coding section 203. The smoothing processing by the smoothing and difference processing section 202 is smoothing processing depending on a magnitude of the difference between an intermediate value and a luminance value of a noted pixel in a median filter as described in the first embodiment.
Moreover, the processing of the difference between the pixels by the smoothing and difference processing section 202 is processing to calculate a difference value between adjacent pixels. The smoothing and difference processing section 202 combines and executes the smoothing processing and the processing of the difference between the pixels.
It is to be noted that the smoothing and difference processing section 202 functions as a conversion processing section for performing conversion processing with respect to the image which is input from the image input section 201 and which is regarded as an object of compression processing.
Next, the processing of the smoothing and difference processing section 202 will be described.
The smoothing and difference processing section 202 successively regards as the noted pixels the respective pixels in the image supplied from the image input section 201, and performs the processing described later.
When the image is supplied from the image input section 201, the smoothing and difference processing section 202 performs smoothing processing (steps S21 to S23). This smoothing processing is similar to the processing of the steps S11 to S13 described in the first embodiment. That is, the smoothing and difference processing section 202 applies the median filter as the smoothing processing with respect to the noted pixel in the image acquired from the image input section 201, and calculates the intermediate value in the median filter (step S21). For example, in the smoothing and difference processing section 202 having the median filter with respect to 3×3 pixels, the intermediate value is calculated with respect to the image constituted of 3×3 pixels.
On calculating the intermediate value, the smoothing and difference processing section 202 calculates the magnitude (difference value) of the difference between the intermediate value and the luminance value of the noted pixel. On calculating the difference value between the intermediate value and the luminance value of the noted pixel, the smoothing and difference processing section 202 judges whether or not the calculated difference value is not less than a preset threshold value (step S22).
When it is judged by this judgment that the difference value between the luminance value of the noted pixel and the intermediate value is not less than the preset threshold value (step S22, YES), the smoothing and difference processing section 202 leaves the luminance value of the noted pixel after smoothed as that of the noted pixel before smoothed.
Moreover, when it is judged by the judgment that the difference value between the luminance value of the noted pixel and the intermediate value is less than the preset threshold value (step S22, NO), the smoothing and difference processing section 202 replaces the luminance value of the noted pixel after smoothed with the intermediate value (step S23).
Next, the smoothing and difference processing section 202 performs processing to calculate the difference value between the adjacent pixels as the processing of the difference between the pixels (step S24). In the processing of the difference between the pixels, the section calculates the difference value between the luminance value of the noted pixel and that of the pixel adjacent to the noted pixel. For example, the processing of the difference between the pixels is calculated by the following equation 2.
f(x, y)=f(x, y)−f(x−1, y), (Equation 2)
wherein f(x, y) denotes the luminance value of the noted pixel, and f(x−1, y) denotes the luminance value of the pixel adjacent to the noted pixel.
When the processing of the difference between the pixels ends, the smoothing and difference processing section 202 checks whether or not the processing has ended with respect to all of the pixels in the image acquired from the image input section 201 (step S25).
When it is judged by this check that the processing does not end with respect to all of the pixels, the smoothing and difference processing section 202 updates the noted pixel (step S26), and returns to the step S21. Accordingly, the smoothing and difference processing section 202 repeatedly performs processing similar to the above-described processing with respect to the next noted pixel.
Moreover, when it is judged by the above-described check that the processing with respect to all of the pixels has ended, the smoothing and difference processing section 202 ends the smoothing and the processing of the difference between the pixels with respect to the whole image (i.e., all of the pixels in the image) acquired from the image input section 201. The image subjected to the smoothing and the processing of the difference between the pixels in this manner is supplied to the coding section 203.
In the above-described smoothing processing and the processing of the difference between the pixels, the processing of the difference between the pixels can be performed with respect to the image while retaining characteristics which are recognition objects such as image information in which the luminance value rapidly changes. Therefore, even in a case where the image information in which the luminance value rapidly changes is the characteristic which is the recognition object, the smoothing and difference processing section 202 can perform the processing of the difference between the pixels with respect to the image of a compression object smoothed while retaining the characteristics which are the recognition objects.
As described above, in the image compression device of the second embodiment, the image of the compression object is subjected to smoothing processing to leave the luminance value of the noted pixel after smoothed as that of the noted pixel before smoothed in a case where the difference value between the intermediate luminance value and the luminance value of the noted pixel is not less than the predetermined threshold value, and to set the luminance value of the noted pixel after smoothed to the intermediate value in a case where the difference value between the intermediate luminance value and the luminance value of the noted pixel is less than the predetermined threshold value. Furthermore, the device performs the processing of the difference between the adjacent pixels in the smoothed image.
Consequently, in the image compression device and method of the second embodiment, the processing of the difference between the pixels can be performed with respect to the image smoothed while retaining the characteristics which are such recognition objects that the luminance value rapidly changes. As a result, in the image compression device and method of the second embodiment, redundancy can be imparted while retaining the characteristics which are the recognition objects, and a compression efficiency of the image can be raised.
Next, a third embodiment will be described.
This third embodiment will be described mainly assuming a case where a character image is a characteristic which is a recognition object in an image regarded as a compression object.
According to the third embodiment, as shown in
It is to be noted that the image input section 301, the coding section 304, and the data output section 305 are similar to the image input section 101, the coding section 103, and the data output section 104 in the above-described first embodiment. Therefore, detailed description of the image input section 301, coding section 304, and data output section 305 will be omitted.
The characteristic extracting section 302 extracts characteristics as images which are recognition objects from the image supplied from the image input section 301. The characteristic extracting section 302 extracts the characteristics as the images of the recognition objects from a luminance value of each pixel constituting the image supplied from the image input section 301. The characteristic extracting section 302 outputs the extracted characteristics as the images of the recognition objects to the gradation conversion processing section 303.
For example, in a case where the image supplied from the image input section 301 is an image of a paper sheet which is an object of character recognition, a pixel group (character region) constituting a character and a pixel group (background region) constituting a background are obtained as the characteristics from the image. Especially in a case where the character having a certain luminance value is described on the paper sheet having a uniform luminance value, the image of the paper sheet comprises the pixel group having a luminance value of the paper sheet itself as the background, and the pixel group having the luminance value of the character itself as the character. In this case, in a distribution of the luminance values of the respective pixels in the image, the luminance value of each pixel constituting the background and that of each pixel constituting the character are two maximums.
The gradation conversion processing section 303 performs gradation conversion with respect to the image based on the characteristics of the image obtained by the characteristic extracting section 302. That is, the gradation conversion processing section 303 functions as a conversion processing section which performs conversion processing with respect to the image input from the image input section 301 and regarded as an object of compression processing. The gradation conversion processing section 303 performs such gradation conversion processing as to allocate a fixed value to each pixel of the image, for example, based on the characteristics of the image obtained by the characteristic extracting section 302. The gradation conversion processing section 303 outputs image data subjected to the gradation conversion processing to the coding section 304.
For example, in a case where the image supplied from the image input section 301 is the image of the paper sheet which is the object of the character recognition, a threshold value (Th1) indicating the pixel group (character region) constituting the character, and a threshold value (Th2) indicating the pixel group (background region) constituting the background are obtained as the characteristics from the characteristic extracting section 302. In this case, the gradation conversion processing section 303 subjects to the gradation conversion processing a whole image (all of the pixels in the image) supplied from the image input section 301 based on two threshold values (Th1, Th2) supplied from the characteristic extracting section 302.
Next, an example of characteristic extraction processing by the characteristic extracting section 302 will be described.
That is, when the image is supplied from the image input section 301, the characteristic extracting section 302 prepares the histogram of the luminance values of the respective pixels in the image. In a case where the image supplied from the image input section 301 is the image of the paper sheet which is the object of the character recognition, the characteristic extracting section 302 prepares the histogram shown in
Therefore, on preparing the histogram of the luminance values with respect to the image supplied from the image input section 301, the characteristic extracting section 302 extracts two maximum values from the prepared histogram. As described above, each pixel having the luminance value in the vicinity of the maximum value extracted from the histogram is the pixel of the character region or the pixel of the background region. In the example shown in
Therefore, the characteristic extracting section 302 judges the threshold values as the character region and the background region based on the maximum values extracted from the histogram. As to these threshold values, a value distant from each maximum value by a predetermined value may be simply set as each threshold value (Th1, Th2), and a value corresponding to a certain accumulated value of the histogram may be set as each threshold value (Th1, Th2). On judging these threshold values, the characteristic extracting section 302 supplies the judged threshold values (Th1, Th2) as the values indicating the characteristics of the image to the gradation conversion processing section 303.
The gradation conversion processing section 303 performs gradation conversion processing to allocate the fixed value to each pixel of the image based on the threshold values (Th1, Th2) obtained from the characteristic extracting section 302. For example, in a case where the image supplied from the image input section 301 is the image of the paper sheet which is the object of the character recognition, the gradation conversion processing section 303 performs the gradation conversion processing represented by the following equation 3 based on the threshold values (Th1, Th2) supplied from the characteristic extracting section 302:
f′(x, y)=Th1 in a case where f(x, y)≦Th1; and
f′(x, y)=Th2 in a case where f(x, y)≧Th2, (Equation 3)
wherein f(x, y) indicates the luminance value of the pixel in the image supplied from the image input section 301, Th1 and Th2 indicate two threshold values judged by the characteristic extracting section 302, and f′(x, y) indicates the luminance value of the pixel converted by the gradation conversion processing section 303.
As shown in
Therefore, in the gradation conversion processing by the gradation conversion processing section 303, redundancy can be imparted without losing the characteristics in the character region or the background region. As a result, in the image compression device 300, a compression efficiency of the image can be raised without influencing the processing such as character recognition with respect to the compressed image.
As described above, in the image compression device of the third embodiment, the characteristics which are objects of recognition processing are extracted from the acquired image, the image is subjected to the gradation conversion processing depending on the characteristics which are the objects of the recognition processing in the extracted image, and the image subjected to the gradation conversion processing is coded.
In the image compression device and method of the third embodiment, the compression efficiency of the image can be raised while retaining the characteristics which are the objects of the recognition processing in the image which is the object of the compression processing.
Next, a fourth embodiment will be described.
The image compression device 400 according to the fourth embodiment is a modification of the third embodiment. The fourth embodiment will be described mainly assuming a case where a characteristic which is a recognition object in an image regarded as a compression object is a face image.
As shown in
It is to be noted that the image input section 401, the gradation conversion processing section 404, the coding section 405, and the data output section 406 are similar to the image input section 301, the gradation conversion processing section 303, the coding section 304, and the data output section 305 in the third embodiment. Therefore, detailed description of the image input section 401, gradation conversion processing section 404, coding section 405, and data output section 406 will be omitted.
The face region detecting section 402 performs processing to detect a face region of the face image from the image supplied from the image input section 401. The face region detecting section 402 supplies to the deviation value judgment section 403 information indicating the face region detected from the image supplied from the image input section 401.
It is to be noted that in the detection processing of the face region in the face region detecting section 402, for example, there can be applied a method described in a document (Kazuhiro FUKUI, Osamu YAMAGUCHI: “Face Characteristic Point Extraction by Combination of Shape Extraction and Pattern Collation”, Papers of the Institute of Electronics, Information and Communication Engineers (D), vol. J80-D-H, No. 8, pp. 2170 to 2177 (1977)). In this document, there is described a method in which a correlation value is obtained while moving in an image a template prepared beforehand to thereby regard a place having the highest correlation value as the face region.
Moreover, in the detection processing of the face region in the face region detecting section 402, a method of extracting the face region may be used utilizing an inherent space method or a partial space method.
The deviation value judgment section 403 judges the deviation value as a threshold value for use in gradation conversion processing by the gradation conversion processing section 404. The deviation value judgment section 403 supplies the judged deviation value as the threshold value to the gradation conversion processing section 404. It is to be noted that the deviation value judgment section 403 functions as a characteristic extracting section which performs characteristic extraction processing to extract the deviation value (threshold value) as the characteristic of the recognition object in the image of the compression object input by the image input section 401.
For example, the deviation value judgment section 403 judges the deviation value based on the luminance value of each pixel forming the face region detected by the face region detecting section 402. The deviation value is judged based on a minimum value (Min) and a maximum value (Max) of the luminance values of the respective pixels forming the face region. That is, when the face region detecting section 402 detects the face region, the deviation value judgment section 403 detects the maximum and minimum values from the luminance values of the respective pixels forming the face region detected by the face region detecting section 402. On detecting the minimum value of the face region, the deviation value judgment section 403 regards a luminance value which is smaller than the detected minimum value by a predetermined value as a first threshold value (Th1) as a first deviation value. On detecting the maximum value of the face region, the deviation value judgment section 403 regards a luminance value which is larger than the detected maximum value by a predetermined value as a second threshold value (Th2) as a second deviation value.
It is to be noted that as to the maximum and minimum values detected by the deviation value judgment section 403, the maximum and minimum values may be statistically detected in a distribution of pixel values of the respective pixels forming the face region. That is, the deviation value judgment section 403 may detect the minimum and maximum values from the pixel values which are not less than a certain frequency in the distribution of the pixel values of the respective pixels forming the face region. In this case, the pixel value of the pixel which appears by an influence of noises or the like can be prevented from being detected as the minimum or maximum value.
The gradation conversion processing section 404 performs the gradation conversion processing with respect to the image based on the threshold value judged by the deviation value judgment section 403. As described above, the threshold value as the deviation value is judged based on the minimum and maximum values of the face region. Therefore, the gradation conversion processing section 404 can perform the gradation conversion processing while retaining the pixel value of each pixel forming the face region which is the recognition object.
In other words, a region other than the face region detected by the face region detecting section 402 is a region (background region) of a background image of the face image. That is, the pixel having a luminance value which is larger than the maximum value, or the pixel having a luminance value which is smaller than the minimum value is the pixel of the background region. When the image is subjected to the gradation conversion processing using the threshold value (Th1) based on the minimum value, the pixel whose luminance value is smaller than the threshold value (Th1) in the background region of the image is converted into a predetermined luminance value. Similarly, when the image is subjected to the gradation conversion processing using the threshold value (Th1) based on the minimum value, the pixel whose luminance value is larger than the threshold value (Th2) in the background region of the image is converted into the predetermined luminance value.
Therefore, in the image compression device 400, the gradation conversion processing can be performed even with respect to the image including the face image as the recognition object having a complicated distribution of the luminance values without influencing the face image as the recognition object. As a result, in the image compression device 400, redundancy is imparted, and a compression efficiency can be enhanced while retaining characteristics as recognition objects.
Next, an example of the deviation value by the deviation value judgment section 403 will be described.
In the example of the histogram shown in
Moreover, the threshold value (Th1) shown in
Therefore, as to the image gradation-converted using the threshold values (Th1, Th2), the face image as the recognition object included in the image is retained. Moreover, the background region of the image is converted into a certain luminance value based on the threshold value. Therefore, the image of the background region can be provided with redundancy. As a result, in the image compression device 400 of the fourth embodiment, the image of the compression object can be subjected to the conversion processing which imparts the redundancy for raising the compression efficiency while retaining the luminance value of each pixel forming the face region as the recognition object.
As described above, in the image compression device of the fourth embodiment, the face region which is the object of the recognition processing is detected from the acquired image, the threshold value for the gradation conversion processing is judged based on the luminance value of each pixel forming the face region, the image is subjected to the gradation conversion processing by the threshold value, and the image subjected to the gradation conversion processing is coded.
In the image compression device and method of the fourth embodiment, the compression efficiency of the image can be raised while retaining the face image which is the object of the recognition processing in the image which is the object of the compression processing.
Next, a fifth embodiment will be described.
According to the fifth embodiment, as shown in
It is to be noted that the coding section 505 and the data output section 506 are similar to the coding section 103 and the data output section 104 in the first embodiment. Therefore, detailed description of the coding section 505 and data output section 506 will be omitted. Additionally, the coding section 505 performs coding processing with respect to a difference image supplied from the difference value calculating section 504.
The image input section 501 continuously inputs a plurality of images. The image input section 501 continuously inputs a plurality of images including characteristics which are objects of recognition processing. For example, the image input section 501 continuously inputs a plurality of images including a face image as the characteristic which is the object of the recognition processing. The image input section 501 comprises, for example, a television camera or the like. The image input section 501 successively supplies each of the continuously input images to the face region detecting section 502.
Moreover, it is assumed that the respective images continuously input by the image input section 501 are digital image data in which values (pixel values) of pixels forming each image are represented by digital values. For example, each of the images continuously input by the image input section 501 is digital image data including 512 pixels in a lateral direction and 512 pixels in a longitudinal direction. Here, description will be given assuming that each image continuously input by the image input section 501 is image data in which the pixel value of each pixel is represented by a luminance value. It is to be noted that each image continuously input by the image input section 501 may be image data in which the value (pixel value) of each pixel is represented by a density value.
The face region detecting section 502 performs processing to detect a region (face region) of the face image with respect to each image successively supplied from the image input section 501. It is to be noted that the face detection processing performed with respect to each image by the face region detecting section 502 is similar to that by the face region detecting section 402 in the fourth embodiment. Therefore, detailed description of the face region detecting section 502 will be omitted.
The positioning section 503 functions as a judgment processing section which judges characteristics of the image input by the image input section 501. The positioning section 503 judges the characteristics of the image input by the image input section 501, and selectively positions each image based on the judged characteristics of the image. That is, the positioning section 503 judges whether or not to perform positioning of the image and the previous image based on the face region of each image detected by the face region detecting section 502.
When it is judged that the positioning is not performed, the positioning section 503 supplies the image as such to the difference value calculating section 504. When it is judged that the positioning is to be performed, the positioning section 503 matches the position of the face region in the image with that of the face region in the immediately previous image. In this case, the positioning section 503 supplies to the difference value calculating section 504 the image in which the face region is positioned depending on the position of the face region in the immediately previous image.
It is to be noted that the positioning section 503 can perform positioning as the positioning of the face region, for example, using a position of a gravity of the face region detected by the face region detecting section 502. In this case, the face region detecting section 502 supplies to the positioning section 503 the information indicating a gravity position of the face region together with the information indicating the face region.
The difference value calculating section 504 calculates an image (difference image) as the difference value between the previous and subsequent images continuously input by the image input section 501. That is, the difference value calculating section 504 calculates the difference image between each image and the immediately previous image.
The difference value calculating section 504 calculates the difference image constituted of a difference value between the respective pixels based on a correspondence of the respective pixels in the previous and subsequent images depending on the characteristics of the image judged by the positioning section 503. The difference value calculating section 504 calculates the difference image between the positioned image and the immediately previous image especially with respect to the image positioned by the positioning section 503. The difference value calculating section 504 supplies the calculated difference image to the coding section 505.
Moreover, for example, the difference value calculating section 504 calculates the luminance value of each pixel in the difference image by the following equation 5:
ft′(x, y)=ft(x, y)−ft+1(x, y), (Equation 5)
wherein ft′(x, y) denotes the value of each pixel in the difference image, ft(x, y) denotes the luminance value of each pixel in the immediately previous image,, and ft+1(x, y) denotes the luminance value of each pixel in the image regarded as a processing object.
According to the equation 5, as to the pixels having the equal luminance value in the same position in the previous and subsequent images, the value of the pixel in the difference image is “0”. When there are many pixels indicating “0” in the difference image, it is indicated that each pixel has high redundancy in the difference image, and a compression efficiency of the difference image is high. That is, the more the pixels having the equal value in the same position exist in the previous and subsequent images, the higher the redundancy is. The compression efficiency is raised.
Next, an example of the difference image obtained from two previous and subsequent images will be described.
In the images shown in
For example, there is assumed a case where a camera as the image input section 501 continuously photographs the image including the face image. In this case, when a photographing position of the camera as the image input section 501 is fixed, and photographing conditions are the same, a background is photographed as the same image in the same position in a plurality of images photographed by the camera.
Therefore, the larger the region of the background is in the image photographed by the camera, the more the pixels indicating “0” exist in the difference image. It is predicted that mainly the value of each pixel in the background region shown by slant lines in
On the other hand, the larger the face region is in the image photographed by the camera, the smaller the background region relatively becomes. In this case, pixels obtained from the background region and indicating “0” are reduced in the difference image. It is also predicted that the background does not move, but a person moves. Therefore, in a case where the face region is large in the whole image as shown in
Therefore, the positioning section 503 matches the positions of the face regions in the previous and subsequent images in a case where the face region is large in the whole image. When the positions of the face regions in the previous and subsequent images are matched, there are supposedly more pixels indicating “0” in the face region in the difference image between the previous and subsequent images. For example, when the position of the face region in the image shown in
Therefore, the positioning section 503 judges whether or not to position the face regions in the previous and subsequent images depending on whether or not the size of the face region in the whole image is not less than a predetermined size.
Next, the positioning processing by the positioning section 503 will be described.
When the face region detecting section 502 detects the face region in the image supplied from the image input section 501, the positioning section 503 calculates the size of the face region as a ratio of the size of the face region occupying the whole image input by the image input section 501. It is to be noted that when the size of the whole image input by the image input section 501 is a fixed size, the size of the face region may be detected.
On calculating the ratio of the face region occupying the whole image, the positioning section 503 judges whether or not the ratio of the face region occupying the whole image is not less than the preset threshold value (Th) (step S51). It is to be noted that when the size of the face region itself is detected, the positioning section 503 may judge whether or not the size of the face region is not less than the preset size.
When it is judged by the judgment that ratio of the face region occupying the whole image is less than the threshold value (Th) (step S52, NO), the positioning section 503 ends the processing without performing the positioning processing of the face regions in the previous and subsequent images.
Moreover, when it is judged by the above-described judgment that the ratio of the face region occupying the whole image is not less than the threshold value (Th) (step S52, YES), the positioning section 503 performs positioning processing of the face regions in the previous and subsequent images (step S52). In the positioning processing, the position of the face region in the image regarded as the processing object is matched with that (e.g., gravity position of the face region) of the face region in the immediately previous image.
As described above, the positioning section 503 positions the face regions in the previous and subsequent images in a case where the size of the face region is not less than a predetermined threshold value, and does not position any image in a case where the size of the face region is less than the predetermined threshold value.
In the processing by the positioning section 503, mainly the face regions in the previous and subsequent images are regarded as object regions of the difference in a case where the size of the face region is not less than the predetermined threshold value. Mainly the background regions in the previous and subsequent images are regarded as the object regions of the difference in a case where the size of the face region is less than the predetermined threshold value. In other words, the positioning section 503 switches the region regarded as the object of the difference to the face region or the background region depending on the size of the face region.
According to the processing by the positioning section 503, the difference value calculating section 504 can calculate the difference image having a high compression efficiency irrespective of the characteristics of the image input by the image input section.
As described above, in the image compression device of the fifth embodiment, the face regions are detected from a plurality of continuously acquired images, respectively, the previous and subsequent images are positioned depending on the size of the detected face region, the difference image is calculated from the positioned previous and subsequent images, and the calculated difference image is coded.
In the image compression device and method of the fifth embodiment, the redundancy in the difference image is improved, and the compression efficiency of the image can be raised based on the characteristics of the respective continuous images.
Additional advantages and modifications will readily occur to those skilled in the art. Therefore, the invention in its broader aspects is not limited to the specific details and representative embodiments shown and described herein. Accordingly, various modifications may be made without departing from the spirit or scope of the general invention concept as defined by the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2005-015629 | Jan 2005 | JP | national |