1. Field of the Invention
The present invention relates to a scheduling technique for plural processors in a layout analysis.
2. Description of the Background
Conventionally, there is known a technique for analyzing, in a function of scanning a paper document with a scanning function of an MFP to create an electric document, a layout of scanned image data to thereby extract a character area, a background area, an image area, and the like and selecting a compression method most suitable for the respective extracted areas to simultaneously realize improvement of efficiency of compression of scanning data and visibility. This technique is a technique, for example, for an area extracted as the character area by the layout analysis the shape of the character is compressed using binary compression techniques such as MMR, JBIG, or JBI2 and an area extracted as the background area or an image area such as a photograph or a picture by the layout analysis is compressed using a compression technique such as JPEG, JPEG2000, or HD Photo. The respective areas compressed by these different compression systems are merged. Consequently, it is possible to prevent deterioration in visibility of an image in a high-frequency portion due to the compression of the character area by JPEG or the like. It is also possible to create an image generally having high compression efficiency.
There is also known a technique for applying OCR or the like to an area extracted as a character area and converting only the character area into a document.
As a technique related to the present invention, there are known an image processing apparatus that allocates plural colors to a character area, an image processing method for the image processing apparatus, and a storage medium for the image processing method (JP-A-2003-008909).
However, the processing such as the layout analysis, the image processing for the respective areas, and the OCR described above is heavily-loaded and time-consuming processing. In addition, according to the improvement of accuracy of the layout analysis and the character recognition and an image quality of electronic document to be created, a processing amount further increases. As a result, relatively long time is required until the electronic document is obtained.
To cope with such a problem, there is known a technique for, instead of sequentially performing these kinds of processing, using plural processors or multi-core processors, allocating processing for each of the areas to the respective processors, and parallelizing the processing to reduce processing time.
However, processing times for the respective areas are different and are not fixed in the parallization of the processing for each of the areas. In order to efficiently use plural calculation resources, it is necessary to schedule loads of processing for the respective calculation resources with good balance.
It is an object of an embodiment of the present invention to provide a technique that can efficiently allocate processing for respective areas extracted by a layout analysis to plural calculation resources.
In order to solve the problem, an image forming apparatus according to an aspect of the present invention is an apparatus that processes image data using plural processors that operate in parallel. The image forming apparatus includes an image-data receiving section that receives inputted image data, a layout analyzing section that analyzes a layout structure including a predetermined area on the basis of the image data received by the image-data receiving section, a processing-amount calculating section that calculates a processing amount for the predetermined area in the layout structure of the image data analyzed by the layout analyzing section, and a processing-processor determining section that allocates, in processing for all areas in the layout structure analyzed by the layout analyzing section, processing for the predetermined areas to any one of the plural processors on the basis of the processing amount calculated by the processing-amount calculating section.
An image processing apparatus according to another aspect of the present invention is an apparatus that processes image data using plural processors that operate in parallel. The image processing apparatus includes an image-data receiving section that receives inputted image data, a layout analyzing section that analyzes a layout structure including a predetermined area on the basis of the image data received by the image-data receiving section, a processing-amount calculating section that calculates a processing amount for the predetermined area in the layout structure of the image data analyzed by the layout analyzing section, and a processing-processor determining section that allocates, in processing for all areas in the layout structure analyzed by the layout analyzing section, processing for the predetermined area to any one of the plural processors on the basis of the processing amount calculated by the processing-amount calculating section.
An image processing method according to still another aspect of the present invention is a method of processing image data using plural processors that operate in parallel. The image processing method includes receiving inputted image data, analyzing a layout structure including a predetermined area on the basis of the received image data, calculating a processing amount for the predetermined area in the analyzed layout structure of the image data, and allocating, in processing for all areas in the analyzed layout structure, processing for the predetermined area to any one of the plural processors on the basis of the calculated processing amount.
Embodiments of the present invention will be hereinafter explained with reference to the accompanying drawings.
As shown in
The processor 10 is a symmetrical multiprocessor including four equivalent PEs (Processor elements) 101 to 104. The processor 10 may be a multi-core processor. The multi-core processor may be a heterogeneous processor or may be a homogenous processor. The processor 10 may be an asymmetrical multiprocessor. The number of PEs of the processor 10 may be any number as long as there are plural PEs.
The processor 10 includes a layout analyzing section 201 (a processing-amount calculating section), an image processing section 202, an OCR processing section 203, a processing-time measuring section 204, and a processing determining section 205 (a processing-processor determining section) shown in
First, the layout analyzing section 201 is explained. The layout analyzing section 201 analyzes a layout structure of image data inputted by the scan IF 40. Specifically, the layout analyzing section 201 analyzes image data including areas of sentences and images shown in
A specific example of an analysis of a character area by the layout analyzing section 201 is described below.
First, the layout analyzing section 201 generates a histogram for image data subjected to luminance conversion and calculates a threshold from the histogram. Then, the layout analyzing section 201 binarizes the image data on the basis of the threshold, identifies characters in the binarized image data using edge extraction and labeling processing, and extracts the characters. Finally, the layout analyzing section 201 discriminates character areas on the basis of intervals among the extracted characters.
After discriminating types of the areas described above, the layout analyzing section 201 further analyzes each of the areas and calculates parameter values for each of the areas. Examples of the parameter values to be calculated include, as shown in
These parameter values are values normalized between 0 and 1 as indicated by remarks in
The image processing section 202 and the OCR processing section 203 are explained. The image processing section 202 applies image processing to each of the areas analyzed by the layout analyzing section 201. Specifically, the image processing section 202 executes compression and filter processing by a system that does not spoil visibility of an area allocated thereto. For example, when a type of the area is an image, the image processing section 202 compresses the area with JPEG. When a type of the area is graphics, the image processing section 202 compresses the area with GIF. When a type of the area is a character and OCR is not executed, the image processing section 202 compresses the area with a binary compression technique such as MMR. The OCR processing section 203 executes OCR on a character area. The image processing section 202 and the OCR processing section 203 execute the processing described above on the basis of an instruction of the processing determining section 205. When processing for all the area in the image data is completed, the image processing section 202 finally merges all the areas.
The processing-time measuring section 204 is explained. The processing-time measuring section 204 measures time of the processing by the image processing section 202 and the OCR processing section 203 and stores the measured processing time, i.e., a processing amount in each of the PEs 101 to 104, in the HDD 20. In this embodiment, since the PEs 101 to 104 are the same PEs, the processing-time measuring section 204 may measure a processing load. However, when the processor 10 includes different PEs, it is necessary to measure processing time.
The processing-time measuring section 204 calculates the “weight of an object area”, the “maximum weight of an area type”, the “sum of weights of processing for an object area”, and the “sum of maximum weight of processing” shown in
The processing determining section 205 is explained. The processing determining section 205 performs scheduling for processing. Specifically, the processing determining section 205 allocates the various kinds of compression processing and the OCR processing to the respective PEs 101 to 104. Scheduling for the processing is explained below with reference to
The processing A shown in
On the other hand, the processing B is scheduling processing for allocating the processing 1 to the processing 10 taking into account processing times of the processing 1 to the processing 10 to minimize a difference between a sum of processing times in the PE 1 and a sum of processing times in the PE 2. By allocating the processing 1 to the processing 10 to the PE 1 and the PE 2 in this way, it is possible to reduce overall processing time by 0.4 second compared with that in the processing A.
The processing determining section 205 performs scheduling taking into account processing time of each of the processing 1 to the processing 10 to minimize a difference in processing time among the PEs 101 to 104.
Allocation processing according to this embodiment is explained.
First, the processing determining section 205 determines whether all areas are allocated to the PEs 101 to 104 (S101).
When there are areas hot allocated to the PEs 101 to 104 (unallocated areas) (S101, NO), the processing determining section 205 selects any one of the unallocated areas (S102).
When the unallocated area is selected by the processing determining section 205, the layout analyzing section 201 calculates an evaluation value of the unallocated area selected by the processing determining section 205 (S103).
When the evaluation value of the unallocated area is calculated by the layout analyzing section 201, the processing determining section 205 selects a PE, a sum of evaluation values of areas already allocated to which is the smallest, among the PEs 101 to 104 (S104), allocates the unallocated area to the selected PE (S105), adds the evaluation value of the allocated area to the sum of evaluation values of areas already allocated to the PE (S106), and determines again whether all the areas are allocated to any one of PEs 101 to 104 (S101).
When all the areas are allocated to any one of the PEs 101 to 104 in step S101 (S101, YES), the processing determining section 205 finishes the allocation processing for the image data.
As described above, the controller 1 according to this embodiment can perform the processing for the respective areas of the image data at high speed by calculating processing loads on the respective areas and allocating the processing for the respective areas to the PEs 101 to 104 taking into account the calculated processing loads to minimize a difference in a sum of processing loads among the PEs 101 to 104.
A second embodiment of the present invention is explained.
This embodiment is different from the first embodiment in that a degree of importance as a weighting coefficient is added to respective parameters for compression and OCR processing for each of areas and an evaluation value of processing for the area is calculated by taking into account the degree of importance. According to the difference from the first embodiment, components and operations for functions executed on the processing processor 10 are different from those in the first embodiment. The components and the operations different from those in the first embodiment are explained below.
As shown in
The degree of importance is explained. The degree of importance is a value added to each of the parameters and normalized to 0 to 1 in the same manner as an evaluation value. The degree of importance is a value for weighting all the parameters indicated by 0 to 1. The degree of importance is determined by the degree-of-importance determining section 206 for each kind of processing for image data. A more appropriate evaluation value of processing for each of the areas is calculated by adjusting the value of the degree of importance.
In this embodiment, operations of the image processing section 202, the OCR processing section 203, the processing-time measuring section 204, and the processing determining section 205 are the same as those in the first embodiment. However, operations of the layout analyzing section 201 are different from those in the first embodiment. Specifically, the operations of the layout analyzing section 201 are different from those in the first embodiment in that, in calculating an evaluation value of processing for each of the areas, the layout analyzing section 201 multiplies parameters for the processing with degrees of freedom and multiplying the parameters multiplied with the degree of importance together.
Degree-of-importance determining processing according to this embodiment is explained.
First, the degree-of-importance determining section 206 determines whether all inputted image data have been processed (S201).
When all the inputted image data have not been processed (S201, NO), the degree-of-importance determining section 206 selects any one of parameters among the parameters for the processing for each of the areas and changes a degree of importance of the selected parameter (S202). The parameter may be selected at random or may be selected according to predetermined order.
When the degree of importance is changed and the processing for the respective areas forming the image data is performed by the image processing section 202 and the OCR processing section 203, the degree-of-importance determining section 206 acquires processing amounts of a PE having a largest processing load and a PE having a smallest processing load in this processing, which are measured in the PEs 101 to 104 by the processing-time measuring section 204, and calculates a difference between the processing amounts as a difference value (S203).
After calculating the difference value, the degree-of-importance determining section 206 compares a difference value in processing of image data inputted immediately before this processing (a difference value in the past) and the difference value calculated in step S203 (a present difference value) and determines whether the difference value in the past is larger than the present difference value (S204). The difference value in the past in this determination does not have to be the difference value in the processing of the image data inputted immediately before this processing and may be a difference value in processing of image data inputted earlier. The degree-of-importance determining section 206 can select a combination of better degrees of importance by referring to records in the past.
When the difference value in the past is larger than the present difference value (S204, YES), the degree-of-importance determining section 206 selects a combination of present degrees of importance (S205).
On the other hand, when the difference value in the past is equal to or smaller than the present difference value (S204, NO), the degree-of-importance determining section 206 selects a combination of degrees of importance in the processing of the image data immediately before this processing (S206).
As explained above, degrees of importance are added to respective parameters for processing of areas forming image data, the degrees of importance are changed every time processing for the image data is performed, and a combination of degrees of importance having lower difference in processing time among PEs is selected. Consequently, for example, since the image data of the same layout structure are continuously inputted, scheduling is gradually optimized and the processing for the image data can be more efficiently executed.
In the embodiments described above, when processing time for image data is shorter than processing time for scheduling, the scheduling does not have to be performed. As the PEs 101 to 104, PEs specialized for performing specific processing such as binary image processing, color image processing, and bit operation processing may be used. Processing for one area may be shared by the plural PEs 101 to 104. In the embodiments, it is assumed that the operations are executed in the MFP. However, the operations may be executed on, for example, a personal computer that includes a multiprocessor and is connected to a scanner.
The present invention has been explained in detail with reference to the specific embodiments. However, it would be obvious for those skilled in the art that various alterations and modifications of the embodiments can be made without departing from the spirit and the scope of the present invention.
As described above, according to the present invention, it is possible to provide a technique that can efficiently allocate processing for respective areas extracted by a layout analysis to plural calculation resources.
Number | Date | Country | |
---|---|---|---|
60983424 | Oct 2007 | US |