(1) Field of the Invention
The invention relates to a system for differentiating pictures and texts.
(2) Description of the Prior Art
In the process of image files, how to differentiate image data and text image data so that they can be processed separately is an important issue. Data processing characteristics of the image data and text image data are different. For instance, in the image file processing of some image processing equipment such as copiers, if the picture and text are separated in advance, picture development result improves. In some cases, there is a greater difference between picture data bits and text image data bits (the data bits of the picture is much greater than the text). If the picture data and the text image data can be separated properly, an improved transmission efficiency can be achieved, especially in the Internet which has limited bandwidth resources.
The conventional techniques on space management include searching point, line and border method and region growing method. On frequency domain, Fourier transformation and wavelet transformation approaches are generally adopted. The space domain means the plane picture. This approach directly processes the individual pixels of the picture. The spectrum approach treats the picture as waves, and processes the picture as signals. The searching point, line and border method or region growing method does not generate a desired result for separating the picture and text. On the other hand, the Fourier transformation or wavelet transformation approach has a greater demand on hardware, and the cost is higher.
Therefore, the present invention aims to provide a system for differentiating pictures and texts that is more efficient and simpler to overcome the aforesaid problems.
Accordingly, the object of the invention is to provide a system and method for differentiating pictures and texts to differentiate picture data and text image data in image files in a more efficient and simpler fashion to facilitate data processing in the image processing equipment at later stages or data transmission through networks.
In one aspect, the invention provides a system and method to differentiate picture data and text image data in an image file. The image file corresponds to a picture which includes a plurality of pixels. Each pixel corresponds to a gray-level. The differentiating system includes a threshold setting module and a picture text differentiating module.
The threshold setting module performs statistics of all pixels of the picture to generate a relationship between the gray-level and quantity, and based on the relationship to set a threshold according to a preset rule.
The picture text differentiating module first divides the picture into a plurality of unit areas of the same size. Each unit area contains same number of pixels. Then the picture text differentiating module sequentially performs statistics of the gray-level of the pixels of each unit, and sequentially compares the gray-level with the threshold, and totals the times that the gray-level equals to the threshold.
Later on the picture text differentiating module compares the times of equality with a preset comparison value. When the times of equality is greater than the comparison value, the data corresponding to the unit area is treated as picture data. When the times of equality is smaller than the comparison value, the data corresponding the unit area is treated as text image data.
In another aspect, the differentiating system further includes an error correction module. When the data corresponding to the unit area that are confirmed by the picture text differentiating module differ from the corresponding data of at least three neighboring unit areas of the unit area, the error correction module changes the corresponding data of the unit area to the different data.
In yet another aspect, the threshold setting module, based on the setting rule, selects the gray-level corresponding to the smallest quantity among two greatest quantities as the threshold that are obtained according to the relationship between the gray-level and the quantity.
In still another aspect, the threshold setting module, based on the setting rule, selects the average of two gray-levels corresponding to two greatest quantities as the threshold that are obtained according to the relationship between the gray-level and the quantity.
Hence, by means of the system and method for differentiating 25 pictures and texts of the invention, the threshold setting module generates a threshold, the picture text differentiating module sequentially compares the gray-level of a plurality of pixels with the threshold, by comparing the times of equality, differentiating of picture data and text image data in the image file can be accomplished efficiently and simply to facilitate data processing in the image processing equipment at later stages or data transmission through the networks.
The present invention will now be specified with reference to its preferred embodiment illustrated in the drawings, in which
FIG.1 is a schematic view of the differentiating system of the invention;
FIG.2 is a schematic view of the image of the invention;
a is a chart showing the times of equality of picture data;
b is a chart showing the times of equality of text image data;
Refer to
Referring to
Referring to
The threshold setting module 40 performs statistics of all pixels 36 to generate a relationship between the gray-level and quantity. Namely, calculate the quantity of the pixels 36 corresponding to each gray-level. The threshold setting module 40, based on the relationship of the gray-level and the quantity and a preset rule, sets one threshold. The-threshold is stored in a storage device 46. The preset rule may vary. Two embodiments are discussed below.
Referring to
Referring to
-The picture text differentiating module 42 accumulates the times of equality of the comparison between the gray-level and the threshold. The times of equality is the times the curve crossing the horizontal line of the threshold in the drawings. For instance, it is nine times in FIG, 3a, and three times in
As the alteration of the gray-level of the picture data is more complicated, their times of equality also is greater. The alteration of the gray-level of the text image data is simpler, hence their times of equality also is less. Hence when the times of equality of the unit area 34 is greater than the comparison value, the data corresponding to the unit area 34 is treated as picture data. When the times of equality of the unit area 34 is less than the comparison value, the data corresponding to the unit area 34 is treated as text image data.
For instance, in the enlarged drawing shown in
Referring to
Finally, the picture data and text image data confirmed by the error correction module 44 are stored in the storage device 46. The data stored in the storage device 46 is retrieved for later data processing performed by the image processing equipment, or for data transmission through the networks,
Refer to
Refer to
Refer to
Step S02: Perform statistics of all pixels 36 of the image 32 to generate a relationship between the gray-level and quantity (the number of the pixels 36).
Step S04: After step S02, based on the relationship between the gray-level and quantity, set a threshold according to a preset setting rule. The method for setting the threshold may refer to the embodiments shown in
Step S06: Divide the image 32 into a plurality of unit areas 34 of the same size. Each of the unit areas 34 contains a plurality of pixels 36 of the same number.
Step S08: Perform statistics sequentially of the gray-levels of the pixels 36 in each unit area 34, and compare the gray-level with the threshold, and accumulate the times of equality between the gray-level and the threshold.
Step S10: After step S08, compare the times of equality with a preset comparison value to determine whether the times of equality is greater than the comparison value.
Step S12: Treat the data corresponding to the unit area 34 as picture data when the times of equality is greater than the comparison value.
Step S14: Treat the data corresponding to the unit area 34 as text image data when the times of equality is less than the comparison value.
Step S16: Perform error corrections for step S12 and step S14. Determine whether the data corresponding to the unit area 34 is different from the data corresponding to at least three neighboring unit areas 34.
Step S18: Change the data corresponding to the unit area 34 to the different data when the picture text differentiating module 42 confirms that the data corresponding to the unit area 34 is different from the data corresponding to at least three neighboring unit areas 34.
Step S20: Maintain the data corresponding to the unit area 34 unchanged from the original determination when the picture text differentiating module 42 confirms that the data corresponding to the unit area 34 is not different from the data corresponding to at least three neighboring unit areas 34. Thereby the picture data and the text image data in the image file may be differentiated.
Therefore, by means of the system and method for differentiating picture and text of the invention, the threshold setting module 40 can generate the threshold, and the picture text differentiating module 42 compares sequentially the gray-level of a plurality of pixels 36 in the unit area 34 with a threshold, and by comparing the times of equality, the picture data and the text image data in the image file may be differentiated efficiently and easily for the image processing equipment to do data processing in the later stages, or for data transmission through the networks.
While the preferred embodiments of the present invention have been set forth for the purpose of disclosure, modifications of the disclosed embodiments of the present invention as well as other embodiments thereof may occur to those skilled in the art. Accordingly, the appended claims are intended to cover all embodiments which do not depart from the spirit and scope of the present invention.