None.
None.
1. Field of the Invention
The present invention relates generally to image processing and, more specifically, to analyzing the content of a scanned image.
2. Description of the Related Art
A scanner is a computer peripheral device or portion of a multifunction or “all-in-one” machine (e.g., scanner/printer/copier/fax) that digitizes a document placed in it. The resulting image data can then be provided to a computer or otherwise processed, printed, faxed, or e-mailed. Scanners are known that analyze the content of the image data to facilitate operations such as detecting where the image is on the scanner glass and using that information to perform operations such as that known as “auto-fit” or “fit-to-page,” and optimizing printing settings that may depend upon the content of the document (e.g., text, photograph, business graphics, mixed, etc.) or the document medium (e.g., glossy/reflective paper, transparency, film, or plain paper).
Scanners that have been incorporated into multifunction machines typically perform a copy operation by repeating the steps, in a pipelined fashion, of scanning part of the image, storing it in memory, processing the stored image portion, then printing the processed portion from memory. This pipelining method is used to minimize memory requirements and to perform scanning and printing in parallel to the greatest extent possible. In such multifunction machines, for reasons of economy, there is typically neither a great amount of memory nor a great amount of processing power. Image pipeline processing is typically controlled by essentially a single chip, such as an application-specific integrated circuit (ASIC). The image portion that is scanned, stored and processed may be a band comprising a number (“N”) of scan lines. In other words, the image is broken up into a number of bands, each comprising N scan lines. In
The image processing that has been performed on a band-by-band basis in certain multifunction machines consisted of detecting and counting the number of black pixels, white pixels and color pixels in each band, creating a histogram, and making a decision based upon the histogram as to whether the image is most appropriately classified as text, picture, or graphics. Then, based on the classification of the image, printer settings can be adjusted accordingly.
It would be desirable to provide an image analysis method and system for multifunction machines that facilitates performing a broader variety of operations than is possible with conventional processing and yet does not require an excessive amount of memory or processing power in the image pipeline ASIC or other chip. The present invention addresses this need and others in the manner described below.
The present invention relates to a method and system for analyzing scanned image content. In some embodiments of the invention, the system can be included in an application-specific integrated circuit (ASIC) or other integrated circuit chip. The image data is received from a scanner or other scanning device, such as the scanning subsystem of a multifunction (e.g., scanner/printer/copier) machine. A generally rectangular grid of sub-regions is defined over the pixels of the image data. In each sub-region or, alternatively, a pixel group comprising a plurality of adjacent sub-regions, the number of pixels within each of a number of pixel categories is counted or otherwise quantified. For example, the categories may be black pixels, white pixels, gray pixels and color pixels, or some suitable combination of two or more of these.
The count or, equivalently, a value derived from a count or from which a count is derivable, such as a percentage, is compared with a predetermined pixel distribution. The distribution may be, for example, a threshold percentage of black pixels, a threshold percentage of white pixels, a threshold percentage of gray pixels, and a threshold percentage of color pixels, or some suitable combination of two or more of these. In response to this comparison with the predetermined pixel distribution, the sub-region or the pixel group is characterized as being of one of a plurality of types. For example, the types may include whitespace, non-whitespace, text, graphics and so forth.
An image processing operation is then performed in response to the characterization. For example, if whitespace is found bordering a central area of text or graphics, the image processing operation can include automatically fitting the central area to page-size or detecting a margin. Similarly, for example, if one area of a document is characterized as text and another area is characterized as graphics, image-enhancement parameters can be selected for the text area that are optimal for text, while other image-enhancement parameters can be selected for the graphics area that are optimal for graphics. In addition to or alternatively to these exemplary operations, any other suitable operation of the types commonly performed in scanner systems or multifunction machines can be performed.
Additional embodiments and advantages of the invention will be set forth in part in the description that follows, and in part will be obvious from the description herein, or may be learned by practice of the invention. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The accompanying drawings illustrate one or more embodiments of the invention and, together with the written description, serve to explain the principles of the invention. Wherever possible, the same reference numbers are used throughout the drawings to refer to the same or like elements of an embodiment, and wherein:
As illustrated in
As illustrated in
As illustrated in
At step 48, the number of pixels of each of a number of pixel categories within each sub-region is counted. For example, in some embodiments of the invention, there can be the following four categories or a subset thereof: color pixels, gray pixels, black pixels and white pixels. In an embodiment in which these four categories are counted, in each sub-region, the number of color pixels is counted, the number of gray pixels is counted, the number of black pixels is counted, and the number of white pixels is counted. In some embodiments, the categories can include white pixels and non-white pixels. Black pixels, gray pixels and color pixels are examples of non-white pixels. These teachings and examples will lead persons skilled in the art to consider still other pixel categories that may be useful in other embodiments of the invention.
At step 50, the pixel counts are compared to one or more pixel distributions. A pixel distribution characterizes a group of one or more sub-regions as having the characteristics of a certain type of content, such as whitespace, text, graphics, non-whitespace (i.e., text, graphics—anything but whitespace), etc. The term “graphics” includes photographic images, business graphics, drawings, clip art and other similar images For example, referring to
The pixel distribution can be empirically determined, defined mathematically or algorithmically in any suitable manner. The term “distribution” is used for convenience in this patent specification and is not intended to imply any specific mathematical or algorithmic concept. In the illustrated embodiment, a distribution can be defined by a set of upper and lower threshold values against which the counts or percentages of pixels in the various categories are compared.
At step 52, the group of one or more sub-regions is characterized as representing one of several types of content. As indicated above, the characterization is made based upon or in response to the comparison of the counts or percentages of pixels in each category with the distribution or distributions. Thus, for example, if the counts or percentages fit a distribution associated with whitespace, the group is characterized as whitespace.
At step 54, an image processing operation is performed in response to the characterization. In other words, the image processing operation is one that depends upon or uses as one of its inputs the type of content. For example, margin detection is one well-known image processing operation performed in multi-function machines. In margin detection, only the printable region or region containing information is stored in memory and further processed or printed in order to minimize memory requirements and improve performance. In other words, only the image data bounded by the margins (whitespace) is processed. Margin detection is described in further detail below.
Another well-known image processing operation performed in multi-function machines is known as “auto-fit.” Auto-fit scales the image to fit the entire printed output page. As one step in the auto-fit process, a rectangular border that bounds the printable region is defined. The area defined by the border is then subjected to auto-fit or other processes. Still other image processing operations can include processing text differently from graphics. It would be desirable, for example, to use a different color print table for text than graphics, or to apply different filters to text and graphics regions, or to apply a background removal operation to the text portion and not the graphics portions. All such image processing operations depend upon identifying regions representing such content types and their locations within the scanned image data.
The step of counting pixels of the various categories in each sub-region is illustrated in further detail in
As described above with regard to the method illustrated in
As illustrated in
As illustrated in
A method for identifying a rectangular border around the printable area of a scanned document is illustrated in
At steps 120 and 122, respectively, the left and right margins of a band of sub-regions are identified as described above with regard to
The process then continues at step 142 (
If it is determined at step 130 that the top margin has already been found, then at step 136 it is determined whether the bottom margin has already been found. If both the top and bottom margins have been found, the process continues at step 133. If the top margin has been found but the bottom margin has not been found, then at step 138 a flag is set that indicates the bottom margin has been found and is located in the current band (band K). Following step 138, the process continues at step 133.
At step 134, it is determined whether the last (bottom-most) band has been processed. If it has, this implies that all image borders have been found at block 135, and the process is completed. If it has not, the process returns to steps 120 and 122 to continue with the next band.
As noted above, steps 142 and 146 are performed following step 128. At step 142, it is determined whether the left margin of band K is less than (i.e., to the left of, with respect to the orientation of the document image) the furthest left margin. The furthest left margin is a value that indicates, of all bands, the margin that has thus far in the process been found to be closest to the left edge of the document image. If the left margin of band K is not less than the furthest left margin, the process continues at step 146. If the left margin of band K is less than the furthest left margin, then at step 144 the value for the furthest left margin is set to the value of the left margin of the current band (band K). The process then continues at step 146. At step 146, it is determined whether the right margin of band K is greater than (i.e., to the right of, with respect to the orientation of the document image) the furthest right margin. The furthest right margin is a value that indicates, of all bands, the margin that thus far in the process has been found to be closest to the right edge of the document image. If the right margin of band K is not greater than the furthest right margin, the process continues at step 133, where K is incremented. If the right margin of band K is greater than the furthest right margin, then at step 148 the value for the furthest right margin is set to the value of the right margin of the current band (band K). The process then continues at step 133. Determining the right and left margins of the current bank K and determining whether or not the current band K left and right margins are furthest left or right can be reversed or done in parallel.
As described above, following step 133 it is determined whether the last band has been processed at step 134 and thus whether the process is completed or the next band is to be processed.
In the manner described above with regard to
In addition to margin-detection and auto-fit, there are many other image processing operations that can be performed once sub-region groups have been characterized as being of certain types. For example, the machine can process text regions differently from graphics regions. Text regions will typically have a certain percentage range of white pixels plus a certain percentage range of pixels that are either black or gray. Graphics regions will typically have a certain percentage range of non-white pixels that include a certain percentage range of pixels that are either color or gray. For example, text may be 40%-70% white, 30%-60% gray or black, and 0%-2% color. Similarly, for example, graphics may be less than 10% white or, alternatively, be more than 20% color. Persons skilled in the art are familiar with such characteristics of text, graphics, pictures and other content types and will readily be capable of defining such pixel distributions against which the counts or percentages can be compared to infer content type.
It can be appreciated that by analyzing groups of one or more sub-regions, it can be inferred that, for example, the document image includes a text region separated from a graphics region by a horizontal or vertical line. For example, if a group of sub-regions in a band have less than 5% color pixels, and an adjacent group of sub-regions in the band has more than 5% color pixels, it can be inferred that one group is text and the other graphics. Similarly, if a group of sub-regions having a very low percentage of white pixels is adjacent a group of sub-regions having more white pixels, it can be inferred that the group with fewer white pixels is graphics. The machine can then optimize processing of such an image accordingly. For example, it can use a different color print table for the text and graphics regions, or apply a background removal process to the text region but not the graphics region.
The methods of the present invention can also be used to aid compensating for localized noise in the image scan. It is known that a scanner may inherently have, for example, noise that results in streaks in the displayed or printed document image. By scanning the (white) cover of a flatbed scanner and counting the color and gray pixels in each sub-region as described above, information can be obtained that describes whether any part of the scan field is inherently noisier than another. The information can then be used in margin-detection, auto-fit and text/graphics detection to treat the noisy areas differently than other areas so as to compensate for the noise. For example, it may be determined that 3% of the pixels in the first three sub-regions in a certain band of the scan field are merely noise, but only 1% of the pixels in the remaining sub-regions of that band are either color or gray. Therefore, a subsequent image processing operation performed upon the first three sub-regions can reduce each of the predetermined thresholds that define the distributions (see
In the manner described above, scanned image content can be analyzed by identifying and quantifying each of a number of pixel categories, such as black, white, gray, color, non-white, etc., in sub-regions of a rectangular grid defined over the scanned-in image data. The counts or quantities derived from the counts (e.g., percentages) are compared with predetermined pixel distributions, and the sub-regions are characterized in response to the comparison. Subsequent image processing operations can then be optimized for the content type or types or to compensate for detected noise.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the scope or spirit of the invention. Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
This application is related to U.S. patent Ser. No. 10/754,123 filed Jan. 9, 2004, entitled “METHOD AND APPARATUS FOR AUTOMATIC SCANNER DEFECT DETECTION” and assigned to the assignee of the current application.