1. Field of the Invention
The present invention relates generally to methods and systems for image processing, and more particularly to methods and systems for de-screening digitally scanned documents.
2. Description of Related Art
Almost all printed matter, except silver-halide photography, is printed using halftone screens. These halftone screens are traditionally optimized for the printing device, and may cause considerable halftone interference (visible large-area beating) and visible Moire patterns if not properly removed from the original scanned image. The successful removal of such screens without compromising text and line art quality is a fundamental key to quality document scanning and document segmentation and compression.
A method and a system for de-screening an image signal are disclosed. A filter bank filters an image signal and produces a set of filter output signals. The method and system utilizes this bank of filters to provide several increasingly blurred versions of the original signal. At any given time, only two of these blurred versions are created, on a pixel-by-pixel basis. The outputs from the selected pair of blurred signals are then blended together to create a variable blending output that can vary smoothly from no blurring to maximum blurring in a smooth and continuous manner. In addition, the method provides the capability to enhance text and line art by using a variable un-sharp masking mechanism with independent post-blur sharpening control, and the capability to detect and enhance neutral (no-color) output pixels.
The features and advantages of the present invention will become apparent from the following detailed description of the present invention in which:
A new method and system are described for de-screening digitally scanned documents such that potential halftone interference and objectionable Moire patterns are eliminated or substantially reduced. The method objective is to selectively eliminate the halftone screens from the scanned input signal, while preserving or enhancing the sharp edge information of text or line objects. This is accomplished even in cases where the two types of information are not spatially separated (e.g., text in lint).
The improved technique utilizes a bank of filters to provide several increasingly blurred versions of the original signal. At any given time, only two of these blurred versions are created, on a pixel-by-pixel basis. The outputs from the selected pair of blurred signals are then blended together to create a variable blending output that can vary smoothly from no blurring to maximum blurring in a smooth and continuous manner.
In addition, the method provides the capability to enhance text and line art by using a variable un-sharp masking mechanism with independent post-blur sharpening control, and the capability to detect and enhance neutral (no-color) output pixels.
The method utilizes complex logic for determining how much to blur and/or sharpen individual pixels and for providing instantaneous enhancement control from one pixel to the next. The method eliminates the need for a second large-size contrast window. Also, the new halftone screen frequency and magnitude require significantly fewer operations.
The method of the present invention can be made fully programmable through the use of piecewise linear control functions and various threshold registers. The de-screening cutoff frequencies, degree of halftone screen removal, and choice of the amount of edge enhancement can all be adjusted and tuned for high-quality output. The present invention is applicable to any document-scanning product. One embodiment of the present invention was implemented in software and demonstrated to deliver excellent image quality across a wide range of screen frequencies and typography sizes.
It should be noted that the method and algorithms described below use the LAB color space. Alternatively, other color spaces such as YCbCr, RGB, etc. can be used where measures of luminance and neutrality are suitably modified. In many cases, linear YCbCr (before gamma correction) can produce better results than (non-linear) LAB.
The purpose of the de-screener system 20 is to detect incoming halftones in the input stream and selectively filter them out. The main objective is to filter out the halftones yet maintain the sharp edges of objects in line art on the page represented by the input image. At the same time, the de-screener system can optionally enhance text or line art objects with sharp edge definitions in order to not significantly compromise the quality of text and line art graphics. The two operations (filtering and enhancement) are tightly but independently controlled.
Referring once again to
The control module DSC 30 relies on two additional input signals from a Screen Frequency Estimate Module SEM (not shown). These are two 8-bit monochrome signals that provide the estimated screen frequency Scf 26 and estimated screen magnitude Scm 24.
The main processing path of the De-Screen Module 20 occurs along the top portion of
Much of the De-Screen work is performed in the Variable Triangular Blur Filter Unit, VTF 50. The Variable Triangular Blur Filter Unit is composed of a Filter Bank unit that can, at any given time, produce two out of five subsequently filtered versions of the input signal, each with an increasingly larger filter span. The top three bits of a bank signal Bnk 48 select which pair of filters is to be used. The outputs of the selected filters are then blended together by the amount of blending specified in the next lower two bits of the bank signal Bnk 48. The selection of which filters to blend and by what amount can change on a pixel-by-pixel basis, depending on the content of the 8-bit bank signal Bnk 48 (the number of bits can change).
The full-color blended output Blv 51 from the Variable Triangular Blur Filter Unit VTF 50 is forwarded to the Variable Sharpening and Neutral Unit VSN 52. The VSN 52 Unit provides capabilities to further enhance the blended signal Blv 51. These capabilities include sharpening the Blv 51 signal and controlling its neutrality on the chroma axes. The unit includes a built-in un-sharp mask filter that uses a heavily blurred version Blr 44 of the source Src 22 signal as a reference signal. The amount of sharpening is controlled by the 8-bit signal Shp 46. In addition, a built-in chroma adjustment circuitry can force the output signal Dsv 176 to be neutral (a=b=128) (equal to zero) to be non-neutral or (b=+127) (equal to −1) depending on the content of an 8-bit control signal Ntl 54.
The left-hand side of
The first F—11 Filter Unit BL5 32 filters the input color signal Src 22 to create the blurred color signal Bl5 58. The Bl5 58 signal is then further filtered through the second F—11 Filter Unit BLA 34 to produce the super-blurred reference color signal Blr 44, which is used in the Pixel Control Module PxC 42. Both Filter units BL5 2 and BLA 34 apply a 2D separable and triangular filter of size (11×11). The details of this filter are given below.
The additional amount of filtering is necessary to ensure that Blr 44 is a stable and relatively noise-free signal. The Blr 44 signal is used in the Pixel Control Module PxC 42, as a reference signal for the un-sharp mask filter inside DSV 40, as well as in a Segmentation module SEG (not shown).
The color output from the first F—11 Filter Unit BL5 32 is also forwarded to the Sparse Contrast Unit SC5 36. The Sparse Contrast Module SC5 92 calculates the color contrast in a 5×5 window over the current pixel of interest. The resulting 8-bit monochrome contrast value 92 is further filtered through the third F—11 Filter Unit CLO 38, to produce the signal Clo 94. The filtered contrast signal Clo 94 is delivered to the Pixel Control module PxC 42.
As an implementation optimization, the CLO unit is optionally allowed to operate at ¼ the normal rate producing a ½ scaled version of Clo. The Clo signal is then used everywhere it is needed by doing simple nearest-neighbor 2× up scaling.
The Pixel Control module PxC 42 takes as inputs the blurred signal Blr 44, the filtered contrast value Clo 94, and the screen frequency Scf 26 and magnitude Scm 24 estimates from the Screen Frequency Estimate Module SEM. The Pixel Control module 42 produces an instantaneous decision, on a pixel-by-pixel basis, as to how much blurring is to be applied in the Variable Triangular Blur unit VTF 50. This decision is communicated to the Variable Triangular Blur unit VTF 50 for execution via the control signal Bnk 48. In addition, the Pixel Control module also generates additional enhancement controls in terms of the amount of sharpness Shp 46 and neutrality Ntl 54 for the Variable Sharpening and Neutral Unit VSN 52. The operation of the Pixel Control module PxC 42 is further detailed below.
The blurring filter arrangement is shown in
The input signal to each filter unit is a full-color signal, where the chroma channels are normally sub-sampled by a factor of two in the fast scan direction only. The 24-bit input signal Src 22 is fed to the first filter unit BL5 32 to produce the full-color filtered output labeled Bl5 58. The BL5 32 signal is then fed to the second Filter Unit BLA 34 to produce the full-color filtered, super-blurred output Blr 44. Both Filter Units operate at the full input data rate, each producing an independent full-color filtered output. Since each filter covers 11×11=121 input pixels, the filter units BL5 32 and BLA 34 are quite compute-intensive. For this reason the filter coefficients have been restricted to simple integers in order to eliminate the need for a large number of multipliers.
The two Filter Units BL5 32 and BLA 34 are identical. Each Unit is composed of an 11×11 2D FIR filter that is symmetric and separable in shape. A 1-D discrete filter response 56 is shown in
In addition, each one of the Filter Units BL5 32 and BLA 34 applies independent 11×11 filtering on each one of the (L, a, b) color components at its input. However, since normally every two subsequent pixels have the same chroma (a, b) values, the chroma filters can be simplified as will be more fully described below. The 1D luminance filter shape is given by equation (1) as:
The overall 2-D response of the luminance F—11 filter is given by equation (2) as:
Different implementations may chose to either normalize each pass back to 8 bits using:
1/36=455/2^14 (using a rounding right shift)
or normalize the final pass only using:
1/36*1/36=809/2^20 (using a rounding right shift).
The brute force method for filtering chroma subsampled images is to first expand the pixels to the same resolution as the luminance, filter using the same weights as the luminance, then average the results of adjacent chroma pairs in x to return to a X chroma subsampled representation. This method can be simplified by knowing that the source and destination chroma pixels are subsampled in X.
For instance, the chrominance filter can be implemented as two alternating filters as shown in equations (3) and (4):
where x=0 represents the location of unused pixels not involved in the current filtering operation. The two chroma filters alternate every other pixel.
Normally, the output chroma is also subsampled in the same way as the input chroma. Therefore the chroma value for both pixels would be the average of the odd and even cases. This can also be computed in one step using a 12 wide filter with weights: {1,3,5,7,9,11,11,9,7,5,3,1} and a normalizing value of 1/(36*2).
One approach to increase the filter efficiency is to increase the vertical context and process many lines in parallel. For example, the largest filter F—11 requires 11 lines of input to produce a single line of output (an efficiency of ˜9%). The filter efficiency is improved with more input lines. For example, if the number of input lines is increased from 11 to 20, the filter could now generate 8 lines of output, and the efficiency goes up to 40%=8/20. However, this requires larger input buffer to hold more lines, and this implies pipeline delay.
The Sparse Contrast Unit SC5 36 measures the amount of contrast of the first blurred signal Bl5 58 from the output of the Filter Unit BL5 32. The Bl5 58 signal itself is a blurred version of the full-color input signal Src 22 that was generated by passing Src 22 through the F—11 filter BL5 32. The BL5 32 input is a 24-bit (L,a,b) signal with a, b sub-sampled by a factor of 2× in the fast scan direction. As shown in
A block diagram of the SC5 module is shown in
In order to reduce the number of overall computations, the search is performed on every other pixel location 66 as shown in
The combined contrast measure is defined as the sum 70 of squared contributions 72 from each color component L(80,82), A(84, 86), and B(88,90) shown in equations (5), (6) and (7):
ΔL=Lmax−Lmin (5)
ΔA=Amax−Amin (6)
ΔB=Bmax−Bmin (7)
Where (Lmax, Lmin) 60, (Amax, Amin) 62, and (Bmax, Bmin) 64 are the independent minimum and maximum values found within the sparse 5×5 window of the respective color component, and the output value 76 is defined in equation (8) to be:
Δ=(ΔL2+ΔA2+ΔB2) (8)
Additional logic is used to limit the value of the result to the range of 8-bit 78 in case the value of Δ becomes too large.
Note that the output contrast value is a sum-of-squares measure, much like variance. It measures the largest squared contrast inside the sparse 5×5 windows. It does not matter if there is more than one pixel with the same maximum or minimum values inside the window—the contrast would still be the same. Likewise, if a certain color component is constant over the window, its maximum value would be identical to its minimum, and the contrast contribution would be zero.
The CLO F—11 38 filter is used to apply further filtering on Sc5 92 image from the Sparse Contrast Unit SC5 36. A large amount of filtering is required in order to obtain a stable contrast output signal Clo 94. For this reason, a large filter size of F—11 is used for CLO 38 as shown in
The CLO 38 F—11 filter takes as input the 8-bit output from Sparse Contrast Unit SC5 92. It produces a filtered output Clo 94 that is limited to fit the 8-bit range. The type of F—11 filter used is identical to the filters used in the filter Units BL5 32 and BLA 34 discussed above. The principle difference between this unit and the (BL5, BLA) units is that in this case the filter only operates on a single 8-bit gray component (as opposed to the 3-channel full-color LAB filters in BL5 and BLA).
Referring now to
The Pixel Control module 42 produces an instantaneous decision, on a pixel by pixel basis, as to which pair of filter outputs of the Variable Triangular Blur Filter VTF 50 is to be blended together and by how much. This decision is communicated to the Variable Triangular Blur Filter VTF 50 via the bank control signal Bnk 48. The Bnk 48 output is an 8-bit signal whose top three most significant bits select the base filter, and the two next significant bits provide the amount of blending to apply between this filter output and the subsequent (one size larger) one. The actual blending operation is implemented inside the Variable Triangular Blur Filter VTF 50 using full-color linear interpolation.
In addition, the Pixel Control module 42 also generates additional enhancement controls in terms of the sharpness Shp and pixel neutrality Ntl. The 8-bit signals Ntl 54 and Shp 46 are forwarded to and executed in the Variable Sharpen and Neutral Unit VSN 52.
The Pixel Control module 42 applies two programmable piecewise linear configuration functions: BnkVsFrq 102 and KilVsCon 112 producing the outputs: Bnka 101 and Kill 111. In general, the piecewise linear functions map 8-bits of input to 8-bits of output, and could be implemented using a full 256-entry lookup table. Although BnkVsFrq 102 is relatively complex, these functions (and others found in other modules) are typically quite simple usually involving only two significant points. These can be approximated by y=Ax+B where A is a low precision constant multiplier which can be implemented as a few add/sub operations.
As can be seen in the upper most part of
A linear interpolation unit Bnki 104 is then used to blend Bnkb 103 and Bnka 101 together, producing the Bnki 105 output. The amount of blending is determined by the control signal Kill 111, which is generated from CLO 94 via the piecewise linear function KilVsCon 112t. The 8-bit blended output is then multiplied 106 with the 8-bit input signal Scm and the resulting output divided by 256 108 clamped with BnkMin 100 as a minimum, becomes the Bnk 48 control. A non zero BnkMin 100 is used for noisy scanners or when reduction scaling is required later on, maybe in the pipeline.
In the lower portion of
Csq=min(255,(BlrA)2+(BlrB)2) (9)
Csq 133 is then compared to the limit Ntl_CsqSmlLim 134 producing the signal CsqSml 135, which indicates when Csq 133 is small. CsqSml 135 is then used as an input to the gates producing the two outputs Ntl 54 and Shplnc 120.
As shown in
The exported Ntl 54 output will be used in a subsequent stage VSN 52 to neutralize pixels by setting the A and B sample values to 128, (equal to 0), or an array from 128 (non-zero) as described below.
The other input to the gate producing Shplnc 120 is the product 122 of Scm 24 inverted and the Shp_TxtlncVsLum 116 piece-wise-linear function divided by 256. For implementation reasons, Shp_TxtlncVsLum 116 is constrained to be a sawtooth shaped function (example below). The sharpness increment value Shplnc 120 is added 122 to the default sharp value Shp_Default 114 producing the sharpness control signal Shp 46.
The inputs to the Variable Triangular Blur Filter VTF 50 include the full-color Lab source signal Src 22 and the monochrome 8-bit bank control signal Bnk 48. The output from the Variable Triangular Blur Filter is the full-color output Blv 150, which is a blurred and blended version of the input Src 22. The de-screened output Blv 150 is delivered to the Variable Sharpen and Neutral Unit VSN 142 for further processing and enhancement.
The Variable Triangular Blur Filter VTF is made up of two units: the Filter Bank unit 140 and the Variable Blend unit 142. The Filter Bank unit is the most computationally intensive in the de-screener unit. As shown in
The outputs from a selected pair 144, 146 of blurred signals are blended together to create a variable blended output that can smoothly transition from no blurring (output=input Src) to maximum blurring in a continuous manner. The selection of filters to use and the amount of blending are communicated from the Pixel Control module PxC 42 via the bank control Bnk 48 signal. The top three bits select the bank pair, and the next two define the amount of blending to apply. The use of these bits is captured in Tables 160 and 162 shown in
The Filter Bank 140 is composed of five independent full-color triangular filters: F—3, F—5, F—7, F—9, and F—11. The Filter Bank 140 arrangement is shown in
The input signal to each one of the filters is the full-color Lab source signal Src 22, where the chroma channels (a, b) are normally sub-sampled by a factor of two in the fast scan direction only. Whatever the pair of filters that is selected, those filters are operating at the full input Src 22 data rate, each producing its own independent full-color blurred output, labeled BLR_n, with n being the filter index.
Each filter unit (out of the two that are currently selected) is processing the input data independently for each of the (YCC) color components. Each filter has a symmetric, triangular and separable shape, with integer coefficients. The 1-D discrete response of the filters 152 is shown in
Blv=(Blr_n*(4-Blend)+Blend*(Blr_n+1)+2)/4 (10)
The added 2 provides nearest integer rounding. Each filter output is first only normalized back to an 11-bit range (without rounding). That is, each result is left scaled up by a factor of 8. This preserves 3 extra bits of precision for the blending operation. Equation (11) describes the final blending step in which rounding and the extra factors of 8 are taken into account as follows:
Blv=(8×Blr_n*(4-Blend)+Blend*(8×Blr_n+1)+16)/32 (11)
The normalization factors and shifts, which leave the intermediate results scaled up by 8, are shown in the Table 162 shown in
In general, when rounding is called for, it is applied by adding in half the divisor prior to performing a shift. Since right shift, performed on 2's complement coded binary numbers is the equivalent of floor (numerator/2^shift), adding half the divisor causes nearest integer rounding for both signed and unsigned numerators. Also it is best to only round once during a final scaling step.
The overall 2-D response of the smallest filter, F—3, is given in equation (12) as:
The larger filters are similarly described. The F—11 equation is identical to that in (12). Since these filters are separable, it is best to implement them in two orthogonal 1D steps. For a more efficient implementation of the first step, the larger filters can share partial results with the smaller filter rather than calculate them separately. For example, each of the 1 D un-normalized triangular sums TriN can be computed using the following loop:
N=0
Where pixel index zero is the current pixel of interest, and positive index axis is in normal raster order scan. One approach to increase the filter efficiency is to increase the vertical context and process many lines in parallel. For example, the largest filter F—11 requires 11 lines of input to produce a single line of output (an efficiency of ˜9%). The filter efficiency is improved with more input lines. For example, if the number of input lines is increased from 11 to 20, the filter could now generate 8 lines of output, and the efficiency goes up to 40%=8/20. This requires larger input buffer to hold more lines and implies a larger pipeline delay.
Referring back to
The block diagram of the Variable Sharpening and Neutral unit is shown in
The blended output Blv from the Variable Triangular Blur Filter VTF is passed through the AbsClrSgn unit, which only modifies the color, channels. For the chrominance channels, it outputs the absolute value of the signed chrominance value (abs(clr−128)). AbsClrSgn also outputs the signal ClrNeg, which records the signs of the 2 chrominance-components (negative in the range of 1 . . . 127, including end-points) before the operation was performed. The AbsClr unit on the Blr input performs the same absolute value function on the chrominance components of the Blr image.
The operation of the Un-sharp Mask filter is achieved by subtracting a low-frequency version of the source input—the super blurred signal Blr—from the blended output Blv. The difference is then scaled by some factor that is determined by the 8-bit Shp signal supplied by the PxC module, and then added back to the blended output. Since the Un-sharp Mask filter subtracted some portion of the low frequency content, the difference contains more of the high-frequency content. By adding more of the high-frequency content back to the original input Blv, the net result is to enhance image and sharpen it. The Shp signal is interpreted as a fixed-point 1.5 number, such that 32 is defined as the sharpening factor 1.0. The un-sharp Mask filter is independently applied to each of the three (L, a, b) color components of Blv.
The raw enhanced image is passed to the ClrClamp unit, which limits the enhancement possible in the chrominance channels as a function of the raw enhanced luminance signal. First the raw enhanced chrominance values, which had been constrained to be 0 . . . 128 before sharpening are clamped between 0 and 127. The primarily prevents negative swings which would translate to a chrominance sign change. Next this chrominance magnitude is reduced by the amount of luminance overshoot that was measured. Luminance overshoot uses the configuration parameter ClrLumOvrThr as a reference to identify pixels where the luminance signal is being driven very bright. The degree of this overshoot is used to reduce the magnitude of the sharpened chrominance. Finally the sharpened chrominance magnitude is converted back to it's normal 8 bit coded value using the ClrNeg bit to determine the original chrominance sign. The Luminance channel is simply clamped between 0 and 255.
The whole process of chrominance sharpening can be disabled by clearing the ClrEnable configuration parameter. The remaining clamp unit simply limits the sharpening luminance between 0 and 255. The current plan is to assume Shp_ClrEn is false and use this simpler form of FIG. 1.6.2.
Finally, the Neutral Adjustment Unit controls the chroma components (A, B) values of the final output Dsv. If the Ntl_Enable configuration parameter is true, then the neutral control Ntl 54 supplied by PxC will force the chroma components to zero by setting the output chroma values to A=B=128. Also, if Ntl_EnsureNonNtl is enabled, then if the Ntl control is false but the pixel chroma components after sharpening are both 128, one of its chroma components (B) is forced away from zero by arbitrary setting it to 127 (equal to −1).
While certain exemplary embodiments have been described in detail and shown in the accompanying drawings, those of ordinary skill in the art will recognize that the invention is not limited to the embodiments described and that various modifications may be made to the illustrated and other embodiments of the invention described above, without departing from the broad inventive scope thereof. It will be understood, therefore, that the invention is not limited to the particular embodiments or arrangements disclosed, but is rather intended to cover any changes, adaptations or modifications which are within the scope and spirit of the invention as defined by the appended claims.
This application is based on a Provisional Patent Application No. 60/393,244 filed Jul. 1, 2002. The present application is related to the following co-pending applications: Ser. No. 10/187,499 entitled “Digital De-Screening of Documents”, Ser. No. 10/188,026 entitled “Control System for Digital De-Screening of Documents”, Ser. No. 10/188,277 entitled “Dynamic Threshold System for Multiple Raster Content (MRC) Representation of Documents”, Ser. No. 10/188,157 entitled “Separation System for Multiple Raster Content (MRC) Representation of Documents”, and Serial No. 60/393,244 entitled “Segmentation Technique for Multiple Raster Content (MRC) TIFF and PDF all filed on Jul. 1, 2002 and all commonly assigned to the present assignee, the contents of which are herein incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
4849914 | Medioni et al. | Jul 1989 | A |
5351314 | Vaezi | Sep 1994 | A |
5384648 | Seidner et al. | Jan 1995 | A |
5392137 | Okubo | Feb 1995 | A |
5515452 | Penkethman et al. | May 1996 | A |
5583659 | Lee et al. | Dec 1996 | A |
5604825 | Hirota et al. | Feb 1997 | A |
5638134 | Kameyama et al. | Jun 1997 | A |
5745596 | Jefferson | Apr 1998 | A |
5900953 | Bottou et al. | May 1999 | A |
6058214 | Bottou et al. | May 2000 | A |
6115502 | De Haan et al. | Sep 2000 | A |
6222945 | Cheung et al. | Apr 2001 | B1 |
6324305 | Holladay et al. | Nov 2001 | B1 |
6343154 | Bottou et al. | Jan 2002 | B1 |
6360025 | Florent | Mar 2002 | B1 |
6400844 | Fan et al. | Jun 2002 | B1 |
6404934 | Lee et al. | Jun 2002 | B1 |
6430318 | Florent et al. | Aug 2002 | B1 |
6493467 | Okuda et al. | Dec 2002 | B1 |
6633670 | Matthews | Oct 2003 | B1 |
6725247 | Acharya | Apr 2004 | B2 |
6839152 | Fan et al. | Jan 2005 | B2 |
6947178 | Kuo et al. | Sep 2005 | B2 |
7106478 | Takano | Sep 2006 | B2 |
20040032600 | Burns et al. | Feb 2004 | A1 |
20040051908 | Curry et al. | Mar 2004 | A1 |
Number | Date | Country |
---|---|---|
0 712 094 | May 1996 | EP |
0 806 864 | Nov 1997 | EP |
0 806864 | Nov 1997 | EP |
1 006 716 | Jun 2000 | EP |
Number | Date | Country | |
---|---|---|---|
20040051908 A1 | Mar 2004 | US |
Number | Date | Country | |
---|---|---|---|
60393244 | Jul 2002 | US |