Claims
- 1. A method having scanned intensity information as input for detecting text in a scanned page by observing a very strong contrast in a localized region between a dark side and a light side, the method comprising:
pre-processing for stroke width determination; and contrast-based text detection processing; wherein the localized region comprises a substantially sharp edge between the dark side and the light side; and whereby any of black text on white background, black text on color background, and white or light text on a dark background are detected.
- 2. The method of claim 1, further comprising measuring a color saturation value and using the value to improve detection accuracy, wherein the color saturation value of the dark side is required to be small.
- 3. The method of claim 2, further comprising preliminarily single pixel processing to estimate the color saturation value using prior color information provided by the scanner.
- 4. The method of claim 1, furthering comprising detecting the presence of half-tone pixels by using a local indicator to improve detection accuracy.
- 5. The method of claim 4, wherein the half-tone detection is obtained through an algorithm for half-tone detection.
- 6. The method of claim 1, wherein the pre-processing further comprises:
detecting a local ramp; identifying an intensity trough; and determining a stroke width.
- 7. The method of claim 1, wherein the contrast-based text detection processing further comprises:
detecting text preliminarily based on local contrast and stroke width; and consistency checking.
- 8. The method of claim 1, wherein the observing a strong contrast further comprises:
detecting text preliminarily based on local contrast; and consistency checking.
- 9. The method of claim 6, further comprising:
detecting a local ramp; identifying an intensity trough; determining a stroke width; detecting text preliminarily based on contrast and stroke width; and consistency checking.
- 10. The method of claim 6, wherein identifying an intensity trough uses a finite state machine algorithm, the algorithm having a sweeping procedure.
- 11. The method of claim 10, the sweeping procedure further comprising:
sweeping the scanned page from left to right for detecting vertical troughs; and sweeping the scanned page from top to bottom for detecting horizontal troughs.
- 12. The method of claim 6, wherein the stroke width determination step further comprises:
determining a width and a skeleton, wherein the width is a distance value and the skeleton is a skeletal line; and detecting closely touching text strokes.
- 13. The method of claim 12, wherein the width and skeleton determining step further comprises:
setting the width value to the smaller of a vertical distance and a horizontal distance between two edges of the stroke; and determining the skeletal line as a roughly equidistant line from the edges.
- 14. The method of claim 12, wherein the detecting closely touching text strokes further comprises detecting a pattern of dark-light-dark (DLD) in a horizontal or a vertical direction within a very small window.
- 15. The method of claim 7, wherein the detecting text further comprises deciding whether a current pixel is a text pixel by using the local contrast present in an N×N window having a center over a set of pixels and centered at the current pixel, and stroke width at the current pixel.
- 16. The method of claim 15, wherein N=9.
- 17. The method of claim 15, wherein numerous statistics of the pixels within the N×N window are collected by using a set of thresholds.
- 18. The method of claim 17, wherein
the set of thresholds comprises any of:
a first minimum intensity level for text background; a maximum intensity level of text to be detected; a second minimum intensity level for text background around crowded text strokes, wherein the second minimum intensity level is smaller than the first minimum intensity level; a medium threshold value, wherein the medium threshold value is around 50% intensity; a first maximum width of a stroke, wherein the first width is considered thin; and a second maximum width of a stroke, wherein the second width is considered very thin; and wherein the numerous statistics comprise any of:
a number of pixels that are thin; a number of pixels in the center of a 3×3 window that are thin; a number of pixels on a skeleton, wherein the skeleton pixels are very thin; a minimum width among pixels of the center 3×3 pixels; a second smallest width among the pixels of the center 3×3 pixels, wherein the second smallest width is equal to the minimum width among pixels of the center 3×3 pixels if more than 1 pixel has the minimum width; a highest intensity present in the N×N window; a number of light pixels; a number of non-light pixels; a number of non-light pixels detected as half-toned from a half-tone detection module; a number of dark and neutral pixels; a number of dark and colored pixels; a number of colored pixels with medium intensity; a number of dark and neutral pixels after boosting; a number of pixels in the center 3×3 window, wherein the pixels are dark to medium in intensity; a thin flag set to 1 if the stroke is thin, or set to zero otherwise; and a background flag set to 1 if the center 3×3 pixels are all light, or set to zero otherwise.
- 19. The method of claim 17, further comprising determining if the current pixel is in a category of a set of predetermined categories using an associated algorithm and the set of thresholds, wherein the thresholds are chosen empirically.
- 20. The method of claim 19, wherein the predetermined set of categories comprises:
Text Outline; Text Body; Background; and Non-text.
- 21. The method of claim 19, further comprising moving the center of the N X N window by J pixels to obtain a subsampled text tag.
- 22. The method of claim 21, wherein J=3.
- 23. The method of claim 7, wherein the consistency checking further comprises:
accumulating a set of statistics using an N×N window of text tags and a set of thresholds; and deciding by using the set of statistics if each of the text tags is any of:
Text Outline; Text Body; Background; and Non-text.
- 24. The method of claim 23, wherein the N×N window further comprises N×N blocks, each block representing J×J pixels.
- 25. The method of claim 24, wherein N=5 and J=3.
- 26. The method of claim 23, wherein set the of thresholds comprises a maximum number of Non-text blocks threshold.
- 27. An apparatus for receiving scanned intensity information as input for detecting text in a scanned page by observing a very strong contrast in a localized region between a dark side and a light side, the apparatus comprising:
a module for pre-processing for stroke width determination; and a module for contrast-based text detection processing; wherein the localized region comprises a substantially sharp edge between the dark side and the light side; and whereby any of black text on white background, black text on color background, and white or light text on a dark background are detected.
REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. application Ser. No. 09/808,791, filed 14 Mar. 2001, now U.S. Pat. No. ______.
Continuations (1)
|
Number |
Date |
Country |
Parent |
09808791 |
Mar 2001 |
US |
Child |
10887940 |
Jul 2004 |
US |