1. Field of the Invention
This invention relates to a document authentication method by image comparison, and in particular, by image comparison on a block-by-block basis.
2. Description of Related Art
In situations where an original document, either in electronic form or in hardcopy form, is printed or copied to produce a copied document in hardcopy form, and the copied document is distributed and circulated, there is often a need to determine whether a purported true copy (referred to as the target document in this disclosure) is authentic, i.e., whether the copied document has been altered while it was in circulation. A goal in many document authentication methods is to detect what the alterations (additions, deletions) are. Alternatively, some document authentication methods determine whether or not the document has been altered, without determining what the alterations are.
Various types of document authentication methods are known. One type of document authentication method performs a digital image comparison of a scanned image of the target document with an image of the original document. In such a method, the image of the original document is stored in a storage device at the time of printing or copying. Later, the target document is scanned, and the stored image of the original document is retrieved from the storage device and compares with the image of the target document. In addition, certain data representing or relating to the original document, such as a document ID, is also stored in the storage device. The same data is encoded in barcodes which are printed on the copied document when the copy is made, and can be used to assist in document authentication.
Often, the image of the target document (the target image) contains various distortions due to the document having been copied and/or scanned. These distortions may include scaling (size enlargement or reduction), rotation, and/or shift of the image as compared to the image of the original document (the original image). Thus, the target image needs to be corrected for these distortions before image comparison. This process may be referred to as image registration or alignment. Correction for scaling distortion is also referred to as resizing; correction for rotation distortion is also referred to as deskew. One image registration method uses cross-correlation of the target and original images to calculate a global registration. Such calculation can be computationally intensive.
The present invention is directed to an improved image comparison method and related apparatus that substantially obviates one or more of the problems due to limitations and disadvantages of the related art.
An object of the present invention is to provide an improved image comparison method useful for comparing images that represent documents containing text.
Additional features and advantages of the invention will be set forth in the descriptions that follow and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
To achieve these and/or other objects, as embodied and broadly described, the present invention provides a document authentication method implemented in a data processing system, which includes: (a) obtaining an original image representing an original document; (b) segmenting the original image into a plurality of blocks to generate layout information, wherein the layout information includes positions of the plurality of blocks; (c) obtaining a target image representing a target document; (d) segmenting the target image into a plurality of blocks; (e) for a first block among the plurality of blocks of the original image: (e1) searching the target image to identify a first block of the target image corresponding to the first block of the original image; (e2) calculating a position mapping for the first block of the target image; and (e3) detecting any alterations in the first block of the target image; and (f) for each subsequent block among the plurality of blocks of the original image: (f1) identifying a subsequent block of the target image corresponding to the subsequent block of the original image based on a position of the subsequent block of the original image relative to the first block of the original image obtained from the layout information and the position mapping for the first block of the target image calculated in step (e2); (f2) comparing the subsequent block of the original image and the subsequent block of the target image to detect any alterations in the subsequent block of the target image.
In another aspect, the present invention provides a computer program product comprising a computer usable non-transitory medium (e.g. memory or storage device) having a computer readable program code embedded therein that causes a data processing apparatus to perform the above methods or parts thereof.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.
Embodiments of the present invention provide an image comparison method useful for authenticating documents containing text. Both the original image representing the original document and the target image representing the target document are segmented into a number of relatively large blocks, each block being a sub-image of the respective image. For example, the images may be segmented into multiple blocks each corresponding to a paragraph of text in the document. Then, a first block in the original image is used to search the target image to find a corresponding first block, for example, by using a cross-correlation method. In this step, the position mapping for the first block in the target image is calculated, and the two blocks are compared to find any alterations. After the first block is processed, subsequent blocks of the original and target images are identified based on relative position information and may be compared using a method other than cross-correlation.
The document authentication method according to embodiments of the present invention includes a document registration stage and an authentication stage. Note that “document registration” should not be confused with “image registration.” Document registration refers to storing (registering) the images of the original documents with the system for later retrieval; “image registration” refers to aligning one image with another. In the document registration stage, a printer 103 or copier 101 makes a hardcopy (i.e. on a physical medium such a paper) copy of an original document which may be in electronic or hardcopy form. An image of the original document (referred to as the original image) is generated in the document registration stage. If the original document is in electronic form, the original image may be generated from the original electronic document by the server 104 or the printer 103. If the original document is in hardcopy form and a copy is made by a copier 101, the copier scans the original hardcopy document to generate the original image and then print a copy from the scanned image. The original image is processed by a data processing apparatus and the resulting data is stored in the storage device 106. Later, in the authentication stage, a user may submit a copied document (the target document) for authentication by scanning the target document using a scanner 102 or copier 101, and causing a data processing apparatus to retrieve the stored data from the storage device 106 and to perform image comparison.
The document registration stage is described with reference to
If the original image is a grayscale image (as is typically the case when it is generated by scanning), the image is binarized (step S14). This step is omitted if the original image is already a binary image.
Then, the binarized original image is segmented into a number of relatively large blocks (step S15). For example, the original image may be segmented into paragraph blocks each corresponding to a paragraph of text. Each block is defined by its bounding box, which is a box (preferably rectangular) that bounds the corresponding text from all sides. If the document contains image or graphic objects, each such object may be a block. The segmentation result, i.e. the positions of the blocks, may be referred to as layout information as it reflects the general layout of the original document.
Many methods can be used to accomplish image segmentation of a document that includes text. In one method, a horizontal histogram (or horizontal projection) is generated by plotting, along the vertical axis, the number of non-white pixels in each row of pixels. Such a horizontal histogram will tend to have segments of low values corresponding to white spaces between lines of text, and segments (approximately equal width) of higher values corresponding to lines of text. Such histograms can therefore be used to identify line units for document segmentation. Further, if paragraph spacing is different from line spacing in the document, block (e.g. paragraph) units can be identified from such histograms (where larger gaps in the histogram would indicate paragraph breaks and smaller gaps in the histogram would indicate line breaks). Additional starting and ending information of lines may be helpful for block extraction. Further, in the case of multiple objects and complicated layout design, the existence of different types of objects in some area can be identified by analyzing the distribution of the histogram, and then data block can be extracted by analyzing vertical projection in that area.
In another document segmentation method, a morphological dilation operation is performed on the image, so that nearby characters merge into dark blocks corresponding to word units. Dilation is a well-known technique in morphological image processing which generally results in an expansion of the dark areas of the image. Once the characters are merged into word units, they can be further grouped to form line units and paragraph units.
In another document segmentation method, connected image components (e.g. connected groups of pixels in the case of a binary image) may be identified as corresponding to characters, and character units are formed from these connected image components. Once character units are formed, they can be grouped to form word units, line units, and paragraph units based on their relative spatial positions.
Other document segmentation methods also exist. Some such methods are knowledge based, which uses knowledge of document structure to segment the image.
After segmentation, the binarized original image is stored in a storage device along with the layout information (step S16). The image and related information are stored in association with the document management information, such as the document ID, to facilitate image retrieval during the authentication stage. The stored image along with the associated information may be referred to as the registered document. The hardcopy generated by step S13 is referred to as a copy of the registered document.
In the document registration stage, steps S14 to S16 may be performed by the copier or printed, in which case the copier or printed can transmit the binarized image and layout information to the server or store it directly in the storage device; or they may be performed by the server, in which case the copier or printer will transmit the original image to the server. Step S12 likewise may be performed by either the copier or printer or the server. More generally, the data processing steps S12 and S14 to S15 may be performed in a distributed manner by several devices. It should also be note that the order of performance of steps S12 and S13 relative to steps S14 to S16 is generally not important.
The authentication stage is described with reference to
Then, the binarized target image is segmented into a number of relatively large blocks (step S25). The segmentation is performed in a similar manner as for the original image. For example, if the original image is segmented into paragraph blocks, then the target image is also segmented into paragraph blocks using the same algorithm. Thus, if the target document contains no alteration or only local alterations (e.g. deletion, insertion or change of words in a relatively isolated manner), the segmentation result for the target image should include the same number of blocks having approximately the same relative positions as in the original image.
Then, an image comparison process is performed on a block-by-block basis to detect any alternations contained the target image (step S25). In this step, the first pair of blocks of the original and target images is treated differently than subsequent pairs of blocks, and different image comparison methods are used for them. This step is described in more detail with reference to
Referring to
The position mapping calculated in step S31 represent the amounts that the first block of the target image must be shifted and/or rotated in order to be aligned with the first block of the original image. In a preferred embodiment, rotation of the target image has been separately corrected in a deskew process (not shown in
It can be seen that the searching step S31 accomplished three functions: identifying a corresponding first block in the target image, calculating its position mapping, and detecting any alterations in the first block of the target image.
After the first block is processed, the subsequent blocks of the original and target images can be compared using a different image comparison method than the method used for the first block. For each subsequent block of the original image (step S32), a corresponding block of the target document is identified based on the position of the subsequent block of the original image relative to the first block of the original image, which is obtained from the layout information, as well as the position mapping for the first block of the target image (step S33). More specifically, this step identifies a block of the target image that has a relative position with respect to the first block of the target image substantially equal to the relative position of the subsequent block of the original image with respect to the first block of the original image, and that has substantially the same size as the subsequent block of the original image. The identification does not require any image comparison. This is based on the reasonable assumption that the relative positions among blocks of the target image are approximately the same as the relative positions among blocks of the original image, even though the target image as a whole is shifted and/or rotated relative to the original image. A suitable tolerance such as half the average size of the characters in the block may be used when comparing the positions and sizes of the blocks.
If a corresponding block satisfying the above conditions is not found in the target image, then the target image may be deemed to have been altered.
Once the corresponding block of the target image is identified, an image comparison is carried out for the pair of blocks (step S34). Because the position mapping for the block of the target image are known (they are assumed to be the same as the correction values for the first block of the target image), an image registration calculation is omitted, and the blocks may be compared without using a computationally intensive cross-correlation method. Various methods may be suitable for image comparison in step S34. For example, a simple method calculates a difference image (XOR) of the two sub-images.
Another image comparison method, described in commonly owned U.S. Pat. No. 8,000,528, issued Aug. 16, 2011, involves segmenting the original and target documents into paragraph, line, word and character units, and comparing the two images at progressively lower levers. The paragraph level comparison determines whether the target and original images have the same number of paragraphs and whether the paragraphs have the same sizes and locations (this would be comparable to step S33 of
Yet another image comparison method, described in commonly owned U.S. Pat. No. 7,965,894, issued Jun. 21, 2011, involves a two-step comparison. In the first step, the original and target images are divided into connected image components and their centroids are obtained, and the centroids of the image components in the original and target images are compared. Each centroid in the target image that is not in the original image is deemed to represent an addition, and each centroid in the original image that is not in the target image is deemed to represent a deletion. In the second step, sub-images containing the image components corresponding to each pair of matching centroids in the original and target images are compared to detect any alterations.
Yet another image comparison method, described in commonly owned, co-pending U.S. patent application Ser. No. 13/053618, filed Mar. 22, 2011, involves comparing pairs of text characters by analyzing and comparing their shape features such as their Euler numbers, aspect ratios of their bounding boxes, pixel densities, the Hausdorff distance between the two characters, etc.
Steps S33 and S34 are repeated for the next block of the original image until all blocks are processed (step S32).
At various points of the image comparison flow shown in
Further, although not shown in the drawings, various post-processing steps may be carried out, such as generating a difference map between the original image and the target image if any alteration is detected, displaying the detection result to the user, etc. Again, these steps may be easily implemented by those skilled in the art.
In the authentication stage, steps S24 to S26 may be performed by the scanner, in which case the scanner can request the original image and layout information from the server or retrieve it directly from the storage device; or they may be performed by the server, in which case the scanner will transmit the target image to the server. Step S22 likewise may be performed by either the scanner or the server. More generally, the data processing steps S22 to S23 and S24 to S26 may be performed in a distributed manner by several devices.
In the methods shown in
It will be apparent to those skilled in the art that various modification and variations can be made in the alteration detection method and related apparatus of the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention cover modifications and variations that come within the scope of the appended claims and their equivalents.