Many offices have documents with sensitive information that need to be kept private and secure. Digital copy guard stamps (with text of “Confidential”, “Top Secret”, “Classified”, “Do not Copy”, etc.) are one popular technique. These stamps with security keywords may indicate the confidential nature on the document, and remind people not to make casual photocopies without permission.
Detecting these copy guard stamps during a copying operation would allow an additional layer of security in situations when personnel inadvertently or illicitly attempt a copying operation on stamped documents. This detection is made difficult, however, by the limitless variation of size, shape, font, and style in such stamps. As a result, conventional optical character recognition (OCR) based detection and deep learning based detection have had limited success. To cope with custom stamps, pattern matching based approaches may be a better choice.
“Pattern matching” refers to detecting a known pattern in an image by comparing their similarity. This method is particularly useful if the pattern structure is arbitrary. There are two major types of pattern matching approaches: template based, and feature based.
“Template based pattern matching” refers to comparing the template (pattern) image against the query image by sliding it. The template image is moved one pixel at a time from left to right or from top to bottom to enable to calculate some numerical measure of similarity to the patch it overlaps. Both images are converted into binary images or in black and white, or converted to edge image by running through an edge filter. Then template matching techniques like normalized cross-correlation, cross-correlation, and sum of squared difference are applied. After looping through all sizes, take the region with the largest correlation coefficient and use that as the “matched” region.
“Feature based pattern matching” refers to four main steps: feature detection, feature extraction, feature matching, and computing affinity. Image features, such as edges and interest points, provide rich information on the image content. Local features and their descriptors are the building blocks of many computer vision algorithms. Their applications include image registration, object detection and classification, tracking, and motion estimation. These features are exclusive for each image, and thus help in identifying similarities and differences between images. The features of an image may persist through changes in size and orientation, so this approach may further prove useful where the match in a scanned image is transformed in some fashion.
However, template based matching is not robustly scale- or rotation-invariant and is very slow at high resolution. In addition, copy stamp detection may need to be fully automated, leaving no place for human confirmation of a result. This prevents reliance on template based pattern matching, because the similarity score used in this approach may be highly dependent on image content. A very well matched pattern in one image may have a lower similarity score than a poorly matched different pattern-query pair. In other words, the similarity score is a “relative measure”. There is no threshold that works universally for most images.
Feature based pattern matching may be invariant to scale, rotate, translation, illumination, and blur. Therefore, when used on high resolution images, a feature based pattern matching may robustly detect a large number of key points/features. A constant threshold value (number of consistently matched key points) may be used for a quantitative final judgment, without human confirmation. Feature based pattern matching, however, is also slow at high resolution, and down-scaled images may have too few detectable features, and the features detected may lack enough uniqueness, to perform effective feature based pattern matching.
There is therefore a need for a solution that quickly and accurately detects the presence of a copy guard stamp during a copying operation without need for human intervention.
This disclosure relates to method of detecting a digital stamp pattern and an apparatus for performing the same. The method includes extracting stamp pattern keypoints and stamp pattern descriptors from an original template pattern image of the digital stamp pattern. The method also includes running a template matching routine between a low resolution original document and at least one lower resolution template pattern image, where match correlation coefficients are determined by regions in the low resolution original document. The method also includes selecting a matched region in the low resolution original document based on the match correlation coefficients. The method also includes cropping out a cropped region in a full resolution original document corresponding to the matched region in the low resolution original document. The method also includes extracting cropped region keypoints and cropped region descriptors in the cropped region. The method also includes matching the cropped region keypoints and the cropped region descriptors in the cropped region with stamp pattern keypoints and stamp pattern descriptors using a feature based pattern matching routine. The method also includes computing a transformation matrix using coordinates for the stamp pattern keypoints and coordinates for the cropped region keypoints to detect at least one of scaling, rotation, and translation of a detected digital stamp pattern in the cropped region relative to the original template pattern image. The method also includes checking a number of qualified matches determined using at least one of the feature based pattern matching and the transformation matrix against a pre-set threshold. On condition the number of qualified matches exceeds the pre-set threshold, the method includes issuing an alert for a possible security issue. On condition the number of qualified matches does not exceed the pre-set threshold, the method includes issuing a signal indicating no security issues.
The apparatus disclosed herein includes a processor and a memory. The memory stores instructions that, when executed by the processor, allow the apparatus to perform the disclosed method.
To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the figure number in which that element is first introduced.
In copy workflow, the disclosed solution for embedded security stamp detection in a multifunction printer (MFP) may quickly, automatically, and accurately detect pattern(s) matching with pre-configured security stamps, with no need for human intervention. Once such a stamp pattern detected, actions (such as stop copying, report to administrator, printing blank page, etc.) may be taken to enforce corporate security rules.
In a copy protection application, a scanned page may have a high resolution, such as 600 dots per inch (dpi), in order to provide a high-quality copied output. To perform security stamp detection during a real-time copying job, where the MFP engine speed may exceed 60 pages per minute (ppm), the scanned image may be scaled down in resolution to speed up the calculations needed to perform detection processes. “Copy protection” also known as content protection, copy prevention and copy restriction, is any effort designed to prevent the reproduction of document for security or copyright reasons. To prevent unauthorized copy, security key words are often overlaid on the document page when print, in the form of digital watermarks, stamps, or special nested patterns.
Based on the unique scope and requirement of stamp copy guard detection, this disclosure uses two pattern matching techniques in a two-stage approach to security stamp detection on a page inserted into an MFP or similar device for copying or scanning. “Copy guard detection” refers to, for security purposes, scanner/MFP automatically examines the content of a scanned-in page using computer vision techniques, to check against pre-set copy guard pattern, in order to prevent/report unauthorized copy.
The first stage uses template based pattern matching at low resolution to provide a relative best match localization, detecting the area where a copy protection stamp may be located. The second stage uses feature based pattern matching on the identified smaller area, at high resolution, for robust automatic detection judgment. This combination may provide the advantages of both template based and feature based approaches, while mitigating the drawbacks encountered using each approach singly.
In the first stage, the inserted page is scanned and downscaled to a fairly low resolution, and template based pattern matching may be performed with a preset copy security stamp scaled to a similar resolution. The rotation variation of the stamp on a scanned page may be small (e.g., less than two degrees). The scale variation may typically be small as well. The small amount of rotation and scaling that may be expected in a scanned page make template based pattern matching a practical approach, even at low resolution. In cases with greater scale variation than may be handled with a single template, multiple templates at different scales may be applied during template matching. When a match is detected within a region of the scanned document through template based pattern matching, the matched region may be cropped from a full resolution image of the scanned document.
In the second stage, feature based pattern matching between the cropped full resolution image and security stamp may be performed more expeditiously than on the full scanned page, the regions processed being smaller and containing less extraneous data. This disclosed method and apparatus may thus provide a good combination of speed and accuracy and may meet the fully-automated requirement of a copy job.
While the disclosed solution is presented as facilitating document security against unauthorized duplication, its applications are not limited thereto. This solution may be used to detect document patterns and logos that are not specifically copy security stamps. It may also be extended to a general approach for quick and effective pattern detection in documents run through an MFP.
At block 104, features may be detected in the source and template. A Scale-Invariant Feature Transform (SIFT) algorithm may be used to detect features. Detected features may then be extracted at block 106. Feature extraction may result in a keypoint and a description associated with each feature.
Features extracted in block 106 from the source and template of block 102 may be matched in block 108. A Fast Library for Approximate Nearest Neighbors (FLANN) may be used. An affine between the two sets of features may be computed using a Random Sample Consensus (RANSAC) algorithm at block 110. Affine transformation is described with respect to
In block 112, matching between the source and template may be displayed. In some embodiments, matching may be returned for use in additional process steps as a set of match correlation coefficients, a number of qualified matches, or a Boolean value indicating whether or not the number of qualified matches exceeds a pre-set threshold.
The feature keypoints and descriptors 206 are indicated by the circles in this drawing in a greatly simplified manner. While hundreds of features may be detected during feature extraction, many potential feature keypoints and descriptors 206 have been omitted in this illustration for the sake of clarity.
As illustrated in
In block 404, a template matching routine may be run between a low resolution original document and at least one lower resolution template pattern image. Template based pattern matching may be run using a template matching routine such as a Python Open Computer Vision (CV) Template Matching algorithm on the low resolution original document and at least one lower resolution template pattern image.
The low resolution original document may be a down-sampled scan of a page inserted into an MFP. For example, in run time, a page scanned at 600 dpi may be downscaled to 100 dpi. The at least one lower resolution template pattern image may be a downscaled version of the original template pattern image. A number of downscaled versions may be created, ranging, in some embodiments, from one tenth to one third the resolution of the original template pattern image. For example, an original 600 dpi template pattern image may be downscaled to a set of template pattern images of 100, 120, 140, 160, 180, and 200 dpi, as described in further detail with respect to
Template based pattern matching may be robust against differences in scale up to 20%. For instances where the scale difference between the original template pattern image and a stamp pattern present on the low resolution original document exceed that threshold, more than one of the downscaled, lower resolution template pattern images may be used in multiple loops of template based pattern matching, using additional template matching routine passes to detect scale-varied stamp patterns on the low resolution original document. This may be performed with commands such as:
The template matching routine(s) may result in a number of match correlation coefficients corresponding to different regions of the low resolution original document. In block 406, a matched region is selected in the low resolution original document based on the match correlation coefficients. The region having the largest match correlation coefficient returned by the template matching routine may be determined to be the matched region, corresponding the region of the low resolution original document that contains a digital stamp pattern.
In block 408, the region in the full resolution original document (e.g., the 600 dpi scanned image) corresponding to the matched region in the low resolution original document may be cropped out to form a cropped region. The cropped region may be considered to be likely to contain a digital stamp pattern and may represent a small area of the full resolution original document.
In block 410, cropped region keypoints and cropped region descriptors may be extracted from the cropped region using Oriented FAST and Rotated BRIEF (ORB) algorithm or a Binary Robust Invariant Scalable Keypoints (BRISK) algorithm. Commands such as the following may be used:
In block 412, the cropped region keypoints and cropped region descriptors in the cropped region may be matched with stamp pattern keypoints and stamp pattern descriptors using a feature based pattern matching routine. The feature based pattern matching routine may be a Fast Library for Approximate Nearest Neighbors (FLANN) algorithm, an ORB algorithm, a Scale-Invariant Feature Transform (SIFT) algorithm, or a Speeded Up Robust Features (SURF) algorithm. The command used may be as follows to invoke a FLANN algorithm:
Matches determined to be “good matches” per Lowe's ratio test may be stored for additional computations.
In block 414, a transformation matrix may be computed using coordinates for the stamp pattern keypoints and coordinates for the cropped region keypoints to detect at least one of scaling, rotation, and translation of a detected digital stamp pattern in the cropped region relative to the original template pattern image. This transformation may in one embodiment be run on the above-discussed stored matches. In one embodiment, the transformation matrix may be an affine matrix. Affine transformations are described at a high level with respect to
In some embodiments, the transformation matrix may be a homography matrix. Homography mapping is described at a high level with respect to
In block 416, a number of qualified matches determined using at least one of the feature based pattern matching and the transformation matrix may be checked against a pre-set threshold. If at decision block 418 the number of qualified matches exceeds the pre-set threshold, an alert for a possible security issue may be generated at block 420. If at decision block 418 the number of qualified matches does not exceed the pre-set threshold, a signal indicating no security issues may be generated at block 422.
In block 504, original template pattern image may be downscaled to at least one lower resolution template pattern image. In one embodiment, more than one lower resolution template pattern image may be created, and each lower resolution template pattern image may have a unique lower resolution than the original template pattern image. Thus, a number of downscaled versions may be created, ranging, in some embodiments, from one tenth to one third the resolution of the original template pattern image. For example, an original 600 dpi template pattern image may be downscaled to a set of template pattern images of 100, 120, 140, 160, 180, and 200 dpi, as described in further detail with respect to
In block 506, a template matching routine may be run between a low resolution original document and at least one lower resolution template pattern image. Template based pattern matching may be run using a template matching routine such as a Python Open Computer Vision (CV) Template Matching algorithm on the low resolution original document and at least one lower resolution template pattern image. The low resolution original document may be a down-sampled scan of a page inserted into an MFP. For example, in run time, a page scanned at 600 dpi may be downscaled to 100 dpi.
The template matching routine(s) may result in a number of match correlation coefficients corresponding to different regions of the low resolution original document. In block 508, a matched region is selected in the low resolution original document based on the match correlation coefficients. The region having the largest match correlation coefficient returned by the template matching routine may be determined to be the matched region, corresponding the region of the low resolution original document that contains a digital stamp pattern.
In block 510, the region in the full resolution original document (e.g., the 600 dpi scanned image) corresponding to the matched region in the low resolution original document may be cropped out to form a cropped region. The cropped region may be considered to be likely to contain a digital stamp pattern and may represent a small area of the full resolution original document.
In block 512, cropped region keypoints and cropped region descriptors may be extracted from the cropped region using Oriented FAST and Rotated BRIEF (ORB) algorithm or a Binary Robust Invariant Scalable Keypoints (BRISK) algorithm. Commands such as the following may be used:
In block 514, the cropped region keypoints and cropped region descriptors in the cropped region may be matched with stamp pattern keypoints and stamp pattern descriptors using a feature based pattern matching routine. The feature based pattern matching routine may be a Fast Library for Approximate Nearest Neighbors (FLANN) algorithm, an ORB algorithm, a Scale-Invariant Feature Transform (SIFT) algorithm, or a Speeded Up Robust Features (SURF) algorithm. The command used may be as follows to invoke a FLANN algorithm:
Matches determined to be “good matches” per Lowe's ratio test may be stored for additional computations.
In block 516, a transformation matrix may be computed using coordinates for the stamp pattern keypoints and coordinates for the cropped region keypoints to detect at least one of scaling, rotation, and translation of a detected digital stamp pattern in the cropped region relative to the original template pattern image. This transformation may in one embodiment be run on the above-discussed stored matches. In one embodiment, the transformation matrix may be an affine matrix. Affine transformations are described at a high level with respect to
In some embodiments, the transformation matrix may be a homography matrix. Homography mapping is described at a high level with respect to
In block 518, a number of qualified matches determined using at least one of the feature based pattern matching and the transformation matrix may be checked against a pre-set threshold. If at decision block 520 the number of qualified matches exceeds the pre-set threshold, an alert for a possible security issue may be generated at decision block 522. If at decision block 520 the number of qualified matches does not exceed the pre-set threshold, a signal indicating no security issues may be generated at block 524.
In this template matching process 900, the lower resolution template pattern image 902 may be compared to the low resolution original document 904 using template based pattern matching 906. Template based pattern matching 906 may identify a matched region 908, where the resemblance of a detected digital stamp pattern 910 on the low resolution original document 904 with a lower resolution template pattern image 902 yields high match correlation coefficients.
Cropping 912 an area in the full resolution original documents 800 corresponding to the matched region 908 may yield a cropped region 914. This cropped region 914 at full resolution may represent a richly featured image suitable for comparison with the original template, while still smaller than the entire full resolution original document 800.
In one embodiment, where the digital stamp pattern is for a confidentiality or security stamp, a match may generate an alert for a possible security issue. Such a generated alert may result in a number of actions, such as preventing printing of the document inserted into the MFP, sending a security alert email to personnel responsible for managing document security, etc.
In cases where the number of qualified matches does not meet the pre-set threshold, the full resolution original document may be determined not to contain a digital stamp pattern, and a signal indicating no security issues may be generated. In cases where security stamps are of particular interest, documents not containing the digital stamp pattern corresponding to a security stamp, a signal indicating no security issues may trigger the MFP to complete copying, scanning, or printing of the full resolution original document.
Relationships between features within an image such as View 1 are not disturbed by affine transformations. Thus the relationships between features within View 3 may be determined to be similar to the relationships among View 1 features, allowing feature based pattern matching to match View 1 and View 3 as patterns having similarity above a certain threshold, by computing an affine matrix as described with respect to
View 1 features and View 2 features may be determined to be similar above a certain threshold by computing a homography matrix as described with respect to
As depicted in
The volatile memory 1412 and/or the nonvolatile memory 1416 may store computer-executable instructions and thus forming logic 1422 that when applied to and executed by the processor(s) 1406 implement embodiments of the processes disclosed herein.
The input device(s) 1410 include devices and mechanisms for inputting information to the data processing system 1402. These may include a keyboard, a keypad, a touch screen incorporated into the monitor or graphical user interface 1404, audio input devices such as voice recognition systems, microphones, and other types of input devices. In various embodiments, the input device(s) 1410 may be embodied as a computer mouse, a trackball, a track pad, a joystick, wireless remote, drawing tablet, voice command system, eye tracking system, and the like. The input device(s) 1410 typically allow a user to select objects, icons, control areas, text and the like that appear on the monitor or graphical user interface 1404 via a command such as a click of a button or the like.
The output device(s) 1408 include devices and mechanisms for outputting information from the data processing system 1402. These may include the monitor or graphical user interface 1404, speakers, printers, infrared LEDs, and so on as well understood in the art.
The communication network interface 1414 provides an interface to communication networks (e.g., communication network 1420) and devices external to the data processing system 1402. The communication network interface 1414 may serve as an interface for receiving data from and transmitting data to other systems. Embodiments of the communication network interface 1414 may include an Ethernet interface, a modem (telephone, satellite, cable, ISDN), (asynchronous) digital subscriber line (DSL), FireWire, USB, a wireless communication interface such as BlueTooth or WiFi, a near field communication wireless interface, a cellular interface, and the like.
The communication network interface 1414 may be coupled to the communication network 1420 via an antenna, a cable, or the like. In some embodiments, the communication network interface 1414 may be physically integrated on a circuit board of the data processing system 1402, or in some cases may be implemented in software or firmware, such as “soft modems”, or the like.
The computing device 1400 may include logic that enables communications over a network using protocols such as HTTP, TCP/IP, RTP/RTSP, IPX, UDP and the like.
The volatile memory 1412 and the nonvolatile memory 1416 are examples of tangible media configured to store computer readable data and instructions to implement various embodiments of the processes described herein. Other types of tangible media include removable memory (e.g., pluggable USB memory devices, mobile device SIM cards), optical storage media such as CD-ROMS, DVDs, semiconductor memories such as flash memories, non-transitory read-only-memories (ROMS), battery-backed volatile memories, networked storage devices, and the like. The volatile memory 1412 and the nonvolatile memory 1416 may be configured to store the basic programming and data constructs that provide the functionality of the disclosed processes and other embodiments thereof that fall within the scope of the present disclosure.
Logic 1422 that implements embodiments of the present disclosure may be stored in the volatile memory 1412 and/or the nonvolatile memory 1416. Said logic 1422 may be read from the volatile memory 1412 and/or nonvolatile memory 1416 and executed by the processor(s) 1406. The volatile memory 1412 and the nonvolatile memory 1416 may also provide a repository for storing data used by the logic 1422.
The volatile memory 1412 and the nonvolatile memory 1416 may include a number of memories including a main random access memory (RAM) for storage of instructions and data during program execution and a read only memory (ROM) in which read-only non-transitory instructions are stored. The volatile memory 1412 and the nonvolatile memory 1416 may include a file storage subsystem providing persistent (non-volatile) storage for program and data files. The volatile memory 1412 and the nonvolatile memory 1416 may include removable storage systems, such as removable flash memory.
The bus subsystem 1418 provides a mechanism for enabling the various components and subsystems of data processing system 1402 communicate with each other as intended. Although the communication network interface 1414 is depicted schematically as a single bus, some embodiments of the bus subsystem 1418 may utilize multiple distinct busses.
It will be readily apparent to one of ordinary skill in the art that the computing device 1400 may be a device such as a smartphone, a desktop computer, a laptop computer, a rack-mounted computer system, a computer server, or a tablet computer device. As commonly known in the art, the computing device 1400 may be implemented as a collection of multiple networked computing devices. Further, the computing device 1400 will typically include operating system logic (not illustrated) the types and nature of which are well known in the art.
Terms used herein should be accorded their ordinary meaning in the relevant arts, or the meaning indicated by their use in context, but if an express definition is provided, that meaning controls.
“Circuitry” in this context refers to electrical circuitry having at least one discrete electrical circuit, electrical circuitry having at least one integrated circuit, electrical circuitry having at least one application specific integrated circuit, circuitry forming a general purpose computing device configured by a computer program (e.g., a general purpose computer configured by a computer program which at least partially carries out processes or devices described herein, or a microprocessor configured by a computer program which at least partially carries out processes or devices described herein), circuitry forming a memory device (e.g., forms of random access memory), or circuitry forming a communications device (e.g., a modem, communications switch, or optical-electrical equipment).
“Firmware” in this context refers to software logic embodied as processor-executable instructions stored in read-only memories or media.
“Hardware” in this context refers to logic embodied as analog or digital circuitry.
“Logic” in this context refers to machine memory circuits, non transitory machine readable media, and/or circuitry which by way of its material and/or material-energy configuration comprises control and/or procedural signals, and/or settings and values (such as resistance, impedance, capacitance, inductance, current/voltage ratings, etc.), that may be applied to influence the operation of a device. Magnetic media, electronic circuits, electrical and optical memory (both volatile and nonvolatile), and firmware are examples of logic. Logic specifically excludes pure signals or software per se (however does not exclude machine memories comprising software and thereby forming configurations of matter).
“Software” in this context refers to logic implemented as processor-executable instructions in a machine memory (e.g. read/write volatile or nonvolatile memory or media).
Herein, references to “one embodiment” or “an embodiment” do not necessarily refer to the same embodiment, although they may. Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to.” Words using the singular or plural number also include the plural or singular number respectively, unless expressly limited to a single one or multiple ones. Additionally, the words “herein,” “above,” “below” and words of similar import, when used in this application, refer to this application as a whole and not to any particular portions of this application. When the claims use the word “or” in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list and any combination of the items in the list, unless expressly limited to one or the other. Any terms not expressly defined herein have their conventional meaning as commonly understood by those having skill in the relevant art(s).
Various logic functional operations described herein may be implemented in logic that is referred to using a noun or noun phrase reflecting said operation or function. For example, an association operation may be carried out by an “associator” or “correlator”. Likewise, switching may be carried out by a “switch”, selection by a “selector”, and so on.