Marks are applied to a good to uniquely identify the good. A mark is a symbol that encodes information in accordance with a predefined symbology. Counterfeit goods are widely available and often hard to spot. When counterfeiters produce fake goods they typically copy the associated symbology including a mark, in addition to the actual goods. To the human eye, a photocopy or counterfeit mark can is appear genuine and even yield the appropriate message (e.g., decode to the appropriate message associated with the symbology). Many of the technologies currently available to counter such copying rely on visually comparing an image of a possible counterfeit mark with an image of an original, genuine mark.
An embodiment described herein provides a method that enables photocopy or counterfeit detection in symbologies. The method includes obtaining images with a representation of a same mark and predicting an authenticity of the representation of the same mark in each image to obtain an authenticity prediction corresponding to each image of the set of images. The method also includes consolidating the authenticity predictions to determine an ensemble prediction of authenticity associated with the same mark.
An embodiment described herein provides a system that enables photocopy or counterfeit detection in symbologies. The system includes at least one processor and at least one non-transitory storage media storing instructions that, when executed by the at least one processor, cause the at least one processor to obtain images with a representation of a same mark and predict an authenticity of the representation of the same mark in each image to obtain an authenticity prediction corresponding to each image of the set of images. The instructions also case the at least one processor to consolidate the authenticity predictions to determine an ensemble prediction of authenticity associated with the same mark.
An embodiment described herein provides at least one non-transitory storage media storing instructions that, when executed by at least one processor, cause the at least one processor to obtain images with a representation of a same mark and predict an authenticity of the representation of the same mark in each image to obtain an authenticity prediction corresponding to each image of the set of images. The instructions also cause the at least one processor to consolidate the authenticity predictions to determine an ensemble prediction of authenticity associated with the same mark.
Marks are used to convey information or data (e.g., a message) associated with goods, and are applied to the goods or packaging of the goods. Marks can be referred to as an “original item identifier” and include quick response (QR) codes and barcodes. In some cases, marks are generated using a thermal transfer or ink-jet process to create highly uniform, solid, black or other printed areas using a predetermined format. The precise printing used to generate marks can produce marks that are not easily replicated. Processes used to replicate a mark tend to produce printed areas in which the black areas are grayer at low resolutions and mottled at high resolutions when compared to a genuine mark. The present systems and techniques enable photocopy or counterfeit detection in symbologies. Subtle differences in marks are detected and evaluated to predict if the mark is a genuine mark.
In the example of
Generally, symbologies such as the QR-codes 102, barcodes 104, and laser print 106 are captured by an image capture device such as a scanner or camera, and are processed to extract information contained in the mark. The image capture device detects an intensity of light reflected by spaces in the pattern created by the mark. In some embodiments, an illumination system outputs infrared light that is reflected by portions of the mark, and the image capture device captures reflected infrared light.
For ease of description, particular symbologies are described. However, the symbologies may include any suitable mark. In some cases, the mark that represents data in a visual, machine-readable form. In examples, the symbology is continuous or discrete. In examples, a continuous symbology is one in which the entire symbology is read at once. Portions of a continuous symbology are invalid. By contrast, a discrete is symbology is a symbology with multiple independent portions. The symbology may be two-width or many width, with several bars or spaces that are multiples of a basic width. In some cases, the symbology includes interleaving. In interleaving, a first character is encoded using black bars of varying width and the second character is then encoded by varying the width of the white spaces between black bars of the first character. Characters are encoded in pairs over the same section of the barcode. The present techniques detect photocopy/counterfeit marks that may be otherwise undetectable. Detecting photocopy/counterfeit marks enables an increased confidence in symbologies as representing genuine, authentic goods. The identification of photocopy/counterfeit marks can be used to eliminate the sale of unauthorized goods. In this manner, accurate photocopy/counterfeit detection according to the present techniques positively impacts a customer trust in a brand as well as brand value.
Generally, a photocopy is a mark that results from copying a genuine mark that represents goods and/or services. A photocopy is not a genuine mark. The photocopy is presented as a genuine representation of a mark and is intended to contain the same information as the mark from which the photocopy is generated. Copying techniques and devices, such as photocopiers, are able to emulate a genuine mark of a symbology with a high accuracy. In examples, the high accuracy refers to a photocopy that is undetectable by the human eye. Generally, a counterfeit mark refers to a mark that is intentionally created to deceive or mislead while being associated with information and/or representing goods of an authentic mark. A counterfeit mark is not a genuine mark. The counterfeit mark is presented as a genuine representation of a mark.
In the workflow 200, image capture 202 is used to obtain multiple image pairs of a mark. Obtaining image pairs includes receiving the images from another source and capturing the images using an image capture device. The another source can be a local or remote database with stored images. In the example of
In operation, two or more images of the mark are captured, generating images that contain a representation of the same mark. In some embodiments, the images are high resolution images. The resolution can vary. In some examples, a high resolution image is an image that has a resolution of greater than 96 dots per inch (dpi). A cell phone, camera, scanner, or other device (e.g., controller 802 of
For example, a person uses a device to capture multiple images of the mark. During the course of image capture, poses of the mark vary. In some cases, the variation in poses among images is minimal, such that the mark is at a substantially similar pose in the multiple images. This can occur, for example, when the images are taken in a short period of time (e.g., less than a couple of seconds) or as the result of a burst mode that automatically captures multiple images in rapid succession. In some cases, the variation in poses creates at least one image with the same mark in a different pose when compared to the multiple images. This can occur, for example, when the images are taken over a longer period of time (e.g., more than a couple of seconds). In some cases, the multiple images vary in quality. For example, the multiple images can have varying levels of focus/blurriness, illumination, percentage of the mark contained within the image capture device field of view, or any combinations thereof.
Multiple images containing a representation of a mark are provided as input to a trained machine learning model 204, and the trained machine learning model 204 makes a prediction of genuine or photocopy/counterfeit for each image of the multiple images. A consolidator 206 consolidates the prediction of genuine or photocopy/counterfeit for each image of the multiple images and outputs a prediction of authenticity for the mark. A mode of predictions associated with the multiple images is used to consolidate the machine learning model results for the multiple images. The mode refers to statistical voting. In examples, if an equal number of is images are predicted as genuine or photocopy/counterfeit, then the mark is predicted to be a photocopy/counterfeit. For example, when six images of a mark are obtained at image capture 202 and provided as input to a trained machine learning model 204, predictions for each image are consolidated by determining a mode or statistical voting associated with the predictions. In this example, if the trained machine learning model outputs a prediction for three images of photocopy/counterfeit and a prediction for the three remaining images of genuine, the consolidator labels the mark contained in the images as a photocopy/counterfeit.
In examples, the present techniques enable high-resolution image capture of six images containing a representation of at least a portion of the same mark. For each image, the prediction of genuine or photocopy/counterfeit is realized using one or more machine learning models 204. The various machine learning models are further described with respect to
The basis for Haralick features is a gray-level co-occurrence matrix (GLCM) G [Ng, Ng]. This matrix is square with dimension Ng, where Ng is the number of gray levels in the image. Element [i,j] of the matrix is generated by counting the number of times a pixel with value i is adjacent to a pixel with value j and then is dividing the entire matrix by the total number of such comparisons made. Each entry is therefore considered to be the probability that a pixel with value i will be found adjacent to a pixel of value j. Accordingly, the GLCM of an image is defined as follows:
At block 304, predefined feature types are illustrated. The Haralick features include the following predefined features, which are computed using the GLCM for each image: 1) Angular Second Moment; 2) Contrast; 3) Correlation; 4) Sum of Squares: Variance; 5) Inverse Difference Moment; 6) Sum Average; 7) Sum Variance; 8) Sum Entropy; 9) Entropy; 10) Difference Variance; 11) Difference Entropy; 12) Information Measures of Correlation; 13) Maximal Correlation Coefficient. Although predefined features are provided herein, other features are available through statistical analysis of the GLCM.
Generally, a texture feature is determined by analyzing a spatial distribution of gray values in an image by computing local features at each point in the image and inferring a set of statistics from the distributions of the local features using the GLCM. The distribution of gray values for each feature is different for photocopy/counterfeit mark when compared to a genuine image. At block 306, the extracted features are input into a classification algorithm (e.g., machine learning classifiers such as XG-Boost, Random Forest, or SVM ML based algorithm) to predict a classification of the images as genuine or photocopy/counterfeit. The prediction from the multiple images are used to consolidate a prediction for the captured mark.
At block 308, a confusion matrix is illustrated. The confusion matrix is used to evaluate the effectiveness of the trained classifier at block 306. Machine learning statistical measures are implemented to determine error based uncertainties. Machine learning statistical measures include, but are not limited to, false positive (FP), false negative (FN), true positive (TP), and true negative (TN) probabillities. Generally, the machine learning statistical measures are based on a ground truth compared with a prediction output by a trained classifier. In evaluating the machine learning statistical measures, a false positive is an error that indicates a condition exists when it actually does not exist. A false negative is an error that incorrectly indicates that a condition does not exist. A true positive is a correctly indicated positive condition, and a true negative is a correctly indicated negative condition. For evaluation of a trained classifier, the actual category of each image is known. By evaluating the performance of the classifier with known data in terms of FP, FN, TP, and TN, the trained classifier can be iteratively updated to provide optimal performance with input images.
At block 504, the base model is customized. In particular, weights of the selected pre-trained deep learning model are updated in a controlled, predetermined manner. Layers of the pre-trained deep learning model are iteratively frozen during further training. By freezing layers during training, computational time used for training is reduced with a minimal loss in accuracy of the resulting trained model. In examples, customization includes a combination of freezing and unfreezing layers in the base-model and adding custom convolutional blocks and dense layers to the base model. The customized pre-trained model is then used to predict a classification of the images as genuine or photocopy/counterfeit at block 506. A confusion matrix at block 508 is used to evaluate the effectiveness of the customized pre-trained model used to predict classifications at block 506.
The base model is customized at block 604 by freezing and unfreezing layers in the base-model and adding custom convolutional blocks and dense layers. At block 606, the customized pre-trained model is converted to a pre-trained lite deep learning model. In some embodiments, conversion to a lite model facilitates mobile or edge deployment (e.g., execution of the model using mobile platforms or operating systems including iOS, Android, and the like). In examples, the pre-trained models are hosted by a cloud-based infrastructure.
At block 608, quantization is applied to the pre-trained lite deep learning model to enable mobile deployment and integration with current mobile device applications. Quantization reduces the model size without compromising an accuracy of predictions made by the model. Generally, quantization generates an approximation of a machine learning model by representing floating-point numbers with numbers using a lower bit-length. The approximation of the model reduces the memory used to store the model as well as computational resources consumed during the execution of the model.
Quantization can be performed according to one or more quantization techniques. For example, quantization includes but is not limited to, dynamic range quantization, full integer quantization, float16 quantization, or any combinations thereof. Generally, dynamic range quantization quantizes weights of a trained model based on the dynamic range of the weights of the trained model. In examples, dynamic range quantization transforms weights of the trained model from a floating point numerical representation to an integer numerical representation. Floating point numerical representations can occupy 32-bits of memory, while integer numerical representations occupy 8-bits of memory. Accordingly, dynamic range quantization creates a model with reduced memory use when compared to the original model. Similarly, full integer quantization enables reductions in memory use by quantizing all weights and activation outputs of the trained model to integer numerical representations, each 8-bits in size. In some embodiments, a range (e.g., minimum and maximum) of all floating-point tensors of the trained model are estimated to determine is the quantizations of the weights and activation outputs during full integer quantization.
Float16 quantization reduces the size of a trained model that includes floating-point numerical representations by quantizing the weights to float16. Generally, float16 is a 16-bit floating point representation according to the “IEEE Standard for Floating-Point Arithmetic,” in IEEE Std 754-2019 (Revision of IEEE 754-2008) , vol., no., pp.1-84, 22 Jul. 2019, doi: 10.1109/IEEESTD.2019.8766229. Float16 quantization reduces a size of the pre-trained model by up to half by reducing weights of the pre-trained model by half of their original size.
Table 2 provides a summary of an accuracy associated with trained models 300, 400, 500, and 600. For each model, input images were provided of marks that include photocopies and genuine marks. Each model demonstrated accurate classification at nearly 100% of genuine or photocopy on unseen images. In Table 2, the macro-average is mean of genuine and photocopy accuracies. This is a metric to represent model accuracies across 2 classes (genuine and photocopy class). Training time, in seconds provides a duration of time for training each model.
The quantized pre-trained deep learning model is then used to predict a classification of the images as genuine or photocopy/counterfeit at block 610. A confusion matrix at block 612 is used to evaluate the effectiveness of the quantized pre-trained deep learning model used to predict classifications at block 612.
At block 704, an authenticity of the representation of the same mark in each image is predicted to obtain an authenticity prediction corresponding to each image of is the set of images. The authenticity prediction of the representation of the same mark refers to the mark being classified as a genuine mark or a photocopy/counterfeit mark. A classification of genuine or photocopy/counterfeit is predicted for each image. The classifications are determined by the model 300 of
At block 706, the authenticity predictions are consolidated to determine an ensemble prediction of authenticity associated with the same mark. In some embodiments, a mode of authenticity predictions associated with the multiple images is calculated to determine the ensemble prediction. If an equal number of images have authenticity predictions of genuine and photocopy/counterfeit, then the mark is predicted to be a photocopy/counterfeit. The ensemble prediction of authenticity is genuine when a predetermined number of authenticity predictions indicate the same mark is genuine. The ensemble prediction of authenticity is photocopy/counterfeit when a predetermined number of authenticity predictions indicate the same mark is a photocopy/counterfeit. In examples, the predetermined number of authenticity predictions is half of the authenticity predictions output by a model.
In some embodiments, each of the model 300 of
In some embodiments, the authenticity predictions from the models are consolidated to determine an ensemble prediction of authenticity associated with the same mark. The authenticity predictions from the models may be consolidated according to weighted voting, where predictions from the models are differently weighted to calculate an ensemble prediction. In examples, a weight of 1.5 is applied is to an authenticity prediction output by model 300, while a weight of 1 is applied to the authenticity prediction from other model(s). In some embodiments, weights are assigned to the authenticity predictions output by the one or more models based on a respective prior performance of the model or by a user.
The authenticity predictions from the models may also be consolidated according to score averaging. In score averaging, the models output an authenticity prediction that is a likelihood or probability of the same mark being genuine or photocopy/counterfeit. The likelihood or probability is use to generate a score. In an example, a score of one indicates an authenticity prediction of genuine, and a score of zero indicates an authenticity prediction of photocopy/counterfeit. The scores output by the models are averaged, and an ensemble prediction of authenticity is based on the average score. A predetermined threshold is applied to the average score. For example, if the average is greater than a predetermined threshold of 0.5, the ensemble prediction of authenticity is genuine. The authenticity predictions from the models may be consolidated according to weighted score averaging. In weighted score averaging, a differential weight is applied to the score from each model, such as the scores obtained according to score averaging.
In some embodiments, a set of metrics associated are calculated for a mark with an authenticity prediction of genuine. To calculate the set of metrics, the mark is divided into a grid of cells, each cell representing a portion of the mark. In examples, the metrics include, for example, a deviation in average cell pigmentation or marking intensity, a cell position bias relative to a best-fit grid, the presence or location of extraneous marks or voids in the mark, and the shape (linearity) of long continuous edges of the mark. An electronic signature is generated based on the set of metrics for the genuine mark. The electronic signature is generated in parallel with consolidating the authenticity predictions to determine the ensemble prediction of authenticity associated with the same mark.
The controller 802 includes a processor 804. The processor 804 can be a microprocessor, a multi-core processor, a multithreaded processor, an ultra-low-voltage processor, an embedded processor, or a virtual processor. In some embodiments, the processor 804 can be part of a system-on-a-chip (SoC) in which the processor 804 and the other components of the controller 802 are formed into a single integrated electronics package.
The processor 804 can communicate with other components of the controller 802 over a bus 806. The bus 806 can include any number of technologies, such as industry standard architecture (ISA), extended ISA (EISA), peripheral component interconnect (PCI), peripheral component interconnect extended (PCIx), PCI express (PCIe), or any number of other technologies. The bus 806 can be a proprietary bus, for example, used in an SoC based system. Other bus technologies can be used, in addition to, or instead of, the technologies above.
The bus 806 can couple the processor 808 to a memory 808. In some embodiments, such as in PLCs and other process control units, the memory 808 is integrated with a data storage 810 used for long-term storage of programs and data. The memory 808 can include any number of volatile and nonvolatile memory devices, such as volatile random-access memory (RAM), static random-access memory (SRAM), flash memory, and the like. In smaller devices, such as programmable logic controllers, the memory 808 can include registers associated with the processor itself The storage 810 is used for the persistent storage of information, such as data, applications, operating systems, and so forth. The storage 810 can be a nonvolatile RAM, a solid-state disk drive, or a flash drive, among others. In some embodiments, the storage 810 will include a hard disk drive, such as a micro hard disk drive, a regular hard disk drive, or an array of hard disk drives, for example, associated with a distributed computing system or a cloud server.
The bus 810 couples the processor 808 to an input/output interface 812. The input/output interface 812 connects the controller 802 to the input/output devices 814. In some embodiments, the input/output devices 814 include printers, displays, touch screen displays, keyboards, mice, pointing devices, and the like. In some examples, one or more of the I/O devices 814 can be integrated with the controller 802 into a computer, such as a mobile computing device, e.g., a smartphone or tablet computer. The controller 802 also includes an image capture device 816. Generally, the image capture device 816 includes hardware associated with image capture. The image capture device can be, for example, a camera or scanner. In some embodiments, the image capture device 816 automatically captures a representation of a mark. In some embodiments, the image capture device 816 captures a representation of a mark in response to input from a user at an input/output device 814.
The controller 802 also includes machine learning models 818. The machine learning models 818 can be, for example, the model 300 of
In some embodiments, a signature generator 826 measures a set of metrics associated with a characteristic of a genuine mark and generates an electronic signature based on the set of metrics for the genuine mark, wherein the electronic signature is generated in parallel with consolidating the authenticity predictions to determine the ensemble prediction of authenticity associated with the same mark. In some examples, the electronic signature is generated in parallel with consolidating (e.g., consolidator 206 of
In the example of
Other implementations are also within the scope of the following claims.
This application claims priority to U.S. Provisional Application No. 63/292,706, filed Dec. 22, 2021, the entire contents of each of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63292706 | Dec 2021 | US |