Self-Calibration

FIELD

The present invention relates to self-calibration, and in particular, but not exclusively to self calibration of a matching algorithm in the context of determining the authenticity of an article.

BACKGROUND

In the fields of authenticating of physical articles it is known to rely upon an identifier for the article. An identifier based on a physical property may be used, these can include embedded reflective particles (WO02/50790A1, U.S. Pat. No. 6,584,214) or an unmodified surface of the article (WO2005/088533).

To provide an authentication result based upon such an identifier, it is necessary to compare a reading from the article to be authenticated to a stored reading result. For this comparison, a match finding algorithm is used.

The present invention has been conceived in the light of known drawbacks of existing systems.

SUMMARY

Viewed from a first aspect, the present invention provides mitigation of processing artefacts caused by surfaces with high contrast printing or colouring transitions within a system to compare signatures derived from inherent physical surface properties of different articles to authenticate or validate articles and within a system to generate signatures from inherent physical surface properties of different articles.

Viewed from another aspect, the present invention can provide a method for performing a comparison between fuzzy data signatures, the method comprising performing a cross-comparison between a test signature and each of a plurality of record signatures, and determining whether the test signature matches one of the plurality of record signatures using a self-calibrating method. Use of a self calibrating method allows high magnitude signal intensity transitions in the signals which were used to create the signatures to be processed to mitigate processing artefacts caused by such large transitions that lead to loss of information from the signals.

In some examples, the self-calibrating method utilises a measure of the randomness of each signature bit. Thus, those bits which caused to have the same bit value by printing or colouration effects rather than by inherent surface properties of the article material can be accorded less weight in determining whether a match occurs that those bits which are not or are less influenced by printing or colouration.

In some examples, the measure of the randomness is derived from a comparison between a best putative match candidate of the record signatures and one or more further putative match candidates of the record signatures. Thus the measure of randomness can be determined without performing a separate detailed analysis of the article surface that gave rise to the signatures.

In some examples, the comparison comprises performing a sliding cross-correlation of each of the one or more further putative match candidates against the best putative match candidate to determine a best correlation location, and wherein the measure of the randomness is derived by determining the number of times that the bit value of each bit of the best putative match candidate is the same as the bit value at the same bit position in each of the one or more further putative match candidates at the best correlation location. Thus the weightings accorded to the particular bits can be derived by checking the number of times that the particular bit value is the same for a number of signatures for similar but non-identical articles.

In some examples, the method further comprises using the measure of randomness to determine a confidence result as to whether the best putative match signature is or is not derived from the same article as the test signature. Thus the matching test can be a confidence result showing a strength of match or non-match for the test signature.

In some examples, each signature is generated from an article by a method comprising: sequentially directing a coherent beam onto each of a plurality of different regions of the article; collecting a set comprising groups of data points from signals obtained when the coherent beam scatters from the different regions of the article, wherein different ones of the groups of data points relate to scatter from the respective different regions of the article; and determining a signature for the article from the set of groups of data points. Thus the signatures are derived from an article surface structure allowing similar but non-identical articles to be individually identified.

In some examples, the determining comprises capping the magnitude of large magnitude intensity signal transitions; and using the capped magnitude data to determine the signature. Thereby, an effect of large magnitude transitions in masking the data describing the article surface structure can be reduced or eliminated.

In some examples, the capping comprises identifying large magnitude transitions and limiting the magnitude of the transition. Thereby a large magnitude transition can be individually identified and capped.

In some examples, the capping comprises: differentiating the intensity data; selecting a differential value at a low percentile; scaling the selected value to determine a threshold; setting all differentials with a value greater than the threshold to zero; and reintegrating the modified differentials. By performing this using a differential process and determining for the data set an appropriate threshold, the technique avoids distorting data where no large transitions occur, and successfully reduces the magnitude of high contrast transitions.

Viewed from a further aspect, the present invention provides a method of generating a signature for an article, the method comprising: sequentially directing a coherent beam onto each of a plurality of different regions of the article; collecting a set comprising groups of data points from signals obtained when the coherent beam scatters from the different regions of the article, wherein different ones of the groups of data points relate to scatter from the respective different regions of the article; and determining a signature for the article from the set of groups of data points, the determining comprising capping the magnitude of large magnitude intensity signal transitions and using the capped magnitude data for determining the signature. Thereby, an effect of large magnitude transitions in masking the data describing the article surface structure can be reduced or eliminated.

Viewed from a further aspect, the present invention provides apparatus for comparing fuzzy data signatures operable to carry out, and/or comprising means for carrying out, any of the methods set out above.

Viewed from another aspect, the present invention provides apparatus operable to perform a comparison between fuzzy data signatures, the apparatus comprising: a cross-comparison unit operable to perform a comparison between a test signature and each of a plurality of record signatures; and a determining unit operable to determine whether the test signature matches one of the plurality of record signatures using a self-calibrating approach. Use of a self calibrating approach allows high magnitude signal intensity transitions in the signals which were used to create the signatures to be processed to mitigate processing artefacts caused by such large transitions that lead to loss of information from the signals.

In some examples, the determining unit is operable to utilise a measure of the randomness of each signature bit to perform the determination. Thus, those bits which caused to have the same bit value by printing or colouration effects rather than by inherent surface properties of the article material can be accorded less weight in determining whether a match occurs that those bits which are not or are less influenced by printing or colouration.

In some examples, the determining unit is operable to derive the measure of the randomness is from a comparison between a best putative match candidate of the record signatures and one or more further putative match candidates of the record signatures. Thus the measure of randomness can be determined without performing a separate detailed analysis of the article surface that gave rise to the signatures.

In some examples, the determining unit is operable to carry out the comparison by performing a sliding cross-correlation of each of the one or more further putative match candidates against the best putative match candidate to determine a best correlation location, and to derive the measure of the randomness by determining the number of times that the bit value of each bit of the best putative match candidate is the same as the bit value at the same bit position in each of the one or more further putative match candidates at the best correlation location. Thus the weightings accorded to the particular bits can be derived by checking the number of times that the particular bit value is the same for a number of signatures for similar but non-identical articles.

In some examples, the determining unit is operable to further use the measure of randomness to determine a confidence result as to whether the best putative match signature is or is not derived from the same article as the test signature. Thus the matching test can be a confidence result showing a strength of match or non-match for the test signature.

In some examples, the apparatus further comprises a signature generator operable to generate the test signature from an article, the signature generator comprising: a source operable to sequentially direct a coherent beam onto each of a plurality of different regions of the article; a detector operable to collect a set comprising groups of data points from signals obtained when the coherent beam scatters from the different regions of the article, wherein different ones of the groups of data points relate to scatter from the respective different regions of the article; and a determiner operable to determine a signature for the article from the set of groups of data points. Thus the signatures are derived from an article surface structure allowing similar but non-identical articles to be individually identified.

In some examples, the determiner is operable to: cap the magnitude of large magnitude intensity signal transitions; and use the capped magnitude data to determine the signature. Thereby, an effect of large magnitude transitions in masking the data describing the article surface structure can be reduced or eliminated.

In some examples, the determiner is operable to cap the magnitude of large magnitude intensity signal transitions by identifying large magnitude transitions and limiting the magnitude of the transition. Thereby a large magnitude transition can be individually identified and capped.

In some examples, the determiner is operable to cap the magnitude of large magnitude intensity signal transitions by: differentiating the intensity data; selecting a differential value at a low percentile; scaling the selected value to determine a threshold; setting all differentials with a value greater than the threshold to zero; and reintegrating the modified differentials. By performing this using a differential process and determining for the data set an appropriate threshold, the technique avoids distorting data where no large transitions occur, and successfully reduces the magnitude of high contrast transitions.

Viewed from a further aspect, the present invention provides apparatus for generating a signature for an article, the apparatus comprising: a source operable to sequentially direct a coherent beam onto each of a plurality of different regions of the article; a detector operable to collect a set comprising groups of data points from signals obtained when the coherent beam scatters from the different regions of the article, wherein different ones of the groups of data points relate to scatter from the respective different regions of the article; and a determiner operable to determine a signature for the article from the set of groups of data points, the determiner operable to cap the magnitude of large magnitude intensity signal transitions and to use the capped magnitude data for determining the signature. Thereby, an effect of large magnitude transitions in masking the data describing the article surface structure can be reduced or eliminated.

Further objects and advantages of the invention will become apparent from the following description and the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

For a better understanding of the invention and to show how the same may be carried into effect reference is now made by way of example to the accompanying drawings in which:

FIG. 1 shows a schematic side view of a reader apparatus;

FIG. 2 shows a block schematic diagram of functional components of the reader apparatus;

FIG. 3 is a microscope image of a paper surface;

FIG. 4 shows an equivalent image for a plastic surface;

FIGS. 5
a and 5b show the effect on reflection caused by non-normal incidence;

FIGS. 6 and 7 show the effect of detector numerical aperture on resistance to non-normal incidence;

FIG. 8 shows a flow diagram showing how a signature of an article can be generated from a scan;

FIGS. 9
a to 9c show schematically the effect of high contrast transitions on collected data;

FIG. 10 shows schematically the effect of high contrast transitions on bit match ratios;

FIGS. 11
a to 11c show schematically the mitigation of the effect of high contrast transitions on collected data by transition capping;

FIG. 12 shows a flow diagram showing how transition capping can be performed;

FIGS. 13
a and 13b show the effect of transition capping on data from a surface with a large number of high magnitude transitions;

FIGS. 14
a and 14b show the effect of transition capping on data from a surface without high magnitude transitions;

FIG. 15 is a flow diagram showing how a signature of an article obtained from a scan can be verified against a signature database;

FIG. 16 shows schematically how the effects of high contrast transitions on bit match ratios can be mitigated;

FIG. 17 is a flow diagram showing the overall process of how a document is scanned for verification purposes and the results presented to a user;

FIG. 18
a is a flow diagram showing how the verification process of FIG. 15 can be altered to account for non-idealities in a scan;

FIG. 18
b is a flow diagram showing another example of how the verification process of FIG. 15 can be altered to account for non-idealities in a scan;

FIG. 19A shows an example of cross-correlation data gathered from a scan;

FIG. 19
b shows an example of cross-correlation data gathered from a scan where the scanned article is distorted; and

FIG. 19C shows an example of cross-correlation data gathered from a scan where the scanned article is scanned at non-linear speed.

While the invention is susceptible to various modifications and alternative forms, specific embodiments are shown by way of example in the drawings and are herein described in detail. It should be understood, however, that drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.

SPECIFIC DESCRIPTION

To provide an accurate method for uniquely identifying an article, it is possible to use a system which relies upon optical reflections from a surface of the article. An example of such a system will be described with reference to FIGS. 1 to 19.

The example system described herein is one developed and marketed by Ingenia Technologies Ltd. This system is operable to analyse the random surface patterning of a paper, cardboard, plastic or metal article, such as a sheet of paper, an identity card or passport, a security seal, a payment card etc to uniquely identify a given article. This system is described in detail in a number of published patent applications, including GB0405641.2 filed 12 Mar. 2004 (published as GB2411954 14 Sep. 2005), GB0418138.4 filed 13 Aug. 2004 (published as GB2417707 8 Mar. 2006), U.S. 60/601,464 filed 13 Aug. 2004, U.S. 60/601,463 filed 13 Aug. 2004, U.S. 60/610,075 filed 15 Sep. 2004, GB 0418178.0 filed 13 Aug. 2004 (published as GB2417074 15 Feb. 2006), U.S. 60/601,219 filed 13 Aug. 2004, GB 0418173.1 filed 13 Aug. 2004 (published as GB2417592 1 Mar. 2006), U.S. 60/601,500 filed 13 Aug. 2004, GB 0509635.9 filed 11 May 2005 (published as GB2426100 15 Nov. 2006), U.S. 60/679,892 filed 11 May 2005, GB 0515464.6 filed 27 Jul. 2005 (published as GB2428846 7 Feb. 2007), U.S. 60/702,746 filed 27 Jul. 2005, GB 0515461.2 filed 27 Jul. 2005 (published as GB2429096 14 Feb. 2007), U.S. 60/702,946 filed 27 Jul. 2005, GB 0515465.3 filed 27 Jul. 2005 (published as GB2429092 14 Feb. 2007), U.S. 60/702,897 filed 27 Jul. 2005, GB 0515463.8 filed 27 Jul. 2005 (published as GB2428948 7 Feb. 2007), U.S. 60/702,742 filed 27 Jul. 2005, GB 0515460.4 filed 27 Jul. 2005 (published as GB2429095 14 Feb. 2007), U.S. 60/702,732 filed 27 Jul. 2005, GB 0515462.0 filed 27 Jul. 2005 (published as GB2429097 14 Feb. 2007), U.S. 60/704,354 filed 27 Jul. 2005, GB 0518342.1 filed 8 Sep. 2005 (published as GB2429950 14 Mar. 2007), U.S. 60/715,044 filed 8 Sep. 2005, GB 0522037.1 filed 28 Oct. 2005 (published as GB2431759 2 May 2007),), U.S. 60/731,531 filed 28 Oct. 2005, GB0526420.5 filed 23 Dec. 2005 (published as GB2433632 27 Jul. 2007), U.S. 60/753,685 filed 23 Dec. 2005, GB0526662.2 filed 23 Dec. 2005, U.S. 60/753,633 filed 23 Dec. 2005, GB0600828.8 filed 16 Jan. 2006 (published as GB2434442 25 Jul. 2007), U.S. 60/761,870 filed 25 Jan. 2006, GB0611618.0 filed 12 Jun. 2006 (published as GB2440386 30 Jan. 2008), U.S. 60/804,537 filed 12 Jun. 2006, GB0711461.4 filed 13 Jun. 2007 (published as GB2450131 17 Dec. 2008) and U.S. 60/943,801 filed 13 Jun. 2006 (all invented by Cowburn et al.), the content of each and all of which is hereby incorporated hereinto by reference.

By way of illustration, a brief description of the method of operation of the Ingenia Technologies Ltd system will now be presented.

FIG. 1 shows a schematic side view of a reader apparatus 1. The optical reader apparatus 1 is for measuring a signature from an article (not shown) arranged in a reading volume of the apparatus. The reading volume is formed by a reading aperture 10 which is a slit in a housing 12. The housing 12 contains the main optical components of the apparatus. The slit has its major extent in the x direction (see inset axes in the drawing). The principal optical components are a laser source 14 for generating a coherent laser beam 15 and a detector arrangement 16 made up of a plurality of k photodetector elements, where k=2 in this example, labelled 16a and 16b. The laser beam 15 is focused by a focussing arrangement 18 into an elongate focus extending in the y direction (perpendicular to the plane of the drawing) and lying in the plane of the reading aperture. In one example reader, the elongate focus has a major axis dimension of about 5 mm and a minor axis dimension of about 40 micrometres. These optical components are contained in a subassembly 20. In the illustrated example, the detector elements 16a, 16b are distributed either side of the beam axis offset at different angles from the beam axis to collect light scattered in reflection from an article present in the reading volume. In one example, the offset angles are ±45 degrees, in another example the angles are −30 and +50 degrees. The angles either side of the beam axis can be chosen so as not to be equal so that the data points they collect are as independent as possible. However, in practice, it has been determined that this is not essential to the operation and having detectors at equal angles either side of the incident beam is a perfectly workable arrangement. The detector elements are arranged in a common plane. The photodetector elements 16a and 16b detect light scattered from an article placed on the housing when the coherent beam scatters from the reading volume. As illustrated, the source is mounted to direct the laser beam 15 with its beam axis in the z direction, so that it will strike an article in the reading aperture at normal incidence.

Generally it is desirable that the depth of focus is large, so that any differences in the article positioning in the z direction do not result in significant changes in the size of the beam in the plane of the reading aperture. In one example, the depth of focus is approximately ±2 mm which is sufficiently large to produce good results. In other arrangements, the depth of focus may be greater or smaller. The parameters, of depth of focus, numerical aperture and working distance are interdependent, resulting in a well known trade off between spot size and depth of focus. In some arrangements, the focus may be adjustable and in conjunction with a rangefinding means the focus may be adjusted to target an article placed within an available focus range.

In order to enable a number of points on the target article to be read, the article and reader apparatus can be arranged so as to permit the incident beam and associated detectors to move relative to the target article. This can be arranged by moving the article, the scanner assembly or both. In some examples, the article may be held in place adjacent the reader apparatus housing and the scanner assembly may move within the reader apparatus to cause this movement. Alternatively, the article may be moved past the scanner assembly, for example in the case of a production line where an article moves past a fixed position scanner while the article travels along a conveyor. In other alternatives, both article and scanner may be kept stationary, while a directional focus means causes the coherent light beam to travel across the target. This may require the detectors to move with the light bean, or stationary detectors may be positioned so as to receive reflections from all incident positions of the light beam on the target.

FIG. 2 is a block schematic diagram of logical components of a reader apparatus as discussed above. A laser generator 14 is controlled by a control and signature generation unit 36. Optionally, a motor 22 may also be controlled by the control and signature generation unit 36. Optionally, if some form of motion detection or linearization means (shown as 19) is implemented to measure motion of the target past the reader apparatus, and/or to measure and thus account for non-linearities in there relative movement, this can be controlled using the control and signature generation unit 36.

The reflections of the laser beam from the target surface scan area are detected by the photodetector 16. As discussed above, more than one photodetector may be provided in some examples. The output from the photodetector 16 is digitised by an analog to digital converter (ADC) 31 before being passed to the control and signature generation unit 36 for processing to create a signature for a particular target surface scan area. The ADC can be part of a data capture circuit, or it can be a separate unit, or it can be integrated into a microcontroller or microprocessor of the control and signature generation unit 36.

The control and signature generation unit 36 can use the laser beam present incidence location information to determine the scan area location for each set of photodetector reflection information. Thereby a signature based on all or selected parts of the scanned part of the scan area can be created. Where less than the entire scan area is being included in the signature, the signature generation unit 36 can simply ignore any data received from other parts of the scan area when generating the signature. Alternatively, where the data from the entire scan area is used for another purpose, such as positioning or gathering of image-type data from the target, the entire data set can be used by the control and signature generation unit 36 for that additional purpose and then kept or discarded following completion of that additional purpose.

As will be appreciated, the various logical elements depicted in FIG. 2 may be physically embodied in a variety of apparatus combinations. For example, in some situations, all of the elements may be included within a scan apparatus. In other situations, the scan apparatus may include only the laser generator 14, motor 22 (if any) and photodetector 16 with all the remaining elements being located in a separate physical unit or units. Other combinations of physical distribution of the logical elements can also be used. Also, the control and signature generation unit 36 may be split into separate physical units. For example, the there may be a first unit which actually controls the laser generator 14 and motor (if any), a second unit which calculates the laser beam current incidence location information, a third unit which identifies the scan data which is to be used for generating a signature, and a fourth part which actually calculates the signature.

It will be appreciated that some or all of the processing steps carried out by the ADC 31 and/or control and signature generation unit 36 may be carried out using a dedicated processing arrangement such as an application specific integrated circuit (ASIC) or a dedicated analog processing circuit. Alternatively or in addition, some or all of the processing steps carried out by the beam ADC 31 and/or control and signature generation unit 36 may be carried out using a programmable processing apparatus such as a digital signal processor or multi-purpose processor such as may be used in a conventional personal computer, portable computer, handheld computer (e.g. a personal digital assistant or PDA) or a smartphone. Where a programmable processing apparatus is used, it will be understood that a software program or programs may be used to cause the programmable apparatus to carry out the desired functions. Such software programs may be embodied onto a carrier medium such as a magnetic or optical disc or onto a signal for transmission over a data communications channel.

To illustrate the surface properties which the system of these examples can read, FIGS. 3 and 4 illustrate a paper and plastic article surface respectively.

FIG. 3 is a microscope image of a paper surface with the image covering an area of approximately 0.5×0.2 mm. This figure is included to illustrate that macroscopically flat surfaces, such as from paper, are in many cases highly structured at a microscopic scale. For paper, the surface is microscopically highly structured as a result of the intermeshed network of wood or other plant-derived fibres that make up paper. The figure is also illustrative of the characteristic length scale for the wood fibres which is around 10 microns. This dimension has the correct relationship to the optical wavelength of the coherent beam to cause diffraction and also diffuse scattering which has a profile that depends upon the fibre orientation. It will thus be appreciated that if a reader is to be designed for a specific class of goods, the wavelength of the laser can be tailored to the structure feature size of the class of goods to be scanned. It is also evident from the figure that the local surface structure of each piece of paper will be unique in that it depends on how the individual wood fibres are arranged. A piece of paper is thus no different from a specially created token, such as the special resin tokens or magnetic material deposits of the prior art, in that it has structure which is unique as a result of it being made by a process governed by laws of nature. The same applies to many other types of article.

FIG. 4 shows an equivalent image for a plastic surface. This atomic force microscopy image clearly shows the uneven surface of the macroscopically smooth plastic surface. As can be surmised from the figure, this surface is smoother than the paper surface illustrated in FIG. 3, but even this level of surface undulation can be uniquely identified using the signature generation scheme of the present examples.

In other words, it is essentially pointless to go to the effort and expense of making specially prepared tokens, when unique characteristics are measurable in a straightforward manner from a wide variety of every day articles. The data collection and numerical processing of a scatter signal that takes advantage of the natural structure of an article's surface (or interior in the case of transmission) is now described.

As is shown in FIG. 1 above, focussed coherent light reflecting from a surface is collected by a number of detectors 16. The detectors receive reflected light across the area of the detector. The reflected light contains information about the surface at the position of incidence of the light. As discussed above, this information may include information about surface roughness of the surface on a microscopic level. This information is carried by the reflected light in the form of the wavelength of features in the observed pattern of reflected light. By detecting these wavelength features, a fingerprint or signature can be derived based on the surface structure of the surface. By measuring the reflections at a number of positions on the surface, the fingerprint or signature can be based on a large sample of the surface, thereby making it easier, following re-reading of the surface at a later date, to match the signature from the later reading to the signature from the initial reading.

The reflected light includes information at two main angular wavelength or angular frequency regions. The high angular frequency (short wavelength) information is that which is traditionally known as speckle. This high angular frequency component typically has an angular periodicity of the order of 0.5 degrees. There is also low angular frequency (long wavelength) information which typically has an angular periodicity of the order of 15 degrees.

As mentioned above, each photodetector collects reflected light over a solid angle which will be called θ_n. It is assumed in the present discussion that each photodetector collects light over a square or circular area. The solid angle of light collection can vary between different photodetectors 16. Each photodetector 16 measures reflected light having a minimum angle from the surface which will be called θ_r. Thus the light detected by a given photodetector 16 includes the reflected beams having an angle relative to the surface of between θ_rand θ_r+θ_n. As will be discussed in greater detail below, there can be advantages in making a system resistant to spoofing in having detector channels separated by the largest possible angle. This would lead to making the angle θ_ras small as possible.

As will be appreciated, the solid angle θ_nover which a photodetector 16 detects reflected light may also be represented as a Numerical Aperture (NA) where:

NA=sin(φ)

where φ is the half-angle of the maximum cone of light that can enter or exit the detector. Accordingly, the numerical aperture of the detectors in the present example is:

NA=sin(θ_n/2)

Thus, a photodetector having a large numerical aperture will have the potential to collect a greater amount of light (i.e. more photons), but this has the effect of averaging more of the reflected information (speckle) such that the sum of all captured information speckle is weaker. However, the long angular wavelength component is less affected by the averaging than the short angular wavelength (traditional speckle) component, so this has the effect of the improving ratio of long wavelength to short wavelength reflected signal.

Although it is shown in FIG. 1 that the focussed coherent beam is normally incident on the surface, it will be appreciated that in practice it can be difficult to ensure perfectly normal incidence. This is especially true in circumstances where a low cost reader is provided, where positioning is performed by a user with little or no training or where positioning of the article is out of control of a user, such as on commercial processing environment including, for example conveyors transporting articles, and any circumstance where the distance from the reader to the article is such that there is no physical contact between reader and article. Thus, in reality it is very likely that the incident focussed coherent light beam will not strike the article from a perfect normal.

It has been found that altering the angle of incidence by only fractions of a degree can have a significant effect on the reflected speckle pattern from a surface. For example, FIG. 5a shows an image of a conventional speckle pattern from a piece of ordinary white paper such as might be used with a conventional printer or photocopier. FIG. 5b shows an image of the speckle pattern of that same piece of paper under identical illumination conditions with the piece of paper tilted by 0.06 degrees relative to its position in the image in FIG. 5a. It is immediately clear to any observer that the speckle pattern has changed significantly as a result of this extremely small angular perturbation in the surface. Thus, if a signature were to be generated from the each of the respective data sets from these two images, a cross-correlation between those two signatures would provide a result much lower than would normally be expected from a cross-correlation between two signatures generated from scanning the same target.

It has also been found that when the angle is repeatedly increased by a small amount and the measurements taken and cross-correlations performed between each new measurement and the baseline original measurement (with zero offset angle), that the cross-correlation result drops off rapidly as the offset angle starts to increase. However, as the angle increases beyond a certain point, the cross-correlation result saturates, causing a plot of cross-correlation result against offset angle to level off at an approximately constant cross-correlation value. This effect is provided by the low frequency component in the reflected light. What is happening is that the high frequency speckle component of the reflected light quickly de-couples as the perturbation in incident angle increases. However, once the angle increase by a certain amount, the effect of the traditional speckle (high frequency) component becomes less than the effect of the low frequency component. Thus, once the low frequency component becomes the most significant factor in the cross-correlation result, this component (which is much more incident angle tolerant) causes the cross-correlation result to saturate despite further increases in incident angle perturbation.

This phenomenon is illustrated in FIG. 6, where a schematic plot of cross correlation result against offset angle is shown at various different numerical aperture values for the photodetector. As can be seen from FIG. 6, at a numerical aperture of 0.015 (full cone angle of approximately 1.7 degrees) the cross correlation result drops off rapidly with increasing angle until a cross-correlation result of approximately 0.5 is reached. The cross-correlation result saturates at this value.

It has also been found that increasing the numerical aperture of the photodetector causes the low frequency component of the reflected light to take precedence over the high frequency component sooner in terms of incident angle perturbation. This occurs because over a larger solid angle (equivalent to numerical aperture) the effect of the low frequency component becomes greater relative to the high frequency “traditional speckle” component as this high frequency component is averaged out by the large “reading window”.

Thus, as shown in FIG. 6, the curves representing higher numerical aperture saturate at respectively higher cross correlation result values. At a numerical aperture of 0.05 (full cone angle of approximately 5.7 degrees), the graph saturates at a cross correlation result of approximately 0.7. At a numerical aperture of 0.1 (full cone angle of approximately 11.4 degrees), the graph saturates at a cross correlation result of approximately 0.9.

A plot of some experimental results demonstrating this phenomenon is shown in FIG. 7. These results were taken under identical illumination conditions on the same surface point of the same article, with the only alterations for each photodetector being there alteration in the incident light beam away from normal. The cross correlation result is from a cross-correlation between the collected information at each photodetector at each incident angle perturbation value and information collected with zero incident angle perturbation. As can be seen from FIG. 7, with a photodetector having a numerical aperture of 0.0185 (full cone angle of 2.1 degrees), the cross correlation result rapidly drops to 0.6 with an increase in incident angle perturbation from 0 to 0.5 degrees. However, once this level is reached, the cross correlation result stabilises in the range 0.5 to 0.6.

With a photodetector having a numerical aperture of 0.1 (full cone angle of 11.4 degrees), the cross correlation result almost instantly stabilises around a value of approximately 0.9. Thus at this numerical aperture, the effect of speckle is almost negligible as soon as any deviation from a normal incident angle occurs.

Thus, it is apparent that a reader using a photodetector according to this technique can be made extremely resistant to perturbations in the incident angle of a laser light beam between different readings from the same surface point.

FIG. 8 shows a flow diagram showing how a signature of an article can be generated from a scan.

Step S1 is a data acquisition step during which the optical intensity at each of the photodetectors is acquired at a number of locations along the entire length of scan. Simultaneously, the encoder signal is acquired as a function of time. It is noted that if the scan motor has a high degree of linearisation accuracy (e.g. as would a stepper motor), or if non-linearities in the data can be removed through block-wise analysis or template matching, then linearisation of the data may not be required. Referring to FIG. 2 above, the data is acquired by the signature generator 36 taking data from the ADC 31. The number of data points per photodetector collected in each scan is defined as N in the following. Further, the value a_k(i) is defined as the i-th stored intensity value from photodetector k, where i runs from 1 to N.

Step S2 is an optional step of applying a time-domain filter to the captured data. In the present example, this is used to selectively remove signals in the 50/60 Hz and 100/120 Hz bands such as might be expected to appear if the target is also subject to illumination from sources other than the coherent beam. These frequencies are those most commonly used for driving room lighting such as fluorescent lighting.

Step S3 performs alignment of the data. In some examples, this step uses numerical interpolation to locally expand and contract a_k(i) so that the encoder transitions are evenly spaced in time. This corrects for local variations in the motor speed and other non-linearities in the data. This step can be performed by the signature generator 36.

In some examples, where the scan area corresponds to a predetermined pattern template, the captured data can be compared to the known template and translational and/or rotational adjustments applied to the captured data to align the data to the template. Also, stretching and contracting adjustments may be applied to the captured data to align it to the template in circumstances where passage of the scan head relative to the article differs from that from which the template was constructed. Thus if the template is constructed using a linear scan speed, the scan data can be adjusted to match the template if the scan data was conducted with non-linearities of speed present.

Step S4 applies an optional signal intensity capping to address a particular issue which occurs with articles having, for example, highly printed surfaces, including surfaces with text printing and surfaces with halftone printing for example. The issue is that there is a tendency for the non-match results to experience an increase in match score thereby reducing the separation between a non-match result and a match result.

This is caused by the non-random effects of a sudden contrast change on the scanned surface in relation to the randomness of each bit of the resulting signature. In simple terms, the sudden contrast change causes a number of non-random data bits to enter the signature and these non-random bits therefore match one-another across scans of similarly printed or patterned articles. FIG. 10 illustrates this process in more detail.

FIG. 9
a shows a scan area 50 on an article, the scan area has two areas 51 which have a first surface colour and an area 52 with a second surface colour. The effect of this surface colour transition is shown in FIG. 9b where the intensity of the reflected signal captured by the scan apparatus is plotted along the length of the scan area. As can be seen, the intensity follows a first level when the first surface colour is present and a second level when the second surface colour is present. At each of the first and second levels, small variations in signal intensity occur. These small variations are the information content from which the signature is derived.

The problem that the step change between the first and second levels in FIG. 9b actually causes in the resulting signature is illustrated by FIG. 9c. FIG. 9c shows the intensity data from FIG. 9b after application of an AC filter (such as the space domain band-pass filter discussed below with respect to step S5). From FIG. 9c it is clear that, even with a high order filter such as a 2^ndorder filter, after each sudden transition in surface pattern on the scan area a region where the small intensity variation is lost occurs. Thus, for each data bit position in the region 53, the value of the data bit that ends up in the signature will be a zero, irrespective of the small variations in intensity that actually occurred at those positions. Likewise, for each data bit position in the region 54, the value of the data bit that ends up in the signature will be a one, irrespective of the small variations in intensity that actually occurred at those positions.

As two similar articles can be expected to have nominally identical surface printing or patterning over a scan region, all signatures for such articles can be expected to have approximately the same regions of all one and/or all zero data bits within the signature at the positions corresponding to the step changes in the surface pattern/print/colour. These regions therefore cause an artificially increased comparison result value for comparisons between different articles, reducing the separation between a match result and a non-match result. This reduced separation is illustrated in FIG. 10, where it can be seen that the peak for comparisons between different scans of a single article (i.e. a match result) is centred at a bit match ratio of around 99%, whereas the peak for the second best match where a comparison is performed against scans of different articles is centred at a bit match ratio of around 85%. Under normal circumstances, where no such surface patterning effects occur, the non-match peak would be expected to be much closer to 50%.

As is noted above, a first approach to minimising the data loss caused by such transitions involves using a high order filter to minimise the recovery time and thus minimise the number of signature bits that are affected by each scan surface transition.

As will be described hereafter, a more involved approach can be taken to minimising the impact of such scan surface transitions on the bits of a signature derived from a scan of that scan surface. Specifically, a system can be implemented to detect that an intensity variation is occurring that is too large to be one of the small variations that represents the surface texture or roughness which leads to the signature. If such a transition is detected, the magnitude of the transition can be chopped or capped before the AC filter is applied to further reduce the filter recovery time. This is illustrated in FIG. 11. FIG. 11a is identical to FIG. 9a, and shows the scan region with the patterned areas. FIG. 11b shows the capped magnitude of the transitions between the patterned areas, and FIG. 11c shows that the regions 55 and 56 which result in all one and all zero data bits are much smaller relative to the corresponding regions 53 and 54 in FIG. 9c. This then reduces the number of bits in the signature which are forced to adopt a zero or one value as a direct result of a surface pattern transition without any reference to the small variations that the remainder of the signature is based upon.

One of the most straightforward ways to detect such transitions is to know when they are coming such as by having a template against which the scan data can be compared to cap the transitions automatically at certain points along the scan length. This approach has two drawbacks, that the template needs to be aligned to the scan data to allow for mispositioning of the scanner relative to the article, and that the scanner needs to know in advance what type of article is to be scanned so as to know what template to use.

Another way to detect such transitions is to use a calculation based on, for example, the standard deviation to spot large transitions. However, such a approach typically has trouble with long periods without a transition and can thus cause errors to be introduced where a scanned article doesn't have any/many transitions.

To address the defects in such approaches, the following technique can be used to enable a system which works equally well whether or not a scan area includes transitions in printing/patterning and which requires no advance knowledge of the article to be scanned. Thus, in the present example, the approach taken in step S4 is shown in FIG. 12.

Starting at step D1, the intensity values are differentiated to produce a series of differential values. Then, at step D2, the differential values are analysed by percentile to enable a value to be chosen at a low value. In the present example, the 50^thpercentile may be conveniently used. Other percentile values around or below the 50^thmay also be used.

Step D3 then creates a threshold by scaling the value at the chosen percentile by a scaling factor. The scaling factor can be derived empirically, although one scaling factor can be applicable to a wide range of surface material types. In the present examples, a scaling factor of 2.5 is used for many different surface material types including papers, cardboards, glossy papers and glossy cardboards.

Then, at step D4, all of the differential values are compared the threshold. Any differentials with a value greater than the threshold are set to a zero value. Once the differential values have been threshold checked, the modified differentials are reintegrated at step D5.

In the present example, all of these steps are carried out after conversion of the analogue data from the photodetectors to multilevel digital values. In an example where the photodetectors output a digital intensity signal rather than an analogue signal, no digitisation would be necessary.

This system therefore spots the large transitions which are too large to be the surface texture/roughness response and caps those transitions in order to avoid the texture/roughness response data being masked by the large transition.

The effects of step S4 on data from a highly printed surface are illustrated in FIGS. 13a and 13b. FIG. 13a shows the data immediately before carrying out step S4, for data retrieved from a surface with a series of high contrast stripes transverse to the scan direction. The same data set, after processing by step S4 is shown in FIG. 13b, where it can be seen that the amount of surface information preserved is high despite the high contrast transitions.

By way of comparison, FIGS. 14a and 14b illustrate that the system implemented in S4 does not cause problems in data without high contrast printed transitions. FIG. 14a shows the data immediately before carrying out step S4, for data retrieved from a plain surface. The same data set, after processing by step S4 is shown in FIG. 14b, where it can be seen that the amount of surface information is not reduced despite the carrying out of the process of S4.

Step S5 applies a space-domain band-pass filter to the captured data. This filter passes a range of wavelengths in the x-direction (the direction of movement of the scan head). The filter is designed to maximise decay between samples and maintain a high number of degrees of freedom within the data. With this in mind, the lower limit of the filter passband is set to have a fast decay. This is required as the absolute intensity value from the target surface is uninteresting from the point of view of signature generation, whereas the variation between areas of apparently similar intensity is of interest. However, the decay is not set to be too fast, as doing so can reduce the randomness of the signal, thereby reducing the degrees of freedom in the captured data. The upper limit can be set high; whilst there may be some high frequency noise or a requirement for some averaging (smearing) between values in the x-direction (much as was discussed above for values in the y-direction), there is typically no need for anything other than a high upper limit. In some examples a 2^ndorder filter can be used. In one example, where the speed of travel of the laser over the target surface is 20 mm per second, the filter may have an impulse rise distance 100 microns and an impulse fall distance of 500 microns.

Instead of applying a simple filter, it may be desirable to weight different parts of the filter. In one example, the weighting applied is substantial, such that a triangular passband is created to introduce the equivalent of realspace functions such as differentiation. A differentiation type effect may be useful for highly structured surfaces, as it can serve to attenuate correlated contributions (e.g. from surface printing on the target) from the signal relative to uncorrelated contributions.

Step S6 is a digitisation step where the multi-level digital signal (the processed output from the ADC) is converted to a bi-state digital signal to compute a digital signature representative of the scan. The digital signature is obtained in the present example by applying the rule: a_k(i)>mean maps onto binary ‘1’ and a_k(i)<=mean maps onto binary ‘0’. The digitised data set is defined as d_k(i) where i runs from 1 to N. The signature of the article may advantageously incorporate further components in addition to the digitised signature of the intensity data just described. These further optional signature components are now described.

Step S7 is an optional step in which a smaller ‘thumbnail’ digital signature is created. In some examples, this can be a realspace thumbnail produced either by averaging together adjacent groups of m readings, or by picking every cth data point, where c is the compression factor of the thumbnail. The latter may be preferable since averaging may disproportionately amplify noise. In other examples, the thumbnail can be based on a Fast Fourier Transform of some or all of the signature data. The same digitisation rule used in Step S6 is then applied to the reduced data set. The thumbnail digitisation is defined as t_k(i) where i runs 1 to N/c and c is the compression factor.

Step S8 is an optional step applicable when multiple detector channels exist (i.e. where k>1). The additional component is a cross-correlation component calculated between the intensity data obtained from different ones of the photodetectors. With 2 channels there is one possible cross-correlation coefficient, with 3 channels up to 3, and with 4 channels up to 6 etc. The cross-correlation coefficients can be useful, since it has been found that they are good indicators of material type. For example, for a particular type of document, such as a passport of a given type, or laser printer paper, the cross-correlation coefficients always appear to lie in predictable ranges. A normalised cross-correlation can be calculated between a_k(i) and a_l(i), where k≠l and k, l vary across all of the photodetector channel numbers. The normalised cross-correlation function is defined as:

$Γ (k, l) = \frac{\sum_{i = 1}^{N} a_{k} (i) a_{l} (i)}{\sqrt{(\sum_{i = 1}^{N} {a_{k} (i)}^{2}) (\sum_{i = 1}^{N} {a_{l} (i)}^{2})}}$

Another aspect of the cross-correlation function that can be stored for use in later verification is the width of the peak in the cross-correlation function, for example the full width half maximum (FWHM). The use of the cross-correlation coefficients in verification processing is described further below.

Step S9 is another optional step which is to compute a simple intensity average value indicative of the signal intensity distribution. This may be an overall average of each of the mean values for the different detectors or an average for each detector, such as a root mean square (rms) value of a_k(i). If the detectors are arranged in pairs either side of normal incidence as in the reader described above, an average for each pair of detectors may be used. The intensity value has been found to be a good crude filter for material type, since it is a simple indication of overall reflectivity and roughness of the sample. For example, one can use as the intensity value the unnormalised rms value after removal of the average value, i.e. the DC background. The rms value provides an indication of the reflectivity of the surface, in that the rms value is related to the surface roughness.

The signature data obtained from scanning an article can be compared against records held in a signature database for verification purposes and/or written to the database to add a new record of the signature to extend the existing database and/or written to the article in encoded form for later verification with or without database access.

A new database record will include the digital signature obtained in Step S6 as well as optionally its smaller thumbnail version obtained in Step S7 for each photodetector channel, the cross-correlation coefficients obtained in Step S8 and the average value(s) obtained in Step S9. Alternatively, the thumbnails may be stored on a separate database of their own optimised for rapid searching, and the rest of the data (including the thumbnails) on a main database.

FIG. 15 is a flow diagram showing how a signature of an article obtained from a scan can be verified against a signature database.

In a simple implementation, the database could simply be searched to find a match based on the full set of signature data. However, to speed up the verification process, the process of the present example uses the smaller thumbnails and pre-screening based on the computed average values and cross-correlation coefficients as now described. To provide such a rapid verification process, the verification process is carried out in two main steps, first using the thumbnails derived from the amplitude component of the Fourier transform of the scan data (and optionally also pre-screening based on the computed average values and cross-correlation coefficients) as now described, and second by comparing the scanned and stored full digital signatures with each other.

Verification Step V1 is the first step of the verification process, which is to scan an article according to the process described above, i.e. to perform Scan Steps S1 to S9. This scan obtains a signature for an article which is to be validated against one or more records of existing article signatures

Verification Step V2 seeks a candidate match using the thumbnail (derived either from the Fourier transform amplitude component of the scan signal or as a realspace thumbnail from the scan signal), which is obtained as explained above with reference to Scan Step S7. Verification Step V2 takes each of the thumbnail entries and evaluates the number of matching bits between it and t_k(i+j), where j is a bit offset which is varied to compensate for errors in placement of the scanned area. The value of j is determined and then the thumbnail entry which gives the maximum number of matching bits. This is the ‘hit’ used for further processing. A variation on this would be to include the possibility of passing multiple candidate matches for full testing based on the full digital signature. The thumbnail selection can be based on any suitable criteria, such as passing up to a maximum number of, for example 10 or 100, candidate matches, each candidate match being defined as the thumbnails with greater than a certain threshold percentage of matching bits, for example 60%. In the case that there are more than the maximum number of candidate matches, only the best candidates are passed on. If no candidate match is found, the article is rejected (i.e. jump to Verification Step V6 and issue a fail result).

This thumbnail based searching method employed in the present example delivers an overall improved search speed, for the following reasons. As the thumbnail is smaller than the full signature, it takes less time to search using the thumbnail than using the full signature. Where a realspace thumbnail is used, the thumbnail needs to be bit-shifted against the stored thumbnails to determine whether a “hit” has occurred, in the same way that the full signature is bit-shifted against the stored signature to determine a match. The result of the thumbnail search is a shortlist of putative matches, each of which putative matches can then be used to test the full signature against.

Where the thumbnail is based on a Fourier Transform of the signature or part thereof, further advantages may be realised as there is no need to bit-shift the thumbnails during the search. A pseudo-random bit sequence, when Fourier transformed, carries some of the information in the amplitude spectrum and some in the phase spectrum. Any bit shift only affects the phase spectrum, however, and not the amplitude spectrum. Amplitude spectra can therefore be matched without any knowledge of the bit shift. Although some information is lost in discarding the phase spectrum, enough remains in order to obtain a rough match against the database. This allows one or more putative matches to the target to be located in the database. Each of these putative matches can then be compared properly using the conventional real-space method against the new scan as with the realspace thumbnail example.

Verification Step V3 is an optional pre-screening test that is performed before analysing the full digital signature stored for the record against the scanned digital signature. In this pre-screen, the rms values obtained in Scan Step S9 are compared against the corresponding stored values in the database record of the hit. The ‘hit’ is rejected from further processing if the respective average values do not agree within a predefined range. The article is then rejected as non-verified (i.e. jump to Verification Step V6 and issue fail result).

Verification Step V4 is a further optional pre-screening test that is performed before analysing the full digital signature. In this pre-screen, the cross-correlation coefficients obtained in Scan Step S8 are compared against the corresponding stored values in the database record of the hit. The ‘hit’ is rejected from further processing if the respective cross-correlation coefficients do not agree within a predefined range. The article is then rejected as non-verified (i.e. jump to Verification Step V6 and issue fail result).

Another check using the cross-correlation coefficients that could be performed in Verification Step V4 is to check the width of the peak in the cross-correlation function, where the cross-correlation function is evaluated by comparing the value stored from the original scan in Scan Step S8 above and the re-scanned value:

$Γ_{k, l} (j) = \frac{\sum_{i = 1}^{N} a_{k} (i) a_{l} (i + j)}{\sqrt{(\sum_{i = 1}^{N} {a_{k} (i)}^{2}) (\sum_{i = 1}^{N} {a_{l} (i)}^{2})}}$

If the width of the re-scanned peak is significantly higher than the width of the original scan, this may be taken as an indicator that the re-scanned article has been tampered with or is otherwise suspicious. For example, this check should beat a fraudster who attempts to fool the system by printing a bar code or other pattern with the same intensity variations that are expected by the photodetectors from the surface being scanned.

Verification step V5 performs a test to determine whether the putative match identified as a “hit” is in fact a match. In the present example, this test is self-calibrating, such that it avoids signature loss caused by sudden transitions on the scanned surface (such as printed patterns causing step changes in reflected light). This provides simpler processing and avoids the potential for loss of a significant percentage of the data which should make up a signature due to printing or other patterns on an article surface.

As has been described above with reference to step S4 and FIGS. 9 to 14, actions can be taken at the signature generation stage to limit the impact of surface patterning/printing on authentication/validation match confidence. In the present examples, an additional approach can be taken to minimise the impact upon the match result of any data bits within the signature which have been set by a transition effect rather than by the roughness/texture response of the article surface. This can be carried out whether or not the transition capping approach described above with reference to FIGS. 9 to 14 is performed.

Thus, in step V5, after the shortlist of hits has been complied using the thumbnail search and after the optional pre-screening of V4, a number of actions are carried out.

Firstly, a full signature comparison is performed between the record signature for each of the shortlist signatures and the test signature to select the signature with the best overall match result. This is selected as the best match signature. To aid in establishing whether the best match signature is actually a match result or is just a relatively high scoring non-match, a measure of the randomness of the bits of the signature is used to weight the cross-correlation result for the best match signature.

To establish the measure of the randomness of the bits in the signature, the best match signature is cross-correlated with the record signature for the other signatures in the shortlist identified by the thumbnails. From a sliding cross-correlation of each shortlist signature against the best match signature, a best result position can be found for each of those shortlist signatures against the best match signature. Then, the number of times that each bit value of the best match signature also occurs in the best result position of each of the shortlist signatures is measured.

This measured value is representative of the randomness of each bit within the best match signature. For example, if a given bit value is the same in approximately half of the shortlist signatures, then the bit is probably random, whereas if the given bit value is the same in approximately 90% of the shortlist signatures, then the bit is probably not random. To quantify this measure, the present examples define and use a bit utility ratio.

$BitUtilityRation \begin{matrix} = 4 {(1 - AverageBitMNR)}^{2} \\ AverageBitBMR \geq 0.5 \\ = 1 \\ AverageBitBMR < 0.5 \end{matrix}}$

This provides that for bits exhibiting a good level of randomness, a Bit Utility Ratio of or approaching 1 will be applied, and for bits exhibiting low level of randomness, a Bit Utility Ratio of or approaching zero will be applied. Referring again to the examples above, if a given bit value is the same in approximately half of the shortlist signatures (AverageBitBMR=0.5), then the Bit Utility Ratio=1, whereas if the given bit value is the same in approximately 90% of the shortlist signatures (AverageBitBMR=0.9), then the Bit Utility Ratio is 0.04.

The Bit Utility Ratio calculated for each bit of the best match signature is then used to weight the cross-correlation result for the comparison between the test signature and the best match signature. Thus, instead of simply summing the comparison result for each bit comparison in the cross-correlation as would conventionally be performed, the Bit Utility Ratio for each bit is used to weight each bit result before the bit results are summed. Thus, whereas the cross-correlation sum result is defined, when no weighting is applied as:

$BMR = \frac{\sum_{i} f (i) \overline{\otimes} g (i)}{\sum_{i} 1}$

where f(i) represents the i^thvalue of the test signature and g(i) represents the i^thvalue of the record signature; the cross-correlation sum result is defined, when using the Bit Utility Ratio (BUR) as a weighting, as:

$CorrectedBMR = \frac{\sum_{i} f (i) \overline{\otimes} g (i) \cdot BUR (i)}{\sum_{i} BUR (i)}$

where BUR(i) represents the Bit Utility Ratio for the i^thbit of the record signature.

This corrected Bit Match Ratio can then be used to assess whether the best match record signature is in fact taken form the same article as the test signature. FIG. 16 shows, by way of comparison with FIG. 10, that the peak for comparisons between different scans of a single article (i.e. a match result) is centred at a bit match ratio of around 97%, whereas the peak for the second best match, where a comparison is performed against scans of different articles is now centred at a bit match ratio of around 55%. Thus the distinction between a non-match and a match is much clearer and more distinct.

As will be clear to the skilled reader, each of the two processes implemented in the present example separately provides a significant contribution to avoiding match results reaching a wrong conclusion due to printing or patterning on an article surface. Implementation of either one (or both) of these techniques can therefore enable a single authentication or verification system to work on a variety of article types without any need to know which article type is being considered or any need to pre-configure a record signature database before population.

Verification Step V6 issues a result of the verification process. In experiments carried out upon paper, it has generally been found that 75% of bits in agreement represents a good or excellent match, whereas 50% of bits in agreement represents no match.

The determination of whether a given result represents a match or a non-match is performed against a threshold or set of thresholds. The level of distinction required between a match and a non-match can be set according to a level of sensitivity to false positives and false negatives in a particular application. The threshold may relate to an absolute BMR value and/or may include a measure of the peak width for a group of non-match results from shortlisted record signatures and/or may include a measure of the separation in BMR between the best result and the second best result.

By way of example, it has been experimentally found that a database comprising 1 million records, with each record containing a 128-bit thumbnail (either derived from the Fourier transform amplitude spectrum or as a realspace thumbnail), can be searched in 1.7 seconds on a standard PC computer of 2004 specification. 10 million entries can be searched in 17 seconds. More modern computers and high-end server computers can be expected to achieve speeds of 10 or more times faster than this.

Thus a method for verification of whether or not a signature generated from an article has been previously included in a database of known articles has been described.

It will be appreciated that many variations are possible. For example, instead of treating the cross-correlation coefficients as a pre-screen component, they could be treated together with the digitised intensity data as part of the main signature. For example the cross-correlation coefficients could be digitised and added to the digitised intensity data. The cross-correlation coefficients could also be digitised on their own and used to generate bit strings or the like which could then be searched in the same way as described above for the thumbnails of the digitised intensity data in order to find the hits.

Thus a number of options for comparing a test signature to record signatures to obtain a match confidence result have been described.

FIG. 17 is a flow diagram showing the overall process of how a document is scanned for verification purposes and the results presented to a user. First the document is scanned according to the scanning steps of FIG. 8. The document authenticity is then verified using the verification steps of FIG. 15. If there is no matching record in the database, a “no match” result can be displayed to a user. If there is a match, this can be displayed to the user using a suitable user interface. The user interface may be a simple yes/no indicator system such as a lamp or LED which turns on/off or from one colour to another for different results. The user interface may also take the form of a point of sale type verification report interface, such as might be used for conventional verification of a credit card. The user interface might be a detailed interface giving various details of the nature of the result, such as the degree of certainty in the result and data describing the original article or that article's owner. Such an interface might be used by a system administrator or implementer to provide feedback on the working of the system. Such an interface might be provided as part of a software package for use on a conventional computer terminal.

It will thus be appreciated that when a database match is found a user can be presented with relevant information in an intuitive and accessible form which can also allow the user to apply his or her own common sense for an additional, informal layer of verification. For example, if the article is a document, any image of the document displayed on the user interface should look like the document presented to the verifying person, and other factors will be of interest such as the confidence level and bibliographic data relating to document origin. The verifying person will be able to apply their experience to make a value judgement as to whether these various pieces of information are self consistent.

On the other hand, the output of a scan verification operation may be fed into some form of automatic control system rather than to a human operator. The automatic control system will then have the output result available for use in operations relating to the article from which the verified (or non-verified) signature was taken.

Thus there have now been described methods for scanning an article to create a signature therefrom and for comparing a resulting scan to an earlier record signature of an article to determine whether the scanned article is the same as the article from which the record signature was taken. These methods can provide a determination of whether the article matches one from which a record scan has already been made to a very high degree of accuracy.

From one point of view, there has thus now been described, in summary, a system in which a digital signature is obtained by digitising a set of data points obtained by scanning a coherent beam over a paper, cardboard, plastic, metal or other article, and measuring the scatter. A thumbnail digital signature is also determined, either in realspace by averaging or compressing the data, or by digitising an amplitude spectrum of a Fourier transform of the set of data points. A database of digital signatures and their thumbnails can thus be built up. The authenticity of an article can later be verified by re-scanning the article to determine its digital signature and thumbnail, and then searching the database for a match. Searching is done on the basis of the thumbnail to improve search speed. Use of a Fourier transform based thumbnail can improve speed, since, in a pseudo-random bit sequence, any bit shift only affects the phase spectrum, and not the amplitude spectrum, of a Fourier transform represented in polar co-ordinates. The amplitude spectrum stored in the thumbnail can therefore be matched without any knowledge of the unknown bit shift caused by registry errors between the original scan and the re-scan.

In some examples, the method for extracting a signature from a scanned article can be optimised to provide reliable recognition of an article despite deformations to that article caused by, for example, stretching or shrinkage. Such stretching or shrinkage of an article may be caused by, for example, water damage to a paper or cardboard based article.

Also, an article may appear to a scanner to be stretched or shrunk if the relative speed of the article to the sensors in the scanner is non-linear. This may occur if, for example the article is being moved along a conveyor system, or if the article is being moved through a scanner by a human holding the article. An example of a likely scenario for this to occur is where a human scans, for example, a bank card using a swipe-type scanner.

In some examples, where a scanner is based upon a scan head which moves within the scanner unit relative to an article held stationary against or in the scanner, then linearisation guidance can be provided within the scanner to address any non-linearities in the motion of the scan head. Where the article is moved by a human, these non-linearities can be greatly exaggerated

To address recognition problems which could be caused by these non-linear effects, it is possible to adjust the analysis phase of a scan of an article. Thus a modified validation procedure will now be described with reference to FIG. 18a. The process implemented in this example uses a block-wise analysis of the data to address the non-linearities.

The process carried out in accordance with FIG. 18a can include some or all of the steps of time domain filtering, alternative or additional linearisation, transition capping, space domain filtering, smoothing and differentiating the data, and digitisation for obtaining the signature and thumbnail described with reference to FIG. 8, but are not shown in FIG. 18a so as not to obscure the content of that figure.

As shown in FIG. 18a, the scanning process for a validation scan using a block-wise analysis starts at step S21 by performing a scan of the article to acquire the date describing the intrinsic properties of the article. This scanned data is then divided into contiguous blocks (which can be performed before or after digitisation and any smoothing/differentiation or the like) at step S22. In one example, a scan area of 1600 mm²(e.g. 40 mm×40 mm) is divided into eight equal length blocks. Each block therefore represents a subsection of the scanned area of the scanned article.

For each of the blocks, a cross-correlation is performed against the equivalent block for each stored signature with which it is intended that article be compared at step S23. This can be performed using a thumbnail approach with one thumbnail for each block. The results of these cross-correlation calculations are then analysed to identify the location of the cross-correlation peak. The location of the cross-correlation peak is then compared at step S24 to the expected location of the peak for the case where a perfectly linear relationship exists between the original and later scans of the article.

As this block-matching technique is a relatively computationally intensive process, in some examples its use may be restricted to use in combination with a thumbnail search such that the block-wise analysis is only applied to a shortlist of potential signature matches identified by the thumbnail search.

This relationship can be represented graphically as shown in FIGS. 19A, 19B and 19C. In the example of FIG. 19A, the cross-correlation peaks are exactly where expected, such that the motion of the scan head relative to the article has been perfectly linear and the article has not experienced stretch or shrinkage. Thus a plot of actual peak positions against expected peak results in a straight line which passes through the origin and has a gradient of 1.

In the example of FIG. 19B, the cross-correlation peaks are closer together than expected, such that the gradient of a line of best fit is less than 1. Thus the article has shrunk relative to its physical characteristics upon initial scanning. Also, the best fit line does not pass through the origin of the plot. Thus the article is shifted relative to the scan head compared to its position for the record scan.

In the example of FIG. 19C, the cross correlation peaks do not form a straight line. In this example, they approximately fit to a curve representing a y²function. Thus the movement of the article relative to the scan head has slowed during the scan. Also, as the best fit curve does not cross the origin, it is clear that the article is shifted relative to its position for the record scan.

A variety of functions can be test-fitted to the plot of points of the cross-correlation peaks to find a best-fitting function. Thus curves to account for stretch, shrinkage, misalignment, acceleration, deceleration, and combinations thereof can be used. Examples of suitable functions can include straight line functions, exponential functions, a trigonometric functions, x²functions and x³functions.

Once a best-fitting function has been identified at step S25, a set of change parameters can be determined which represent how much each cross-correlation peak is shifted from its expected position at step S26. These compensation parameters can then, at step S27, be applied to the data from the scan taken at step S21 in order substantially to reverse the effects of the shrinkage, stretch, misalignment, acceleration or deceleration on the data from the scan. As will be appreciated, the better the best-fit function obtained at step S25 fits the scan data, the better the compensation effect will be.

The compensated scan data is then broken into contiguous blocks at step S28 as in step S22. The blocks are then individually cross-correlated with the respective blocks of data from the stored signature at step S29 to obtain the cross-correlation coefficients. This time the magnitude of the cross-correlation peaks are analysed to determine the uniqueness factor at step S29. Thus it can be determined whether the scanned article is the same as the article which was scanned when the stored signature was created.

Accordingly, there has now been described an example of a method for compensating for physical deformations in a scanned article, and/or for non-linearities in the motion of the article relative to the scanner. Using this method, a scanned article can be checked against a stored signature for that article obtained from an earlier scan of the article to determine with a high level of certainty whether or not the same article is present at the later scan. Thereby an article constructed from easily distorted material can be reliably recognised. Also, a scanner where the motion of the scanner relative to the article may be non-linear can be used, thereby allowing the use of a low-cost scanner without motion control elements.

An alternative method for performing a block-wise analysis of scan data is presented in FIG. 18b

This method starts at step S21 with performing a scan of the target surface as discussed above with reference to step S21 of FIG. 13a. Once the data has been captured, this scan data is cast onto a predetermined number of bits at step S31. This consists of an effective reduction in the number of bits of scan data to match the cast length. In the present example, the scan data is applied to the cast length by taking evenly spaced bits of the scan data in order to make up the cast data.

Next, step S33, a check is performed to ensure that there is a sufficiently high level of correlation between adjacent bits of the cast data. In practice, it has been found that correlation of around 50% between neighbouring bits is sufficient. If the bits are found not to meet the threshold, then the filter which casts the scan data is adjusted to give a different combination of bits in the cast data.

Once it has been determined that the correlation between neighbouring bits of the cast data is sufficiently high, the cast data is compared to the stored record signature at step S35. This is done by taking each predetermined block of the record signature and comparing it to the cast data. In the present example, the comparison is made between the cast data and an equivalent reduced data set for the record signature. Each block of the record signature is tested against every bit position offset of the cast data, and the position of best match for that block is the bit offset position which returns the highest cross-correlation value.

Once every block of the record signature has been compared to the cast data, a match result (bit match ratio) can be produced for that record signature as the sum of the highest cross-correlation values for each of the blocks. Further candidate record signatures can be compared to the cast data if necessary (depending in some examples upon whether the test is a 1:1 test or a 1:many test).

After the comparison step is completed, optional matching rules can be applied at step S37. These may include forcing the various blocks of the record signature to be in the correct order when producing the bit match ration for a given record signature. For example if the record signature is divided into five blocks (block 1, block 2, block 3, block 4 and block 5), but the best cross-correlation values for the blocks, when tested against the cast data returned a different order of blocks (e.g. block 2, block 3, block 4, block 1, block 5) this result could be rejected and a new total calculated using the best cross-correlation results that keep the blocks in the correct order. This step is optional as, in experimental tests carried out, it has been seen that this type of rule makes little if any difference to the end results. This is believed to be due to the surface identification property operating over the length of the shorter blocks such that, statistically, the possibility of a wrong-order match occurring to create a false positive is extremely low.

Finally, at step S39, using the bit match ratio, the uniqueness can be determined by comparing the whole of the scan data to the whole of the record signature, including shifting the blocks of the record signature against the scan data based on the position of the cross-correlation peaks determined in step S35. This time the magnitude of the cross-correlation peaks are analysed to determine the uniqueness factor at step S39. Thus it can be determined whether the scanned article is the same as the article which was scanned when the stored record signature was created

The block size used in this method can be determined in advance to provide for efficient matching and high reliability in the matching. When performing a cross-correlation between a scan data set and a record signature, there is an expectation that a match result will have a bit match ratio of around 0.9. A 1.0 match ratio is not expected due to the biometric-type nature of the property of the surface which is measured by the scan. It is also expected that a non-match will have a bit match ratio of around 0.5. The nature of the blocks as containing fewer bits than the complete signature tends to shift the likely value of the non-match result, leading to an increased chance of finding a false-positive. For example, it has been found by experiment that a block length of 32 bits moves the non-match to approximately 0.75, which is too high and too close to the positive match result at about 0.9 for many applications. Using a block length of 64 bits moves the non-match result down to approximately 0.68, which again may be too high in some applications. Further increasing the block size to 96 bits, shifts the non-match result down to approximately 0.6, which, for most applications, provides more than sufficient separation between the true positive and false positive outcomes. As is clear from the above, increasing the block length increases the separation between non-match and match results as the separation between the match and non-match peaks is a function of the block length. Thus it is clear that the block length can be increased for greater peak separation (and greater discrimination accuracy) at the expense of increased processing complexity caused by the greater number of bits per block. On the other hand, the block length may be made shorter, for lower processing complexity, if less separation between true positive and false positive outcomes is acceptable.

It is also possible to produce a uniqueness measure for individual subsets of the data gathered by the photodetectors and to combine those individual uniqueness values rather than combining the data and then calculating an overall uniqueness. For example, in some examples, the data is broken down into a set of blocks for processing and each block can have a BMR calculated therefor. This can be taken a step further such that a uniqueness measure is created for each block. Likewise, the data from individual photodetectors can be analysed to create a uniqueness thererfor.

By taking such a approach, additional information about the overall uniqueness may become apparent. For example if the data is split into 10 blocks and three of those blocks provide a very strong uniqueness and the other seven blocks return a weaker or non-existent uniqueness, then this might provide the same overall uniqueness as if the ten blocks all have a modest uniqueness. Thus tampering of articles, article damage, sensor malfunction and a number of other conditions can be detected.

Such an approach thus involves combining the individual block and/or photodetector uniquenesses to give the overall uniqueness. This is can be a straightforward combination of the values, or in some circumstances a weighting may be applied to emphasise the contribution of some values over others. To combine uniqunesses expressed in a logarithmic scale, the individual uniquenesses are summed (e.g. of three blocks each have a uniqueness of 10²⁰, the overall uniqueness would be 10⁶⁰), and the values are multiplied if a logarithmic scale is not used.

Another characteristic of an article which can be detected using a block-wise analysis of a signature generated based upon an intrinsic property of that article is that of localised damage to the article. For example, such a technique can be used to detect modifications to an article made after an initial record scan.

For example, many documents, such as passports, ID cards and driving licenses, include photographs of the bearer. If an authenticity scan of such an article includes a portion of the photograph, then any alteration made to that photograph will be detected. Taking an arbitrary example of splitting a signature into 10 blocks, three of those blocks may cover a photograph on a document and the other seven cover another part of the document, such as a background material. If the photograph is replaced, then a subsequent rescan of the document can be expected to provide a good match for the seven blocks where no modification has occurred, but the replaced photograph will provide a very poor match. By knowing that those three blocks correspond to the photograph, the fact that all three provide a very poor match can be used to automatically fail the validation of the document, regardless of the average score over the whole signature.

Also, many documents include written indications of one or more persons, for example the name of a person identified by a passport, driving licence or identity card, or the name of a bank account holder. Many documents also include a place where written signature of a bearer or certifier is applied. Using a block-wise analysis of a signature obtained therefrom for validation can detect a modification to alter a name or other important word or number printed or written onto a document. A block which corresponds to the position of an altered printing or writing can be expected to produce a much lower quality match than blocks where no modification has taken place. Thus a modified name or written signature can be detected and the document failed in a validation test even if the overall match of the document is sufficiently high to obtain a pass result.

The area and elements selected for the scan area can depend upon a number of factors, including the element of the document which it is most likely that a fraudster would attempt to alter. For example, for any document including a photograph the most likely alteration target will usually be the photograph as this visually identifies the bearer. Thus a scan area for such a document might beneficially be selected to include a portion of the photograph. Another element which may be subjected to fraudulent modification is the bearer's signature, as it is easy for a person to pretend to have a name other than their own, but harder to copy another person's signature. Therefore for signed documents, particularly those not including a photograph, a scan area may beneficially include a portion of a signature on the document.

In the general case therefore, it can be seen that a test for authenticity of an article can comprise a test for a sufficiently high quality match between a verification signature and a record signature for the whole of the signature, and a sufficiently high match over at least selected blocks of the signatures. Thus regions important to the assessing the authenticity of an article can be selected as being critical to achieving a positive authenticity result.

In some examples, blocks other than those selected as critical blocks may be allowed to present a poor match result. Thus a document may be accepted as authentic despite being torn or otherwise damaged in parts, so long as the critical blocks provide a good match and the signature as a whole provides a good match.

Thus there have now been described a number of examples of a system, method and apparatus for identifying localised damage to an article, and for rejecting an inauthentic an article with localised damage or alteration in predetermined regions thereof. Damage or alteration in other regions may be ignored, thereby allowing the document to be recognised as authentic.

In some scanner apparatuses, it is also possible that it may be difficult to determine where a scanned region starts and finishes. Of the examples discussed above, this may be most problematic a processing line type system where the scanner may “see” more than the scan area for the article. One approach to addressing this difficulty would be to define the scan area as starting at the edge of the article. As the data received at the scan head will undergo a clear step change when an article is passed though what was previously free space, the data retrieved at the scan head can be used to determine where the scan starts.

In this example, the scan head is operational prior to the application of the article to the scanner. Thus initially the scan head receives data corresponding to the unoccupied space in front of the scan head. As the article is passed in front of the scan head, the data received by the scan head immediately changes to be data describing the article. Thus the data can be monitored to determine where the article starts and all data prior to that can be discarded. The position and length of the scan area relative to the article leading edge can be determined in a number of ways. The simplest is to make the scan area the entire length of the article, such that the end can be detected by the scan head again picking up data corresponding to free space. Another method is to start and/or stop the recorded data a predetermined number of scan readings from the leading edge. Assuming that the article always moves past the scan head at approximately the same speed, this would result in a consistent scan area. Another alternative is to use actual marks on the article to start and stop the scan region, although this may require more work, in terms of data processing, to determine which captured data corresponds to the scan area and which data can be discarded.

In some examples, a drive motor of the processing line may be fitted with a rotary encoder to provide the speed of the article. Alternatively, a linear encoder of some form may be used with respect to the moving surface of the line. This can be used to determine a start and stop position of the scan relative to a detected leading edge of the article. This can also be used to provide speed information for linearization of the data, as discussed above with reference to FIG. 8. The speed can be determined from the encoder periodically, such that the speed is checked once per day, once per hour, once per half hour etc.

In some examples the speed of the processing line can be determined from analysing the data output from the sensors. By knowing in advance the size of the article and by measuring the time which that article takes to pass the scanner, the average speed can be determined. This calculated speed can be used to both locate a scan area relative to the leading edge and to linearise the data, as discussed above with reference to FIG. 8.

Another method for addressing this type of situation is to use a marker or texture feature on the article to indicate the start and/or end of the scan area. This could be identified, for example using the pattern matching technique described above.

Thus there has now been described an number of techniques for scanning an item to gather data based on an intrinsic property of the article, compensating if necessary for damage to the article or non-linearities in the scanning process, and comparing the article to a stored signature based upon a previous scan of an article to determine whether the same article is present for both scans.

A further optional arrangement for the signature generation will now be described. The technique of this example uses a differential approach to extraction of the reflected signals from the photodetectors 16 (as illustrated in FIG. 1). In this approach, the photodetectors are handled in pairs. Thus if more than two photodetectors are used, some may be included in pairs for a differential approach and some may be considered individually or in a summing sense. The remainder of this example will refer to a situation where two photodetectors 16a and 16b are employed.

In the present example, the output from each photodetector 16 is fed to a separate ADC 31. The outputs of these two ADCs are then differenced (for example whereby the digitised signal from the second photodetector is subtracted from the digitised signal from the first photodetector) to provide the data set that is used for signature generation.

This technique is particularly applicable to situations where the outputs from the two photodetectors are substantially anticorrelated as the differencing then has the effect of up to doubling the signal strength. Examples of situations where a high level of anticorrelation occurs are surfaces with high levels of halftone printing.

Thus an example of a system for obtaining and using a biometric-type signature from an article has been described. Alternative scanner arrangements, and various applications and uses for such a system are set out in the various patent applications identified above. The use of the match result testing approaches disclosed herein with any of the physical scanner arrangements and/or the applications and uses of such technology disclosed in those other patent applications is contemplated by the inventor.

Self-Calibration

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

Priority Claims (1)

Provisional Applications (1)