The invention relates to methods and apparatus for generating an identifier of a computer device, for example using parameters related to and/or received from a software application such as a web browser installed on the computer device.
Patent publications WO2012/122621 and WO2012/122674 describe mechanisms to construct a unique identifier from a fixed number of parameters which may change over a period of time, for use in computing environments. The identifier may be constructed using identifiers of assets such as a motherboard, BIOS, MAC address and hard disk, some of which may change from time to time. Such changes in the parameters can be countered using error correction capabilities, so that the change of small fraction of the contributing parameters leads to the calculated identifier remaining the same. These error correction capabilities can beneficially be added to process of calculating the identifier without revealing the original or ‘correct’ values of the parameters which have subsequently changed.
WO2012/122674 also describes a variant in which two or more asset parameters are combined using a pre-processing operation to produce an output that is then processed as if a single asset parameter in the process of
The schemes described in WO2012/122674 are examples of more general technologies in which a fixed number of parameters are converted into an identifying message X, wherein the conversion to X is robust to limited changes in the parameters. This is illustrated in
The article “How Unique Is Your Web Browser?” by Peter Eckersley of the Electronic Frontier Foundation, which was presented at the Proceedings of Privacy Enhancing Technologies Symposium 2010, describes the results of an experiment collecting detectable properties of web browsers over a large population of browsers. It shows that there are an extremely large number of browser properties that can be used to identify a particular computer, smart phone, tablet or even an end user. Similar browser properties are reported elsewhere, and the HTML5 W3C specification is expected to feature additional API's that may expose further client specific browser properties. Typically, a “fingerprint” JavaScript on a web page may be used to cause a web browser to collect browser specific parameters. This is described in the above Peter Eckersley publication but also in US2011/099480 in which a web server uses collected browser parameters to identify a computer.
Collected browser parameters can be used as a fingerprint in a variety of fraud prevention applications, for example as discussed in US2011/099480. However, storing web browser parameters for the purposes of future identification of a computer device may be undesirable because of the storage requirements and privacy concerns, but prior art techniques that robustly derive a compact identifier from a set of parameters are not generally suited for the processing of web browser parameters. The large number of different possible web browser parameters, the small fraction of actually present parameters in any particular web browser and the typically frequent changes in the presence and values of these parameters over time are problematic for robust identity determination schemes such as those mentioned above. Similar issues arise in respect of other types of software application installed on a computer device, and indeed in respect of a computer device itself.
The invention address these and other problems and limitations of the related prior art.
The invention can be used to convert a sparse and dynamically changing parameter set into a fixed number of parameters that can be input to a robust identity determination module to generate an identifier from the parameter set. In particular, the invention can be used to collect parameters related to an installed web browser or other software application, or computer device, and to process the collected parameters to generate an identifier of the software application or computer device which is more robust to changes in the collected parameters, for example by remaining constant under typical limited changes to the parameters.
One application of the invention is to link a web app to a specific web browser instance. As each installed instance of a web browser is usually unique or nearly so, the invention can be used to achieve such a link. The invention also improves protection of information such as the browser parameters, which there may be an interest in keeping confidential, including by providing an identifier from which it is very difficult to retrieve information about the collected browser parameters from which is it generated.
Accordingly the invention provides a method of generating an identifier of a computer device, for example of an instance of a piece of software, for example a piece of software such as a browser or web browser which is installed on the computer device, comprising: collecting a plurality of parameters of the installed computer device, for example by providing to the computer device a script or other code for execution; forming a permuted extended set of parameters comprising applying a permutation to the collected parameters in combination with a plurality of dummy parameters; and determining an identifier of the computer device from the permuted extended set of parameters.
The computer device could be, for example, a smart phone, a tablet computer, a desktop or laptop computer and so forth. The step of collecting may a step of collecting parameters related to a software application installed on the computer device, and the generated identifier of the computer device is also then an identifier of the software application, which may be a web browser.
Typically, the method is repeated a number of times using the same permutation, to determine to determine the identifier of the computer device at each of the plurality of different times. These repeated versions of the identifier can then be compared to check for changes in the identity of the computer device, which maybe indicated by a change in the identifier.
Typically, as the configuration of the computer device changes over time, the parameters which are available for collection from the computer device will change, irrespective of the values of those parameters, and values of the parameters will also change.
Preferably, the permuted extended parameter set is formed of the same number of parameters at each of the plurality of times, by varying the number of added dummy parameters to compensate for changes in the number of collected parameters. Preferably, the number of dummy parameters is at least as many as the number of collected parameters.
The collected parameters may be compressed and processed in various ways for inclusion in the permuted extended parameter set, and the collected parameters may also be reordered or conformed to a particular ordering scheme (for example alphabetical for strings) for inclusion in the permuted extended parameter set, so that the order of collected parameters in the permuted extended set is unchanged between each of the plurality of times.
The permuted extended parameter set may be transformed or cast into the form of an error correcting code, such as a Reed Solomon code. The identifier may then be generated by decoding the error correcting code.
The invention also provides apparatus, for example: a collection function or module arranged collect a plurality of parameters of or relating to a computer device or software application such as a web browser installed on the computer device; a mapping function or module arranged to form a permuted extended set of parameters comprising applying a permutation to the collected parameters in combination with a plurality of dummy parameters; and a determination function or module arranged to determine an identifier of the computer device or installed software application from the permuted extended set of parameters.
The collection function, mapping function and determination function may be installed together on the computer device, or may be installed in part or in whole elsewhere for example on a remote server. The collection function, mapping function and determination function may for example be implemented as a web app for execution by an installed web browser for which an identifier is generated.
The apparatus may therefore comprise a web app or other computer program comprising the above elements, the web app or other computer program being provided on one or more computer readable media, being distributed by a data network, or being provided by a web server to the computer device. A system may include the computer device and any other component or network element providing parts of the apparatus.
The apparatus may further comprise a compression function arranged such that one or more of the collected parameters in the permuted set of parameters are compressed and/or combined, for example using one or more hash functions. The apparatus may also comprise an ordering function arranged such that the order of collected parameters in the permuted extended set is ordered according to a predetermined ordering scheme which does not vary between times at which the browser identifier is re-determined.
The apparatus may also comprise a comparison function arranged to compare identifiers determined by the determination function based on parameters collected from the computer device at a plurality of different times, and to confirm therefrom that the identity of the installed computer device is unchanged between the different times. The determination function may determine the same identifier of the installed computer device even if the set of parameters of the plurality of collected parameters changes, irrespective of the values of those parameters, or of at least one parameter value changes.
The combined number of collected parameters and dummy parameters used to form the permuted extended set is preferably the same at each of the plurality of different times, for example by extending the collected (and optionally compressed and ordered) parameters by a variable number of dummy parameters.
Embodiments of the invention may be used in node-locking or anchoring to bind a software license to a particular end user so as to ensure that the software is only used by an authorised and paid customer. In particular, the invention can be used for node-locking or anchoring software, such as web applications, to a particular browser.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings of which:
Referring now to
The functional elements include a collection function 72 which is arranged to collect from the web browser at least some of the available parameters of the web browser. The collected parameters are shown as data structure 74. The collection of browser parameters can conveniently be done using JavaScript code 76 provided to the browser by the collection function 72 as part of a web page, assuming that the browser includes a JavaScript engine for the processing of such scripts and a suitable API to obtain various browser specific parameters. Other ways of collecting browser parameters will be apparent to the skilled person.
In many browsers the collection of some browser parameters can be carried out using JavaScript code along the following lines:
The above script uses the standard JavaScript API “navigator.plugins” to obtain a reference to a data structure with details about the currently installed browser plug-in modules. The remaining code converts that into an identifying string for each plug-in. There are thousands of browser plug-in modules, but a single instance of an installed web browser 50 typically will not usually have more than about 30 different plug-in modules installed
Similar scripts can be used to collect other browser parameters using available JavaScript API's possibly in combination with CSS constructs. With these additional sources, the range of potential parameters increases dramatically.
Note that in any particular installed web browser 50 only a small subset of potential browser parameters will be present, and that the particular combination of parameters present will typically vary widely even between the same browser type (for example Apple Safari, Google Chrome) on comparable platforms (for example Apple iphone, Microsoft Windows 7 PC), with extensive further variation being found in the actual values of the parameters. The parameters collected at any particular time by the collection function 72 will therefore be a sparse subset of the potential parameters which might in general be collected from the installed web browser, and both the parameters which are available from the web browser 50 and their values will vary over time, for example as plug-in modules are updated, added and deleted, as the font set changes, or as the resolution of the graphical display is changed.
The functional elements also include a mapping function 80 which receives the collected parameters 74 from the collection function 72, and processes them to generate a permuted extended parameter set 90. The mapping function 80 may include a number of different functions, which may operate in various different orders or simultaneously on the collected parameters 74. One such function is a compression function 82, which is arranged to compress some or all of the parameters collected from the web browser for example using hashing functions, an XOR operation on the characters in a parameter string, and or other suitable data reduction processes, which may typically vary depend on the nature of a parameter being processed or compressed. Such compression preferably aims to preserve the entropy found in the potential range of values of a particular collected parameter. The compression function may also combine various collected parameters or parts of collected parameters received from the web browser 50 to form other, composite versions of the collected parameters.
The collected parameters 74 may not always be collected in the same order from one collection action of the collection function 72 to another, for example because of the way in which the web browser responds to requests from the collection function 72, and this is particularly likely to be the case when a parameter has been added or removed from the browser parameters 51. The mapping function 80 may therefore also sort the collected parameters (in compressed form if required) using a sorting scheme 84, to ensure consistency in ordering of the collected parameters between repeated operations of the collection and mapping functions. An example sorting scheme 84 could be an alphabetic sort on a list of string parameters.
The mapping function 80 generates the permuted extended set of parameters 90 by applying a permutation 86 to the collected parameters (in sorted and/or compressed forms as appropriate) in combination with a plurality of dummy parameters (denoted in the illustrated permuted extended set of parameters as “D”). The number of parameters in the combined set of collected parameters and dummy parameters to which the permutation is applied will typically be much lower than the potential number of different parameters which could be collected from the web browser, this potential number being closely related to the entropy of the collected parameters across a large population of web browsers. The Peter Eckersley paper referenced above reports typical entropy of collectable browser parameters of at least 18 bits. As most browser parameters have a fairly limited number of different values (say 8 bits of entropy), this suggests that Eckersley found around 210 different parameters to be collectable in practise over the population of browsers in his experiments. A typical installed web browser might contain a parameter set with approximately 50 different collectable parameters.
The total number of parameters in the combined set of collected parameters and dummy parameters to which the permutation is applied may be predetermined and used by the mapping function consistently between operations on different sets of collected parameters. For example, the total number of parameters to be permuted could be set at around two or three times the typical number of collected parameters, for example, such that the number of dummy parameters is always at least the same as the number of collected parameters.
The dummy parameters D may be allocated default values, for example all being allocated the same default value, for example a zero integer value, or different values such as random values.
The process of permutation of the extended parameter set, including the dummy parameters, may be carried out in various ways, before, after or in combination with the other processes carried out by the mapping function. The permutation 86 may be defined, for example, by a random permutation table or other structure which defines a reordering of the collected parameters in combination with the dummy parameters, in which the dummy parameters will typically be interspersed among the collected parameters (and vice versa). The permutation 86 is maintained without change by the mapping function 80 for operation on multiple different sets of collected parameters over a period of time so that the permuted extended parameter sets 90, 90′, 90″ generated from corresponding sets of collected parameters 74, 74′, 74″ can be used to generate multiple versions of the identifier 60, 60′, 60″ of the browser.
The permutation 86 could be generated locally in the web app 70 or otherwise at the device 52, or could be communicated to the device from a remote server. The permutation is preferably stored in an obfuscated form. Without knowledge of the permutation 86 it is hard for an attacker to derive information about the original parameters 51 or collected parameters 74 from the permuted extended parameter set 90, which helps preserve confidentiality.
The permuted extended parameter set 90 is passed to a determination function 100 which is arranged to determine an identifier 60 of the web browser 50 from the permuted extended parameter set. The collection function, mapping function and determination function may repeat their operations at multiple different times to determine the identifier 60, 60′, 60″ at those times. In
In many applications, the generated identifier 60. 60′, 60″ will typically not be stored for extended periods at the computer device 52 itself, to reduce the risk of compromise or attack.
To generate identical identifiers at different times, using collected parameters which are expected to change in both presence within the collected parameters 74 and in value between those times, the determination function 100 preferably implements a robust identity determination based on the permuted extended parameter set 90. Some suitable robust identity determination schemes are taught in WO2012/122621 and WO2012/122674, and can be applied using the permuted extended parameter set 90. The permuted extended parameter set is well suited as input to such schemes and algorithms because it has a fixed number of elements, unlike the parameters collected from the web browser by the collection function 70 which will vary in the number of parameters from time to time. The use of the permuted extended parameter set therefore reduces the propagation of changes in the collected parameters to the identifier 60, allowing the use of a simpler error correction scheme in the determination function 100. The propogation of changes is reduced because replacing or adding an element to the collected parameters does not shift all parameters, but only a subset, and these changes are distributed over the entire permuted extended parameter set.
The teaching of WO2012/122621 can be applied by generating a share corresponding to each parameter of the permuted extended parameter set, applying a secret sharing algorithm to a number of subsets of the plurality of shares to derive a plurality of candidate identifiers, the number of subsets being determined in accordance with a tolerance threshold for differences in the parameters of the permuted extended parameter set as compared to previous or original values of the permuted extended parameter set, and determining a most prevalent of the candidate identifier values as a final identifier of the web browser 50. The secret sharing algorithm could be a (M−k,N)-secret sharing algorithm, where N is the number of the plurality of shares, M<N, and k is a predetermined constant. Other details are provided in WO2012/122621 which is hereby incorporated by reference for this and all other purposes.
The teaching of WO2012/122674 can be applied by processing a permuted extended parameter set and a fingerprint in accordance with a pre-determined function to obtain code symbols, the fingerprint being associated with the web-browser and being based on an earlier permuted extended parameter set from the mapping function 80. In this way the permuted extended parameter set is transformed into an error correcting code. An error correction algorithm is then applied to the code symbols to obtain the identifier 60. The error correction algorithm could be a Reed-Solomon error correcting code or similar. Other details are provided in WO2012/122674 which is hereby incorporated by reference for this and all other purposes.
The determination function 100 may require initialisation in order to acquire suitable lookup information to transform the permuted extended parameter set into an identifier 60 which is suitably robust to changes in the collected parameters. This may involve sending an earlier generated permuted extended parameter set or set of collected parameters to a remote server which calculates suitable configuration data for use at the computer device, and in particular error correcting data to ensure that the correct identifier can be calculated. For example, suitable error correcting code may be provided by such a server, which may also be a server that provides the web application code to the computer device. Calculation of the error correcting code at the web application will frequently be undesirable because of the increased potential for attacks. To this end, an anonimised version of the collected parameters or permuted extended parameter set (for example using parameters initially collected) may be sent from the computer device to the server which then returns error correcting code capabilities in the form of configuration data. The server then also knows the value of the identifier 60 that the computer device will generate and use in subsequent internal calculations and/or communication protocols.
The flow chart of
The permuted extended parameter set E′ forms the input to the robust identity determination step 260 that has the ability to correct for changes in the collected parameters which result from changes to the web browser configuration. The above mentioned WO2012/122621 and WO2012/122674 publications describe ways to implement such a step.
Note that the order of the steps in
It will be understood that variations and modifications may be made to the described embodiments without departing from the scope of the invention as defined in the appended claims. For example, it is to be understood that any feature described in relation to any one embodiment may be used alone, or in combination with other features described in respect of that or other embodiments.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2013/073393 | 3/28/2013 | WO | 00 |