The present invention relates to an anonymization technology.
Statistical data about data including personal information such as an age, a gender or an address is being opened to the public and used. There is known a technology for anonymizing the data by using data abstraction in order to prevent an individual from being specified based on the published data at the time of publishing data. Anonymization is a technology which processes data so that which individual data may not be specified to each record (a set of personal information and the like) in the set of personal information. As a well-known index of anonymization, there is k-anonymity. K-anonymity is what is assured that data is not narrowed down to less than k data. Among attributes included in personal information, attributes (and an attribute group which is a set of attributes) which can specify an individual based on a combination of the attributes is called quasi-identifiers. Basically, anonymization for securing k-anonymity assures k-anonymity by generalizing an attribute value included in this quasi-identifier, and making the number of records sharing the quasi-identifier be k or over.
For example, in patent document 1 and patent document 2, there is disclosed public information privacy preserving devices for processing data in order to protect privacy in published information.
In a privacy preserving apparatus of patent document 1, a setting means sets an order of priority (weighting) to each attribute of data by considering a required condition of a user who uses public information.
A calculating means calculates an evaluation point of each data based on the set order of priority (weighting).
A processing method selection means selects a data processing method by which a decrease of the calculated evaluation point becomes smallest, and next, selects a data processing method by which an increase of the calculated evaluation point becomes biggest.
A data processing means processes data using the selected processing methods. The data processing means processes the data from the lowest priority order (weighting) set by the setting means until k-anonymity is satisfied.
A privacy preserving apparatus of patent document 1 solves a problem of a lack of information required by a data user based on adopting the above-mentioned constitution and dealing with all data equally.
And, in patent document 3, there is disclosed an information processing device which anonymizes information using a judgment of whether or not anonymization is done as a whole when anonymization is performed to each item of data.
And, in patent document 4, there is disclosed an information processing device which can set a secure level dynamically.
However, in the technologies described in patent documents 1-4, if at least one data of a person whose request level of anonymization is high is included in a data set, the information value of the whole data set drops. The reason of this is because the whole data set is abstracted in order to satisfy the level of the k-anonymity according to the data of the person of the highest request level.
In addition, a technology which considers a request level of each data is described in non-patent document 1. The technology described in non-patent document 1 divides a data set into groups based on a request level. When described concretely, it is as follows. First, the technology described in non-patent document 1 divides a data set into data which has similar k request level of discriminability (the narrowed number as data of a specific user in a data set just like k of anonymity). Then, the technology described in non-patent document 1 groups data according to a semantic similarity degree for each divided data set. Here, the size of each group satisfies a request level. However, in each group processed for each request level, when data within a single group are largely dispersed, or when a group is closed to another group that neighbors, the technology described in non-patent document 1 moves data. However, the technology described in non-patent document 1 divides a data set based on the request level. Therefore, when the number of data which has a similar request level is not sufficient, data in the group is not necessarily constructed of resemble data. As the technology described in non-patent document 1, when applied for the purpose of keeping statistics values, it is not a problem in particular that data in the group is not resemble necessarily. However, it is difficult to apply the technology described in non-patent document 1 to anonymization which needs abstraction which is meaningful as data.
An object of the present invention is to provide an anonymization device and an anonymization method in which all data satisfies request levels of anonymization, and can prevent a decline of information value based on a whole data set being abstracted.
To achieve the above-mentioned object, an anonymization device according to the present invention includes: anonymization means for executing anonymization processing to a data set including two data or over with making each group of the data as a processing unit; anonymous level setting means for setting an adaptive anonymous level to each of the groups of the data set executed the anonymization processing; anonymity judgment means for judging whether or not the group satisfies the set adaptive anonymous level; and further the anonymization means executes anonymization processing to the data set executed the anonymization processing based on the judgment result by the anonymity judgment means.
To achieve the above-mentioned object, an anonymization method according to the present invention, includes: executing anonymization processing to a data set including two data or over with making each group of the data as a processing unit; setting an adaptive anonymous level to each of the groups; judging whether or not the group satisfies the set adaptive anonymous level having; and further executing anonymization processing to the data set executed anonymization processing based on the judgment result.
To achieve the above-mentioned object, a program causing a computer to execute: executing anonymization processing to a data set including two data or over with making each group of the data as a processing unit; setting an adaptive anonymous level to each of the groups; judging whether or not the group satisfies the set adaptive anonymous level; and further executing anonymization processing to the data set executed anonymization processing based on the judgment result.
An example of the effect of the present invention is that all data satisfies respective request level of anonymization, and can prevent a decline of information value based on a whole data set being abstracted.
First, in order to make an understanding of a first exemplary embodiment of the present invention, a technology related to this exemplary embodiment will be described.
In the beginning, terms used in the following description will be arranged.
Sensitive information (sensitive information) is information which is not wanted to be known by others.
A quasi-identifier is information which can specify a user based on a combination of background knowledge and information, that is, it is information to be an identifier. In addition, there is a case where a quasi-identifier includes sensitive information.
And, as an example for explanation, it is assumed a case where a provider who discloses data to be used for analysis after applying anonymization processing to the data (hereinafter, referred to as a “disclosure provider”) holds data shown in
In this exemplary embodiment, it is supposed that information about “sickness” is sensitive information. However, in this exemplary embodiment, it is supposed that sensitive information is used for analysis of data. Accordingly, sensitive information (“sickness” in
Therefore, this exemplary embodiment abstracts at least part of the quasi-identifiers other than the sensitive information (“sickness”).
“ki” means a request level of k-anonymity. K-anonymity is an index which requires that the number of data including a same combination of quasi-identifiers is k or more. Data is operated as a group. Accordingly, information of the quasi-identifier is abstracted so that a group satisfies the request level of k-anonymity. The symbol “i” of “ki” means a number (number) which identifies data. For example, “i” of “ki” of the data of No. 2 is “2”. And, the request level of the data of No. 2 is expressed with “k2”, and its value is “3” (refer to
The technology related to this exemplary embodiment sets a request level of the highest k-anonymity among data held by the data set to whole data possessed by the disclosure provider as an “optimum k-anonymity level”. In the case of the data set shown in
The technology related to this exemplary embodiment, for example, divides the data shown in
As shown in
Here, the number of data of the group of twenties is “4”. And, the number of data of the group of thirties is “5”. Any group satisfies “4” of the optimum k-anonymity level.
The related technology of this exemplary embodiment does not divide data any more. The reason of this is because the both of the groups will not satisfy the optimum k-anonymity level if divided furthermore because the optimum k-anonymity level is set to the group of twenties and the group of thirties evenly.
However, neither of the request levels of data belonging to the group of thirties is requesting “4” which is the optimum k-anonymity level. In other words, information on the quasi-identifiers of the data belonging to the group of thirties is abstracted more than necessary.
That is, because the related technology of this exemplary embodiment executes anonymization processing in compliance with the highest request level within the data set, there is a problem that the information value of the whole data set drops.
The first exemplary embodiment of the present invention described below settles the above mentioned problem of the related technology.
First, with reference to
The anonymization unit 11 receives a set (hereinafter, referred to as a “data set”) including two data or over from an external device or system. The anonymization unit 11 may receive a data set from a storage device which is not illustrated or from a constitution unit which is not illustrated. And, as will be described in detail later, the anonymization unit 11 receives a data set from the anonymity judgment unit 13 and/or the group modification unit 14.
In addition, the anonymization device 10 of this exemplary embodiment has no limitation in particular in a technique of transmission and reception of a data set between each constitution. For example, the anonymization device 10 may store a data set in a memory unit which is not illustrated, and each constitution may read data included in the data set of the memory unit or write data in it. And, each constitution of the anonymization device 10 may transmit a data set to a next constitution directly. Further, each constitution of the anonymization device 10 may transmit partial data (for example, abstracted data, grouped data or data before executed abstraction or grouping) of a data set needed for the next constitution or later constitutions. Hereinafter, these are collectively referred to as outputting a data set or transmitting a data set, or inputting a data set or receiving a data set.
The anonymization unit 11 divides the data into groups to the received data set, and executes anonymization processing which abstracts as making a divided group be a processing unit. When receiving a data set which is already grouped, the anonymization unit 11 may divide a group included in the data set into small groups furthermore. Hereinafter, these are referred to as dividing a data set which includes dividing a group within the data set into small groups furthermore.
However, in division, anonymization processing of the anonymization unit 11 of this exemplary embodiment suppresses abstraction of data as much as possible, and processes (divides/abstracts) data so that an individual cannot be specified from disclosed data.
The anonymization processing of this exemplary embodiment is described using a top-down processing as an example. The top-down anonymization processing of this exemplary embodiment includes division processing and abstraction processing of data. In other words, in this exemplary embodiment, the anonymization unit 11 divides a data set into groups and abstracts data belonging to a group as needed. In addition, the top-down anonymization processing of the anonymization unit 11 has no limitation. This anonymization processing may be a processing which uses a classification tree or a processing which uses clustering, by focusing attention on an optional quasi-identifier, for example.
The anonymization unit 11 outputs a data set divided into groups to the anonymous level setting unit 12.
The anonymous level setting unit 12 receives the data set divided into groups from the anonymization unit 11. The anonymous level setting unit 12 sets an “adaptive anonymous level” which is a request level of anonymization to each group based on the received data set. Here, the adaptive anonymous level may be different for each group, or may be the same for some groups. However, as will be described later, this exemplary embodiment operates recursively. In other words, setting of the applicable anonymous level may be executed several times. Accordingly, this exemplary embodiment does not exclude a case where the anonymous level setting unit 12 sets a same adaptive anonymous level to all groups.
“Adaptive anonymous level” is a request level of anonymity which is set adaptively according to data belonging to a group. The anonymous level setting unit 12 may set the request level of data having the highest request level of anonymization within a group (for example, it corresponds to the optimum k-anonymity level mentioned above) to an adaptive anonymous level.
The anonymous level setting unit 12 outputs a set of data to which an adaptive anonymous level is set to each group to the anonymity judgment unit 13.
The anonymity judgment unit 13 receives the data set to which an adaptive anonymous level is set to each group from the anonymous level setting unit 12. The anonymity judgment unit 13 judges whether or not each group satisfies the adaptive anonymous level. When judging that each group satisfies the adaptive anonymous level, the anonymity judgment unit 13 outputs the data set of each group to the anonymization unit 11.
Hereafter, the anonymization unit 11, the anonymous level setting unit 12 and the anonymity judgment unit 13 repeat the processing recursively until the anonymity judgment unit 13 judges that at least one group does not satisfy the adaptive anonymous level.
When judging that at least one group does not satisfy the adaptive anonymous level, the anonymity judgment unit 13 outputs the data set to the group modification unit 14.
The group modification unit 14 modifies the group of the data set based on the judgment result of the anonymity judgment unit 13. When the shortage of data of the group which is judged as not satisfying the adaptive anonymous level can be compensated by the excess of other group, the group modification unit 14 moves the excess data of the other group for necessary to compensation to the group which does not satisfy the adaptive anonymous level.
After moving the data and modifying groups, the group modification unit 14 outputs the data set after modification to the anonymization unit 11.
Hereafter, the anonymization unit 11, the anonymous level setting unit 12, the anonymity judgment unit 13 and the group modification unit 14 repeat the described processing recursively until the group modification unit 14 judges that it cannot modify a group any more in a manner of satisfying the adaptive anonymous level in any group.
When judging as a state that it cannot modify a group in a manner that an adaptive anonymous level is satisfied in any group, the group modification unit 14 cancels the division which the anonymization unit 11 has performed finally, and returns a state to the state that all groups satisfy the respective adaptive anonymous levels. The returned data set becomes a data set divided as much as possible in the state that each group satisfies the adaptive anonymous level. Accordingly, this data set may be called a final data set.
The group modification unit 14 outputs the final data set to a display device, for example. The group modification unit 14 may output the final data set to a storage device, an external device or a system which is not illustrated.
In addition, the state that at least one group cannot be modified so as satisfying the adaptive anonymous level is, for example, the state that a shortage of data of at least one group judged as not satisfying the adaptive anonymous level cannot be compensated by data of an excess of other group. Alternatively, this state is the state that there is no excess data in other groups.
Next, with reference to
As shown in
Next, the anonymous level setting unit 12 sets the adaptive anonymous levels to respective groups (Step S12). In this exemplary embodiment, the anonymous level setting unit 12 sets the request level of data having the highest request level of anonymization within a group as the adaptive anonymous level of the group.
Next, the anonymity judgment unit 13 judges whether or not each group in the data set divided into groups satisfies the adaptive anonymous level (Step S13). When judging that each group satisfies the adaptive anonymous level, the anonymity judgment unit 13 outputs the data set to the anonymization unit 11.
Hereafter, the processing of Step S11, Step S12 and Step S13 are repeated recursively until the anonymity judgment unit 13 judges that at least one group does not satisfy the adaptive anonymous level.
In Step S13, when at least one group is judged that it does not satisfy the adaptive anonymous level, the anonymity judgment unit 13 outputs the data set to the group modification unit 14.
The group modification unit 14 judges whether or not it is possible to modify the groups in a manner that all groups satisfy the respective adaptive anonymous levels (Step S14). Concretely, the group modification unit 14 judges whether or not a shortage of data of a group which is judged by the anonymity judgment unit 13 as not satisfying the adaptive anonymous level can be compensated by the excess of other group.
When judging that it is possible to be compensated, the group modification unit 14 moves the excess data from the other group to the group which does not satisfy the adaptive anonymous level. Based on this move, the group modification unit 14 modifies the groups so that all the groups satisfy the adaptive anonymous levels (Step S15).
After modifying the groups, the group modification unit 14 outputs the data set to the anonymization unit 11. Hereafter, the judgment device 10 repeats the processing of Step S11, Step S12, Step S13, Step S14 and Step S15 recursively until the group modification unit 14 judges that it is impossible to modify a group in any group in a manner of satisfying the adaptive anonymous level.
In Step S14, when judging as a state that at least one group cannot be modified in a manner of satisfying the adaptive anonymous level, the group modification unit 14 cancels the division of the data set performed by the anonymization unit 11 finally. Then, the group modification unit 14 returns the data set to a state that all the groups satisfy the adaptive anonymous levels (Step S16). The group modification unit 14 outputs the data set (the final data set) which has the state that each group satisfies the adaptive anonymous level to a display device, for example. The group modification unit 14 may output the final data set to a storage device, an external device or a system which is not illustrated.
Next, with reference to
And, in the following description, the anonymization device 10 divides data using a top-down processing that uses a classification tree.
In addition, although description will be used an anonymization method using a classification tree as description of this exemplary embodiment, a technique of anonymization which the anonymization unit 11 adopts is not limited to this. The anonymization unit 11 may use a method of clustering or the like that is general. A general clustering method is a method using a mean value of values of quasi-identifiers and a k-means method, for example.
In Step S11 of
In Step S12 of
In Step S13 of
In Step S11 of
Here, the anonymization unit 11 may judge whether or not it is possible to divide the data set, and divide it when judging that it is possible. Alternatively, the anonymization unit 11 may divide data sets further using a classification tree without judging whether or not it is possible to divide it. As a judgment whether or not it is possible to divide, the anonymization unit 11 may judge that it is possible to divide a group when the number of data belonging to the group is “2×ave ki(j)” (hereinafter, referred to as “2ave ki(j)” by omitting “×”) or more. Here, “ave ki(j)” is the average of ki (request level of k-anonymity) of data included in group j. In the following description of this exemplary embodiment, it is supposed that the anonymization unit 11 judges whether or not it is possible to divide using this method.
In the group of j=1 of the data set shown in
In the group of j=2, it is ave ki(j)=2 and it is 2ave ki(j)=4. The number of data of the group of j=2 is 5. Accordingly, the anonymization unit 11 judges that the group of j=2 (here, the group of thirties) can be divided. Then, the anonymization unit 11 divides the group of j=2 (the group of thirties) into two groups (here, a group of ages 30-34 and a group of ages 35-39) based on the top-down processing using a classification tree.
Next, in Step S12 of
Here, the group of j=2 is that the adaptive anonymous level (k(2)) is “3”. However, the number of data belonging to this group is 2. Accordingly, in Step S13 of
In Step S14 of
Referring to
In Step S15 of
For example, the group modification unit 14 considers a one-dimensional space which takes “age” for its axis as a data space. In this space of “age”, the center of gravity of the data of the group of j=2 is “32.5” which is the average of “31” of No. 4 and “34” of No. 9.
Based on the value of this center of gravity, the group modification unit 14 moves the data of No. 8 having a value of “age” of “35” which is the closest to “32.5” which is the “age” of the center of gravity of the group of j=2 within data belonging to the group of j=3 to the group of j=2, and modifies the groups.
Next, processing of the anonymization device 10 returns to Step S11 of
The anonymization device 10 of this exemplary embodiment may judge whether or not all groups after modification can be divided at the time when the group modification unit 14 modifies groups. Then, at a time point when judging that it is impossible to divide the all groups after group modification by the group modification unit 14, the anonymization device 10 may output a final data set to a display device or the like, and ends processing. However, operations of the anonymization device 10 of this exemplary embodiment are not limited to this.
For example, tentatively, it is considered the case where the processing returns Step S11 and the anonymization unit 11 divides a group even though all groups become impossible to be divided after the group modification unit 14 modifies the groups. In this case, the anonymity judgment unit 13 judges that a group which does not satisfy anonymity exists in Step S13. Then, the group modification unit 14 judges that modification of groups is impossible. Then, in this case, the processing of the anonymization device 10 proceeds to Step S16. In Step S16, the group modification unit 14 cancels the division which the anonymization unit 11 executes finally, and returns to the data set of the state in which all groups satisfy the adaptive anonymous levels. Then, the group modification unit 14 outputs the final data set to a display device or the like.
In addition, the group modification unit 14 may be constituted as a structure which does not move data on a predetermined condition, even when the one group satisfies the adaptive anonymous level if it move the excess data included in the other group between groups divided into two. For example, a structure which does not move data is a case where a distance between the position of the center of gravity of data of the group which does not satisfy the adaptive anonymous level and the position of data closest to the center of gravity among excess data belonging to the other group on the data space is a predetermined threshold value or over. In this case, the group modification unit 14 may cancel the division which the anonymization unit 11 performs finally without modifying groups.
When described using specific values, it is a case where, in the above-mentioned example, a threshold value is 5 and the value of data belonging to the group of j=3 which is the closest to “32.5” which is the center of gravity of data belonging to the group of j=2 is “38”. In this case, the group modification unit 14 does not move data and cancels the last division.
Here, when the data set shown in
In addition, a structure of this exemplary embodiment is not limited to the structure shown in
As above-described, the anonymization device 10 according to the first exemplary embodiment can make all data satisfy the request levels of anonymization, and prevent a decline of an information value based on the whole data being abstracted.
The reason is because the anonymization device 10 sets the adaptive request level of anonymization (adaptive anonymous level) for each divided group. Further, it is also because the anonymization device 10 modifies groups so that the adaptive anonymous level becomes appropriate.
Next, an anonymization device 20 according to a second exemplary embodiment of the present invention will be described. The anonymization device 10 used for description of the first exemplary embodiment adopts top-down processing which uses a classification tree as anonymization processing. In contrast, the anonymization device 20 of this exemplary embodiment is different in the point that it adopts bottom-up processing.
The anonymization unit 21 receives a data set of two data or over from an external device or system. The anonymization unit 21 may receive a data set from a storage device or other constitution unit which is not illustrated. And, the anonymization unit 21 receives a data set or a judgment result from the anonymity judgment unit 23.
The anonymization unit 21 executes anonymization processing to a received data set as making a group of data be a processing unit. Anonymization processing of this exemplary embodiment is bottom-up processing. Anonymization processing using bottom-up processing includes integration processing of data and abstraction processing. First, the anonymization unit 21 of this exemplary embodiment divides a data set into two groups or over so that the number of data of unit group becomes the number of data of a predetermined minimum value. The minimum value may be set to a specific value in advance, or may be set based on user's operation whenever the anonymization device 20 operates. Further, the anonymization unit 21 integrates two groups after judgment processing by the anonymity judgment unit 23, abstract data if necessary, and executes anonymization processing. There is no limitation in particular in anonymization processing performed in a bottom-up manner. For example, anonymization processing which is used may be processing which focuses on an optional quasi-identifier, integrates groups having the closest gravity-center distance from each other on a data space and abstracts, or may be processing based on NCP (Normalized Certainty Penalty).
The anonymization unit 21 outputs a data set divided into plural groups which have the number of data of a predetermined minimum value or a data set integrated groups to the anonymous level setting unit 22.
The anonymous level setting unit 22 receives the data set from the anonymization unit 21. The anonymous level setting unit 22 sets the adaptive anonymous level for each group like the anonymous level setting unit 12.
The anonymous level setting unit 22 outputs the data set to which an adaptive anonymous level is set for each group to the anonymity judgment unit 23.
The anonymity judgment unit 23 receives the data set to which an adaptive anonymous level is set for each group from the anonymous level setting unit 22. The anonymity judgment unit 23 judges whether or not each group of the data set satisfies the adaptive anonymous level. When judging that at least one group does not satisfy the adaptive anonymous level, the anonymity judgment unit 23 outputs the data set to the anonymization unit 21.
Hereafter, the anonymization unit 21, the anonymous level setting unit 22 and the anonymity judgment unit 23 repeat processing recursively until the anonymity judgment unit 23 judges that all groups satisfy the adaptive anonymous levels.
When judging that all groups satisfy the adaptive anonymous levels (a data set of this case is a “final data set”), the anonymity judgment unit 23 outputs the final data set to a display device, for example. The anonymity judgment unit 23 may output the final data set to a storage device, an external device or a system which is not illustrated.
Next, with reference to
Next, the anonymous level setting unit 22 sets the adaptive anonymous level to the respective groups (Step S22). In this exemplary embodiment, the anonymous level setting unit 22 sets the request level of data which has the highest request level of anonymization within the group as the adaptive anonymous level of the group.
Next, the anonymity judgment unit 23 judges whether or not all groups of the data set satisfy the adaptive anonymous levels (Step S23). When judging that at least one group does not satisfy the adaptive anonymous level, the anonymity judgment unit 23 outputs the data set to the anonymization unit 21.
The anonymization unit 21 which receives the data set from the anonymity judgment unit 23 integrates a group and one or more other groups so that the group which does not satisfy the adaptive anonymous level satisfies the adaptive anonymous level (Step S24).
Hereafter, the anonymization device 20 repeats the processing of Step S22, Step S23 and Step S24 recursively until the anonymity judgment unit 23 judges that all groups satisfy the adaptive anonymous levels.
In Step S23, when judging that all groups satisfy the adaptive anonymous levels (in this case, the data set is a final data set), the anonymity judgment unit 23 outputs the final data set to a display device, for example. The anonymity judgment unit 23 may output the final data set to a storage device, an external device or system which is not illustrated.
Next, each step of
And, in the following description, the anonymization device 20 integrates data based on bottom-up processing.
In Step S21 of
In addition, when a predetermined minimum value is “2” tentatively, the data set will be divided into groups such as a group including two data of No. 3 and No. 5 and a group including two data of No. 1 and No. 7 shown in
In Step S21 of
In Step S23 of
In Step S24 of
The anonymization unit 21 selects a group (selected group) of a target of integration processing. For example, the anonymization unit 21 may select an optional group from groups which do not satisfy the adaptive anonymous levels as the target for the processing. Alternatively, the anonymization unit 21 may select a group with the greatest difference between the value of the adaptive anonymous level and the number of data in the group among groups which do not satisfy the adaptive anonymous levels as the target for the processing. A selection technique of a target for processing of this exemplary embodiment is not limited to the method described in this specification. However, in the following description of this exemplary embodiment, description will be made supposing that the anonymization unit 21 selects a group with the greatest difference between the adaptive anonymous level and the number of data as the target for the processing.
Next, the anonymization unit 21 selects other group (integration target group) which is an integration target of the group (selected group) selected as the target for the processing.
Here, selection of the integration target group is not limited in particular. However, it is desired for the anonymization unit 21 to select a group with the smallest information based on integration processing as an integration target group. For example, the anonymization unit 21 selects a group having a position of the center of gravity closest to the position of the center of gravity of the selected group on a data space as an integration target group. Then, the anonymization unit 21 may integrate two groups (the selected group and the integration target group) which are selected. Alternatively, the anonymization unit 21 may select a group to which a degree of abstraction (for example, a width which a value of a quasi-identifier included in the group after integration takes) becomes smallest when integrated with the selected group by using the technique of NCP as the integration target group. In the description of this exemplary embodiment, it is supposed that the anonymization unit 21 selects a group with the closest gravity-center distance as the integration target group.
The anonymization unit 21 selects a group of j=1 to which the data of No. 3 having the greatest difference between the adaptive anonymous level and the number of data belongs as the target for the processing (that is, a group of j=1 is a selected group). It is because the number of data belonging to the group of j=1 is “1” and the adaptive anonymous level k(1) is “4”, and difference between these becomes “3”. The difference “3” of the group of j=1 is the greatest compared with differences of the other groups.
The anonymization unit 21 selects a group of j=2 which is a group to which the data of No. 5 belongs as a group to be an integration target of the group of j=1 (in other words, the group of j=2 is the integration target group). It is because the group of j=2 has the closest distance to the group of j=1 on the one dimensional space taking “age” as an axis.
The anonymization unit 21 integrates the group of j=1 and the group of j=2.
Then, in Step S22 of
Here, in Step S23 of
Hereafter, the processing of Step S22, Step S23 and Step S24 are repeated recursively until the anonymity judgment unit 23 judges that the adaptive anonymous levels are satisfied in all groups.
In
Next, the anonymization unit 21 selects the group of j=4 to which difference between the adaptive anonymous level and the number of data is greatest as the selected group. And, the anonymization unit 21 selects the group of j=3 in which the data of No. 4 which is one of data closest to the data of No. 9 is included as the integration target group.
The anonymization unit 21 repeats the procedure described above, and repeats integration of groups as shown in
When the data of the data set is integrated into groups up to the state shown in
Here, when comparing the outputted final data set shown in
As mentioned above, the anonymization device 20 according to the second exemplary embodiment can make all data included in the data set satisfy the request levels of anonymization and prevent a decline of information values based on abstracting the whole data set.
The reason is because the anonymization device 20 sets the adaptive request level of anonymization (adaptive anonymous level) for each group. Further it is because the anonymization device 20 integrates groups so that the adaptive anonymous levels become appropriate.
Next, the anonymization device 30 according to the third exemplary embodiment of the present invention will be described with reference to a drawing.
The anonymization unit 31 executes anonymization processing to a data set inputted from outside of the anonymization unit 31 with making each group of data as a processing unit. And, the anonymization unit 31 receives a data set from the anonymity judgment unit 33. The anonymization unit 31 outputs the data set to which anonymization processing is executed to the anonymous level setting unit 32.
The anonymous level setting unit 32 sets the adaptive anonymous level based on data included in the group for each of the groups to which the anonymization unit 31 executes anonymization processing. The anonymous level setting unit 32 outputs the data set to which the adaptive anonymous level is set for each of the groups to the anonymity judgment unit 33.
The anonymity judgment unit 33 judges whether or not the group satisfy the adaptive anonymous level which is set. Depending on a judgment result, the anonymity judgment unit 33 outputs the data set to the anonymization unit 31, or ends the processing and outputs the data set to a display device or the like.
Next, the anonymous level setting unit 32 sets the adaptive anonymous level for each group to which the anonymization unit 31 executes anonymization processing (Step S32).
Next, the anonymity judgment unit 33 judges whether or not each group satisfy the corresponding adaptive anonymous level (Step S33).
According to the judgment result, the anonymity judgment unit 33 outputs the data set to the anonymization unit 31, or ends the processing and outputs the data set to a display device or the like.
As mentioned above, the anonymization device 30 according to the third exemplary embodiment can make all data satisfy the request levels of anonymization and prevent a decline of information values based on abstracting the whole data set.
The reason is because the anonymization device 30 sets the adaptive request level of anonymization (adaptive anonymous level) for each group.
While the invention has been particularly shown and described with reference to exemplary embodiments thereof, the invention is not limited to these embodiments. It will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the claims.
For example, the anonymization device 10 may receive a data set from outside via the communication IF 2.
The CPU 1 operates an operating system and controls the whole of the anonymization device 10. And, for example, the CPU 1 may read out the program and the data set from a computer-readable recording medium, which is not illustrated and mounted on a drive apparatus into the memory 3, and execute various kinds of processing based on this.
For example, a part of or all of the functions of the anonymization unit 11, the anonymous level setting unit 12, the anonymity judgment unit 13 and the group modification unit 14 may be realized using the CPU 1 and the program.
The storage device 4 is an optical disk, a flexible disk, a magnetic optical disk, an external hard disk or a semiconductor memory, for example, and stores a computer program so that reading is possible from a computer (CPU). The storage device 4 may store the data set and the computer program for realizing the anonymization device 10, for example. And, the computer program for realizing the anonymization device 10 may be downloaded from an outside computer which is not shown and is connected to a communication network.
In addition, the block diagrams used in each exemplary embodiment described so far shows not a structure of a hardware unit but blocks of function unit. These function blocks may be realized using any combination of hardware and software. And, a realization means of the constitution units of the anonymization device 10 is not limited to a physical device in particular. That is, the anonymization device 10 may be realized using one device made by combining physically, or it may be realized by connecting two or more physically separated devices with a wired or a wireless and using these plural devices.
A program of the present invention should just be a program which makes a computer execute each operation described in each of the above-mentioned exemplary embodiments.
And, the anonymization device 20 according to the second exemplary embodiment and the anonymization device 30 according to the third exemplary embodiment may be realized by the computer based on the hardware configuration shown in
While the invention has been particularly shown and described with reference to exemplary embodiments thereof, the invention is not limited to these embodiments. It will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the claims.
This application claims priority based on Japanese application Japanese Patent Application No. 2011-191355, filed on Sep. 2, 2011, the disclosure of which is incorporated herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2011-191355 | Sep 2011 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2012/072282 | 8/28/2012 | WO | 00 | 2/25/2014 |