One aspect of the present invention relates to a population calculation system for calculating population that is concealed and a population calculation method thereof.
Conventionally, methods for finding population in a specific area have been known. For example, Patent Literature 1 discloses a population distribution analyzing apparatus including area setting means for setting an area in which population distribution of a plurality of users carrying respective portable terminals is to be analyzed, positional information acquiring means for acquiring positional information of each of the portable terminals, and analyzing means for analyzing the population distribution of the users in the set area on the basis of the acquired positional information.
However, when performing summation on population with an apparatus described in Patent Literature 1, for example, there are occasions that an individual is identified on the basis of population data that is a result of the summation, which may cause a problem in privacy, for example. Accordingly, it is required to hide information by superimposing noise on population data to conceal the population data. However, when unnecessarily large noise is superimposed in concealing population data, deviation from the population data before concealment becomes large, and thus there is a possibility that reliability of information on the population data decreases.
Therefore, one aspect of the present invention aims to provide a population calculation system that can conceal and present population while maintaining the reliability of population data at or above a certain level, and a population calculation method thereof.
A population calculation system according to one aspect of the present invention is a population calculation system for calculating concealed population in a target area, the system including acquisition means for acquiring the number of counted people who are counted as samples in a count area containing the target area; population calculation means for calculating population in the count area on the basis of the number of counted people acquired by the acquisition means and a scaling factor for determining the population in the count area on the basis of the number of counted people and calculating population in the target area on the basis of the population in the count area thus calculated; concealing means for concealing the population in the count area or the population in the target area on the basis of a class interval that is a product of concealment reference that is a reference value of a minimum summation unit and the scaling factor in calculation processing by the population calculation means; and output means for outputting the population in the target area obtained through concealment processing by the concealing means as the concealed population in the target area.
A population calculation method according to one aspect of the present invention is a population calculation method executed by a population calculation system for calculating concealed population in a target area, the method including an acquisition step of, by the population calculation system, acquiring the number of counted people who are counted as samples in a count area containing the target area; a population calculation step of, by the population calculation system, calculating population in the count area on the basis of the number of counted people acquired at the acquisition step and a scaling factor for determining the population in the count area on the basis of the number of counted people and calculating population in the target area on the basis of the population in the count area thus calculated; a concealing step of, by the population calculation system, concealing the population in the count area or the population in the target area on the basis of a class interval that is a product of concealment reference that is a reference value of a minimum summation unit and the scaling factor in calculation processing at the population calculation step; and an output step of, by the population calculation system, outputting the population in the target area obtained through concealment processing at the concealing step as the concealed population in the target area.
According to these aspects, when determining population in the count area from the number of counted people and determining population in the target area on the basis of the population, the population in the count area or the population in the target area is concealed on the basis of the class interval that is the product of the concealment reference and the scaling factor. In this manner, by discretely determining population in the target area using the class interval based on the concealment reference, fractions below the class interval can be properly rounded, and thus it is possible to conceal and present population while maintaining the reliability of population data at or above a certain level.
In a population calculation system according to another aspect, the concealing means may conceal the population in the target area calculated by the population calculation means on the basis of the class interval.
According to this aspect, the population in the target area that is a final result of calculation is concealed on the basis of the class interval in calculation processing, which makes it possible to conceal and present population while maintaining the reliability of population data at or above a certain level.
In a population calculation system according to still another aspect, optionally, the concealing means conceals the population in the count area calculated by the population calculation means on the basis of the class interval, and the population calculation means calculates a product of the population in the count area concealed by the concealing means and a ratio of the population in the target area to the population in the count area before concealment as the population in the target area obtained through the concealment processing.
According to this aspect, in the calculation processing, population in the count area that is an intermediate result of calculation is concealed on the basis of the class interval, and the product of the concealed population in the count area and the ratio of the population in the target area to the population in the count area before concealment is calculated as the population in the target area obtained through the concealment processing. In this manner, by concealing the population in the count area whose population is larger than that in the target area and then multiplying it by the ratio of the population in the target area to the population in the count area to calculate the concealed population in the target area, deviation from the population data before concealment can be reduced compared to the case in which population in the target area is directly concealed.
In a population calculation system according to still another aspect, optionally, the target area and the count area are the same, the concealing means conceals the population in the count area calculated by the population calculation means on the basis of the class interval, and the population calculation means calculates the population in the count area concealed by the concealing means as the population in the target area obtained through the concealment processing.
According to this aspect, when the target area and the count area are the same, the calculated population in the count area is concealed on the basis of the class interval and is calculated as the population in the target area. In this manner, even when the target area and the count area are the same, it is possible to conceal and present population while maintaining the reliability of population data at or above a certain level.
In a population calculation system according to still another aspect, optionally, the acquisition means acquires the number of counted people for each of a plurality of attributes, the population calculation means calculates the population in the count area and the population in the target area for each of the plurality of attributes on the basis of the scaling factor that is set for each of the plurality of attributes and also calculates the sum of population in the count area and the sum of population in the target area for at least two attributes out of the plurality of attributes, and the concealing means conceals the total population in the count area or the total population in the target area on the basis of the class interval that is a product of the concealment reference and a largest scaling factor among scaling factors each set for the at least two attributes.
According to this aspect, the population in the count area and the population in the target area are calculated for each of the plurality of attributes, and also the sum of population for the at least two attributes out of the plurality of attributes in these two areas is calculated. Total population in the count area or in the target area is concealed on the basis of the class interval that is the product of product the concealment reference and the largest scaling factor among scaling factors each set for the at least two attributes. Although a largest scaling factor among scaling factors for a plurality of attributes is a scaling factor for an attribute with which an individual can be most easily identified among the plurality of attributes, by concealing the total population in the target area on the basis of the largest scaling factor, a risk of an individual being identified can be suppressed.
In a population calculation system according to still another aspect, optionally, the concealing means conceals the total population in the count area or the total population in the target area on the basis of not the class interval but another class interval different from the class interval, and the other class interval is the sum of scaling factors from top to n-th (n is the concealment reference) when the scaling factors each set for the at least two attributes are arranged in descending order.
According to this aspect, concealment is performed with the sum of scaling factors from top to n-th (n is the concealment reference) when the scaling factors each set for the at least two attributes are arranged in descending order. Accordingly, it is possible to conceal population while reducing deviation from the population data before concealment.
In a population calculation system according to still another aspect, the concealing means may quantize the population into an integral multiple of the class interval when performing concealment based on the class interval.
According to this aspect, concealment is performed by quantizing the population into an integral multiple of the class interval. Accordingly, it is possible to conceal population while reducing deviation from the population data before concealment.
In a population calculation system according to still another aspect, the concealing means may, when rounding population to a specific class by quantizing the population into an integral multiple of the class interval, round the population to either one of a class whose difference from the population is the smallest and a class whose difference from the population is the subsequently smallest on the basis of the differences between the population and the respective classes.
According to this aspect, when concealing population, the differences between the population and the respective classes as candidates to which the population is to be rounded are considered, and thus it is possible to conceal the population while reducing deviation from the population data before concealment.
In a population calculation system according to still another aspect, the concealing means may quantize population into an integral multiple of the other class interval when performing concealment based on the other class interval.
According to this aspect, concealment is performed by quantizing the population into an integral multiple of the other class interval. Accordingly, it is possible to conceal population while reducing deviation from the population data before concealment.
In a population calculation system according to still another aspect, the concealing means may, when rounding population to a specific class by quantizing the population into an integral multiple of the other class interval, round the population to either one of a class whose difference from the population is the smallest and a class whose difference from the population is the subsequently smallest on the basis of the differences between the population and the respective classes.
According to this aspect, when concealing population, the difference between the population and the respective classes as candidates to which the population is to be rounded are considered, and thus it is possible to conceal population while reducing deviation from the population data before concealment.
In a population calculation system according to still another aspect, the acquisition means may, with respect to each positional information that is registered from each of mobile devices within the count area in a predetermined period of time, calculate each of feature amounts by using two or more out of time when each of the mobile devices registers the positional information, time when each of the mobile devices registers the previous positional information, and time when each of the mobile devices registers the following positional information, estimate the number of mobile devices within the count area on the basis of the sum of the feature amounts, and acquire this number as the number of counted people.
According to this aspect, on the basis of the feature amounts, the more accurate number of the counted people can be acquired.
In a population calculation system according to still another aspect, the acquisition means may, out of pieces of positional information registered by mobile devices, on the basis of pieces of positional information that are within a summation time period in which times when the mobile devices register the pieces of positional information are summed up or an expanded time period to which the summation time period is expanded, extract mobile devices that are presumed to be present in the count area within at least part of the summation time period or one piece of positional information that is generated by these mobile devices within the summation time period or the expanded time period, and on the basis of the number of the mobile devices or the number of the pieces of positional information thus extracted, estimate the number of the mobile devices within the summation time period and acquire this number as the number of counted people.
According to this aspect, because double counting of the mobile devices can be avoided, the more accurate number of the counted people can be acquired.
A population calculation system according to still another aspect is a population calculation system for calculating concealed population in a target area, the system including population calculation means for calculating population in a count area containing the target area and calculating the population in the target area on the basis of the population in the count area thus calculated; concealing means for concealing the population in the count area calculated by the population calculation means; and output means for outputting a product of the population in the count area concealed by the concealing means and a ratio of the population in the target area to the population in the count area before concealment as the concealed population in the target area.
A population calculation method according to another aspect is a population calculation method executed by a population calculation system for calculating concealed population in a target area, the method including a population calculation step of, by the population calculation system, calculating population in a count area containing the target area and calculating the population in the target area on the basis of the population in the count area thus calculated; a concealing step of, by the population calculation system, concealing the population in the count area calculated at the population calculation step; and an output step of, by the population calculation system, outputting a product of the population in the count area concealed at the concealing step and a ratio of the population in the target area to the population in the count area before concealment as the concealed population in the target area.
According to these aspects, the population in the count area and the population in the target area are calculated, and the product of the population in the count area concealed and the ratio of the population in the target area to the population in the count area before concealment is output as the concealed population in the target area. In this manner, by concealing the population in the count area whose population is larger than that in the target area and then multiplying it by the ratio of the population in the target area to the population in the count area to calculate the concealed population in the target area, deviation from the population data before concealment can be reduced compared to the case in which population in the target area is directly concealed.
With the population calculation system and the population calculation method described above, it is possible to conceal and present population while maintaining the reliability of population data at or above a certain level.
a) to 4(c) are diagrams illustrating examples of information stored in a database depicted in
a) to 11(c) are diagrams illustrating examples of information stored in the database depicted in
Embodiments of the present invention will now be described in detail with reference to the attached drawings. Note that like reference sings are given to like or equivalent elements in descriptions of the drawings, and redundant explanations are omitted.
Referring to
This population calculation system 1 as depicted
Referring back to
The target area is a specific geographical range estimated population in which is to be calculated, and the count area is a communicable range for a specific base station constituting a mobile communication network and is constituted by a plurality of sectors in the present embodiment. An example of the target area and the count area is illustrated in
When a mobile device enters a certain sector of a certain base station, position registration processing is performed by communication between the mobile device and the base station, and positional information indicating that the mobile device is present in the sector is stored in a predetermined database (not depicted) of the mobile communication network. Alternatively, by periodic communication between the mobile device and the base station, the position registration processing is periodically performed and the positional information is stored in the database. Accordingly, zero or more pieces of positional information can be registered in the database for each of the sectors A and B, for example. In the mobile communication network, a database in which pieces of user information on users of mobile devices are registered also exists in the mobile communication network. The acquisition module 10 sums up (counts) the number of users (the number of counted people) present in each sector for each of user attributes by referring to these databases. At this time, the acquisition module 10 counts mobile devices in each sector as the number of users.
The acquisition module 10 may acquire the number of counted people on the basis of pieces of positional information of mobile devices acquired by GPSs, for example, that are built in the mobile devices. The pieces of positional information of the mobile devices acquired by the GPSs, for example, are stored in the predetermined database of the mobile communication network. The acquisition module 10 refers to this database and the database in which the pieces of user information are registered and sums up the number of pieces of positional information that indicate presence in the count area, thereby sums up the number of users present in the count area for each of the user attributes. When the count area and the target area are the same or they indicate almost the same geographic range, the acquisition module 10 may acquire the number of counted people by summing up the number of pieces of positional information that indicate presence in the target area instead of the count area. Note that summation conditions or summation method when the acquisition module 10 acquires the number of counted people are not limited.
The “number of users” column in
In the database 15, information given in
c) indicates the area ratio of a target area contained in a count area to the count area. For example,
The first calculation module 11 is means for calculating population in a count area on the basis of the number of counted people acquired by the acquisition module 10 and the scaling factor for determining population in the count area from the number of counted people. The first calculation module 11 may calculate population in the count area for each of attributes on the basis of the scaling factor for each of the attributes.
The first calculation module 11 calculate, as the population of the attribute 1 in the sector A, 5×2=10 that is a product of the number of users with the attribute 1 in the sector A and the scaling factor for the attribute 1. Similarly, the first calculation module 11 calculates 152×2.5=380 as the population of the attribute 2 in the sector A, 5×2=10 as the population of the attribute 1 in the sector B, and 55×2.5=137.5 as the population of the attribute 1 in the sector B. The first calculation module 11 stores these calculation results in the database 15 as in the “population” column depicted in
The second calculation module 12 is means for calculating population in the target area on the basis of the population in the count area calculated. The second calculation module 12 may calculate population in the target area for each of attributes.
The second calculation module 12 calculates, as the population of the attribute 1 in the mesh M, 10×0.3+10×0.2=5 from the population of the attribute 1 in the sector A, the population of the attribute 1 in the sector B, the area ratio of the mesh M contained in the sector A to the sector A, and the area ratio of the mesh M contained in the sector B to the sector B. Similarly, the second calculation module 12 calculates 380×0.3+137.5×0.2=141.5 as the population of attribute 2 in the mash M. In addition, the second calculation module 12 calculates 5+141.5=146.5 that is the sum of population of all attributes in the mesh M as the total population in the mesh M. The second calculation module 12 stores these calculation results in the database 15 as in the “M population” (indicating the population in the mesh M) column of the table depicted in
The quantization module 13 is means for concealing population in the target area on the basis of a class interval that is a product of concealment reference that is a reference value of a minimum summation unit and the scaling factor in calculation processing by the second calculation module 12. In the present embodiment, the quantization module 13 is explained as a module for quantizing the population in the target area into an integral multiple of the class interval, but the method of concealment is not limited to this.
The quantization module 13 calculates the class interval that is a product of the concealment reference and the scaling factor for each attribute. The concealment reference herein is a reference value of the minimum number of people in summation unit. For example, when the number of users is several as a result of summing up the number of users, there is a possibility that individuals are easily identified. Accordingly, by avoiding the number of users from becoming equal to or smaller than a predetermined number of people as a result of summation, summation by which individuals the number of which is equal to or smaller than the predetermined number of people are not identified becomes possible. This predetermined number of people is the concealment reference. In the present embodiment, the concealment reference is uniformly assumed to be 10 regardless of types of attribute or areas.
When determining population of the attribute 1, the quantization module 13 calculates 10×2=20 as the class interval for the attribute 1. Next, the quantization module 13 quantizes five that is the population of the attribute 1 in the mesh M into an integral multiple of 20 that is the class interval. The quantization module 13, when rounding population to a predetermined class by quantizing the population into an integral multiple of the class interval, rounds either one of a class whose difference from the population is the smallest and a class whose difference from the population is the subsequently smallest. Herein, the class whose difference from population is the smallest is zero (the difference is five), and the class whose difference from the population is the subsequently smallest is 20 (the difference is 15). Out of these two classes, the class (0) whose value is smaller is defined as a lower value, the class (20) whose value is larger is defined as an upper value. In the present embodiment, it is assumed that the quantization module 13 rounds population to be rounded to the lower value. Accordingly, the quantization module 13 quantizes five that is the population of the attribute 1 in the mesh M into zero. Similarly, the quantization module 13 quantizes 141.5 that is the population of the attribute 2 in the mesh M into 125 on the basis of an integral multiple of 10×2.5=25 that is the class interval for the attribute 2.
The quantization module 13 may quantize the total population in the target area into an integral multiple of a class interval that is a product of the concealment reference and the largest scaling factor among scaling factors each set for attributes.
The largest scaling factor among scaling factors set for the attributes 1 and 2 is 2.5 for the attribute 2, and thus the quantization module 13 calculates 10×2.5=25 as a class interval for the total population. Next, the quantization module 13 quantizes 146.5 that is the total population in the mesh M to obtain 125 that is an integral multiple of 25 being the class interval.
In the present embodiment, the quantization module 13 rounds population to the lower value, but the method of rounding is not limited to this. For example, the quantization module 13 may round up population to the upper value. Alternatively, the quantization module 13 may round population unilaterally to one of the upper value and the lower value whose difference from the population is smaller than that of the other, or may round population to either one of the upper value and the lower value in a random manner.
Furthermore, the quantization module 13 may, when rounding population by quantizing the population into an integral multiple of the class interval, round the population to either one of a class whose difference from the population is the smallest and a class whose difference from the population is the subsequently smallest on the basis of the differences between the population and the respective classes.
A method of rounding population in a random manner will be described below in which probability of the population being rounded to the upper value is increased when the population to be rounded is closer to the upper value and probability of the population being rounded to the lower value is increased when the population is closer to the lower value. It is assumed that population to be rounded is e, the lower value is r1, the upper value is r2, the probability of the population e being rounded to r1 is (r2−e)/(r2−r1), and the probability of the population e being rounded to r2 is (e−r1)/(r2−r1). For example, when rounding population of the attribute 1 in the mesh M, e=5, r1=0, and r2=20, and thus the probability that the quantization module 13 rounds the population of the attribute 1 in the mesh M to the lower value becomes (20-5)/(20-0)=75(%) and the probability that the quantization module 13 rounds it to the upper value becomes (5−0)/(20−0)=25(%).
The quantization module 13 may quantize the total population in the target area, without using the above class interval, by using as another class interval the sum of scaling factors from top to n-th (n is the concealment reference) when the scaling factors each set for the attributes are arranged in descending order.
For example, it is assumed that respective scaling factors for 10 people present in a certain area are 10, 5, 2, 5, 3, 1, 2, 4, 6, and 3. When the concealment reference is three herein, the quantization module 13 determines 21 that is the sum of 10, 6, and 5 that are scaling factors from the top to the third of scaling factors for the above 10 people arranged in the order of scaling factor as the other class interval. The quantization module 13 quantizes the total population into an integral multiple of the other class interval. Herein, when the quantization module 13 calculates the class interval as a product of the concealment reference and the largest scaling factor, the class interval becomes 3×10=30 and, as described above, using 21 that is the other class interval reduces deviation from the population data before and after the concealment. However, when the number of people with an attribute for which the scaling factor is largest is equal to or larger than the concealment reference, the other class interval becomes the same as that in the case when using the largest scaling factor.
The output module 14 is means for outputting the population in the target area obtained through quantization (concealment) processing by the quantization module 13 as concealed population in the target area.
More specifically, the output module 14 stores the quantized population in the database 15 as in the “concealed M population” column of
Referring to
To begin with, the acquisition module 10 acquires the numbers of users in the sectors A and B containing the mesh M (step S11, acquisition step). Next, the first calculation module 11 calculates populations in the sectors A and B on the basis of the number of users acquired at step S11 and the scaling factor, and the second calculation module 12 calculates population in the mesh M on the basis of the populations in the sectors A and B thus calculated (step S12, population calculation step). Next, the quantization module 13 conceals the population in the mesh M calculated at step S12 on the basis of the class interval that is the product of the concealment reference and the scaling factor (step S13, concealing step). Next, the output module 14 outputs the population in the mesh M concealed at step S13 as the concealed population in the mesh M (step S14, output step).
As described above, according to the present embodiment, when determining populations in the sectors A and B from the number of counted people and determining the population in the mesh M on the basis of the populations, the populations in the sectors A and B or the population in the mesh M is quantized into an integral multiple of the class interval that is the product of the concealment reference and the scaling factor. In this manner, by discretely determining the population in the mesh M using the class interval based on the concealment reference, fractions below the class interval can be properly rounded, and thus it is possible to conceal and present population while maintaining the reliability of population data at or above a certain level.
In addition, according to the present embodiment, populations in the sectors A and B and population in the mesh M are calculated for each of attributes, and also the sum of the populations of all attributes in these two areas is calculated. Furthermore the total population in the sectors A and B or the mesh M is quantized into an integral multiple of the class interval that is the product of the concealment reference and the largest scaling factor among scaling factors each set for the attributes. Although the largest scaling factor among scaling factors for the respective attributes is a scaling factor for an attribute with which an individual can be most easily identified among the respective attributes, by concealing the total population in the target area on the basis of the largest scaling factor, a risk of an individual being identified can be suppressed.
In addition, according to the present embodiment, quantization is performed by using as the class interval the sum of scaling factors from top to n-th (n is the concealment reference) when the scaling factors each set for the attributes are arranged in descending order. Accordingly, it is possible to quantize population while reducing deviation from the population data before quantization.
In addition, according to the present embodiment, when quantizing population, the differences between the population and the respective classes as candidates to which the population is to be rounded are considered, and thus it is possible to quantize the population while reducing deviation from the population data before quantization.
Not only when quantizing the total population of all attributes in the mesh M but even when quantizing the total population for at least two attributes out of three or more attributes in the mesh M, the largest scaling factor or the other class interval described above can be similarly set. For example, the quantization module 13 may, when determining the total population for three attributes 1, 3, and 5 out of attributes 1 to 5 in the mesh M, perform quantization processing by using the class interval that is the product of the concealment reference and the largest scaling factor among scaling factors for the attributes 1, 3, and 5. In addition, the quantization module 13 may, in a similar case, perform quantization by using as the class interval the sum of scaling factors from top to n-th when scaling factors for persons belonging to the attributes 1, 3, and 5 are arranged in descending order.
Functions and configuration of a population calculation system 1A according to a second embodiment will now be described with reference to
It is assumed similarly to the first embodiment that the population calculation system 1A according to the second embodiment calculates concealed population in the mesh M contained in the sectors A and B depicted in
The first calculation module 11 calculates 10+10=20 by using values in the “population” column in
The quantization module 13A is means for concealing population in the count area on the basis of the class interval that is the product of the concealment reference and the scaling factor in calculation processing by the first calculation module 11. In the present embodiment, explanations are made on the assumption that the quantization module 13A quantizes population in the target area into an integral multiple of the class interval, but the method of concealment is not limited to this.
The quantization module 13A quantizes 20 that is the population of the attribute 1 in the sector A+B into 20 by using 10×2=20 that is the class interval. Similarly, the quantization module 13A quantizes 517.5 that is the population of the attribute 2 in the sector A+B into 500 by using 10×2.5=25 that is the class interval.
The quantization unit 13A may quantize the total population of all attributes in the count area into an integral multiple of the class interval that is the product of the concealment reference and the largest scaling factor among scaling factors each set for the attributes.
The largest scaling factor among scaling factors set for the attributes 1 and 2 is 2.5 for the attribute 2, and thus the quantization module 13A calculates 10×2.5=25 as the class interval. Next, the quantization module 13A quantizes 537.5 that is the total population in the sector A+B into an integral multiple of 25 that is the class interval to obtain 525. The population in the sector A+B quantized by the quantization module 13A is stored in the database 15 as in the “quantized A+B population” (indicating the population in the sector A+B thus quantized) column in
Note that the quantization module 13A rounds population down to the lower value in the above-described quantization but the method of rounding is not limited to this. For example, similarly to the first embodiment, the quantization module 13A may round population up to the upper value, may round population down to the lower value with which difference between the population and the class is smaller, or may round population to either class in a random manner.
In addition, similarly to the quantization module 13 of the first embodiment, the quantization module 13A may quantize population of all attributes in the sector A+B into an integral multiple of the class interval that is the sum of scaling factors from top to the number of people as the concealment reference in the order of scaling factors each set for the attributes.
The third calculation module 16 is means for calculating a product of the population in the count area quantized by the quantization module 13A and an area population ratio that is a ratio of the population in the target area to the population in the count area before quantization as the population in the target area obtained through quantization processing.
For the attribute 1, the third calculation module 16 calculates 20×(5/20)=5 as the product of 20 that is the population in the sector A+B quantized by the quantization module 13A and 5/20 that is the area population ratio. Similarly, for the attribute 2, the third calculation module 16 calculates 500×(141.5/517.5)=136.7 as the above-described product. Similarly, with respect to the population of all attributes in the sector A+B, the third calculation module 16 calculates 525×(146.5/537.5)=143.0 as the above-described product. Note that numbers are rounded down to the nearest tenth in the present embodiment.
The output module 14A outputs the population in the target area obtained through quantization (concealment) processing as the concealed population in the target area.
More specifically, the output module 14A stores the quantized population in the database 15 as in the “concealed M population” column in
When focusing on the values in the “concealed M population” of the table depicted in
Referring to
To begin with, the acquisition module 10 acquires the number of users in the sectors A and B containing the mesh M (step S31, acquisition step). Next, the first calculation module 11 calculates population in the sectors A and B on the basis of the number of users acquired at step S31 and the scaling factor (step S32, population calculation step). Next, the second calculation module 12 calculates population in the mesh M on the basis of the population in the sectors A and B calculated at step S32 (step S33, population calculation step). Next, the quantization module 13A conceals the population in the sector A+B calculated at step S32 on the basis of the class interval that is the product of the concealment reference and the scaling factor (step S34, concealing step). Next, the third calculation module 16 calculates a product of the population in the sector A+B concealed at step S34 and the ratio of the population in the mesh M calculated at step S33 to the population in the sector A+B calculated at step S32 (step S35, third calculation step). Next, the output module 14A outputs the product calculated at step S35 as the concealed population in the mesh M (step S36, output step).
As described above, according to the present embodiment, in the calculation processing, the population in the sectors A and B that is an intermediate result of calculation is quantized into an integral multiple of class interval that is the product of the concealment reference that is a reference value of a minimum summation unit and the scaling factor, and the product of the population in the sectors A and B thus quantized and the area population ratio is calculated as the population in the mesh M obtained through quantization processing. Although it is possible to conceal population while maintaining the reliability of the population data at or above a certain level even when directly concealing the population in the mesh M, by concealing the population in the sectors A and B whose population is larger than that in the mesh M and then multiplying it by the area population ratio to calculate the concealed population in the mesh M in this manner, deviation from the population data before concealment can be reduced compared to the case in which population in the mesh M is directly concealed as described above.
However, when the mesh M and the sector A+B are the same or represent almost the same geographical range, the third calculation module 16 may calculate population in the sector A+B quantized by the quantization module 13A as the population in the mesh M and the output module 14A may output this population in the mesh M as the concealed population. Even when the mesh M and the sector A+B represent the same geographical range in this manner, it is possible to conceal and present population while maintaining the reliability of the population data at or above a certain level. Such processing is substantially the same as the processing of multiplying the quantized population in the count area by an area population ratio of “1” to obtain the concealed population.
In addition, not only when quantizing the total population of all attributes in the sector A+B but even when quantizing the total population for at least two attributes out of three or more attributes in the sector A+B, the largest scaling factor or the other class interval described above can be similarly set. For example, the quantization module 13A may, when determining the total population for three attributes 1, 3, and 5 out of attributes 1 to 5 in the sector A+B, perform quantization processing by using the class interval that is the product of the concealment reference and the largest scaling factor among scaling factors for the attributes 1, 3, and 5. In addition, the quantization module 13A may, in a similar case, perform quantization by using as the class interval the sum of scaling factors from top to n-th when scaling factors for persons belonging to the attributes 1, 3, and 5 are arranged in descending order.
(Modification 1 of Second Embodiment)
Modification 1 of the population calculation system 1A according to the second embodiment will be described below with reference to
The acquisition module 10 acquires five as the number of users with the attribute 1 in the sector C and 152 as the number of users with the attribute 2 in the sector C as depicted in
The first calculation module 11 calculates 5×2=10 that is a product of the number of users with the attribute 1 in the sector C and the scaling factor for the attribute 1 as the population of the attribute 1 in the sector C. Similarly, the first calculation module 11 calculates 152×2.5=380 as the population of the attribute 2 in the sector C. In addition, the first calculation module 11 calculates 10+380=390 as the population of all attributes in the sector C.
The second calculation module 12 calculates 10×0.3=3 as the population of the attribute 1 in the mesh N. Similarly, the second calculation module 12 calculates 380×0.3=114 as the population of the attribute 2 in the mesh N. In addition, the second calculation module 12 calculates 3+114=117 as the population of all attributes in the mesh N.
The quantization module 13A quantizes 10 that is the population of the attribute 1 in the sector C into an integral multiple of 10×2=20 that is the class interval to obtain zero. Similarly, the quantization module 13A quantizes 380 that is the population of the attribute 2 in the sector C into an integral multiple of 10×2.5=25 that is the class interval to obtain 375. Similarly, the quantization module 13A quantizes 390 that is the population of all attributes in the sector C into an integral multiple of 10×2.5=25 that is the class interval calculated with the largest scaling factor to obtain 375.
The third calculation module 16 calculates 0×(3/10)=0 as the population of the attribute 1 in the mesh N, calculates 375×(114/380)=112.5 as the population of the attribute 2 in the mesh N, and calculates 375×(117/390)=112.5 as the total population in the mesh N.
The output module 14A outputs 0, 112.5, and 112.5 that are concealed population of the attribute 1, concealed population of the attribute 2, and concealed total population in the mesh N, respectively.
Note that the population calculation system 1 of the first embodiment also can calculate the concealed population in the mesh N depicted in
(Modification 2 of Second Embodiment)
Modification 2 of the population calculation system 1A according to the second embodiment will be described below. In the present modification, the population calculation system 1A further includes a concealing module (concealing means).
The concealing module conceals population in a count area calculated by the first calculation module 11. The method for concealment is not limited to a specific method. One example of the method for concealment is the method for quantization by the quantization module 13A in the second embodiment.
Another example of the method for concealment by the concealing module is a method in which a value smaller than the concealment reference or the product of the concealment reference and the largest scaling factor is omitted. For example, when the population in the count area is five, this value is smaller than 10 that is the concealment reference and thus omitted by the concealment module, and finally the concealed population is not output. In contrast, for example, when the population in the count area is 20, this value is larger than 10 that is the concealment reference and thus is not omitted by the concealing module, and the concealed population in the count area becomes 20.
Another example of the method for concealment by the concealing module is a method in which populations are concealed by a specific concealing method for respective unit attributes each of which is a unit defining a scaling factor and then are summed up. In the second embodiment, as depicted in
Another example of the method for concealment by the concealing module is a method in which the class interval in the second embodiment is not set and values are varied by random numbers. In the second embodiment, the quantization module 13 rounds up to the upper value or down to the lower value when rounding population to a specific class by quantizing the population into an integral multiple of the class interval. In contrast, in the concealment method in which values are varied by random numbers, for example, the population is concealed into a random value with a probability based on a probability density function having a variance with a magnitude depending on the class interval.
Another example of the method for concealment by the concealing module is a method in which population is concealed into a real-number multiple of the class interval that is the product of the concealment reference and the scaling factor. Herein, when it is assumed that the class interval is y and the integral number is z, for example, in the first embodiment and the second embodiment, the quantization module 13 and the quantization module 13A conceal population into 0, y, 2y, 3y, zy, . . . . Instead of this concealment method, the quantization module 13 and the quantization module 13A conceal the population into 0, 1.1y, 2.2y, 3.3y, . . . , 1.1zy, . . . , for example.
Another example of the method for concealment by the concealing module is a method in which population is concealed into a value obtained by adding a predetermined real number to an integral multiple of the class interval that is the product of the concealment reference and the scaling factor. Herein, when it is assumed that the class interval is y and the integral number is z, for example, in the first embodiment and the second embodiment, the quantization module 13 and the quantization module 13A conceal population into 0, y, 2y, 3y, . . . , zy, . . . . Instead of this concealment method, the quantization module 13 and the quantization module 13A conceal the population into 0, 0.5, y+0.5, 2y+0.5, 3y+0.5, zy+0.5, . . . , for example.
Note that all of the methods for concealment by the concealing module described in the modification 2 of the second embodiment are applicable also to the first embodiment.
The output module 14A outputs the product of the population in the count area concealed by the concealing module and the area population ratio that is the ratio of the population in the target area calculated by the second calculation module 12 to the population in the count area before concealment as the concealed population in the target area. A specific example of calculating the product of the population in the count area concealed and the area population ratio is similar to calculations performed by the third calculation module 16 in the second embodiment, explanations thereof is omitted.
In the foregoing, the present invention has been described in detail on the basis of the embodiments thereof. However, the present invention is not limited to the above-described embodiments. In the present invention, various changes may be made without departing from the scope thereof.
The population calculation systems 1 and 1A may be constructed of one computer, or may be constructed of a plurality of computers. When the population calculation systems 1 and 1A are constructed of a plurality of computers, functions of the population calculation systems 1 and 1A may be implemented by sending and receiving input and output of the respective functional components between servers.
In the embodiments described above, the first calculation module 11 and the second calculation module 12 calculate population in the count area on the basis of the number of counted people and the scaling factor, and calculates population in the target area on the basis of the population in the count area thus calculated, but the procedure for calculating population in the target area is not limited to this. For example, the population calculation means may calculate the number of counted people in the target area on the basis of number of counted people and the area ratio of the target area to the count area, and calculate population in the target area on the basis of the number of counted people in the target area thus calculated and the scaling factor.
In the embodiments described above, the count area is a spatial sector in mobile communication, but is not limited to this. For example, actual households surveyed for ratings in a television program rating survey can be the count area. Thus, the count area only has to be a region in which the number of people can be counted as a sample.
Furthermore, in the population calculation systems 1 and 1A in the respective embodiments described above, temporary data calculated by the functional components is stored in the database 15, but may be stored in a working memory or a database system, for example.
In addition, the population calculation systems 1 and 1A of the respective embodiments described above calculate concealed population, but the present invention can be applied to purposes other than the calculation of population. For example, it may be applied to a computer system in a field where data is concealed and presented while the reliability of the data is maintained at or above a certain level.
Furthermore, in the population calculation systems 1 and 1A of the respective embodiments described above, when acquiring the number of users, the acquisition module 10 refers to the predetermined database in the mobile network to sum up and acquire the number of users, but the acquisition method is not limited to this. For example, the acquisition module 10 may acquire the number of counted people from static compiled data that is compiled in advance such as compiled data from questionnaires.
Furthermore, in the population calculation systems 1 and 1A of the respective embodiments described above, when the acquisition module 10 refers to the database in which positional information and user information are registered and performs summation, the acquisition module 10 may perform a de-identification process including conversion to irreversible codes by a one-way function on user identifiers (i.e., telephone numbers) included in the positional information or the user information. As this one-way function, a keyed hash function based on a hash function recommended by assessment projects or assessment bodies from home and abroad can be used.
Furthermore, in the population calculation systems 1 and 1A of the respective embodiments described above, when acquiring the number of counted people, the acquisition module 10 may estimate and acquire the number of counted people (the number of mobile devices) by presence count estimation or entry count estimation both of which are terminal count estimation described below, for example.
The idea of the presence count estimation and a calculation method thereof will be described hereinafter. As in a model depicted in
In other words, a result obtained by dividing the sum of the staying time ti of each mobile device ai in the sector S within the summation time period by the length T of the summation time period is estimated as the number of mobile devices m. Note that the actual value of the staying time ti of the mobile device ai in the sector S within the summation time period cannot be measured, but a signal that each mobile device ai transmits to register the positional information can be measured.
Signals that the mobile device ai transmits in the sector S within the summation time period in time order are:
q
i1
,q
i2
, . . . ,q
ix
[Formula 2]
Assuming that (xi is the total number of signals that the mobile device ai transmits in the sector S within the summation time period), a value of m can be estimated from qij observed (j is an integer that is equal to or larger than 1 and equal to or smaller than xi).
A calculation method for estimating the number of mobile devices will now be described with reference to
E(ti)=xi/pi (2)
Assuming herein that transmission time of the signal qij is uij, the density pij of the signal qij is given by the following formula (3).
p
ij=2/(ui(j+1)−ui(j−1)) (3)
Assuming herein that (ui(j+1)−ui(j−1)) in the above formula (3) is a feature amount wij for the signal qij, the above formula (3) becomes as follows. In other words, the feature amount wij can be calculated in association with the reciprocal of the density pij.
p
ij=2/(ui(j+1)−ui(j−1))=2/wij (4)
At this time, the density pi is given by
the estimated value E(m) of the number of mobile devices m can be calculated by the following formula (6).
As depicted in
Subsequently, the idea of the entry count estimation and a calculation method thereof will be described hereinafter. Note that in the present specification, the entering terminal count means the number of unique mobile devices that stay in an area (sector) on which summation is to be performed during at least part of the summation time period. The term “unique” herein means that the number of entering terminals is a number after subtracting the duplicate counts of a same mobile device.
One example of a process of estimating the entering terminal count performed by the acquisition module 10 will be described below. In this example, the entering terminal count is determined by using an estimated staying period of each of mobile devices in a sector. To begin with, in pieces of position registration information whose user identifiers are the same, the acquisition module 10 calculates the estimated staying period during which mobile devices stay in a certain given sector for each of the mobile devices on the basis of pieces of in-sector position data in which times at which pieces of position registration information are acquired are within an expanded time period described later and whose sector IDs indicate the certain given sector, and pieces of out-sector position data that are adjacent to the pieces of in-sector position data when pieces of position registration information are arranged in time sequence on the basis of times at which the pieces of position registration information are acquired and whose sector IDs indicate outside of the certain given sector. Note that the above-mentioned “expanded time period” herein means, as one example, a period to which the summation time period is expanded by a predetermined duration (e.g., 1 hour) before and after the summation time period, more specifically, a time period between the time as a start point to which time goes back from the summation starting time t0 by the predetermined duration and the time as an end point to which time proceeds from the summation ending time t1 by the predetermined duration.
As depicted in
Then, the acquisition module 10 extracts mobile devices whose estimated staying times thus calculated overlap with the summation time period.
However, mobile devices whose estimated staying period represented by the rectangles in
Next, the acquisition module 10 counts the number of mobile devices thus extracted for each sector ID, and estimates the counted number thus obtained as the entering terminal count in each sector within the summation time period. As described above, the acquisition module 10 determines the entering terminal count for each sector.
Note that the method for estimating the entering terminal count by the acquisition module 10 using the estimated staying period is one example, and other methods may be adopted. As another method, in pieces of position registration information on a same mobile device, the acquisition module 10 may extract one piece of position registration information (position registration information that is acquired at the earliest time as an example) out of pieces of in-sector position data in which times when pieces of position registration information are acquired are within the summation time period. When one piece of position registration information is extracted for each mobile device in this manner, accuracy of extraction is a little lower than that of the above-described method using the estimated staying period, but the estimated staying period for each mobile device does not have to be calculated, and thus it is possible to extract one piece of position registration information for each mobile device with a low processing load. In this case, the acquisition module 10 could estimate the entering terminal count by counting the number of pieces of position registration information thus extracted. However, when extracting one piece of position registration information for each mobile device, it is not indispensable to extract the piece of position registration information that is acquired at the earliest time, and another piece of positional information may be extracted. For example, a piece of position registration information that is acquired at the latest time may be extracted, or a piece of position registration information that is acquired at time closest to the midpoint of an observation period.
Note that in the above-described embodiments, as positional information of a mobile device, other than positional information or position registration information acquired by GPS, for example, any information with which a position can be grasped is usable.
Examples of other applicable fields of the present invention include a television program rating survey, a political party approval rating survey, a web questionnaire survey, and a census.
1, 1A . . . population calculation system, 10 . . . acquisition module (acquisition means), 11 . . . first calculation module (population calculation means), 12 . . . second calculation module (population calculation means), 13, 13A . . . quantization module (concealing means), 14, 14A . . . output module (output means), 15 . . . database, 16 . . . third calculation module (population calculation means)
Number | Date | Country | Kind |
---|---|---|---|
2010-206883 | Sep 2010 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2011/071044 | 9/14/2011 | WO | 00 | 12/17/2012 |