The present invention relates to a number-of-terminals estimation device and a number-of-terminals estimation method to estimate the number of terminals located in a certain area, using location data about mobile terminals obtained from network facilities of the mobile terminals (e.g., cell phones). The “location data” in the present specification is a collection of a plurality of pieces of location information including location registration information and may further include location information (e.g., GPS positioning information or the like) except for the location registration information.
There is a conventionally known technology to estimate the number of terminals located in a certain area (e.g., a sector) (the number of terminals is so called a presence count), based on the number of location registration information generated. However, as well as being generated at a predetermined location registration period, the location registration information is also generated when a user of a terminal moves to another location area across a boundary between location areas (which will be referred to hereinafter as “LA boundary”). Therefore, in a sector facing an LA boundary, a probability of generation of location registration information becomes relatively higher than that in the other sectors, which poses a risk of overestimating the presence count in the sector facing the LA boundary.
Patent Literature 1 below proposes a correction for the presence count in the sector facing the LA boundary, in view of the overestimation of the presence count in the sector facing the LA boundary.
Patent Literature 1: WO2010/116903A1
In the meantime, the location registration period is reset to zero when a terminal user crosses a location registration area boundary. For this reason, the probability of generation of location registration information becomes relatively lower in sectors at some distance from the LA boundary (e.g., in sectors adjacent inside to the sector facing the LA boundary), which poses a risk of underestimating the presence count in the sectors at some distance from the LA boundary. At locations a little away from the region near the center of the horizontal axis (LA boundary) in
However, Patent Literature 1 fails to clearly show a specific technique for solving the above problem, and there is a room for improvement to achieve more accurate estimation of the presence count.
The present invention has been accomplished in order to solve the above problem and it is an object of the present invention to realize more accurate estimation of the presence count.
A number-of-terminals estimation device according to an aspect of the present invention is one comprising: a location data acquisition unit for acquiring location data which are a collection of multiple pieces of location information including location registration information; a first presence count estimation unit for estimating the number of terminals located in an observation area during an observation period, as a first presence count, based on the location data acquired by the location data acquisition unit; an extraction unit for extracting location data in accordance with a type of location data, from the location data acquired by the location data acquisition unit; a second presence count estimation unit for estimating the number of terminals located in the observation area during the observation period, as a second presence count, based on the extracted location data; and a third presence count estimation unit for estimating the number of terminals located in the observation area during the observation period, as a third presence count, based on one or both of the first presence count obtained by the first presence count estimation unit and the second presence count obtained by the second presence count estimation unit. Since the number-of-terminals estimation device of this configuration is configured to estimate the number of terminals as the third presence count, based on one or both of the first presence count based on the location data as the collection of multiple pieces of location information including the location registration information and the second presence count based on the location data extracted in accordance with the type of location data, it is able to more accurately estimate the third presence count to be the number of terminals corresponding to somewhere in a numerical range between the first presence count and the second presence count (inclusive of the numerals at the two ends). The third presence count estimation unit may estimate an error of the number of terminals caused by a crossing of a terminal across a location registration area boundary, based on the first presence count and the second presence count.
A number-of-terminals estimation device according to another aspect of the present invention is one comprising: a location data acquisition unit for acquiring location data which are a collection of multiple pieces of location information including location registration information; a first presence count estimation unit for estimating the number of terminals located in an observation area during an observation period, as a first presence count, based on the location data acquired by the location data acquisition unit; a signal removal unit for removing location registration information generated due to a crossing of a terminal across a location registration area boundary, from the location data acquired by the location data acquisition unit, to obtain location data after removal; a second presence count estimation unit for estimating the number of terminals located in the observation area during the observation period, as a second presence count, based on the location data after removal obtained by the signal removal unit; and a third presence count estimation unit for estimating the number of terminals located in the observation area during the observation period, as a third presence count, based on one or both of the first presence count obtained by the first presence count estimation unit and the second presence count obtained by the second presence count estimation unit. In this case, the third presence count estimation unit may estimate an error of the number of terminals caused by the crossing of the terminal across the location registration area boundary, based on the first presence count and the second presence count.
In the foregoing number-of-terminals estimation device, the third presence count estimation unit may estimate the third presence count by switching the presence count as a basis of estimation for each observation area, among the first presence count only, the second presence count only, and, both of the first and second presence counts.
Since the number-of-terminals estimation device of this configuration is configured to estimate the number of terminals as the third presence count, based on one or both of the first presence count based on the location data as the collection of multiple pieces of location information including the location registration information and the second presence count based on the location data after removal of the location registration information generated due to the crossing of the terminal across the location registration area boundary, it is able to estimate the third presence count to be the number of terminals corresponding to somewhere in the numerical range between the first presence count and the second presence count (inclusive of the numerals at the two ends). This solves the problem that the presence count is overestimated near the location registration area boundary and the presence count is underestimated at the locations a little away from the location registration area boundary, thus allowing more accurate estimation of the presence count.
A technique of estimating the third presence count can be adopted from a variety of modes. For example, the third presence count estimation unit may estimate, as the third presence count, the number of terminals corresponding to a predetermined proportional division point between the first presence count and the second presence count. The proportional division point herein may be an exact center point between the first presence count and the second presence count, a point of trisection of the numerical range between the first presence count and the second presence count that is closer to the second presence count, or any proportional division point except for these. Furthermore, the third presence count estimation unit may estimate the third presence count as follows: when the first presence count is not less than the second presence count, the third presence count estimation unit estimates the third presence count to be the number of terminals corresponding to a predetermined proportional division point between the first presence count and the second presence count; when the first presence count is less than the second presence count, the third presence count estimation unit estimates the third presence count to be the second presence count.
Furthermore, a technique of estimating the first and second presence counts can also be adopted from a variety of modes. For example, the device may be configured as follows: the location data includes identification information to identify a terminal and location acquisition time information when the location information is acquired, associated with each piece of location information; and both or one of the first presence count estimation unit and the second presence count estimation unit includes: a preceding and following location data acquisition unit for, concerning a target location data, acquiring location acquisition time information of location data immediately preceding the target location data and location acquisition time information of location data immediately following the target location data, out of location data including the same identification information as the target location data; a feature amount calculation unit for calculating a feature amount on the target location data, based on at least two of the location acquisition time information of the immediately-preceding location data, the location acquisition time information of the target location data, and the location acquisition time information of the immediately-following location data; an observation target acquisition unit for acquiring, as observation target location data, one or more pieces of location data including location acquisition time information after an observation start time and before an observation end time about an observation period to be observed and including location information associated with observation area information about an observation area to be observed; and a number-of-terminals estimation unit for estimating the number of terminals located in the observation area during the observation period, based on the feature amount on the observation target location data and a length of the observation period being a difference between the observation start time and the observation end time.
Incidentally, the feature amount calculation unit may operate as follows: the feature amount calculation unit makes a determination on whether or not the target location data includes location registration information generated due to a crossing across a location registration area boundary and a determination on whether or not the immediately-following location data includes location registration information generated due to a crossing across a location registration area boundary; the feature amount calculation unit calculates the feature amount on the target location data, using at least two of the location acquisition time information of the target location data, the location acquisition time information of the immediately-preceding location data, and the location acquisition time information of the immediately-following location data, according to the result of the determination on whether or not the target location data includes location registration information generated due to a crossing across a location registration area boundary and the result of the determination on whether or not the immediately-following location data includes location registration information generated due to a crossing across a location registration area boundary. In this case, although the detailed principle will be described later, a highly accurate feature amount can be obtained with consideration to the property of generation timing about the location registration information generated due to the crossing across the location registration area boundary. It is noted that “location registration information generated due to a crossing across a location registration area boundary” means location registration information generated due to a crossing of a mobile terminal across a location registration area boundary.
More specifically, the feature amount calculation unit may operate as follows: when the target location data includes the location registration information generated due to the crossing across the location registration area boundary, the feature amount calculation unit sets a location acquisition time of the target location data to a first variable; when the target location data does not include the location registration information generated due to the crossing across the location registration area boundary, the feature amount calculation unit sets a midpoint time between the location acquisition time of the target location data and the location acquisition time of the immediately-preceding location data to the first variable; when the immediately-following location data includes the location registration information generated due to the crossing across the location registration area boundary, the feature amount calculation unit sets a location acquisition time of the immediately-following location data to a second variable; when the immediately-following location data does not include the location registration information generated due to the crossing across the location registration area boundary, the feature amount calculation unit sets a midpoint time between the location acquisition time of the target location data and the location acquisition time of the immediately-following location data to the second variable; and the feature amount calculation unit calculates the feature amount on the target location data, based on a difference between the first variable and the second variable set.
The feature amount calculation unit may operate as follows: when a difference between the location acquisition time of the target location data and the first variable is larger than a predetermined value, the feature amount calculation unit calculates the feature amount on the target location data, using as the first variable, a time set backward by a predetermined time from the location acquisition time of the target location data. Similarly, when a difference between the location acquisition time of the target location data and the second variable is larger than a predetermined value, the feature amount calculation unit may calculate the feature amount on the target location data, using as the second variable, a time set forward by a predetermined time from the location acquisition time of the target location data. As the feature amount calculation unit is made to operate as described above, when an acquisition time interval of location data becomes abnormally long because of the mobile terminal being located in an out-of-service area or because of the mobile terminal being in a power-off mode, it is feasible to prevent the abnormally long acquisition time interval from excessively affecting the calculation result.
The number-of-terminals estimation device may further comprise a population estimation unit for estimating a population in the observation area during the observation period, based on the third presence count obtained by the third presence count estimation unit and a ratio of a presence count and a population in a predetermined area. In this case, the population in the observation area during the observation period can be estimated more accurately. The population may be estimated by obtaining a ratio of a presence count and a population in the entire country during the observation period or a certain constant. Furthermore, a predetermined ratio may be used instead of the observation period. When (population/presence count) is adopted as the aforementioned ratio of presence count and population, this ratio is also called “scaling factor.” This scaling factor may be derived using the number of terminals (presence count) estimated based on the feature amount and the observation period length and a method of deriving it will be described later. The number-of-terminals estimation device may further comprise: an observation period acquisition unit for acquiring observation period information including a set of an observation start time and an observation end time; and an observation area acquisition unit for acquiring observation area information associated with one or more pieces of location information.
The invention of the above-described number-of-terminals estimation devices can also be regarded as the invention of number-of-terminals estimation methods executed by the number-of-terminals estimation devices, with the same operation and effect.
Namely, a number-of-terminals estimation method according to an aspect of the present invention is a number-of-terminals estimation method executed by a number-of-terminals estimation device, comprising: a location data acquisition step of acquiring location data which are a collection of multiple pieces of location information including location registration information; a first presence count estimation step of estimating the number of terminals located in an observation area during an observation period, as a first presence count, based on the location data acquired in the location data acquisition step; an extraction step of extracting location data in accordance with a type of location data, from the location data acquired in the location data acquisition step; a second presence count estimation step of estimating the number of terminals located in the observation area during the observation period, as a second presence count, based on the extracted location data; and a third presence count estimation step of estimating the number of terminals located in the observation area during the observation period, as a third presence count, based on one or both of the first presence count acquired in the first presence count estimation step and the second presence count acquired in the second presence count estimation step.
A number-of-terminals estimation method according to another aspect of the present invention is a number-of-terminals estimation method executed by a number-of-terminals estimation device, comprising: a location data acquisition step of acquiring location data which are a collection of multiple pieces of location information including location registration information; a first presence count estimation step of estimating the number of terminals located in an observation area during an observation period, as a first presence count, based on the location data acquired in the location data acquisition step; a signal removal step of removing location registration information generated due to a crossing of a terminal across a location registration area boundary, from the location data acquired in the location data acquisition step, to obtain location data after removal; a second presence count estimation step of estimating the number of terminals located in the observation area during the observation period, as a second presence count, based on the location data after removal obtained in the signal removal step; and a third presence count estimation step of estimating the number of terminals located in the observation area during the observation period, as a third presence count, based on one or both of the first presence count acquired in the first presence count estimation step and the second presence count acquired in the second presence count estimation step.
According to the present invention, the number of terminals is estimated as the third presence count, based on one or both of the first presence count based on the location data as the collection of multiple pieces of location information including the location registration information and the second presence count based on the location data extracted in accordance with the type of location data, whereby the number of terminals corresponding to somewhere in the numerical range between the first presence count and the second presence count (inclusive of the numerals at the two ends) can be estimated as the third presence count, with higher accuracy. At this time, when the second presence count to be used is the one based on the location data after removal of the location registration information generated due to the crossing of the terminal across the location registration area boundary, the device or method solves the problem that the presence count is overestimated near the location registration area boundary and the presence count is underestimated at the locations a little away from the location registration area boundary, thus allowing more accurate estimation of the presence count.
Embodiments of the present invention will be described below with reference to the accompanying drawings. The same portions will be denoted by the same reference signs as much as possible, without redundant description. In the present specification the term “location data” generally means data including a terminal identifier to identify a mobile terminal, a sector identifier to identify a sector where the mobile terminal is located, location information (e.g., latitude and longitude information) about a location of the mobile terminal, and location acquisition time information when the location information is acquired, and the location data includes, for example, information included in a location registration signal generated by the mobile terminal (which will be referred to hereinafter as “location registration information”), GPS location information to which the aforementioned sector identifier is added, and so on. The GPS location information, when generated, does not include the sector identifier, but the sector identifier is added by a base station having received the GPS location information from the mobile terminal and a below-described number-of-terminals estimation device 10 (
[Configuration of Communication System]
The exchanges 400 collect below-described location information on the mobile terminals 100 through the BTSs 200 and RNCs 300. The RNCs 300 are able to measure locations of the mobile terminals 100 through the use of delay values in RRC connection request signals, during execution of communication connections with the mobile terminals 100. The exchanges 400 are able to receive the location information of the mobile terminals 100 measured as described above, during execution of communication connections by the mobile terminals 100. The exchanges 400 store the received location information and output the collected location information to the management center 500 at predetermining timing or in response to a request from the management center 500.
The various processing nodes 700 acquire the location information of the mobile terminals 100 through the RNCs 300 and exchanges 400, perform re-calculation of location or the like if necessary, and output the collected location information to the management center 500 at predetermining timing or in response to a request from the management center 500.
The location information of mobile terminal 100 to be employed in the present embodiment can be a sector-number indicating in which sector the mobile terminal 100 is located, acquired from location registration information; location positioning data obtained by a location information acquisition system such as the GPS positioning system or PRACH PD; and so on. The location data of a mobile terminal 100 includes identification information to identify the mobile terminal (e.g., information associated with the mobile terminal, such as a line number), and location acquisition time information when the location information is acquired, in addition to the aforementioned location information. When the line number is used as the identification information, it is preferable to use a value associated with the line number (e.g., a hash of the line number or the like), instead of using the line number.
The management center 500, as described above, is configured including the social sensor unit 501, peta-mining unit 502, mobile demography unit 503, and visualization solution unit 504, and each unit performs statistical processing using the location data of mobile terminals 100. A below-described number-of-terminals estimation device 10 (
The social sensor unit 501 consists of server apparatus to collect data including the location information of mobile terminals 100 and others, from each exchange 400 and various processing node 700, or, off-line. This social sensor unit 501 is configured so as to be able to receive data output at periodic intervals from the exchanges 400 and various processing nodes 700 or to acquire data from the exchanges 400 and various processing nodes 700 in accordance with timing predetermined in the social sensor unit 501.
The peta-mining unit 502 consists of server apparatus to convert data received from the social sensor unit 501, into a predetermined data format. For example, the peta-mining unit 502 performs a sorting process using user IDs as key or a sorting process for each area.
The mobile demography unit 503 consists of server apparatus to perform a totalization process on the data processed in the peta-mining unit 502, i.e., a counting process of each item. For example, the mobile demography unit 503 is able to count the number of users located in a certain area and to totalize the distributions of users.
The visualization solution unit 504 consists of server apparatus to visualize the data totalized in the mobile demography unit 503. For example, the visualization solution unit 504 is able to perform a mapping process of mapping the totalized data on a map. The data processed by this visualization solution unit 504 is provided to companies, public agencies, individuals, or the like to be used in development of shops, surveys of road traffic, countermeasures against natural disasters, countermeasures against environmental damage, and so on. Such statistically processed information is processed so that individuals or the like cannot be identified therefrom, in order to prevent invasions of privacy, as a matter of course.
Each of the social sensor unit 501, peta-mining unit 502, mobile demography unit 503, and visualization solution unit 504 is composed of the server apparatus as described above, and it is needless to mention that each unit has an ordinary basic configuration of information processing device (i.e., CPU, RAM, ROM, input devices such as keyboard and mouse, a communication device for communication with the outside, a memory device to store information, and output devices such as display and printer), illustration of which is omitted herein.
[Configuration of a Number-of-Terminals Estimation Device]
Next, the number-of-terminals estimation device according to the present embodiment will be described.
The functions of the respective units in the number-of-terminals estimation device 10 in
The first presence count estimation unit 15 retrieves the location data from the storage unit 12 and estimates the number of terminals located in the observation area during the observation period, as a first presence count, based on the location data. A number-of-terminals estimation method by the first presence count estimation unit 15 does not have to be limited to a specific method, but may be selected from various methods. An example of the number-of-terminals estimation method and a configuration of the first presence count estimation unit 15 to execute the estimation method will be described later. The number-of-terminals estimation method to be employed can be one except for the below-described example, e.g., the method described in Japanese Patent Application No. 2010-221456 which was filed by the same Applicant. This method is a method wherein an information analysis device receives, from the outside, point data including location information indicative of a location of each user, positioning time information when the location information is acquired, and a user ID, extracts point data with a positioning time immediately preceding a target time and point data with a positioning time immediately following the target time, from the point data on each user, estimates a location of each user at the target time by supplementing an interval between a location indicated by the point data immediately preceding the target time and a location indicated by the point data immediately following the target time, for each user, and calculates a population distribution in each predetermined calculation target area unit at the target time, based on the estimated locations of respective users, and this method can be applied to the number-of-terminals estimation.
The signal removal unit 16 retrieves the location data from the storage unit 12 and removes the location registration information generated due to a crossing of a terminal across a location registration area boundary (which will be referred to hereinafter as “location registration information due to LA-crossing”), from the location data to obtain location data after removal. The “location data after removal” obtained herein includes the location registration information excluding the location registration information due to LA-crossing, out of the location registration information and it is a matter of course that it may further include the location information (e.g., the GPS positioning information or the like) except for the location registration information. The foregoing signal removal unit 16 corresponds to the “extraction unit” and “signal removal unit” in the scope of claims and in the present embodiment, the signal removal unit 16 that removes the location registration information due to LA-crossing from the location data to obtain the location data after removal will be described as an example of the extraction unit to extract the location data in accordance with a type of location data (e.g., the generation factor of the location data or the like). An extraction method in accordance with a type of location data to be executed herein can be, for example, a method of removing only the LA-crossing location registration information from the location data (i.e., extracting the location data except for the LA-crossing location registration information) with reference to the aforementioned generation factor information included in the location registration information, a method of extracting only the location registration information generated due to the periodic location registration from the location data, or the like. The “type of location data” applicable herein can be, for example, a generation time of location data or the like, besides the foregoing generation factor.
The second presence count estimation unit 17 estimates the number of terminals located in the observation area during the observation period, as a second presence count, based on the location data after removal obtained by the signal removal unit 16. A number-of-terminals estimation method by the second presence count estimation unit 17 is not limited to a specific method, either, and can be selected from various methods including the aforementioned method described in Japanese Patent Application No. 2010-221456; however, it is preferably the same method as the number-of-terminals estimation method by the first presence count estimation unit 15. The present embodiment will be described using an example wherein the second presence count estimation unit 17 adopts the same number-of-terminals estimation method as the first presence count estimation unit 15. A specific example of the number-of-terminals estimation method and a specific configuration of the second presence count estimation unit 17 to execute the estimation method will be described later.
The third presence count estimation unit 18 estimates the number of terminals located in the observation area during the observation period, as a third presence count, based on one or both of the first presence count obtained by the first presence count estimation unit 15 and the second presence count obtained by the second presence count estimation unit 17. In
In the meantime, the number-of-terminals estimation method by the third presence count estimation unit 18 is not limited to a specific method, but can be selected from various methods as described below. For example, the third presence count estimation unit 18 may estimate the third presence count to be the number of terminals corresponding to a predetermined proportional division point between the first presence count and the second presence count. The proportional division point herein may be an exact center point between the first presence count and the second presence count, a point of trisection of the numerical range between the first presence count and the second presence count that is closer to the second presence count, or any proportional division point other than these.
Furthermore, the third presence count estimation unit 18 may estimate the third presence count as follows: when the first presence count is not less than the second presence count, the third presence count estimation unit 18 estimates the third presence count to be the number of terminals corresponding to the proportional division point between the first presence count and the second presence count as described above; on the other hand, when the first presence count is less than the second presence count, the third presence count estimation unit 18 estimates the third presence count to be the second presence count.
The third presence count estimation unit 18 may carry out the estimation of the third presence count by switching the presence count as a basis of the estimation for each observation area, among the first presence count only, the second presence count only, and, both of the first and second presence counts.
The third presence count estimation unit 18 may estimate an error of the number of terminals caused by a crossing of a terminal across a location registration area boundary, based on the first presence count and the second presence count. For example, the third presence count estimation unit 18 may estimate the error of the number of terminals caused by the crossing of the terminal across the location registration area boundary to be a difference between the first presence count and the second presence count. The error of the number of terminals estimated in this manner can be used in evaluation of accuracy about the estimation of the number of terminals. The estimation of the error of the number of terminals described above cannot be executed only in the embodiment wherein the signal removal unit 16 removes the location registration information due to LA-crossing from the location data to obtain the location data after removal, but can also be executed in any embodiment wherein the location data is extracted in accordance with a type of location data.
Returning to
[Configurations of First Presence Count and Second Presence Count Estimation Units]
As shown in
The observation target acquisition unit 31 acquires observation start time information and observation end time information about the observation period to be observed, from the observation period acquisition unit 13, acquires observation area information about the observation area to be observed, from the observation area acquisition unit 14, and acquires as observation target location data, one or more pieces of location data including location acquisition time information after the observation start time and before the observation end time, and location information associated with the observation area information, from the storage unit 12. The observation target location data may be further subjected to a narrowing process by a separately given condition (e.g., age groups of users of mobile terminals or the like).
The preceding and following location data acquisition unit 32 acquires, concerning a piece of location data as a target on which a below-described feature amount is calculated in a procedure of presence count estimation processing (which will be referred to hereinafter as “first location data”), the location acquisition time information of location data immediately preceding the first location data (which will be referred to hereinafter as “second location data”) and the location acquisition time information of location data immediately following the first location data (which will be referred to hereinafter as “third location data”), from the location data including the same identification information as the first location data. It is not essential for the preceding and following location data acquisition unit 32 to acquire the whole of the second or third location data, but it is sufficient for the preceding and following location data acquisition unit 32 to acquire, at least, the location acquisition time information in the location data. The preceding and following location data acquisition unit 32 may retrieve the location acquisition time information of the second and third location data from the storage unit 12 or receive the information from the location data acquisition unit 11. It makes no logical difference if either method is employed.
The feature amount calculation unit 33 calculates the feature amount on each piece of first location data. For example, the feature amount calculation unit 33 calculates a difference between the location acquisition time of the second location data and the location acquisition time of the third location data, as the feature amount on the first location data. When the location acquisition time of the second location data is an abnormal value, e.g., when a difference between the location acquisition time of the first location data and the location acquisition time of the second location data is larger than a predetermined reference value (e.g., one hour) as an example, the feature amount calculation unit 33 uses as the location acquisition time of the second location data, a time set backward by a predetermined time (e.g., one hour) from the location acquisition time of the first location data to calculate the feature amount on the first location data. Similarly, when the location acquisition time of the third location data is an abnormal value, e.g., when a difference between the location acquisition time of the first location data and the location acquisition time of the third location data is larger than a predetermined reference value (e.g., one hour) as an example, the feature amount calculation unit 33 uses as the location acquisition time of the third location data, a time set forward by a predetermined time (e.g., one hour) from the location acquisition time of the first location data to calculate the feature amount on the first location data. These processes in the case where the location acquisition time of the second or third location data is an abnormal value are not indispensable processes, but execution of the above processes can prevent such inconvenience that when an acquisition time interval of location data becomes abnormally long because of the mobile terminal 100 being located in an out-of-service area or because of the mobile terminal 100 being in a power-off mode, the abnormally long acquisition time interval excessively affects the calculation result.
The number-of-terminals estimation unit 34 estimates the number of terminals located in the observation area during the observation period, based on the feature amounts on the observation target location data and the length of the observation period which is the difference between the observation start time and the observation end time. The details will be described later, but the number-of-terminals estimation unit 34 estimates the number of terminals to be a numeral obtained by dividing the sum of the feature amounts on the observation target location data by twice the length of the observation period.
[Conception and Calculation Method of Number-of-Terminals Estimation]
Next, the conception and the calculation method of number-of-terminals estimation will be described. Let us assume, like the model shown in
Namely, the number of terminals m is estimated to be the result of a division of the sum of visit durations ti of respective terminals ai in the sector S during the observation period by the length T of the observation period. However, true values of the visit durations ti of the respective terminals ai in the sector S during the observation period are unobservable, but each terminal ai sends signals (e.g., location registration signals including the location registration information), which are observable.
Let us assume that signals sent in the sector S during the observation period by terminal ai, are defined in chronological order as follows.
qi1, qi2, . . . , qix
(where xi is a total number of signals sent in the sector S during the observation period by terminal ai). Then the estimation of the number of terminals is nothing but estimating the value of m from the observed signals qij (where j is an integer of not less than 1 and not more than xi).
Now, let us explain the calculation method of number-of-terminals estimation on the basis of
E(ti)=xi/pi (2)
When a transmission time of each signal qij is represented by uij, a density pij of signal qij is given by Equation (3) below.
p
ij=2/(ui(j+1)−ui(j−1)) (3)
When the signal qij is assumed to be a signal related to the first location data, the signal qi(j−1) corresponds to a signal related to the second location data and the signal qi(j+1) to a signal related to the third location data. In the present embodiment, a difference between the transmission time ui(j−1) of the signal qi(j−1) related to the second location data and the transmission time ui(j+1) of the signal qi(j+1) related to the third location data, i.e., (ui(j+1)−ui(j−1)) in above Equation (3) is defined as a feature amount wij on the first location data (feature amount wij=ui(j+1)−ui(j−1). Therefore, Equation (3) above can be written into the formula below.
p
ij=2/(ui(j+1)−ui(j−1))=2/wij (4)
At this time, since the density pi is given by the following formula:
an estimated value E(m) of the number of terminals m can be calculated according to Equation (6) below.
When it is assumed as shown in the example of
[Number-of-Terminals Estimation Process]
A number-of-terminals estimation process according to a number-of-terminals estimation method of the present invention will be described below. It is assumed herein as an example that the location information in the location data of a mobile terminal given herein includes a sector number of a sector in which the mobile terminal visits. It is also assumed herein that a set of an observation start time T1 and an observation end time T2 are preliminarily acquired as observation period information by the observation period acquisition unit 13 and that a sector number S is preliminarily acquired as observation area information by the observation area acquisition unit 14.
As shown in
Next, the first presence count estimation unit 15 retrieves the location data from the storage unit 12 and estimates the number of terminals located in the observation area during the observation period, as a first presence count, based on the location data (step S2 in
Next, the signal removal unit 16 retrieves the location data from the storage unit 12 and removes the location registration information due to LA-crossing from the location data to obtain location data after removal (step S3 in
Next, the second presence count estimation unit 17 estimates the number of terminals located in the observation area during the observation period, as a second presence count, based on the location data after removal obtained by the signal removal unit 16 (step S4 in
Then the third presence count estimation unit 18 estimates the number of terminals located in the observation area during the observation period, as a third presence count, based on one or both of the first presence count obtained by the first presence count estimation unit 15 and the second presence count obtained by the second presence count estimation unit 17 (step S5 in
Finally, the output unit 19 outputs the number of terminals obtained by the estimation (step S5 in
When the number of terminals is estimated as the third presence count, based on one or both of the first presence count based on the location data which are a collection of multiple pieces of location information including the location registration information and the second presence count based on the location data after removal of the location registration information due to LA-crossing, as described above, the third presence count can be estimated to be the number of terminals corresponding to somewhere in the numerical range between the first presence count and the second presence count (inclusive of the numerals at the two ends), whereby the estimated presence count can be made closer to a true value as illustrated as an example in
[Example of First Presence Count and Second Presence Count Estimation Processes]
The below will describe an example of step S2 in
As shown in
Next, the processes in steps S12 and S13 below are executed for each piece of the acquired observation target location data. In step S12, concerning a piece of location data (first location data) as a target for calculation of the feature amount out of the observation target location data, the preceding and following location data acquisition unit 32 acquires the location acquisition time information of the location data (second location data) immediately preceding the first location data and the location acquisition time information of the location data (third location data) immediately following the first location data, in view of their location acquisition times, from the location data including the same identification information as the first location data. It is noted herein that it is not essential for the preceding and following location data acquisition unit 32 to acquire the whole of the second and third location data, but it is sufficient for the preceding and following location data acquisition unit 32 to acquire the location acquisition time information in the second and third location data. The preceding and following data acquisition unit 32 may retrieve the location acquisition time information of the second and third location data from the storage unit 12 or receive the information from the location data acquisition unit 11. It makes no logical difference if either method is employed.
Then, in step S13 the feature amount calculation unit 33 calculates the feature amount on the first location data. The content of the process will be described using
The feature amount calculation unit 33 calculates the difference between the location acquisition the times of the first and second location data (i.e., the difference between the times t1 and t2) Da, and the difference between the location acquisition times of the first and third location data (i.e., the difference between times t1 and t3) Db (step S31 in
The above completes the processes in steps S12 and S13 in
Thereafter, the aforementioned processes in steps S12 and S13 are executed for each piece of the observation target location data, and the flow goes to step S15 after the execution of the processes is completed for all pieces of the observation target location data (with an affirmative judgment in step S14).
In step S15, the number-of-terminals estimation unit 34 calculates the sum of feature amounts wij on the observation target location data and estimates the number of terminals to be a numeral obtained by dividing the resultant sum of the feature amounts wij by twice the length T of the observation period, as in Equation (6) described above. In this manner, the first or second presence count can be estimated.
Since the foregoing example of estimation process involves performing the correction using the acquisition time information of the preceding and following location data in estimating the number of terminals using the location data, the number of terminals can be accurately estimated while correcting the influence of variation in reception intervals. Since the processes in the case where the location acquisition time of the second or third location data is the abnormal value as described above are carried out in the calculation process of feature amount, when the acquisition time interval of location data becomes abnormally long because of the mobile terminal 100 being located in an out-of-service area or because of the mobile terminal 100 being in a power-off mode, it becomes feasible to prevent the abnormally long acquisition time interval from excessively affecting the calculation result.
As apparent from Equation (6), the number-of-terminals estimation unit 34 may estimate the number of terminals to be a numeral obtained by dividing each of the feature amounts wij on the observation target location data by 2, calculating the sum of (feature amounts wij/2), and then dividing the obtained sum by the length T of the observation period. However, the number of divisions is overwhelmingly smaller in the calculation method of dividing the sum of the feature amounts wij on the observation target location data by twice the length T of the observation period as in the present embodiment, which provides the advantage of reduction in processing load.
[Modification of Estimation Processes of First and Second Presence Counts]
The foregoing estimation processes of first and second presence counts showed the example in which the location data of the target for calculation of the feature amount was narrowed down to the observation target location data, whereas a modification example thereof below will describe an example in which the feature amounts are calculated for targets of all pieces of the acquired location data and thereafter they are narrowed down to the feature amounts to be used in the estimation.
Namely, as shown in
The observation target acquisition unit 31 in the modification example acquires as the observation target location data, one or more pieces of location data including the location acquisition time information after the observation start time and before the observation end time about the observation period to be observed and including the location information associated with the observation area information about the observation area to be observed, and thereafter it outputs the observation target location data to the number-of-terminals estimation unit 34.
The preceding and following location data acquisition unit 32 defines each of all pieces of the location data acquired by the location data acquisition unit 11, as the first location data and acquires the location acquisition time information of the second location data (immediately-preceding location data) and the third location data (immediately-following location data) about the first location data. The location data acquired by the location data acquisition unit 11 may be data stored in the storage unit 12 after acquired by the location data acquisition unit 11, or data transmitted from the location data acquisition unit 11 to the preceding and following location data acquisition unit 32, without being stored in the storage unit 12. Namely, the preceding and following location data acquisition unit 32 may retrieve the location acquisition time information of the second and third location data from the storage unit 12 or receive the information from the location data acquisition unit 11. It makes no logical difference if either method is adopted.
The feature amount calculation unit 33 defines each of all pieces of the location data acquired by the location data acquisition unit 11, as the first location data and calculates the feature amount on the first location data. Since the result of this calculation becomes a huge amount of data, the feature amount calculation unit 33 is preferably provided with a feature amount storage unit 33A for storage of feature amounts as the calculation result as shown in
The number-of-terminals estimation unit 34 extracts the feature amounts on the observation target location data received from the observation target acquisition unit 31, from the feature amounts on all pieces of location data preliminarily calculated and stored in the feature amount storage unit 33A, and estimates the number of terminals located in the observation area during the observation period, based on the feature amounts on the observation target location data and the difference between the observation start time and the observation end time (the length of the observation period). Specifically, as in the aforementioned embodiment, the number-of-terminals estimation unit 34 estimates the number of terminals to be a numeral obtained by dividing the sum of the feature amounts on the observation target location data by twice the length of the observation period.
The number-of-terminals estimation process in the modification example will be described below. It is assumed herein that the location information in the location data of each mobile terminal given is a sector number of a sector in which the mobile terminal is located.
As shown in
The above completes the processes in steps S21-S23 on a given piece of observation target location data (first location data).
Thereafter, the processes in steps S21-S23 are executed for each of all pieces of the location data. After the processes in steps S21-S23 are completed for all pieces of the location data (with an affirmative judgment in step S24), the feature amounts on all pieces of the location data have been calculated and stored in the feature amount storage unit 33A. In this manner, the feature amounts on all pieces of the location data can be preliminarily calculated and stored before execution of the number-of-terminals estimation.
In next step S25, the observation period acquisition unit 13 acquires the observation period information including a set of an observation start time and an observation end time and the observation area acquisition unit 14 acquires the observation area information associated with one or more pieces of location information. It is assumed herein that a set of an observation start time T1 and an observation end time T2 are acquired as the observation period information and that a sector number S is acquired as the observation area information.
Next, the observation target acquisition unit 31 acquires as the observation target location data, one or more pieces of location data including the location acquisition time information after the observation start time T1 and before the observation end time T2 and including the location information associated with the sector number S as the observation area information (e.g., the location information of which is the sector number S), from the storage unit 12 (step S26). Namely, the observation target acquisition unit 31 acquires the location data meeting the following conditions, as the observation target location data.
As shown in Equation (6) above, the number-of-terminals estimation unit 34 then estimates the number of terminals to be a numeral obtained by dividing the sum of the feature amounts wij on the observation target location data by twice the length T of the observation period (step S27). In this manner, the device can estimate the first or second presence count.
Since the foregoing modification example involves calculating and storing the feature amounts on all pieces of location data in advance before the execution of the number-of-terminals estimation, the number-of-terminals estimation device 10 is able to reduce the time from the acquisition of the observation period information and the observation area information and the start of the number-of-terminals estimation process to the acquisition of the number of terminals as the estimation result.
In the processing of
[Modification Example of Number-of-Terminals Estimation Device]
The number-of-terminals estimation device 10 of the aforementioned embodiment may be further provided with a population estimation unit 20 to estimate a population in an observation area during an observation period, as shown in
The ratio of presence count and population may be, for example, a “terminal subscription rate” which is a ratio of “the number of subscriber terminals of a specific telecommunications carrier from which the location data is acquired” to “a population in an area of a specific range.” At this time, the foregoing ratio (also including the terminal subscription rate) is preferably used in the estimation of population by obtaining ratios in respective areas, ratios in respective genders, ratios in respective age groups, and so on. The population may also be estimated by obtaining a ratio of a presence count and a population in the entire country during the observation period, or by obtaining a certain constant. A predetermined ratio may also be used instead of the observation period.
The population in the observation area during the observation period can be estimated and output by taking the number of terminals from which the location data is not acquired (e.g., terminals in a power-off mode, terminals located in areas out of service, etc.), into consideration.
When “population/presence count” is adopted as the aforementioned ratio of presence count and population, this ratio is also called “scaling factor”. The scaling factor may be derived as follows. The scaling factor to be used herein as an example can be a reciprocal of “a product of a presence rate and a terminal penetration rate (i.e., a ratio of a presence count to a population).” The “presence rate” herein means a ratio of a presence count to the number of subscriptions, and the “penetration rate” means a ratio of the number of subscriptions to a population. It is preferable to derive such a scaling factor in each of the aforementioned scaling factor calculation units, but it is not essential. The scaling factor may be derived, for example, using the number of terminals (presence count) estimated based on the feature amounts and the length of the observation period as follows. Namely, the feature amounts are calculated from the location data by the technique as described in the first embodiment, the numbers of terminals in respective scaling factor calculation units are totalized based on the feature amounts and the length of the observation period to obtain user count pyramid data, and population pyramid data in the same scaling factor calculation units preliminarily obtained as statistical data (e.g., the Basic Resident Register or the like) is acquired. Then an acquisition rate of location data in each of the scaling factor calculation units (i.e., presence count/population in each unit) is calculated with the user count pyramid data and the population pyramid data. The “acquisition rate of location data (i.e., presence count/population)” obtained herein corresponds to the aforementioned “product of a presence rate and a terminal penetration rate”. A reciprocal of the “acquisition rate of location data” obtained in this manner can be derived as a scaling factor. The scaling factor calculation units for calculation of the scaling factor to be employed may be, for example, prefectures of addresses, age groups at 5-year or 10-year intervals, genders, time zones of one-hour intervals, and so on, or may be combinations of two or more of them. For example, when a scaling factor calculation unit is “men in their twenties residing in Tokyo”, location data extracted is location data corresponding to men in their twenties residing in Tokyo (namely, the address information in user attributes of which is Tokyo) in the whole of Japan; the number of terminals is counted to obtain user count pyramid data; population pyramid data about men in their twenties residing in Tokyo is acquired from the statistical data. In obtaining the user count pyramid data, as to the condition of “residing in Tokyo,” the device does not extract only the location data of users residing in Tokyo, but the device extracts the location data the address information in user attributes of which is Tokyo. Then the acquisition rate (i.e., presence count/population) of the location data in the scaling factor calculation unit (men in their twenties residing in Tokyo herein) is calculated from the user count pyramid data and the population pyramid data, and a reciprocal of the obtained “acquisition rate of location data” can be derived as a scaling factor. In the present specification, the description is given on the assumption that the scaling factor calculation units are equal to the population estimation units, but this is just an example, without having to be limited to this example.
The second embodiment will describe the second technique about the number-of-terminals estimation and a feature amount calculation process based on the same technique. Since the configurations of the communication system and the number-of-terminals estimation device in the second embodiment are the same as in the first embodiment, the description thereof is omitted herein.
Based on this conception, the visit duration ti in which the terminal ai is located in the sector S during the observation period is a duration indicated by a thick solid line in
Furthermore, as shown in
The feature amount calculation process based on the second conception of the number-of-terminals estimation as described above will be described using
As shown in
Next, the feature amount calculation unit 17 determines whether or not the immediately-following location data includes the LA-crossing location registration information, for example, by service class information included in the immediately-following location data (step S44). In this step, when the immediately-following location data includes the LA-crossing location registration information, the feature amount calculation unit 17 sets the location acquisition time of the immediately-following location data to a second variable e for calculation of feature amount (which will be referred to hereinafter as “variable e”) (step S45); when the immediately-following location data does not include the LA-crossing location registration information, the feature amount calculation unit 17 sets a midpoint time between the location acquisition time of the calculation target location data and the location acquisition time of the immediately-following location data to the variable e (step S46). It is not essential to perform the determination processes in steps S41, S44 above on the basis of the service class information, but they may be performed based on other information. For example, it is also possible to adopt a scheme in which area information indicative of ranges of location registration areas is preliminarily retained and the determination processes are carried out based on the location information of the calculation target location data and the immediately-following location data, and the area information.
Next, the feature amount calculation unit 17 performs an adjustment process of the variables s, e shown in
The feature amount calculation unit 17 calculates a difference Dc between the variable s and the time t1 and a difference Dd between the variable e and the time t1 (step S51 in
Next, returning to
The second embodiment described above can obtain the feature amount with high accuracy while taking account of the point that when at least one of the calculation target location data and the immediately-following location data includes the LA-crossing location registration information, the entrance into the sector S or the exit from the sector S is determined to have occurred at the time of generation of the LA-crossing location registration information.
The feature amount calculation technique described in the second embodiment is also applicable to “the case where the feature amounts are calculated by narrowing down the location data to the observation target location data and where the number of terminals is estimated by the feature amounts obtained,” and to “the case where the feature amounts are preliminarily calculated for all pieces of the location data and where the number of terminals is estimated using the feature amounts on the observation target location data among them.”
Next, a modification example about the feature amount will be described. The aforementioned first and second embodiments showed the examples in which the time difference between the preceding and following location data (i.e., the time difference between the second location data (immediately-preceding location data) and the third location data (immediately-following location data)) before and after the location data as a target for calculation of the feature amount (first location data) was calculated as the feature amount on the first location data. Expressing this by an equation, the feature amount can be expressed by Equation (7) below. The below equation (7) is a modification of the aforementioned equation (4) and is equivalent to Equation (4) (namely, there is no change in the conception of Equation (4)).
w
ij=u
i(j+1)
−u
i(j−1) (7)
The present modification example shows another variation of the feature amount calculation method in the feature amount calculation unit 17.
In the present modification example, when the feature amount calculation unit 17 calculates the feature amount on the first location data, it takes account of type information (e.g., below-described generation factor (generation timing) of location data) on the second location data and the third location data. Specifically, the feature amount calculation unit 17 calculates a value of a multiplication of the time difference between the third location data and the first location data by a correction factor α corresponding to the type information of the third location data (generation factor herein) and calculates a value of a multiplication of the time difference between the first location data and the second location data by a correction factor β corresponding to the type information of the second location data (generation factor herein). However, instead of the above factors, the feature amount calculation unit 17 may determine the correction factor α or β according to the type information of the first location data or may determine the correction factor β according to the type information of the first and second location data and determine the correction factor α according to the type information of the first and third location data. Then the feature amount calculation unit 17 defines a value obtained by adding the results of these multiplications, as the feature amount on the first location data. When the feature amount calculation process in the feature amount calculation unit 17 is expressed by an equation, it is represented by Equation (8) below.
w
ij=α(ui(j+1)−uij)+β(uij−ui(j−1)) (8)
For example, when the location data is the location registration information, the type information about the second location data and the third location data can be information about the generation factor of the location registration information, and this information about the generation factor is included in the generated location registration information. Examples of such generation factors of location registration information include a crossing of a terminal across a boundary of a location registration area, generation based on location registration performed at periodic intervals, execution of an attachment process by a power-on operation of a terminal or the like, execution of a detachment process by a power-off operation of a terminal or the like, and so on, and set values of the correction factors α and β are preliminarily defined corresponding to these generation factors. Then the feature amount calculation unit 17 can set the correction factor α on the third location data in accordance with the information about the generation factor of the third location data and set the correction factor β on the second location data in accordance with the information about the generation factor of the second location data. The correction factors α, β both may be preliminarily determined as values of not less than 0 and not more than 2. However, this numerical range is not essential.
For example, in the case of the location registration information the generation timing of which is irrespective of the location of the terminal like the location registration information based on location registrations performed at periodic intervals, expectations of time when the terminal has been located in a current sector are considered to be the same before and after generation of the location registration information. On the other hand, when the location registration information is one generated because of a crossing of a terminal across a location registration area boundary, it can be determined that the terminal has not been located in the current sector yet, at least before generation of the pertinent location registration information. For this reason, a duration in which the terminal has been located in the current sector before generation of the pertinent location registration information can be considered to be 0, and when the type information (generation factor) of the first location data is “a crossing across a location registration area boundary,” the correction factor β in above Equation (8) (i.e., the correction factor β about the time difference from the immediately-preceding location data) can be set to 0. This allows the device to calculate the feature amount better agreeing with the actual condition. When the type information (generation factor) of the first location data is “a crossing across a location registration area boundary” in this manner, the calculation of the feature amount with the correction factor β of 0 can achieve the same effect as in the aforementioned second embodiment.
As described above, when the feature amount calculation unit 17 calculates the feature amount on the target location data (first location data), it corrects the time differences from the second location data and from the third location data in accordance with the type information on the second and third location data (generation factors of the location data as an example) being the preceding and following location data before and after the first location data, and calculates the feature amount using the corrected time differences. This allows the device to calculate the feature amount more accurately, based on the type information of the location data.
In all of the aforementioned embodiments the device may be configured to perform an unidentifiability securing process for removing information with individual identifiability from the location data and use the location data after the unidentifiability securing process. For example, the configurations of
1: communication system; 10: number-of-terminals estimation device; 11: location data acquisition unit; 12: storage unit; 13: observation period acquisition unit; 14: observation area acquisition unit; 15: first presence count estimation unit; 16: signal removal unit; 17: second presence count estimation unit; 18: third presence count estimation unit; 19: output unit; 21: population estimation unit; 31: observation target acquisition unit; 32: preceding and following location data acquisition unit; 33: feature amount calculation unit; 33A: feature amount storage unit; 34: number-of-terminals estimation unit; 100: mobile terminal; 200: BTS; 300: RNC; 400: exchange; 500: management center; 501: social sensor unit; 502: peta-mining unit; 503: mobile demography unit; 504: visualization solution unit; 700: various processing node.
Number | Date | Country | Kind |
---|---|---|---|
2011-018860 | Jan 2011 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2012/051455 | 1/24/2012 | WO | 00 | 2/26/2013 |