This application claims the benefit of priority under 35USC 119 to Japanese Patent Application No. 2004-209977, filed on Jul. 16, 2004, the entire contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates a spatial data analyzing apparatus, spatial data analyzing method and spatial data analyzing program.
2. Related Background Art
In recent years, much spatial data having position information pieces and attribute information pieces is accumulated according to advance of GPS (Global Positioning System), sensor network or the like.
When the spatial data is analyzed and such a regularity as “a place where data items satisfying a certain condition (data items having a certain attribute information piece or a certain combination of attribute information pieces) are congested” can be found, since it is expected that data items satisfying the regularity are congested at a certain specific place in the future, an effective measure can be taken in advance.
However, in order to find a correlation between attribute information pieces and congestion states, it is necessary to select data items corresponding to each of all attribute information pieces and all combinations of attribute information pieces to determinate about a congestion state, which results in requirement for much computation time.
In view of these circumstances, JP-A-2003-256757 has proposed an approach which finds conditions of attribute information pieces under which data items is congested (attribute information pieces and combinations of attribute information pieces), while reducing the number of times required for determination.
In the approach, however, in the case where an attribute information piece by which a congestion state is not formed exists, there is a problem that it is difficult to find a classification rule (regulation) including the attribute of the attribute information piece.
According to an aspect of the present invention, there is provided a spatial data analyzing apparatus which, from a plurality of records each record including position information showing a position in a two or more-dimensional space and a plurality of attribute information pieces belonging to a plurality of attributes, finds a condition of attribute information under which records are congested in the space, comprising: a first data selecting unit which selects records having same attribute information piece from the plurality of records for each of attribute information pieces; a first determining unit which stores the attribute information piece corresponding to the records selected in case where the records selected satisfy a predetermined congestion condition based on the position information of the records selected; a combination data generating unit which combines attribute information pieces stored among different attributes to generate a combination of attribute information pieces; a second data selecting unit which selects records having the combination of attribute information pieces from the plurality of records; and a second determining unit which stores the combination of attribute information pieces corresponding to the records selected by the second data selecting unit in case where the records selected satisfy the predetermined congestion condition based on the position information of the records selected.
According to an aspect of the present invention, there is provided a spatial data analyzing method which, from a plurality of records each record including position information showing a position in a two or more-dimensional space and a plurality of attribute information pieces belonging to a plurality of attributes, finds a condition of attribute information under which records are congested in the space, comprising: selecting records having same attribute information piece from the plurality of records for each of attribute information pieces; storing the attribute information piece corresponding to the records selected in case where the records selected satisfy a predetermined congestion condition based on the position information of the records selected; combining attribute information pieces stored among different attributes to generate a combination of attribute information pieces; selecting records having the combination of attribute information pieces from the plurality of records; and storing the combination of attribute information pieces corresponding to the records selected in case where the records selected satisfy the predetermined congestion condition based on the position information of the records selected.
According to an aspect of the present invention, there is provided a spatial data analyzing program which, from a plurality of records each record including position information showing a position in a two or more-dimensional space and a plurality of attribute information pieces belonging to a plurality of attributes, finds a condition of attribute information under which records are congested in the space, the spatial data analyzing program causing a computer to execute: a first data selecting step selecting records having same attribute information piece from the plurality of records for each of attribute information pieces; a first determining step storing the attribute information piece corresponding to the records selected in case where the records selected satisfy a predetermined congestion condition based on the position information of the records selected; a combination data generating step combining attribute information pieces stored among different attributes to generate a combination of attribute information pieces; a second data selecting step selecting records having the combination of attribute information pieces from the plurality of records; and a second determining step storing the combination of attribute information pieces corresponding to the records selected by the second data selecting step in case where the records selected satisfy the predetermined congestion condition based on the position information of the records selected.
As shown in
A spatial data analyzing apparatus shown in
First, basic configuration and operation of the spatial data analyzing apparatus will be explained with reference to
(1) A data selecting unit 11 shown in
The data selecting unit 11 extracts one attribute information piece from the list of the attribute information pieces (Step S13), and selects records including the extracted attribute information piece from the analysis target data (Step S14). That is, the data selecting unit 11 selects records from the analysis target data using the extracted attribute information piece as a selection condition. The data selecting unit 11 performs this processing on all the attribute information pieces. That is, for example, the data selecting unit 11 selects all records including the attribute information piece “female”, selects all records including the attribute information piece “male”, and selects all records including the attribute information piece “young” and so on.
(2) A congestion state determining unit 12 determines whether or not a group of records selected by data selecting unit 11 satisfies a predetermined congestion condition for each of the attribute information pieces, and it stores the attribute information piece as a candidate of a solution when the attribute information piece satisfies the predetermined congestion condition (Step S15).
Particularly, the congestion state determining unit 12 calculates a center of gravity of the record group based upon a position information piece of respective records constituting the record group. Next, the congestion state determining unit 12 calculates a congestion degree (described later) based upon the calculated center of gravity and the position information of respective records. The congestion state determining unit 12 determines whether or not the calculated congestion degree satisfies a criteria based upon a congestion state determining parameter(s) inputted from a parameter input unit 15. When the calculated congestion degree satisfies the criteria, the congestion state determining unit 12 selects the attribute information piece as a candidate of a solution. The selected attribute information piece (a candidate of solution) is stored together with the center of gravity and the congestion degree. The stored attribute information piece, center of gravity, and congestion degree are outputs of the spatial data analyzing apparatus.
(3) A selection condition composing unit 13 combines attribute information pieces satisfying the predetermined congestion condition, which are selected by the congestion state determining unit 12, among different attributes to generate a combination of the attribute information pieces. The selection condition composing unit 13 passes the generated combination of the attribute information pieces to the data selecting unit 11 (Yes in Step S16, Step S17).
(4) The data selecting unit 11 selects records including the passed combination of the attribute information pieces from the analysis target data (Step S14).
(5) The congestion state determining unit 12 determines whether or not the selected records (record group) satisfies the above predetermined congestion condition. When the record group satisfies the above predetermined congestion condition, the congestion state determining unit 12 stores the combination of the attribute information pieces as a candidate of a solution (Step S15).
In particular, the congestion state determining unit 12 calculates a center of gravity of the record group including the selected combination of the attribute information pieces and the congestion degree thereof such as the above. When the calculated congestion degree satisfies the criteria based upon the above-described congestion state determining parameter(s), the congestion state determining unit 12 stores the combination of the attribute information pieces as a candidate of a solution together with the center of gravity and the congestion degree. The stored combination of attribute information pieces, center of gravity, and congestion degree are outputs of the space data analyzing apparatus.
(6) The selection condition composing unit 13 combines the selected combination of the attribute information pieces and the attribute information piece previously selected by the congestion state determining unit 12 such that equal attributes do not overlap with each other to generate a new combination of attribute information pieces (Yes in Step S16, Step S17). The selection condition composing unit 13 passes the new combination of the attribute information pieces to the date selecting unit 11 (Step S17).
Thereafter, the above items (4) to (6) are repeated until a combination of attribute information pieces can not be generated by the selection condition composing unit 13 (No in Step S16).
A case that the traffic accident occurrence record data shown in
First, the data selecting unit 11 selects records from the traffic accident occurrence record data shown in
That is, records including “sex=female”, “sex=male”, “age=young”, “age=older”, “weather=fair”, “weather=rainy”, and “weather=snow” are selected from the analysis target data shown in
Record groups selected based upon respective attribute information piece (“sex=female”, “sex=male”, “age=young”, “age=older”, “weather=fair”, “weather=rainy”, and “weather=snow”) and aspects where positions of records constituting the respective record groups have been plotted on a two-dimensional map are shown in
Determination is then made in the congestion state determining unit 12 shown in
In this embodiment, the congestion degree is defined as a value obtained by dividing the sum of the squares of distances from the center of gravity of a record group (an average of longitudes and latitudes of respective records constituting the record group) to positions (the longitudes and the latitudes) of respective records by the number of records. For example, a state that the congestion degree is 12.5 (a thresholder inputted as the parameter by a user) or less is defined as a congestion state. Incidentally, the minimum number of records constituting congestion is defined as 2 (which is inputted as the parameter by a user). In other words, regarding one record group, when the congestion degree is 12.5 or less and the number of records is 2 or more, the one record group satisfies the congestion condition.
Now, regarding respective record groups selected based upon the attribute information pieces of the above-described “sex=female”, “sex=male”, “age=young”, “age=older”, “weather=fair”, “weather=rainy”, and “weather=snow”, the congestion state determining unit 12 first calculates the number of records, the center of gravity, and the congestion degree which correspond to each attribute information piece, and the like. The calculation results are shown in the following Table 1.
From Table 1, the attribute information pieces where the congestion degree is 12.5 or less and the number of records is 2 or more, namely, the attribute information pieces satisfying the congestion condition are four pieces of “sex=female”, “sex=male”, “age=older”, and “weather=snow”, and the four attribute information pieces and data items corresponding thereto (the number of records, center of gravity, sum of squares of distances from the center of gravity, and congestion degree) are stored in the congestion state determining unit 12. The attribute information pieces stored are passed to the selection condition composing unit 13 shown in
The selection condition composing unit 13 combines the attribute information pieces passed from the congestion state determining unit 12 between different attributes to generate a combination(s) of the attribute information pieces. That is, the selection condition composing unit 13 ANDs two attribute information pieces between attributes different from each other.
In other words, in this example, the selection condition composing unit 13 generates five combinations of attribute information pieces i.e. “sex=female and age=older”, “sex=male and age=older”, “sex=female and weather=snow”, “sex=male and weather=snow”, and “age=older and weather=snow” based upon the four passed attribute information pieces (“sex=female”. “sex=male”, “age=older”, and “weather=snow”).
The five combinations (selection conditions) of the attribute information pieces generated are passed from the selection condition composing unit 13 to the data selecting unit 11 shown in
The respective record groups selected based upon the selection conditions (“sex=female and age=older”, “sex=male and age=older”, “sex=female and weather=snow”, “sex=male and weather=snow”, and “age=older and weather=snow”) and aspects where positions of records constituting the respective record groups are plotted on a two-dimensional map are shown in
The congestion state determining unit 12 shown in
That is, regarding the respective record groups selected based upon the selection conditions (“sex=female and age=older”, “sex=male and age=older”, “sex=female and weather=snow”, “sex=male and weather=snow”, and “age=older and weather=snow”), the congestion state determining unit 12 first calculates respective numbers of records, respective center of gravity, respective congestion degrees and the like. The results are shown in the following Table 2.
From Table 2, it is understood that the number of the combinations of attribute information pieces where the congestion degree is 12.5 or less and the number of records is 2 or more, namely the number of the combinations of attribute information pieces satisfying the congestion condition is four of “sex=female and age=older”, “sex=male and age=older”, “sex=female and weather=snow”, and “age=older and weather=snow”. Therefore, the four combinations of attribute information pieces and data items (the number of records, center of gravity, sum of squares of distance from center of gravity, and congestion degree) corresponding thereto are stored as selection conditions in the congestion state determining unit 12. The combinations of attribute information pieces (the selection conditions) stored are passed to the selection condition composing unit 13 shown in
The selection condition composing unit 13 combines the four combinations of attribute information pieces (the selection conditions) passed and the previously (first) passed attribute information piece which is “sex=female”, “sex=male”, “age=older”, or “weather=snow”) such that the former and the latter do not overlap with each other. As a result, two new combinations of attribute information pieces (selection conditions) i.e. “sex=female, age=older, and weather=snow” and “sex=male, age=older, and weather=snow” are generated.
The generated two combinations of attribute information pieces (selection conditions) are passed from the selection condition composing unit 13 to the data selecting unit 11, and the data selecting unit 11 selects records from the traffic accident occurrence record data shown in
Determination is made in the congestion state determining unit 12 shown in
That is, first, the congestion state determining unit 12 calculates the number of records, the center of gravity, the congestion degree, and the like regarding each of the record groups selected based upon the selection conditions (“sex=female, age=older, and weather=snow” and “sex=male, age=older, and weather=snow”). The results are shown in the following Table 3.
From Table 3, it is understood that the combination of attribute information pieces where the congestion degree is 12.5 or less and the number of records is 2 or more, namely, the combination of attribute information pieces satisfying the predetermined congestion condition is only one of “sex=female, age=older, and weather=snow”. The combination of attribute information pieces (a selection condition) and data pieces (the number of records, the center of gravity, the sum of squares of distance from center of gravity, and the congestion degree) corresponding thereto are stored in the congestion state determining unit 12. The combination of attribute information pieces stored is passed to the selection condition composing unit 13 shown in
The selection condition composing unit 13 determines that there is not any more attribute information pieces to be combined regarding the selection condition passed (if the selection condition is combined with any attribute information piece, same attributes is overlapped) and outputs a termination instruction of processing to the congestion state determining unit 12.
When the congestion state determining unit 12 receives the termination instruction of processing, it outputs the attribute information pieces, the combinations of attribute information pieces stored and the like. The attribute information pieces, the combinations of attribute information pieces and the like outputted are shown in the following Table 4.
In order to obtain the results shown in Table 4, the number of attribute information pieces or the number of the combinations of attribute information pieces which were determined about whether or not satisfying the predetermined congestion condition reaches 14 as a total of 7 at the first time, 5 at the second time, and 2 at the third time. In the embodiment, therefore, the attribute information pieces and the combination of attribute information pieces satisfying the congestion condition can be found by an amount of calculation reduced as compared with that the conventional method which makes determination about 35 cases corresponding to all the attribute information pieces and all the combinations thereof.
In the embodiment, even if a first attribute information piece which does not satisfy the congestion condition is contained regarding a certain attribute, when a second attribute information piece satisfying the congestion condition is present regarding the certain attribute, the second attribute information piece and the combination of attribute information pieces including the second attribute information piece and satisfying the congestion condition can be found. In this embodiment, for example, there are three information pieces of “fair”, “rainy” and “snow” as the attribute information pieces for the weather. The attribute information pieces “fair” and “rainy” of these information pieces do not satisfy the congestion condition (the congestion degree is larger than 12.5), but the attribute information piece “snow” satisfies the congestion condition, so that the attribute information piece “snow” and the combination of attribute information pieces including the “snow” and satisfying the congestion condition can be found.
The attribute information pieces and the combinations of attribute information pieces, the centers of gravity, the congestion degrees, and the like thus obtained can be used for various applications.
For example, it is understood from Table 4, line 7 (sex=female and weather=snow) that a woman tends to cause a traffic accident during snow in the vicinity of the center of gravity (4, 3.5). Accordingly, it is proposed that a sign board promoting awareness, such as “care to slippage” is provided for women around the center of gravity. Since the congestion degree is as small as “0.25” (tendency of congestion is large), with regard to the range where the sign board is provided, such determination can be made that the sign board can be provided within a distance relatively near the center of gravity (4, 3.5).
In the above explanation, the traffic accident occurrence record data is used as the analysis target data example. However, it is possible to use various spatial data including position information. For example, an effective sales strategy can be planed by analyzing data including a position where a taxi catches a passenger as the position information to calculate a place where passengers gather for each sex, for each time band, or the like.
In the selection condition composing unit 13 shown in
Here, before combining, if a combination of attribute information pieces is generated, a prediction about whether or not the combination of attribute information pieces satisfies the congestion condition may be performed. For example, that is performed by obtaining centers of gravity about respective record groups including respective subjects to be combined to determine about whether or not a square of a distance between the centers of gravity satisfies a criteria based upon a predetermined threshold.
For example, in the above-described second processing, before four attribute information pieces of
Therefore, when squares of a distance between centers of gravity are calculated regarding the above four attribute information pieces, the following results are obtained.
Since the square of the distance between centers of gravity is 10 (a predetermined threshold) or more (11.95) regarding the combination of “sex=male and weather=snow”, it is determined that the combination is not generated. As a result, a processing for selecting records including “sex=male and weather=snow” from the traffic accident occurrence record data (refer to
In the embodiment described above, the processing step shown in
In the embodiment described above, explanation has been made using the two-dimensional space as one example, but the present invention is applicable to a three or more-dimensional space.
Number | Date | Country | Kind |
---|---|---|---|
2004-209977 | Jul 2004 | JP | national |