The present invention relates to a method of analyzing data and its device, especially relates to a data harmonic analysis method suitable for analyzing complex data and a data analysis device used for it.
Harmonic analysis technique represented by Fourier analysis and wavelet analysis is used in many fields for a practical analysis method related to grid-like one-dimensional data and grid-like two-dimensional data. The grid-like data means uniform data in distance between adjacent data. When the harmonic analysis technique is used, various data analysis such as the estimate and forecasting of data, data compression, the removal of noise superimposed on data and the classification of data is made possible (for example, refer to S. G. Mallat, IEEE Trans. Pattern Anal. Machine Intell., vol. 11, No. 7, pp. 674-693, 1989). Recently, for a two-dimensional data analysis method, higher technique such as wedgelet and curvelet is also proposed (for example, refer to R. L. Claypoole and R. G. Baraniuk, Proc. SPIE, vol. 4119, pp. 253-262, 2000 and E. J. Candes, D. L. Donoho, IEEE Trans. Image Proc., vol. 11, pp. 670-684, 2002).
In the meantime, the importance of an analysis method also applicable to data which is not grid-like two- or less-dimensional data, that is, three- or more-dimensional data and data which is not arrayed in a grid (hereinafter called complex data) increases. If high-precision analysis technique for complex data can be established, the technique can be not only applied to the analysis of data acquired from a sensor network for example and the classification of data represented in complex feature space (for example, in non-Euclidean space) but the enhancement of the processing of conventional type grid-like two- or less dimensional data can be expected. However, a conventional type method developed to analyze grid-like two- or less-dimensional data is difficult to apply to complex data as it is.
Grid-like two- or less-dimensional data and complex data can be interpreted as data having graph structure. The graph structure means structure configured by a set of nodes (vertexes) and a set of edges that connect nodes. When two nodes are connected via one edge, the nodes are called connected. Gird-like two- or less-dimensional data can be regarded as data having two- or less-dimensional grid-like graph structure. To correspond to complex data, the development of harmonic analysis technique applicable not only to two- or less-dimensional grid-like graph structure but to data having more general graph structure is required. Though harmonic analysis methods applicable to data having these graph structures have been proposed, sufficient performance has been not acquired (for example, refer to U.S. Patent No. 2006/0004753 and M. Gavish, B. Nadler, R. R. Coifman, International Conference on Machine Learning, pp. 367-374, 2010).
A main subject of harmonic analysis technique for data having graph structure is the compatibility of performance, computation and versatility. If an applied object is limited to simple graph structure, a harmonic analysis method in which high performance is acquired for data having the graph structure and computation is little may exist. For example, the above-mentioned wedgelet and curvelet are methods in which high-performance and high-speed harmonic analysis can be made for data having two-dimensional grid-like graph structure. However, it is difficult to apply these harmonic analysis methods to more general graph structure as they are. If the more general graph structure is approximated to two-dimensional grid-like graph structure, the application is enabled, however, the performance is deteriorated. Besides, for a method that can be applied to more general graph structure, harmonic analysis technique for data having tree structure is proposed (refer to M. Gavish, B. Nadler, R. R. Coifman, International Conference on Machine Learning, pp. 367-374, 2010). However, the tree structure is graph structure having a very strong constraint that only one node called an uppermost node and having no parent node exists and nodes except the node on an uppermost hierarchy have only one parent node and similarly, it is difficult to apply the harmonic analysis technique for data having tree structure to general complex data.
In the meantime, general purpose technique for arbitrary graph structure is proposed (refer to U.S. Patent No. 2006/0004753). However, though the technique is versatile, much computation is required and operation in an order of the square of the number of nodes Nv to the third power of Nv is generally required. Besides, the technique may be unable to fulfill sufficient performance for data having specific graph structure. For example, it is difficult to apply analysis utilizing the information of the hierarchical structure to data having hierarchical graph structure (that is, a graph in which nodes have membership).
The present invention settles the problems of the prior art and provides such a data harmonic analysis method and such a data analysis device that simultaneously meet performance, computation and versatility in the analysis of data having graph structure.
The present invention can be applied to graph structure in a sufficiently wide class though it cannot be applied to arbitrary graph structure and settles the problems by the following data harmonic analysis method and the following data analysis device as high-performance high-speed technique.
(1) The present invention is based upon a data harmonic analysis method including a data acquisition step for acquiring plural data pieces as objects of analysis, a similarity calculation step for calculating similarity between plural data sources which are sources of data values of the plural data pieces acquired in the data acquisition step, a hierarchical graph generation step for generating a hierarchical graph having a hierarchy of plural child nodes corresponding to the plural data pieces as a lower hierarchy and having a hierarchy of parent nodes having no data as an upper hierarchy as graph structure that represents the plural data pieces acquired in the data acquisition step, a connection rate calculation step for calculating a connection rate between each of the plural child nodes and its parent node in the hierarchical graph generated in the hierarchical graph generation step using the information of similarity acquired in the similarity calculation step and a harmonic analysis step for applying harmonic analysis to data values in the graph based upon the hierarchical graph generated in the hierarchical graph generation step for data analysis, and has a characteristic that harmonic analysis is carried our according to the connection rate calculated in the connection rate calculation step between the child node and the parent node in the analysis step.
In the present invention, harmonic analysis suitable for data in the form of a graph having hierarchical structure can be made. Tree structure is also one type of hierarchical graph structure, however, hierarchical graph structure which is an object in the present invention is not limited to the tree structure. That is, two or more nodes may also exist on an uppermost hierarchy and a node except the uppermost hierarchy may also have plural parent nodes. The hierarchical graph structure is graph structure in wide class including tree structure. Therefore, various data can be exactly represented. Harmonic analysis can be applied to a graph having tree structure by processing called orthogonal transformation, however, as non-orthogonal transformation is required in a hierarchical graph which is not tree structure, such a method for tree structure as in M. Gavish, B. Nadler, R. R. Coifman, International Conference on Machine Learning, pp. 367-374, 2010) cannot be applied. Besides, as a child node has plural parent nodes, harmonic analysis is required to be carried out in consideration of the strength of connection with respective parent nodes. In the present invention, harmonic analysis using non-orthogonal transformation is applied. Moreover, a connection rate between each child node and its parent node is calculated and a harmonic analysis method is changed according to the connection rate. In the meantime, the compatibility of performance and computation is enabled by making harmonic analysis positively utilizing information of the hierarchical structure of a graph differently from a general purpose method applicable to arbitrary graph structure. As for computation, high-speed operation is enabled by performing multi-resolution processing in which processing is applied to nodes on an upper hierarchy in order from nodes on a lowermost hierarchy.
(2) Besides, the present invention is based upon the hierarchical graph generation step and has a characteristic that a connection rate of each edge is calculated based upon the similarity and harmonic analysis is carried out based upon the connection rate.
In multiple hierarchical graphs, data structure can be more properly represented when a weighted graph in which a connection rate is assigned to each edge is considered. Similar data values can be strongly related by assigning a higher connection rate to the more similar data values.
(3) Moreover, the present invention is based upon the hierarchical graph generation step and has a characteristic that after the hierarchical graph is generated, the generated hierarchical graph is changed to a hierarchical graph in which all lowermost nodes have a data value, all nodes except the lowermost nodes have no data value and all nodes except an uppermost node have a parent node on an upper hierarchy by one.
As for data having hierarchical graph structure, the node having the data value and the node having no data value exist. Besides, a graph in which the child node is connected to the parent node on the upper hierarchy by two or more via an edge is also conceivable. As the hierarchical graph has multiple variations as described above, it is not easy to uniformly make harmonic analysis. Then, harmonic analysis can be applied to an arbitrary hierarchical graph by relatively simple processing by changing to a hierarchical graph for which processing is easy as preparation for the harmonic analysis.
(4) In addition, the present invention is based upon the harmonic analysis step and has a characteristic that when processing is performed using a node on an “n”th hierarchy from the lowermost hierarchy and a node on an (n+1)th hierarchy from the lowermost hierarchy in the hierarchical graph, processing for equalizing the total of the sum of squares of high resolution transformation coefficients and the sum of squares of the nodes on the (n+1)th hierarchy from the lowermost hierarchy to the sum of squares of data values of the nodes on the nth hierarchy from the lowermost hierarchy is performed.
In each of multi-resolution processing, the sum of squares of output (that is, the total of the sum of squares of the high resolution transformation coefficients and the sum of squares of the nodes on the (n+1)th hierarchy from the lowermost hierarchy) is equalized to the sum of squares of input (that is, the sum of squares of the data values of the nodes on the nth hierarchy from the lowermost hierarchy). Hereby, the sum of squares of the resolution transformation coefficients and the sum of squares of a data value of an uppermost node which are respectively the output of the harmonic analysis can be equalized to the sum of squares of data values of each node which are the input of the harmonic analysis. A property that the sum of squares of data values is kept before and after harmonic analysis is called Parseval's equality and harmonic analysis that meets this property is useful in data processing. For example, as the ratio of the sum of squares of noise included in data values which are input and the sum of squares of components (hereinafter called signal components) except noise is also stored after harmonic analysis, the quantity of noise can be easily estimated using its value after the harmonic analysis. In orthogonal transformation, it is guaranteed that the Parseval's equality is met, however, in non-orthogonal transformation, this equality is generally not met. However, in the present invention, processing that meets the Parseval's equality is also enabled in non-orthogonal transformation by performing processing for equalizing the sum of squares of the input to the sum of squares of the output in each of multi-resolution processing.
(5) Further, the present invention is based upon the harmonic analysis step and has a characteristic that high resolution transformation coefficients of a number equal to a value acquired by subtracting the number of all nodes from the sum of the number of edges in the hierarchical graph and the number of nodes having a data value are calculated.
In the case of tree structure, the number of all nodes is equal to a value acquired by adding 1 to the number of edges. Therefore, the sum of high resolution transformation coefficients and the number of a data value of an uppermost node (the latter is equal to 1) which are the output of harmonic analysis is equal to the number of nodes having a data value which are the input of the harmonic analysis. That is, the number in the output value and the number in the input value are coincident. In the meantime, in the present invention, graph structure that each node is connected to plural parent nodes can be also represented. At this time, natural processing having little computation is enabled by using harmonic analysis having the characteristic described in (4).
(6) Furthermore, the present invention is based upon the data harmonic analysis method and has a characteristic that data analysis is carried out by acquiring plural data pieces to be analyzed, calculating similarity between plural data sources which are generation sources of respective values of the acquired plural data pieces, specifying the number of nodes on an uppermost hierarchy out of one or more hierarchies of parent nodes having no data and arranged on the upside of a lowermost hierarchy as a hierarchy of plural child nodes corresponding to the plural data pieces in graph structure representing the acquired plural data pieces, generating a hierarchical graph including the lowermost hierarchy to the uppermost hierarchy on a condition of the specified number of nodes on the uppermost hierarchy, inputting information of a lower limit of the similarity for connecting each of the plural child nodes on the lowermost hierarchy in the generated hierarchical graph to its parent node on the upper hierarchy by one of the lowermost hierarchy, calculating a connection rate between each of the plural child node and its parent node using information of the calculated similarity and the input information of the lower limit of the similarity and applying harmonic analysis to the data values in the graph according to the calculated correction rate based upon the generated hierarchical graph.
As described above, the compression and the estimation of data values of complex data, the removal of noise and the classification of data can be performed at higher performance by utilizing the harmonic analysis method applicable to the hierarchical graph.
According to the present invention, data analysis can be carried out at high performance and at high speed by grasping complex data such as data acquired by plural and different types of sensors and multidimensional data as data having hierarchical graph structure and making harmonic analysis.
The present invention relates to a method of analyzing complex data such as data acquired by plural different sensors and multidimensional data, especially provides a method of regarding data as data having hierarchical structure and making harmonic analysis and its device. Referring to the drawings, embodiments of the present invention will be described below.
As described above, high-performance analysis can be applied to various data which can be represented by a hierarchical graph by generating a hierarchical graph which is not tree structure and making harmonic analysis suitable for data on the hierarchical graph. Besides, multi-resolution processing can be performed by performing processing of a node on a hierarchy on the upside sequentially from a lowermost node and hereby, computation can be reduced.
The details of each step will be described showing concrete examples below.
First, examples of data will be described referring to
Besides, an image can be classified by applying harmonic analysis to a value representing a class. For example, suppose that a numeric value acquired by representing a degree as Class A by a real value 0 to 1 is a data value. Each image is a data source. In this case, each data value of the image 1 and the image 3 is 1 and a data value of the image 2 is 0. As it is unknown to which class the image 4 belongs, its data value shall be 0.5 between 0 and 1. It can be regarded that a problem of the classification of an image lies in estimating a data value of an image the class of which is unknown. A data value is not required to be a scalar and may be also a vector. For example, suppose that each image is classified into three classes of Class A, B, C. In this case, a data value can be represented as a vector value configured by three real numbers showing a degree of any of Class A, B, C. Moreover, a data value is not a scalar or a vector of a real number but may be also a scalar or a vector of a complex number, a quaternion and others.
High-performance analysis can be applied to complex data by generating the hierarchical graph which is not tree structure and making harmonic analysis suitable for data on the hierarchical graph using this device.
Next, a method of calculating similarity between data sources will be described referring to
Finally, in a step S403, similarity 413 is acquired from a minimum value of differences calculated for each image in the group of images 411. When the minimum value of the difference is dmin, similarity s is acquired by “s=exp(−k×dmin)” for example, however, the present invention is not limited to this (k: constant). Even if the image A is shifted/rotated/extended/reduced for the image B, similarity can be acquired without being effected by the shift/rotation/extension/reduction by this method. Similarity between arbitrary two images is calculated according to this method.
In
In
Finally, similarity 535 is calculated in a step S534 using the finite difference calculated in the step S531 and the feature values calculated in the steps S532, S533. In this example, similarity is calculated using only the finite difference between the differentiations and the feature values, however, more information such as finite difference between the second-order differentials may be also used. As in the case shown in
In
Next, a method of generating a hierarchical graph will be described referring to
Graphs 702, 702 are examples of hierarchical graphs which are not a tree. In the graph 702, a node 720 has two parent nodes 721, 722. A node 723 is also similar. In the graph 703, a node having two or more parent nodes exists and in addition, the node further has the two top nodes 730, 731. As described above, the hierarchical graphs 702, 703 do not meet the requirements of a tree. Complex data structure can be represented by considering the hierarchical graphs not limited to the tree.
Next, an example of a generated hierarchical graph will be described.
This is an example in which semi-teaching type image classification is applied in which each image belongs to either of Class A or Class B. And in this case, it is taught that the image 1 belongs to Class A and the image 2 belongs to Class B, however, the image 3 and the image 4 are untaught. A data value that represents likelihood of Class A is defined and, in this case, data values of the image 1 and the image 2 are set as 1 and 0 respectively. Data values of the images 3, 4 are estimated by applying harmonic analysis, described later, to the above data values. As shown in a reference numeral 803, supposed results of estimating the data values of the image 3 and the image 4 are set as 0.7 and 0.1. When the data values are equal to or exceeding a certain value (for example, 0.5), the image is classified into Class A and if not, the image is classified into Class B. In this example, the image 3 is classified into Class A and the image 4 is classified into Class B.
When plural similar nodes of parent nodes exist, the child node may have the plural parent nodes. For example, the node 904 has three parent nodes. In
As shown in
In all the graphs shown in
In a graph 1101, nodes 1114, 1115 are nodes on a lowermost layer, however, the node 1115 has no data value. Besides, nodes 1111 to 1113 which are not lowermost have a data value. Further, as the node 1112 is a parent node of the nodes 1114, 1115, it is grasped that the node 1112 is located on a second layer from the downside and as the node 1111 is a parent node of the node 1112, it is grasped that the node 1111 is located on a third layer from the downside. Then, the node 1114 has the parent node 1111 on the upside by two layers.
In a graph 1102, the graph 1101 is coordinated to facilitate the understanding of the hierarchies, a node 1121 is added, and further, the lowermost node 1115 having no data value is deleted. It is known that a condition that all the nodes except the uppermost node have the parent node on the layer on the upside by one by adding the node 1121 having no node value is met. Besides, as the lowermost node having no data value has no effect on the analysis of a data value, the node may be also deleted.
In a graph 1103, in place of replacing the nodes 1111 to 1113 having a data value of the nodes except those on the lowermost layer with nodes having no data value, nodes on a lowermost layer having the same data value are newly added and each node is connected via an edge. Nodes 1111′ to 1113′ are nodes on a lowermost layer added in place of the nodes 1111 to 1113. As the nodes 1111, 1113 are nodes on a third layer from the downside, nodes 1131, 1132 on a second layer from the downside are added. When a harmonic analysis method described later is used, it is considered that such shift of a data value has no effect. The graph can be converted to a graph that meets the above-mentioned conditions by such a change. When harmonic analysis is applied to the data having such graph structure, the nodes except those on the lowermost layer are also made to have a data value as descried later.
In data having hierarchical graph structure, the node having a data value and the node having no data value exist. Besides, a graph in which a child node is connected to a parent node on a layer on the upside by two or more via an edge is also conceivable. As the hierarchical graph has multiple variations as described above, it is not easy to uniformly apply harmonic analysis. Then, harmonic analysis can be applied to an arbitrary hierarchical graph by relatively simple processing by changing the current hierarchical graph to a hierarchical graph easy to process as preparation for the harmonic analysis.
In the step S1204, the number Mn+1 of nodes may be also fixed beforehand and may be also changed according to data. For example, in
Besides, when the Mn+1 pieces of nodes are selected as representative nodes of the nodes on the (n+1)th layer from the lowermost layer in the step S1205, it is desirable that the representative nodes are not biased. As a node that does not belong to the following classes cannot be connected to the similar node when the representative nodes are occupied by only a few types of specific classes in generating a hierarchical graph for data for image recognition for example, it is desirable that the representative nodes are occupied by multiple types of classes. Then, Mn+1 pieces of nodes are selected out of the nodes on the nth layer from the lowermost layer so that mutual similarity is low. Hereby, the representative nodes can be made unbiased.
Moreover, when the representative nodes are selected in the step S1205, plural nodes (for example, nodes v1, v2) on the nth layer from the lowermost layer may be also made to correspond to a representative node (for example, a node u) of one node on the (n+1)th layer from the lowermost layer in place of correlating one node on the nth layer from the lowermost layer to a representative node of one node on the (n+1)th layer from the lowermost layer. In this case, similarity between a node v located on the nth layer from the lowermost layer and the representative node u can be defined using similarity between v and v1 and similarity between v and v2.
Referring to
A connection rate w (v, u) between v and u is calculated as in the following expression for example, however, the present invention is not limited to this.
A (v,v′) denotes similarity between nodes v and v′, R(u) denotes a representative node corresponding to the node u, and Vn denotes a set including the whole nodes on the nth layer from the lowermost layer.
The connection rate defined in the mathematical expression 1 meets the following expression as to arbitrary vεV1.
For a child node having only one parent, a connection rate with the parent node shall be 1. An example of a connection rate acquired by calculation is shown in 1303.
Such analysis that data sources that belong to the strongly connected parent node have stronger relevance can be made by calculating the connection rate with the parent node as described above and applying harmonic analysis based upon the connection rate, and high-performance data analysis is enabled.
In the example shown in
Next, a method of making multi-resolution harmonic analysis and inverse transformation of it will be described referring to
Details of the steps S1404, S1405 shown in
In the step S1404, high resolution transformation coefficients d1(v), d2(v), - - - , dk(v) and low resolution transformation coefficients a1, a2, - - - , ak are calculated based upon the data values of each node v and its parent node as shown in the following expression.
[Mathematical expression 3]
(d1(v),d2(v), . . . ,dk(v),a1,a2, . . . ,ak)←f(xv,x1(v),x2(v), . . . ,xk(v);w1(v),w2(v), . . . ,wk(v)) (Mathematical expression 3)
In this case, f denotes a certain function.
The f is required to be such a function to which an inverse function exists to make inverse transformation in harmonic analysis possible. “w1(v), w2(v), - - - , wk(v)” are connection rates between each node v and its parent node v1, v2, - - - , vk. In the step S1405, ak is assigned to xk(v) as shown in the following expression.
[Mathematical expression 4]
(xk(v)←ak(kε{1,2, . . . K}) (Mathematical expression 4)
In the step S1503, calculation in the following expression is carried out as inversion transformation of (the mathematical expression 3).
[Mathematical expression 5]
(xv,x1(v),x2(v), . . . ,xk(v))←f−1(d1(v),d2(v), . . . ,dk(v),x1(v),x2(v), . . . ,xk(v);w1(v),w2(v), . . . ,wk(v)) (Mathematical expression 5)
When it is considered that especially, the step S1404 is realized by such linear transformation that dk(v) and ak are acquired from xv, xk(v), wk(v), the mathematical expression 3 is expressed in the form of the sum of products as shown in the following expression.
[Mathematical expression 6]
d
k
(v)
←p
k
(v)
x
v
+q
k
(v)
x
k
(v) (Mathematical expression 6)
[Mathematical expression 7]
a
k
←p′
k
(v)
x
v
+q′
k
(v)
x
k
(v) (Mathematical expression 7)
“pk(v), qk(v), p′k(v), q′k(v)” are a function of wk(v). The calculation in the mathematical expressions 6, 7 is carried out for 1, - - - , K as k.
A special case in the mathematical expressions 6, 7 will be shown below. In the following example, the sum of w1(v), w2(v), - - - , wk(v) shall be 1.
In this case, data values of nodes except nodes on the lowermost layer are all initialized to zero. “sv,k=wk(v)sv”, and “sv and s1(v), s2(v), - - - , sk(v)” are values (hereinafter called mass) which the node v and its parent nodes v1, v2, - - - , vk have.
The mass of the lowermost node is 1 and the mass of nodes except the nodes on the lowermost layer is initialized to zero.
After the calculation of the mathematical expressions 8, 9 is carried out, the mass sk(v) of the node vk is updated as shown in the following expression.
[Mathematical expression 10]
s
k
(v)
←s
k
(v)
+s
v,k (Mathematical expression 10)
It is known that the mathematical expressions 8 and 9 are a such special case as shown in the following mathematical expression 11 in the mathematical expressions 6, 7.
Inverse transformation corresponding to the harmonic analysis by the mathematical expressions 8, 9 can be realized by the following expression.
After the calculation of the mathematical expressions 12, 13 is carried out, the mass sk(v) of the node vk is updated as shown in the following expression.
[Mathematical expression 14]
s
k
(v)
←s
k
(v)
−s
v,k (Mathematical expression 14)
In this embodiment, the harmonic analysis suitable for data having the graph having the hierarchical structure can be made. Tree structure is also one type of hierarchical graph structure, however, the hierarchical graph structure as an object of this embodiment is not limited to tree structure. That is, the uppermost node may be also two or more and the node except the node on the uppermost layer may also have plural parent nodes. As the hierarchical graph structure is graph structure of a wide class, various data can be exactly represented. Harmonic analysis can be applied to a graph having tree structure by processing called orthogonal transformation, however, as non-orthogonal transformation is required in a hierarchical graph which does not have tree structure, a method for the tree structure cannot be applied.
Besides, as plural parent nodes exist, harmonic analysis is required to be carried out in consideration of the strength of connection between the child node and the parent node. In this embodiment, harmonic analysis using non-orthogonal transformation is applied. Moreover, a connection rate between each child node and its parent node is calculated and a harmonic analysis method is changed according to the connection rate. In the meantime, performance and computation are compatible by making harmonic analysis utilizing information of the hierarchical structure of a graph which is not considered in a general method that can be applied to arbitrary graph structure. As for computation, high speed operation is enabled by performing multi-resolution processing for nodes on an upper layer in order from the lowermost nodes.
When graph structure is tree structure, the harmonic analysis represented in the mathematical expressions 8, 9 is processing called orthogonal transformation and the similar result to that in the transformation described in M. Gavish, B. Nadler, R. R. Coifman, International Conference on Machine Learning, pp. 367-374, 2010) is acquired. However, in the case of a hierarchical graph which is not tree structure, it is very difficult to make harmonic analysis by orthogonal transformation and it is not easy to lead the mathematical expressions 8, 9 from the method described in M. Gavish, B. Nadler, R. R. Coifman, International Conference on Machine Learning, pp. 367-374, 2010). In this embodiment, when graph structure is not tree structure, non-orthogonal transformation is applied.
Computation in the harmonic analysis and the inverse transformation is proportional to the number of nodes Nv or to “Nv×log Nv”. In harmonic analysis, as computation proportional to at least the number of nodes Nv is generally required, it can be said that computation is sufficiently a little in processing in this embodiment.
Another example in the mathematical expressions 6, 7 will be described below. In the following example, the sum of w1(v), w2(v), - - - , wk(v) is not required to be 1.
In this case, data values of nodes except nodes on the lowermost layer are all initialized to zero. “sv,k=wk(v)sv” and “sv and s1(v), s2(v), - - - , sk(v)” are the mass of the node v and each mass of its parent nodes v1, v2, - - - , vk. The mass of the lowermost node is 1 and the mass of nodes except nodes on the lowermost layer is initialized to zero.
Besides, a mathematical expression 17 is as follows.
“tv” and “t1(v), t2(v), - - - , tk(v)” are a value (hereinafter called second mass) which the node v has and values (second mass) which its parent nodes v1, v2, - - - , vk have. The second mass of the lowermost node is 1 and the second mass of nodes except nodes on the lowermost layer is initialized to zero.
After the calculation of the mathematical expressions 15, 16 is carried out, the mass sk(v) of the node vk is updated by the mathematical expression 10. Besides, the second mass tk(v) of the node vk is updated as shown in the following expression.
[Mathematical expression 18]
t
k
(v)
←t
k
(v)
+t
v,k (Mathematical expression 18)
Inverse transformation corresponding to the harmonic analysis by the mathematical expressions 15, 16 can be realized by the following expression.
After the calculation of the mathematical expressions 19, 20 is carried out, the mass sk(v) of the node vk is updated as shown in the mathematical expression 14 and the second mass tk(v) is updated as shown in the following expression.
[Mathematical expression 21]
t
k
(v)
←t
k
(v)
−t
v,k (Mathematical expression 21)
It is clear that these examples are also the special case shown in the mathematical expressions 6, 7. When harmonic analysis is carried out in the mathematical expressions 15, 16, such a value as strongly affected by the child node having a stronger connection rate with a data value of its parent node can be acquired and harmonic analysis suitable for a weighted graph can be made.
Graphs 1601, 1602 show states before and after the step S1404 shown in
In each of multi-resolution processing, the sum of squares of the output (that is, the total of the sum of squares of the high resolution transformation coefficients and the sum of squares of nodes on an (n+1)th layer from a lowermost layer) is equalized to the sum of squares of the input (that is, the sum of squares of data values of nodes on an nth layer from the lowermost layer). Hereby, the sum of squares of the resolution transformation coefficients and the sum of squares of data values of the uppermost nodes which are both the output of the harmonic analysis can be equalized to the sum of squares of data values of each node which are the input of the harmonic analysis. A property that the sum of squares of data values is kept before and after the harmonic analysis is called Parseval's equality and harmonic analysis that meets this property is useful in data processing. For example, as the ratio of the sum of squares of noise included in data values which are the input and the sum of squares of components (hereinafter called signal components) except the noise is also kept after harmonic analysis, the quantity of noise can be easily estimated using values after the harmonic analysis. Besides, this property is one of important properties in the high-performance removal of noise. It is guaranteed that orthogonal transformation meets the Parseval's equality, however, non-orthogonal transformation does not generally meet this equality. However, processing that meets the Parseval's equality is also enabled in non-orthogonal transformation by performing processing for equalizing the sum of squares of the input and the sum of squares of the output in each multi-resolution processing as in this embodiment.
A graph 1701 shows an example of a hierarchical graph before harmonic analysis is carried out. Besides, a graph 1702 shows a hierarchical graph after harmonic analysis is applied to the graph 1701. A high resolution transformation coefficient for an edge that connects a node vj and its parent node vk is represented as dj(k). In the graph 1701, nodes v5, v6, v7 before harmonic analysis have no data value. Therefore, when data values of the nodes v5, v6, v7 are in an initialized state in calculating transformation coefficients with the nodes v5, v6, v7 as a parent node in the step S1404, trivial values are calculated as high resolution transformation coefficients. In the example shown in the mathematical expression 8, as initial values of the mass of the nodes v5, v6, v7 are zero, the high resolution transformation coefficients are necessarily zero. The high resolution transformation coefficients having the trivial value shall be deleted after harmonic analysis. (As such high resolution transformation coefficients include no information, they may be deleted.)
In the graph 1702, d1(5), d4(6) and d5(7) are high resolution transformation coefficients having the trivial value and they are deleted from
In the case of tree structure, the number of all nodes is equal to a value acquired by adding 1 to the number of edges. Therefore, the sum of the number of high resolution transformation coefficients and the number of a data value of an uppermost node (the latter is equal to 1) which are respectively the output of harmonic analysis is equal to the number of nodes having a data value which are the input of harmonic analysis. That is, the number of output values and the number of input values are coincident. In the meantime, in this embodiment, such graph structure that each node is connected to plural parent nodes can be also represented. At this time, as described referring to
In a step S1810, a degeneration process described later is applied to the high resolution transformation coefficient calculated in the step S104. A degeneration process may be also applied not only to the high resolution transformation coefficient but to a data value of an uppermost node. Data after the removal of noise shown in an image 1802 is acquired from data before the removal of noise shown in an image 1801 by performing inverse transformation shown in a step S1811 after the degeneration process. An image may also include plural pieces as shown in 1803. In this case, one graph structure representing plural pieces of images is generated using each pixel for a data source, and harmonic analysis and a degeneration process are carried out. When the images are strongly related, a satisfactory result can be expected, compared with a case that noise is removed from each every image.
An image 1804 shows an example of the image after noise is removed from the image 1803. Besides, as the information volume of the transformation coefficients is reduced by the degeneration process and signal components can be efficiently represented by a little information volume, the similar flow can be also used for data compression.
Processing for inverse transformation shown in the step S1811 can be realized by the flow shown in
Steps S101 to S104 are the same as the steps S101 to S104 shown in
As described above, it can be expected that performance is enhanced more than that in the conventional type method by estimating a missing data value and performing semi-teaching type data classification by the harmonic analysis using the complex data.
In a step S2101, the data acquired in the step S101 is classified. Next, steps S2102 to S2106 are repeated every time when a new data source is acquired. In the step S2102, the next new data source is acquired. In the step S2103, similarity between the new data source and the other data source is acquired. In the step S2104, a hierarchical graph is updated based upon the similarity acquired in the step S2103. Concretely, a node corresponding to the new data source is added to the hierarchical graph. In the step S2105, data is classified using the updated hierarchical graph. At this time, only data of the new data source may be classified or data of the other data source may be classified again. In the step S2106, termination is determined.
As described above, as processing can be performed before all data are acquired by dynamically performing semi-teaching type data classification, high-speed classification can be performed. This embodiment is suitable for a case in which short classification time is required.
Besides, the area 2240 is provided with an area 2242 for setting values related to similarity. This area has a field 2211 for setting a lower limit of connected similarity. When similarity is lower than a value specified in the field 2211, it is possible to set not to connect corresponding nodes. Further, the field 2242 has a field 2212 for setting relation between similarity and a connection rate. In the field 2212, an interface that enables visually adjusting a value using a graph and an interface that directly describes a relational expression can be used. All parameters are not necessarily independent but may be mutually related. A function for interlocking parameters which are not independent and automatically updating the other values if necessary when one value is set may be also provided.
An area 2243 is an area for setting processing conditions when noise is removed after applying harmonic analysis to the graph. In the area 2243, parameters related to noise removal processing are set. This area 2243 has a field 2221 for setting noise removal intensity and a field 2222 for setting a frequency of repetition in the case of noise removal according to a method of repetition.
Moreover, a button “determine” 2233 and a button “clear” 2234 are displayed on the user interface screen 2200, harmonic analysis is applied to plural data pieces on conditions set by clicking the button “determine” 2233 when the setting of each condition is finished in the node setting area 2241, the similarity setting area 2242 and the noise removal processing condition setting area 2243, and noise removal processing is applied to the result. In the meantime, when each condition set in the node setting area 2241, the similarity setting area 2242 and the noise removal processing condition setting area 2243 is changed, the individual condition or all the conditions can be collectively erased by clicking the button clear 2243.
In addition, when the noise removal processing is not required, the processing of data is executed by clicking the decision button 2233 after each data is set in the area 2240.
An area 2250 of the user interface screen 220 is an image display area and an image after noise is removed is displayed. In an example shown in
The present invention made by these inventors has been concretely described based upon the embodiments, however, the present invention is not limited to the embodiments, and it need scarcely be said that various variations are allowed in a scope which does not deviate from the object.
201 . . . linear structure, 202 . . . circular structure 210 . . . image to noise removal target 220 . . . sensor network 221 . . . sensor, 301 . . . data acquisition unit, 302 . . . similarity acquisition unit, 303 . . . data base, 304 . . . input/output unit, 305 . . . control unit, 306 . . . hierarchical graph generation unit, 307 . . . connection rate calculation unit, 308 . . . harmonic analysis unit, 309 . . . data processing unit.
Number | Date | Country | Kind |
---|---|---|---|
2012-189710 | Aug 2012 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2013/068517 | 7/5/2013 | WO | 00 |