Method and Apparatus for Automatic Pattern Analysis

Information

  • Patent Application
  • 20080097991
  • Publication Number
    20080097991
  • Date Filed
    August 01, 2005
    19 years ago
  • Date Published
    April 24, 2008
    17 years ago
Abstract
A method and apparatus is disclosed for pattern analysis by arranging given data so that highdimensional data can be more effectively analyzed. The method allows arrangements of given data so that patterns can be discovered within the data. By utilizing maps that characterizes the data and the type or the set it belongs to, the method produces many data items from relatively few input data items, thereby making it possible to apply statistical and other conventional data analysis methods. In the method, a set of maps from the data or part of the data is determined. Then, new maps are generated by combining existing maps or applying certain transformations on the maps. Next, the results of applying the maps to the data are examined for patterns. Optionally, certain strong patterns are chosen, idealized, and propagated backwards to find a data reflecting that pattern.
Description

DESCRIPTION OF DRAWINGS


FIG. 1 shows a flow chart of the method to discover patterns in data.



FIG. 2 shows the flowchart of the exploration algorithm.



FIG. 3 schematically shows the data structure FC and substructures used in FC.



FIG. 4 shows the flowchart of the process of idealization.





BEST MODE

In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It may be evident, however, to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate description of the present invention. It is to be understood that the present invention may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof. Preferably, the present invention is implemented in software as an application program tangibly embodied on a program storage device. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (CPU), a random access memory (RAM), and input/output (I/O) interface(s). The computer platform also includes an operating system and micro instruction code. The various processes and functions described herein may either be part of the micro instruction code or part of the application program (or a combination thereof) which is executed via the operating system. In addition, various other peripheral devices may be connected to the computer platform such as an additional data storage device and a printing device. It is to be further understood that, because some of the constituent system components and method steps depicted in the accompanying Figures are preferably implemented in software, the actual connections between the system components (or the process steps) may differ depending upon the manner in which the present invention is programmed. Given the teachings of the present invention provided herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present invention.


Data

Here, an embodiment of the present invention to analyze data is presented. For clarity's sake, a level of abstraction is maintained that is common and well-known to those skilled in the related art; for instance, sets and maps are represented as, or approximated by, data on an information system.


To illustrate how frequency or probability is handled in the present invention, a data structure called frequency count is herein disclosed. It is a concrete way to model the simple counting probability measures on a set. In this embodiment, all data is represented as a frequency count on some set.


In the following, for any set A, a frequency count on A means a data that keeps track of members of A and their numbers. It is treated as a subset of A×N, where N={1,2,3, . . . } is the set of natural numbers, such that no member of A appears more than once. The set of frequency counts on A is denoted by Freq(A). Thus a frequency count on A, i.e., a member F of Freq(A), is a set of pairs (a,n), where a is a member of A and n is a natural number, such that if (a,n) is in F, no other member of the form (a,m) is in F. These pairs in frequency counts are hereinafter called the particles. For a member a of A and a frequency count F on A, the count of a, denoted by countF(a), is defined to be n, if there is a particle of the form (a,n) in F, and 0 otherwise; mass(F), the mass of F, is defined by the sum of countF(a) for all a in A; and PF(a), the probability of a, is defined by countF(a) divided by mass(F). The support supp(F) of F is defined to be the subset of A that consists of the members a with countF(a)>0. The entropy H(F) of F is defined by the sum −Σaεsupp(F)PF(a) log2PF(a) for all a in supp(F).


The following should be noted for later reference:


[FC I] From two frequency counts F on A and G on B, another frequency count (the product) F×G on A×B may be generated as follows: F×G is a subset of (A×B)×N that consists of particles ((a,b),nm) for all combinations of particles (a,n) in F and (b,m) in G. This corresponds to the product probability measure.


[FC II] When there is a map f:A→B, a map f*:Freq(A)→Freq(B) of frequency counts is defined as follows: For a frequency count F,f*(F) is a subset of B×N that consists of particles (b,n) such that at least one particle (a,m) in F with b=f(a) exists and n is the sum of m's in all such particles (a,m). In other words, the set f*(F) is made by adding (f(a),m) for all (a,m) in F and then replacing (b,i) and (b,j) of the same b by (b,i+j) until there is no distinct particles that have the same first component. This corresponds to the induced probability measure.


[FC III] If A⊃B, then Freq(A)⊃Freq(B), i.e., a frequency count on B is automatically a frequency count on A. When A⊃B and F is a frequency count on A, the restriction F|B of F to B is a frequency count on B (and therefore on A) that consists of all the particles (a,n) in F such that a is in B.


[FC IV] Two frequency counts F and G on A are said to be equivalent if there is a number m>0 such that countF(a)=m countG(a) for all a in A. If F and G are equivalent, various properties hold: mass(F)=m mass(G), supp(F)=supp(G), PF(a)=PG(a) for all a in A, and H(F) =H(G).


[FC V] For a set A, the standard frequency count St(A) on A is defined as the subset of A×N consisting of one particle (a,1) for each a in A. Note that, according to this definition and [FC I], St(A)×St(B) is identical to St(A×B).


Primitive Maps

All the primitive maps that are listed in [PM I] and on are included in the set of primitive maps.


Derived Data and Maps

Based on the loaded data and the primitive maps, other data and maps are generated to explore the possibilities of various sets that characterize the data. In the beginning, there is the input data represented as a frequency count on sets. Thus the system begins by trying possible maps that can be applied to the sets. The result of applying such maps to existing data is a new data. More specifically, the process keeps the following data structures:

    • A data structure FC that stores a representation of frequency counts. It begins with the input data represented as frequency counts; and the standard frequency count St(A) (see [FC V]) for any set A that appears as a component of the set which the input data is on (i.e., if the input data is a frequency count on A×(B→C), the standard frequency counts on A, B, C, B→C, and A×(B→C) would be in FC.) It also includes the standard frequency counts on some standard sets such as bool and unit.
    • A data structure SETS that stores the symbolic representations of sets. It begins with the sets the frequency counts in FC are on.
    • A data structure MAPS that stores the symbolic representations of maps. It begins with the primitive maps in it.


As the process continues, more members are added to FC, SETS and MAPS, in one of the following way:


[D I] If a pair of frequency counts F and G are already in FC, F×G may be added to FC (see [FC I].) Similarly for three or more frequency counts.


[D II] If any map in MAPS can be applied to some map(s) in MAPS (e.g., [PM III], [PM IV], [PM V], [PM VI], and [PM XII]) the resulting map may be added to MAPS. For instance, some pair of maps may be chosen and either their product or, if applicable, their concatenation may be added to MAPS; or it may be any map applied to other maps and result may be added to MAPS.


[D III] A subset of a set in SETS can be added to SETS. A frequency count may be restricted to a subset. An inverse image of a subset can be added to SETS. For a subset B of A, the subset classifier map subsetB:A→bool (defined by subsetB(a)=true if aεB and false otherwise) may be added to MAPS.


[D IV] If a frequency count F on a set A is in FC and a map f:A→B is in MAPS,f*(F) may be added to FC (see [FC II].) If this rule is used to add a frequency count, FC also records the map that was used.


Note that the sets can be considered to make a directed graph structure by taking sets as nodes and maps as edges. The frequency counts on the sets can also be considered to make a directed graph structure by taking frequency counts as nodes and maps as edges.


These maps and data can be explored and added to the data structures in various orders. For instance, a breadth-first search order could be used in the tree structure mentioned above. In this embodiment, a stochastic search algorithm is used:


Exploration Algorithm


Outline


Stochastically execute one of the actions from 1 to 6 below:

  • 1. Choose a pair of frequency counts F and G in FC and add F×G to FC. Add A×B to SETS, where A and B are the sets F and G are on, respectively.
  • 2. Choose and apply a map in MAPS that can be applied to map(s) according to [D II], add the result to MAPS.
  • 3. Choose a set in A in SETS, add a proper subset B of A to SETS and add subsetB:A→bool to MAPS.
  • 4. Choose in FC a frequency count F. Choose a proper subset B of A in SETS, where A is the set F is on. Add F|B to FC.
  • 5. Choose a map f:A→B in MAPS and a proper subset C of B in SETS. Add the inverse image f−1(C) to SETS.
  • 6. Choose a frequency count F in FC and a map f in MAPS from the set that F is on to some other set. Add f*(F) to FC.


Details



FIG. 2 shows the flowchart of the exploration algorithm. The choice of the action taken and the choice of the objects of the action are done stochastically.


Each frequency count, set, and map in FC, SETS, and MAPS is assigned an integral weight. In the beginning, the input data has the weight 1000, others are all given the weight of 100.


For each frequency count or map, a set of eligible objects are defined as follows: For a frequency count F on a set A, its set EO(F) of eligible objects consists of all the frequency counts in FC and all proper subsets of A in SETS. For a map f:A→B, its set EO(f) of eligible objects consists of all maps in MAPS to which f can be applied, all proper subsets of B in SETS, and all frequency counts on A.


Each time the exploration algorithm is invoked, a frequency count, a set, or a map is chosen with a probability from FC, SETS, and MAPS (201). The probability is proportional to its weight; except in the case of a set, where it is proportional to 200 divided by the number of members in the set.


If a frequency count F on a set A is chosen, another frequency count G or a proper subset B of A is chosen from EO(F) with a probability proportional to its weight (202). If G on a set C is chosen, F×G is added to FC and A×C to SETS (203). F×G is given the weight equal to the larger of the weights of F and G. A×C is given the weight equal to the larger of the weights of A and C. If B is chosen, F|B is added to FC (204) and given the weight equal to the larger of the weights of F and B.


If a set A is chosen, its subset B is randomly chosen and added to SETS and given the weight of 100. The subset map subsetB:A→bool is also added to MAPS with the weight of 100 (205).


If a map f:A→B is chosen, a frequency count F on A, a proper subset C of B, or a map g is chosen from EO(f) with a probability proportional to its weight (206). If a frequency count F is chosen,f*(F) is added to FC (207), and given a weight equal to the larger of the weights of f and F. If a proper subset C of B is chosen,f−1(C) is added to SETS (208) and given the same weight as C; if a map g is chosen, f(g) is added to MAPS (209), and given the weight equal to the larger of the weights of f and g.


Particle Record



FIG. 3 schematically shows the data structure FC and the substructures used in FC. The data structure FC (301) contains a record for each frequency count (302, 303). The record (302) for a frequency count F on a set A contains the information on A (304), the map, the idealization (see below,) or the restriction to a subset that caused F (305), the weight w(F) (an integer) for F (306), and information on the particles in F (307). The particles record (307) keeps track of the particles, stochastically estimating if necessary. It contains the type of the particles record (308), the mass of F (309), and a data structure that stores explicit records of particles (310). The type of the particles record (308) has one of the values: standard, product, or explicit. For a standard frequency count on a set, the particles record has the type standard. For a product frequency count, the type is product. For these types of particles, no explicit record of the particles is kept, since any information can be readily obtained from the definition of these frequency counts. Otherwise, the particles record has the type explicit. This type of particles record stores explicit records of the particles. For a particle (a,n) in a frequency count F on a set A, where a is a member of A and n>0 is an integer, the explicit record for the particle (311) stores a and n in the fields member (312) and count (313), respectively. A constant MAXPARTICLE is used below. Though it should be determined according to factors such as the kind of input data and the available resources, MAXPARTICLE=100000 is given here for the sake of concreteness.


When the input data is received and represented as a frequency count, it creates a particle record (311) for each particle in the frequency count and stores it in the particles record (310); the type (308) is set to explicit. The sum of the count field (313) of the particles that are in the particles record (310) is stored in the mass field (309).


When a result of applying a map f to a frequency count F on a set A is added to FC, in the record (302) that is created in FC for the result, the type is set to explicit. If the number of particles in F is more than MAXPARTICLE, only MAXPARTICLE particles are stochastically chosen with the probability proportional to their count; otherwise, all particles in F are chosen. For each chosen particle (a,n), the member f(a) is computed. If an explicit particle record (311) with the member field (312) containing f(a) is already there, its count field (313) is increased by n; otherwise, an explicit particle record (311) is created with the member field (312) containing f(a) and the count field (313) set to n.


Patterns

In this embodiment, the method iterates the Exploration Algorithm and then checks for patterns (data and map) in the frequency counts in FC. This is done by calculating the entropy H(F) for any frequency count F that has been updated in the current iteration, if any. The entropy is normalized by subtracting it from the entropy of the frequency count that is created by sending, by the same map that created F, the standard frequency count on the original set. Thus, if a frequency count F on A is created by sending the frequency count G on B, by a map f:B→A, i.e., F=f*(G), the quantity J(f,F)=H(f*(St(B)))−H(F) is computed. When a frequency count with J(f,F) higher than a threshold value is found, the map f and the frequency count that led to the frequency count is marked as pattern and used (e.g., output, backtracked) in the later stages; also the map and the frequency count each gets its weight value increased by 100. The threshold value should be determined according to the application and other factors, such as the available resources. As the benchmark of the presence of patterns other than J(f,F), another possibility is the relative entropy (also known as Kullback-Leibler divergence). For two frequency counts F and G, the relative entropy D(F,G) is the sum of −PF(a) log2[PF(a)/PG(a)] for all a in supp (G). Instead of finding a high J(f, F), a low D(F,f*(St(B))) may be looked for.


In computing the entropy of various frequency counts, various relationships are employed to reduce the computation cost:

    • For evaluation map ev:(A→B)×A→B, the frequency count ev*(St(A→B)×St(A)) is equivalent to St(B), thus H(ev*(St(A→B)×St(A)))=H(St(B)). This is important for efficiency since sets of maps tend to be large.
    • For any frequency counts F and G, H(F×G)=H(F)+H(G).
    • For any frequency counts F on A and G on B, and maps f:A→B and g:C→D, it holds (f×g)*(F×G)=f*(F)×g*(G), thus H((f×g)*(F×G))=H(f*(F))+H(g*(G)).
    • For a projection map projA:A×B→A and frequency counts F on A and G on B, projA*(F×G) is equivalent to F. Thus H(projA*(F×G))=H(F).
    • For an injection f:A→B, i.e., a map f such that f(a)≠f(b) implies a≠b, and a frequency count F on A, it holds H(f*(F))=H(F).


Backtrack

When a frequency count F with low entropy is found, a process of idealization takes place. That is a process of creating another frequency count F′ by removing some particles from F so that its entropy would be even lower.



FIG. 4 shows the flowchart of the process of idealization. It takes a frequency count F and returns the idealized frequency count F′. First (401), F is copied to a new frequency count F′. Then, in a loop, the entropy of F′ is computed (402) and if it is lower than a predetermined value, the process terminates and returns F′ as a return value. Otherwise, a particle (a,n) in F′ with the lowest count n is found in F′ (403) and removed (404). Then the loop returns to 402. The predetermined value of entropy should be determined according to the application.


Next, the particles still left in F′ are backtracked. Let the map that caused F be f:A→B, i.e., F=f*(G) for some frequency count G on a set A. A particle (b,n) in F′ is made by combining the particles of the form (f(a),ma) (see [FC II].) Let f*−1(F′) be the inverse image of F′ by f, which is the restriction of G to f−1(supp(F′)) (see [FC III].) That is, (a,m) in G belongs to f*−1(F′) if and only if countF′(f(a))>0. If f has been made by concatenating more than one map, e.g., f=f1∘f2∘ . . . ∘fk, there will be a series of frequency counts such as fk*−1(F), (fk−1Πfk))*−1(F′), and so on. These frequency counts are added to FC along with the information as to how they are created (e.g., the idealization, the taking of inverse image) and the same weight as that of F. They are then treated in the same way as other frequency counts in FC.


Finally, if a frequency count F in FC is on a set of maps, i.e., a set that is of the form A→B for some sets A and B, and if relatively few members of the set have higher counts, one of more members of A→B with high counts may be added to MAPS.


Output

The maps that were found as patterns may be used as indicators of useful characteristics or parameters of the original data. As such, they are the output of the embodiment. The part of the data that causes a specific map to be a pattern is found by backtracking and may also be output.


Mode for Invention

This embodiment can be used to analyze various kinds of data. The following examples are intended to illustrate but not limit the use to which this embodiment may be put.


EXAMPLE 1
Image

Data


In this embodiment, an image is loaded from any of available image file format and represented in the following way.


The color space is denoted by Col. For a color image, it is generally a three dimensional real vector space. If the image is a grayscale image, Col is the set of real numbers. For images with larger spectrum Col might be a vector space of higher dimensions. Here, the only assumption is that it is a real vector space.


The image domain is denoted by Dom and assumed to be some finite subset of a d-dimensional Euclidean space EDom. For instance, an ordinary bitmap image has a domain of m×n lattice points in a 2-dimensional Euclidean space. For other kind of images, such as 3D medical image data, the dimension would be higher.


An image generally gives colors at each point in the domain. Thus an image can be considered a map from Dom to Col, that is, a member of the set Dom→Col. This embodiment represents the input image by a frequency count on Dom→Col. That is, the initial data is a frequency count Im in Freq(Dom→Col) that contains one particle (im,1), where im:Dom→Col is the map that sends each pixel position to the color in the image.


Primitive Maps


In addition to the general primitive maps, there may be added primitive maps specifically useful for image data. For instance, if the image is in pixels, as usually the case, neighbor relationship between pixels may be useful. This is put in the system as a primitive map Nb:Dom×Dom→bool that gives true whenever two members of Dom are neighboring pixels. Another example would be various kinds of filters that are known in the related art of image processing; e.g., a wavelet filter.


Derived Data and Maps


Some examples of simpler maps and data that the method may add to MAPS and FC are:


A. Color frequency

  • 1. A1. By [D I], a frequency count Im×St(Dom) on (Dom→Col)×Dom is added to FC, based on the two frequency counts, Im on Dom→Col and St(Dom) on Dom.
  • 2. A2. By [D IV], ev*(Im×St(Dom)) is added to FC based on Im×St(Dom) from A1 and the evaluation map ev: (Dom→Col)×Dom→Col (which, as a primitive map, is in MAPS.)


The frequency count ev*(Im×St(Dom)) on Col is a set of particles (c,nc), where nc is the number of pixels that has color c.

B. Color difference and position difference frequency

  • 1. B1. By [D II], a map (mp∘diag)×diag:(Dom→Col)×(Dom×Dom) (Dom×Dom→Col×Col)×(Dom×Dom)×(Dom×Dom) is added to MAPS, based on the diagonal map diag: (Dom→Col)→(Dom→Col)×(Dom→Col), the product map mp:(Dom→Col)×(Dom→Col)→(Dom×Dom→Col×Col) and the diagonal map diag:Dom×Dom→(Dom×Dom)×(Dom×Dom).
  • 2. B2. By [D II], a map ev×idDom×Dom: (Dom×Dom→Col×Col)×(Dom×Dom)×(Dom×Dom)→(Col×Col)×(Dom×Dom) is added to MAPS, based on the evaluation map ev: (Dom×Dom→Col×Col)×(Dom×Dom)→Col×Col and the identity map on Dom×Dom.
  • 3. B3. By [D II], a map SubCol×DiffDom:(Col×Col)×(Dom×Dom)→Col×VDom is added to MAPS, based on the subtraction in the color space and the difference map in the image domain.
  • 4. B4. Concatenating the three maps added to MAPS in B1, B2, and B3, (SubCol×DiffDom)∘(ev×idDom×Dom)∘((mp∘)diag)×diag):(Dom→Col)×(Dom×Dom)→Col×VDom is added to MAPS by [D II].
  • 5. B5. By [D I], a frequency count Im×St(Dom×Dom) on (Dom→Col)×(Dom×Dom) is added to FC.
  • 6. B6. By [D IV], the result of applying the map in B4 to the frequency count Im×St(Dom×Dom) added in B5 is added to FC.


The frequency count added in B6 on Col×VDom is a set of particles ((d,v),n ), where nd,v is the number of occurrence of pairs of pixels i) that have the color difference d, and ii) the vectors in the image domain between which is v.

Patterns


The frequency count ev*(Im×St(Dom)) on Col obtained in A2 would have small entropy when there are not too many colors used. If the whole image is one color, it would have entropy of 0, the lowest possible value.


The frequency count added in B6 on Col×VDom would have small entropy when there are many pairs of pixels that have the same particular color difference and are separated by the same vector. If, for instance, there are horizontal lines of one color, there would be relatively high concentration of particles (particles with high counts) with color difference 0 and horizontal vectors, giving the frequency count lower entropy.


EXAMPLE 2
Data Matrix

A data matrix is a rectangular array with N rows and D columns, the rows giving different observations or individuals and the columns giving different attributes or variables. Each variable can have a value that is a member of some set, which we call here the value set. For instance, if the variable can only take an integral number, the value set is the set of integers. If the variable can take any number, the value set is the set of real numbers. Or if the variable can take the value of “yes” or “no”, the value set can be the set of Booleans.


Let the D variables denoted by a1,a2, . . . ,aD and the sets in which variables take values by X1,X2, . . . XD, respectively. Then, each observation gives a member in the set X1×X2× . . . ×XD. The input data in the form of a data matrix is represented in this embodiment as a frequency count on X1×X2× . . . ×XD with each observation contributing a single count in one particle. Thus, the mass of the frequency count is N.


INDUSTRIAL APPLICABILITY

Thus a method and apparatus has been disclosed to arrange given data so that high-dimensional data can be more effectively analyzed and better pattern discovery within the data is allowed. It is applicable in wide variety of industry, where more and more data are collected and it is increasingly important to find the relevant information out of a vast pile of data. The areas in which the present invention is useful includes the case of the large number of genes and relatively few patients with a given genetic disease and the case of images, whch can easily have a million dimensions (pixels).


While only certain preferred features of the invention have been illustrated and described herein, many modifications and changes will occur to those skilled in the art. For instance, the concepts such as sets and maps, which have been used herein to explain the present invention has many equivalent or similar concepts in diverse discipline: e.g., function, type, method, etc. The terminologies such as set and map can be avoided entirely if one wishes; the whole invention can be described in terms of data and subroutine. Such superficial differences are, however, not real differences.


It is, therefore, to be understood that the appended claims are intended to cover all such modifications, changes and differences of terminologies as fall within the true spirit of the invention.

Claims
  • 1. A method of pattern analysis, said method comprising the steps of: receiving at least one first data;deriving at least one second data; andseeking pattern within one or more data.
  • 2. The method of claim 1, wherein said step of deriving at least one second data includes at least one of: applying at least one map to at least one third data;taking a product of one or more sets;taking an inverse image of at least one set; andrestricting at least one data.
  • 3. The method of claim 2, wherein said at least one map is chosen according to said at least one third data.
  • 4. The method of claim 3, wherein said at least one map is chosen so that said at least one third data belongs to the domain of said at least one map.
  • 5. The method of claim 4, wherein at least one collection is provided to store at least one of: said first data, said second data, and said at least one map; and wherein said at least one third data is chosen from within said collection.
  • 6. The method of claim 5, wherein said at least one map comprises one or more of: an identity map, a constant map, an equality map, a product map, a map that gives the product map of plurarity of maps, a pullback-operation map, a projection map, a diagonal map, a permutation map, a map-concatenation map, an evaluation map, a map that combines plurarity of lower-order maps to give a higher-order map, a currying map, a logical-operation map, a vector-operation map, an order map, a functionnal-operation map, and a fixed-point-operation map.
  • 7. The method of claim 6, furthur comprising the step of: generating an ideal data that corresponds to said pattern.
  • 8. The method of claim 7, wherein said step of generating an ideal data that corresponds to said pattern includes at least one of: creating a data with lower entropy;concentrating a probability measure;creating multiple probability measures corresponding to multiple concentration in a probability measure; andmaking an approximately repeating pattern repeat more exactly.
  • 9. The method of claim 2, wherein at least one collection is provided to store at least one of: said first data, said second data, and said at least one map; and wherein said at least one third data is chosen from within said collection.
  • 10. The method of claim 2, furthur comprising the step of: determining at least one pattern map corresponding to said pattern.
  • 11. The method of claim 2, wherein said at least one map comprises one or more of: an identity map, a constant map, an equality map, a product map, a map that gives the product map of plurarity of maps, a pullback-operation map, a projection map, a diagonal map, a permutation map, a map-concatenation map, an evaluation map, a map that combines plurarity of lower-order maps to give a higher-order map, a currying map, a logical-operation map, a vector-operation map, an order map, a functionnal-operation map, and a fixed-point-operation map.
  • 12. The method of claim 1, furthur comprising the step of: generating an ideal data that corresponds to said pattern.
  • 13. The method of claim 12, wherein said step of generating an ideal data that corresponds to said pattern includes at least one of: creating a data with lower entropy;concentrating a probability measure;creating multiple probability measures corresponding to multiple concentration in a probability measure; andmaking an approximately repeating pattern repeat more exactly.
  • 14. The method of claim 2, furthur comprising the step of: generating an ideal data that corresponds to said pattern.
  • 15. The method of claim 11, furthur comprising the step of: generating an ideal data that corresponds to said pattern.
  • 16. A system for pattern analysis, said system comprising: a memory arrangement including thereon a computer program; anda processing arrangement which, when executing said computer program, is configured to:receive at least one first data;derive at least one second data; andseek pattern within one or more data.
  • 17. The system of claim 16, wherein said processing arrangement, when executing said computer program, is configured to derive said at least one second data in at least one of the following manner: applying at least one map to at least one third data;taking a product of one or more sets;taking an inverse image of at least one set; andrestricting at least one data.
  • 18. The system of claim 17, wherein said at least one map is chosen so that said at least one third data belongs to the domain of said at least one map.
  • 19. The system of claim 18, wherein at least one collection is provided to store at least one of: said first data, said second data, and said at least one map; and wherein said at least one third data is chosen from within said collection.
  • 20. The system of claim 19, wherein said at least one map comprises one or more of: an identity map, a constant map, an equality map, a product map, a map that gives the product map of plurarity of maps, a pullback-operation map, a projection map, a diagonal map, a permutation map, a map-concatenation map, an evaluation map, a map that combines plurarity of lower-order maps to give a higher-order map, a currying map, a logical-operation map, a vector-operation map, an order map, a functionnal-operation map, and a fixed-point-operation map.
  • 21. The system of claim 20, wherein said processing arrangement, when executing said computer program, is further configured to: generate an ideal data that corresponds to said pattern.
  • 22. The system of claim 21, wherein said processing arrangement,, when executing said computer program, is configured to generate said ideal data that corresponds to said pattern in at least one of the following manner: creating a data with lower entropy;concentrating a probability measure;creating multiple probability measures corresponding to multiple concentration in a probability measure; andmaking an approximately repeating pattern repeat more exactly.
  • 23. The system of claim 17, wherein said processing arrangement, when executing said computer program, is further configured to: generate an ideal data that corresponds to said pattern.
  • 24. A software storage medium which, when executed by a processing arrangemnet, is configured to perform pattern analysis, said software storage medium comprising a software program incuding: a first module which, when executed, receives at least one first data;a second module which, when executed, derives at least one second data; anda third module which, when executed, seeks pattern within one or more data.
  • 25. The software storage medium of claim 24, wherein said second module, when executed, derives said at least one second data in at least one of the following manner: applying at least one map to at least one third data;taking a product of one or more sets;taking an inverse image of at least one set; andrestricting at least one data.
  • 26. The software storage medium of claim 25; wherein said second module, when executed, choses said at least one map so that said at least one third data belongs to the domain of said at least one map.
  • 27. The software storage medium of claim 26, wherein said second module, when executed, provides at least one collection to store at least one of: said first data, said second data, and said at least one map; and wherein said at least one third data is chosen from within said collection.
  • 28. The software storage medium of claim 27, wherein said at least one map comprises one or more of: an identity map, a constant map, an equality map, a product map, a map that gives the product map of plurarity of maps, a pullback-operation map, a projection map, a diagonal map, a permutation map, a map-concatenation map, an evaluation map, a map that combines plurarity of lower-order maps to give a higher-order map, a currying map, a logical-operation map, a vector-operation map, an order map, a functionnal-operation map, and a fixed-point-operation map.
  • 29. The software storage medium of claim 28, wherein said software program further incudes: a fourth modlue which, when executed, generates an ideal data that corresponds to said pattern.
  • 30. The software storage medium of claim 29, wherein said fourth module, when executed, generates said ideal data that corresponds to said pattern in at least one of the following manner: creating a data with lower entropy;concentrating a probability measure;creating multiple probability measures corresponding to multiple concentration in a probability measure; andmaking an approximately repeating pattern repeat more exactly.
  • 31. The software storage medium of claim 25, wherein said software program further incudes: a fourth modlue which, when executed, generates an ideal data that corresponds to said pattern.
PCT Information
Filing Document Filing Date Country Kind 371c Date
PCT/IB05/52570 8/1/2005 WO 00 2/1/2007
Provisional Applications (1)
Number Date Country
60592911 Aug 2004 US