This disclosure may contain information subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent disclosure or the patent as it appears in the U.S. Patent and Trademark Office files or records, but otherwise reserves all copyright rights whatsoever.
1. Field of the Invention
The present invention relates to the field of data trends and analysis, and more specifically, to methods and systems relating to trend extraction and analysis of data located on various computer systems and network(s), for example, the Internet.
2. Description of Related Art
Data extraction and analysis of dynamically changing data compilations, including analysis of relationships in the data, trend analysis, and prediction of the future is an area of wide application. For example, individuals and organizations often would like to derive useful information from data that will help them with sales, marketing, purchase, and various operation decisions to improve the efficiency and effectiveness of their and that of the organization. Some examples of dynamically changing data includes email messages including various topics, To-Do lists on peoples computers, employee or customers postings to companies electronic bulletin boards (e.g., on a LAN, an Intranet, or the Internet), development of web sites on computer networks including the Internet, open postings to web sites such as Wikipedia, open postings to Craigslist, open postings to public bulletin boards on the Internet (e.g., weblog web sites), etc. In many cases, this dynamically changing information and data may be user/entity generated content that may be very useful. However, due to the dynamic nature of the information, it is often difficult to draw meaningful information from the data or to draw insights from the data which will prove helpful in improving efficiencies and effectiveness of individuals and organization.
One particular active area of interest in data analysis is in weblog web sites on the Internet (the accumulation of all weblog web sites (or blog for short) on the Internet or World Wide Web (i.e., the Web) may be referred to as the blogosphere). A blog is a relatively new self-publishing phenomenon on the Web that has quickly become mainstream over the past few years. A blog is a special Web site on which an individual author (a blogger) or a group of collaborating authors periodically publish articles (entries or posts). Usually the entries are posted in reverse chronological order and each entry may include a time stamp indicating the time when the entry was posted.
The world of blogs is growing rapidly. According to Technorati, one of the top blog search engines, more than 1.2 million new blog entries are created everyday. In addition, these numbers have been doubling every six months in the past three years. As an arena in which tens of millions of users share the latest information and exchange personal opinions, the blogosphere offers great commercial value and provides new business opportunities in areas such as product survey, customer relationship, marketing, employee satisfaction, competitive assessments, etc. For example, for businesses to make judicious decisions, it is important for them to track customer opinions and complaints in a timely fashion. Here the blogosphere provides free large-scale information sources from which businesses can quickly learn opinions and complaints from their customers, employees, and competitor's customers about their own products and services, as well as those of their competitors. At the same time, as a special part of the Web, the blogosphere has its unique nature and features and therefore raises many new challenges. One such unique feature is that the blogosphere is much more dynamic than traditional Web pages. For example, an announcement of a new product may instantly trigger intensive discussions in the blogosphere. Very often, it is exactly these dynamic trends that are valuable for businesses to track, understand, and predict the interests of their customers, competitors, and their competitor's customers.
There may be various links among blogs and entries in the blog. A blog page may contain links to archives of old entries. It may also contain a blogroll, a sidebar consisting of bookmarks pointing to other blog sites. In the content of an entry, there may be citation links pointing to Web sites (e.g., sources of information discussed in the entry) or other entries (written either by the same author or by other bloggers). At the end of an entry, there may be comments from other bloggers as well as “trackbacks” (i.e.,links to other bloggers who are interested in the entry).
Recently, a number of commercial blog and Web search engines have introduced services for temporal trend analysis of the blogosphere. For example, for given keywords, BlogPulse and IceRocket generate trend curves over time in terms of the percentage of blog entries that contain the keywords. For a given tag, Technorati provides curves that show the daily number of entries that adopt the tag. Google has just announced a new service called Google Trend that, for given keywords, plots the search volume and news reference volume that are related to the keywords over time for all web sites.
There also exists a growing body of literature on trend analysis of dynamically evolving data in blogs and the blogosphere. For example, there have been various studies described in technical articles that include: Q. Mei, C. Liu, H. Su, and C. Zhai, A probabilistic approach to spatiotemporal theme pattern mining on Weblogs, In Proc. of the 15th WWW Conference, 2006; J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. of the ACM, 46(5), 1999; L. De Lathauwer, B. De Moor, and J. Vandewalle. A multilinear singular value decomposition. SIAM J. on Matrix Analysis and Applications, 21(4), 2000; R. Kumar, J. Novak, P. Raghavan, and A. Tomkins., On the bursty evolution of blogspace, Proc. of the 12th WWW Conference, 2003; N. S. Glance, M. Hurst, and T. Tomokiyo, BlogPulse: Automated trend discovery for weblogs, WWW 2004 Workshop on the Webloggirng Ecosystem:Aggregation, Analysis and Dynamics, 2004; D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins, Information diffusion through blogspace, Proc. Of the 13th WWW Conference, 2004; J. Leskovec, J. Kleinberg, and C. Faloutsos, Graphs over time: densication laws, shrinking diameters and possible explanations. In Proc. of the 11th ACM SIGKDD Conference, 2005; X. Song, B. L. Tseng, C.-Y. Lin, and M.-T. Sun., ExpertiseNet: Relational and evolutionary expert Modeling, Int. Conf on User Modeling, 2005; B. H. Murray, Sizing the internet, White paper, Cyveillance, Inc., 2000; F. Douglis, A. Feldmann, and B. Krishnamurthy, Rate of change and other metrics: a live study of the World Wide Web, In Proc. of the USENIX Symposium on Internet Technologies and Systems, 1997; J. Cho and H. Garcia-Molina, Effective page refresh policies for web crawlers, ACM Tran. on Database Systems, 28(4), 2003; D. Fetterly, M. Manasse, M. Najork, and J. L. Wiener, A large-scale study of the evolution of web Pages, Proc. of tile 12th WWW Coniference, 2003; and A. Ntoulas, J. Cho, , and C. Olston, What's new on the Web? The evolution of the web from a search engine perspective, Proc. of the 13th WWWConference, 2004. Some examples of prior patents in the general area of trend extraction and analysis techniques include those described in U.S. Pat. No. 6,915,009, U.S. Pat. No. 5,559,940, and U.S. Application Publication 2005/0091176. However, none of these approaches provide the analysis and insights that will prove most beneficial for dynamic data, particularly data that changes dues to self-publishing be one or more persons or organizations.
The aforementioned identified systems and methods lack certain useful capabilities. For example, the systems and methods do not combine the contents and the links among data sets (e.g., blogs). Further, they typically do not include a non-probabilistic approach. Nor do they model the content and linkage changes in graph structures or focus on direct analysis of the data in order to reveal trends and other insights about the data. These approaches also fail to extract trends and patterns from ordered and structured data sets, as well as form matrices containing higher dimensional structured data to analyze data, such as the change of a graph structure with time. Further, in typical trend extraction and analysis methods and systems there is no temporal/order information. They also typically fail to include an approach where one dimension is the time line and the main purpose is to extract the main trend in this dimension.
In addition, the prior approaches can not handle higher dimensional structured data, such as the change of a graph structure with time, and thus can not draw out, sort out, identify, or decipher certain characteristics contained in the data sets that may operate in different manners from the summation or aggregation of the data set. The known techniques typically use and other traditional trend analysis methods use simple statistics, such as percentage or total count, to represent temporal trends on the given keywords. Statistics such as total count or average have statistical merit and typically only represent general tendencies. However, statistics obtained by traditional methods are aggregations and typically ignore the characteristics of individual groups of data (e.g., blogs) that published the entries. This distinction becomes important because different groups of data (e.g., blogs) may contribute to the trend differently. For example, considering blogosphere data, some blogs constantly discuss products by a specific company whereas others mention the company name occasionally (e.g., only when it is acquired by another company). Such differences in activity are not factored in by traditional methods.
Therefore, there is a need for data trend extraction and analysis methods and systems that can extract and analyze trend(s) of data from dynamic data set(s) contained in computer systems and networks in more detail so that more accuracte results and characteristics of the underlying information may be obtained and more efficient and effective use of the data can be realized for individuals and organizations.
The present invention is directed generally to providing methods and systems for trend extraction and analysis. More specifically, embodiments may include methods and systems for trend extraction and analysis of information extracted from dynamically changing data included in computer systems and/or networks. For example, the present invention may be implemented in a personal computer, on ad-hoc networks such as peer-to-peer networks, and/or on a large network of computers such as LANs, Intranets, and the Internet. The techniques may be used to analyze temporal trends in various data sets and various graph structures drawn therefrom, in such data sets including the World Wide Web generally, social communities, financial data, political data, legal data, product data, service data, etc. In any case, the present invention includes various embodiments that may generate characteristic indicators for trend(s) and/or distribution(s) for one or more data sources by use of, for example, temporal indicators derived through analysis of the difference in contribution separate portions of the data have to the whole data set being considered, contribution of individual sources, and/or the interaction of the separate portions of the data with one another. Some exemplary approaches may include the use of singular value decomposition (SVD) and higher-order singular value decomposition (HOSVD) data extraction and analysis techniques. One particularly interesting exemplary use of these techniques is in the analysis of the dynamic data contained in the Web and Weblogs. In various embodiments, the dynamically changing information and data may be userlentity generated content and/or self published information.
In addition, the disclosed techniques can provide information not available through existing methods, for example, by providing the distribution of the occurrence of particular information in separate portions of the data or separate data sets. As an example, the techniques may be used to determine the distribution for the popularity of a product name or the authority of a particular entity. Further, the invention may indicate in what degree a product name is popular in the public based on the aggregate of data analysis for a complete data set (e.g., the blogosphere). In other words, the invention may help determine if a product name is popular in the general public or in a small community of blogs that share special interests. The invention may also help determine if there is an abnormal change in the structure of a data set or separate sections of a data set, for example, an abnormal change in the structure of a product-related community.
In the present description the term “eigen-trends,” may be defined to be temporal indicators derived through singular value decomposition (SVD) and higher-order singular value decomposition (HOSVD), that take differences among individual data sets or separate portions of a data set (e.g., blogs) into consideration and/or relationships among the individual data sets or separate portions of a data set. Two types of eigen-trends are described: (1) scalar eigen-trends (SVD based) and (2) structural eigen-trends (HOSVD based). In various embodiments, the systems and methods represent the observed data as a combination of information that captures temporal changes of the underlying data (i.e., eigen-trends) and information that captures the characteristics of individual data sources (e.g. bloggers) that may be referred to as the authority and/or hub. A combination statistically may give an optimal estimation of the observed data.
Various embodiments may include methods and systems in which information is partitioned into time windows. Further, some embodiments may include methods and systems in which a feature vector is built to represent the distribution of a term(s) used in a term search of one or more data source(s). Still some embodiments may include, for example, methods and systems in which a matrix(ces) is created by arranging the feature vector(s) in the order of time. Some embodiments may further include methods and systems that apply a singular value decomposition (SVD) to the matrix(ces). Various embodiments may also be directed toward generating a trend based on how a term(s) changes with time among one or more data source(s) from an output of the singular value decomposition (SVD). In various embodiments, the method(s) and system(s) may include generating a distribution vector based on how a term(s) is distributed among one or more data source(s) from an output of the singular value decomposition (SVD).
In various embodiments, a higher-order singular value decomposition (HOSVD) may be applied for trend analysis of data sets, and more particularly to trend analysis of graph structure data extracted from dynamic data. Further, the method(s) and system(s) may include a tensor (three dimensional matrix) created by arranging feature matrix(ces) in tie dimension of time. Some embodiments may include methods and systems in which a higher-order singular value decomposition (HOSVD) is applied to the tensor. Still some embodiments may further include, for example, methods and systems in which a trend(s) is generated based on how a term(s) changes with time for relationships among one or more individual data source(s) or separate portions of a data set from an output of the higher order singular value decomposition (HOSVD). In at least one embodiment, the method(s) and system(s) may include a distribution vector(s) generated based on how a term(s) is distributed among one or more data source(s) from an output of the higher order singular value distribution (HOSVD).
In various embodiments, the method(s) and system(s) may include analyzing, generating and/or identifying the temporal trend in a group of blogs with common interests, that takes the differences among individual blogs in consideration. Further, some embodiments may include methods and systems in which the observed data is a combination of information that captures temporal changes of the underlying data (i.e., eigen-trends) and information that captures the characteristics of individual bloggers (e.g., authority, hubs, etc.).
In various embodiments, the method(s) and system(s) may utilize singular value decomposition (SVD) to extract multiple scalar eigen-trends. Some embodiments may include methods and systems in which the main scalar eigen-trend best approximates the observed data and has good statistical properties. Still some embodiments may further include, for example, methods and systems in which secondary scalar eigen-trends can be used to represent non-dominating interests in the blocosphere. Further, in various embodiments, the method(s) and system(s) may utilize higher-order singular value decomposition (HOSVD) to extract structural eigen-trends. Some embodiments may include methods and systems in which structural eigen-trend(s) detect(s), for example, structural changes in the blogosphere.
The new data trend analysis and extraction techniques can reveal a lot of interesting trend information and insights for various dynamic data set(s), and as shown herein this is true for blogosphere data. These insights are not obtainable from traditional count-based methods of data trend analysis and extraction. Therefore these new techniques can provide invaluable analysis and may be particularly useful when used along with various traditional methods for trend analysis.
The above summary is intended to provide examples of the present invention and is not all inclusive. As such, the above described features of the invention and still further features included for various embodiments will be apparent to one skilled in the art based on the study of the following disclosure and the accompanying drawings thereto.
The utility, objects, features and advantages of the invention will be readily appreciated and understood by those skilled in the art upon consideration of the following detailed description of the embodiments of this invention, when taken with the accompanying drawings, in which same numbered elements are identical and:
a is an exemplary diagram showing data results used for building a score vector-time matrix containing stacked popularity scores for blogs at different time intervals, according to at least one embodiment;
b is an exemplary chart depicting a score vector-time matrix containing stacked popularity scores for the blogs at different times, according to at least one embodiment;
a-9d are exemplary graphs depicting experimental results which illustrate what happens when a few blogs dominate the discussion on a topic in the blogosphere then, at a time point, one of the dominating blogs generates much fewer entries than usual, according to at least one embodiment;
a-10d are exemplary graphs depicting experimental results which illustrate what happens when one non-dominating blog posts an abnormally large number of entries, according to at least one embodiment;
a-11f are exemplary graphs depicting experimental results simulating two distinct groups of blogs discussing different aspects of the same term following different temporal patterns, according to at least one embodiment;
a-12d are exemplary graphs depicting experimental results which illustrate what happens when, at a given time, instead of using the hub and authority scores, all links are generated randomly by selecting any blog to be the source or the target, according to at least one embodirnent;
a-13d are exemplary graphs depicting experimental results which illustrate that, to become a valid hub, a blog must build a track record of consistently pointing to good authorities over time, according to at least one embodiment;
a-14f are exemplary graphs depicting the scalar eigen-trend analysis for the term “tax,” according to at least one embodiment;
a-15f are exemplary graphs depicting the scalar eigen-trend analysis for the term “hurricane,” according to at least one embodiment;
a and 16b are exemplary graphs depicting experimental results which illustrate the authority vectors for two terms, Engadget and Technorati, suggesting that Engadget is popular in a relatively small community of bloggers while Technorati is popular in the more general public, according to at least one embodiment;
The present invention applies generally to methods and systems for trend extraction and analysis. More specifically, embodiments may include methods and systems for trend extraction and analysis of information extracted from dynamically changing data that may be typically stored, processed, and transmitted in computer systems and/or networks. For example, the techniques described herein may be implemented in a personal computer, on ad-hoc networks such as peer-to-peer networks, and/or on a large network of computers such as LANs, Intranets, and the Internet. They may be used to analyze temporal trends in various data set(s) and various graph structures drawn from the data set(s) and related to, for example, the World Wide Web (www), social communities, financial data, political date, product data, service data, etc. The various embodiments of the invention may include methods and/or systems that generate characteristic indicators for trend(s) and/or distribution(s) for one or more data sources by use of, for example, temporal indicators derived through analysis of the difference in contribution of separate portions of the data to the whole data set being considered, contribution of individual sources, and/or the interaction of the separate portions of the data with one another. Some exemplary approaches may include the use of singular value decomposition (SVD) and higher-order singular value decomposition (HOSVD) data extraction and analysis techniques. Some particularly interesting exemplary that will be used herein to more fully describe the invention, are the techniques use in the analysis of the dynamic data contained in self publishing inter-person posting sites.
In this detailed description, Web logs and the blogosphere will be used as an example of a particular application for the present invention. In this case, blog(s) will be used for the data set(s) to be analyzed, so that a more focused understanding of the invention may be drawn. However, the invention is equally applicable to other data set(s) including dynamically changing data.
As with other data set(s) and applications, existing approaches for analyzing blog(s) are typically based on simple counts, such as the number of entries or the number of links. However, the present invention introduces a number of new techniques for trend analysis that are defined and coined herein as “eigen-trend(s)” that may be applied to various data set(s). With respect to blogs, these techniques may, for example, include representing the temporal trend in a group of blogs with common interests. There are two particular techniques for extracting “eigen-trends” in various data set(s), such as blog(s); one trend analysis technique based on the singular value decomposition (SVD) and another trend analysis based on higher-order singular value decomposition (HOSVD). The SVD extracted eigen-trend(s) may provide, for example, new insights into multiple trends on the same term or keyword. The HOSVD trend analysis technique may analyze the data set(s), such as blog(s), as a dynamic graph structure and may extracts eigen-trends that reflect the structural changes of the various data set(s), such as blog(s) in the blogosphere, over time. Experimental results show that the new techniques of the present invention can reveal a lot of interesting trend information and insights about various dynamic data set(s), and particularly with respect to blog(s), that are not presently obtainable from traditional count-based methods.
By summing up the occurrence of entries, traditional methods of analyzing blog(s) typically ignore individual blog(s) that published those entries. However, different blog(s) may contribute to the trend differently. For example, some blog(s) may constantly discuss one or more products by a specific company, whereas other blog(s) may mention the company name occasionally (e.g., only when it is acquired by another company). Such differences in activity are not factored in by traditional methods. The present invention data set(s) trend analysis and extraction techniques provide a better way to represent the temporal behavior of various blog(s) in the blogosphere by considering such differences among blog(s). Further, for the same term or keyword, different groups of blog(s) may have different interests. Sometimes, a single trend does not make sense to all the interested groups of blogs. For example, there may be some data set(s) or blog(s) that are interest in tax matters from the financial point of view and there may be other data set(s) or blog(s) that are interested in tax matters from a political point of view. Thus, if only a simple count or statistical trend analysis is provided for a “tax” software company, the trend curve, which would be an accumulation of all the interests, will be misleading for purposes such as supporting marketing decisions because at various times the blog(s) activity will be high due to political discussions about tax. Thus, the blog(s) in the blogosphere usually do not explicitly indicate its interests (e.g., finance vs. politics for tax matters). However, the present invention may be used to detect different data set(s) or blog(s) with different interests and extract meaningful trends related to the corresponding groups so that a more accurate understanding of the data may be obtained, for example, by using a technique including SVD or an equivalent analysis.
Various dynamic data set(s), including blog(s) in the blogosphere, may make up one or more ecosystem(s) in which, for example, the data set(s) such as blogs interact with each other generating reference structure. In this sense, the data set(s) or blog(s) in the blogosphere, can be considered as a data set(s) graph or blog graph where the nodes are individual data set(s) or blog(s) and the links reflect endorsements and interactions among the data set(s) or blog(s). In addition, such a data set graph or blog graph is changing with time as a result of the development of internal relationships (e.g., interactions among the data set(s) or blog(s)) and external events (e.g., breaking news). The present invention can directly analyze and extract meaningful trends from such a dynamically changing data set(s) or blog(s) graph structure(s), for example, by using a technique including HOSVD or a similar technique.
In at least one embodiment, a key idea of the present invention is to represent the observed data as a combination of information that captures temporal changes of the underlying data (i.e., eigen-trends) and information that captures the characteristics of individual user(s)/entity(ies), such as individual bloggers (e.g., authority). This combination may statistically give an optimal estimation of the observed data. As mentioned above, there may be two types of eigen-trends: which may be further coined as “scalar eigen-trenads” and “structural eigen-trends”, which are some exemplary methods for analyzing the temporal aspects of data set(s). First, the various embodiments may include a method based on the singular value decomposition (SVD) to extract multiple scalar eigen-trends. A main scalar eigen-trend may best approximate the observed data and have good statistical properties. A secondary scalar eigen-trends may be used to represent non-dominating interests in the data set(s), such as blog(s) in the blogosphere. Second, the various embodiments may include a method based on a higher-order singular value decomposition (HOSVD) to extract structural eigen-trends. The structural eigen-trend may detect the structural changes in the data set(s), such as blog(s) in the blogosphere. Although SVD may have been used for time-series analysis in various other areas, it has not been used as is done by the present invention and has not been used for trend analysis dynamic data set(s) including self publishing and/or blog(s). Further, the present invention is the first time that higher-order singular value decomposition has been used for trend analysis of graph structure data. The present data set(s) trend analysis techniques can reveal a lot of interesting trend information and insights into the characteristics of the data set(s) such as blog(s) in the blogosphere, which are not obtainable from traditional count-based methods, and it may be particularly useful in supplementing traditional methods for trend analysis.
Referring now to
The data module 110 may be coupled to a Score Vector-Time Matrix module 120. The Score Vector-Time Matrix 120 may be used to build a score vector-time matrix of one or more characteristics of a data set(s). For example, a popularity or authority score of a desired entity and/or term may be calculated and placed into a score vector-time matrix generated by the Score Vector-Time Matrix 120. The Score Vector-Time Matrix 120 may be coupled to a Singular Value Decomposition (SVD) module 130. The Singular Value Decomposition (SVD) module 130 may be used to analyze the score vector-time matrix so as to determine various trends and unique characteristics of the data within the trends and over time. As such, the SVD module 130 may output various indicators such as vectors. These indicators may be used to provide for the data Trend(s) 140 nd Authority Distribution(s) 150. For example, the Trend(s) 140 may be a showing of how the popularity of a term or the occurrence of a term changes over time. Further, the Authority Distribution(s) 150 may provide a showing of how the contribution (to the total data set) of entity(ies) that make a contribution(s) to the data set may change over time.
Referring now to
Referring now to
Referring to
First, as background, some mathematical notations and concepts are now provided that will be used in later sections. Herein, scalars are written as lower-case letters (a,b, . . .), vectors as lower case letters in vector forms ({right arrow over (a)},{right arrow over (b)}, . . .), matrices and tensors as capital letters. One exception is made: In is used to denote the upper bound for the nth index of a tensor. For an Nth-order tensor AεI
m×n, (A)ij represents the element at the ith row and jth column of A. For a vector {right arrow over (v)}=(v1, . . . , vn)T, its 1-norm is defined as
and its 2-norm is defined as
The 2-norm of a matrix Aεm×n is defined based on the vector 2-norm as ∥A∥2≡max∥{right arrow over (v)}∥
m×m is called an orthogonal matrix if AAT=ATA=I, where I is the identity matrix in
m×m.
Further, for two tensors A, BεI
Finally, the Frobenius norm of A is defined as ∥A∥F≡√{square root over (<A, A>)}.
Given that background, for a given term or keyword (e.g., the name of a specific product), trend analysis studies according to the present invention may be applied to blog(s) to show a term(s) dominance, popularity or authority in blog(s) of the blogosphere as it changes over time. For example, the blog(s) of a blogosphere may consists of m blogs and that the popularity or authority score of a term(s) or keyword(s) k among those blog(s) within a time window j is given as a dominance, popularity or authority vector {right arrow over (x)}j=(x1j, . . . , xmj)T. This dominance, popularity or authority vector may be observed through n consecutive time windows and stacked into an m×n matrix X=({right arrow over (x)}1, . . . , {right arrow over (x)}n), as illustrated in
The observed data X may be represented by a pair of vectors: a trend vector {right arrow over (t)} that represents the overall trends of a term or keyword over time and an authority vector {right arrow over (a)} that represents the contribution of individual entity(ies), e.g, bloggers, to the trend. The following mathematical formulation may be used to show that this pair of vectors can provide better statistic estimation of the observed data X compared to traditional count-based methods. Accordingly, in at least one embodiment of the present invention, a new temporal trend, called a scalar eigen-trend, is proposed.
First, it may be observed how well traditional count-based methods can represent the observed data X. A simple count-based method may represent the trend as a vector {right arrow over (t)}c=(t1, . . . , tn)T where tj=Σixij. That is, the overall popularity score at time j may be defined as the total number of entries among all data set(s), e.g., blogs, at time j that contain the term or keyword. This count-based score may be a reasonable estimator of the central tendency of the popularity among blogs and is particularly useful in the following sense—if it is assumed that at time j, each xij is an independent sample drawn from a random variable with mean
then {circumflex over (μ)}=tj=Σixij is an unbiased estimator for μ that has the minimal sample variance
To represent this property in a different way, the vector {right arrow over (t)}c may be the solution to the following equation:
where ∥·∥F is the Frobenius norm and {right arrow over (a)}o is a column vector whose entries are all 1/m.
Note however that in the above discussion and trend analysis, differences among individual blogs are ignored and it is assumed that the popularity score of any blog has the same distribution as the sum of the total blogs. That is, in the count-based score may be a reasonable estimator without knowledge of the influence on the total of individual entity(ies), such as individual bloggers, a priori. In reality, however, it may be observed that one entity, e.g., a blogger, may publish entries on the term or keyword more frequently than other entity(ies) or blogger(s), contributing to the number of overall occurrences of the term or keyword (i.e., trend) constantly, thus becoming a dominant, popular or authority entity(ies) or blogger(s). For example, for the term or keyword “iPod,” there can be data set(s) or blogs devoted completely to iPod that have tens of entries every day talking about different features of iPod, and there can also be data set(s) or blogs that mention iPod only infrequently. Assuming that the fraction of contribution to the trend by individual entity(ies) or bloggers, xij, is drawn from a distribution with aiμ as the mean. This information may be given as a unit 2-norm vector {right arrow over (a)}=(a1, . . . , am)T. Under this assumption, a better trend indicator may be given as μ that minimizes the error Σi(xij−aiμ)2 instead of the error
as used in the count-based method. Then, the trend {right arrow over (t)} may be the solution to the following equation:
In fact, the following property may show that under an assumption of equal variance, the solution that minimizes Σi(xij−aiμ)2 is the linear unbiased estimator for μ with the minimal variance. Property 1. Let {right arrow over (a)}=(a1, . . . , am)T be a unit vector. If for each i, xij is drawn from a distribution with mean μ and variance σ2, then the value {circumflex over (μ)}=arg minr Σi(xij−air)2 is the linear unbiased estimator for μ with the minimal variance. By setting the derivative of Σi(xij−air)2 with respect to r to be zero, the value that minimizes Σi(xij−air)2 is {circumflex over (μ)}=Σiaixij. {circumflex over (μ)} may be an unbiased estimation of μ because E({circumflex over (μ)})=E(Σiaixij)=Σiai2μ=μ. Now we prove that {circumflex over (μ)} may be the linear unbiased estimator for μ with the minimal variance. For an arbitrary linear estimator {circumflex over (μ)}1 for μ, then {circumflex over (μ)}1 may be Σibixij and define {right arrow over (b)}=(b1, . . . , bm)T. For {circumflex over (μ)}1 to be unbiased, we have E({circumflex over (μ)}1)=E(Σibixij)=(Σibiai)μ and so Σibiai=1 or equivalently
∥{right arrow over (b)}∥·∥{right arrow over (a)}∥·cos θ=1
where θ is the angle between {right arrow over (b)} and {right arrow over (a)}. The variance of {circumflex over (μ)}1 may be written as
var({circumflex over (μ)}1)=var(Σibixij)=(Σibi2)σ2=∥{right arrow over (b)}∥2 σ2
So it would be desirable to minimize ∥{right arrow over (b)}∥2 σ2 subjected to ∥{right arrow over (b)}∥·∥{right arrow over (a)}∥·cos θ=1. Because ∥{right arrow over (a)}∥=1, the solution is obviously θ=0 and {right arrow over (b)}={right arrow over (a)}. Therefore, {circumflex over (μ)}=Σiaixij may be the linear unbiased estimator for μ with the minimal variance.
Now, we may determine how to best estimate {right arrow over (a)}. A simple way may be to take the average of xij over all the time windows. However, this estimation treats all the time windows equally. Similar to the above discussion, if the trend for each time window is known, a better way to estimate may be to find {right arrow over (a)} that minimizes the error Σij(xij−aitj)2. Note that {right arrow over (t)} may be one example of a desired trend. Then the trend {right arrow over (t)} may be given by the following equation:
That is, a pair of {right arrow over (t)} and {right arrow over (a)} may be provided, that together best approximate the observed data.
Equation (3) above may be solved by, for example, applying a singular value decomposition (SVD) on X: a Theorem 1 may be to assume X=UΣVT is the singular value decomposition for X, where U=({right arrow over (u)}1, . . . , {right arrow over (u)}m)εm×m and V=({right arrow over (v)}1, . . . , {right arrow over (v)}m)ε
m×n are orthogonal matrices representing the basis for the column space and the basis for the row space of X, respectively; Σ=diag(σ1, . . . , σk, 0, . . . , 0)ε
m×n in which k≦min(m, n) is the rank of X and σ1≧ . . . ≧σk≧0 are the singular values of X. Then σ1{right arrow over (v)}1 is a solution to {right arrow over (t)} in Equation (3) and the minimal error is achieved at {right arrow over (a)}1={right arrow over (u)}1. A proof of the theorem may be that the theorem may be obtained from the following well-known property of an SVD: with σ1, {right arrow over (u)}1 and {right arrow over (v)}1 being the first singular value, the first left and right singular vectors, respectively, if we define X1={right arrow over (u)}1σ1{right arrow over (v)}1, then ∥X−X1∥F=minrank(Y)=1∥X−Y∥F. Obviously {right arrow over (a)}1{right arrow over (t)}1T is a rank-1 matrix with ∥{right arrow over (a)}∥=1. So by taking {right arrow over (t)}1=σ1{right arrow over (v)}1 and {right arrow over (a)}1={right arrow over (u)}1, Equation (3) may be satisfied. Of course, there may be other methods that may prove equally useful in indicating the dominance, popularity, or authority of an entity(ies) or blogger(s) in the data or information contained in the data set(s) or blog(s).
The above discussion shows that the pair of vectors, {right arrow over (t)} and {right arrow over (a)}, may be better indicator(s) to approximate the characteristics of the observed data, where the former shows the temporal trend of the popularity of a term or keyword and the latter shows the contribution to the whole or dominance of individual entity(ies) or blogger(s) to the trend. These are defined or identified herein as an eigen-trend and an authority scores, respectively. To distinguish this group of trend indicators from another group of trend indicators discussed later, this group of trend indicators will specifically be called a scalar eigen-trend. These names are particularly appropriate given because of the following property: Property 2. It may be shown that tie solutions {right arrow over (a)} and {right arrow over (t)} from the above procedure may satisfies the following recursive relationship (after appropriate normalization)
This mutual reinforcement relationship between {right arrow over (t)} and {right arrow over (a)} may be considered as similar to the one between hubs and authorities in an HITS algorithm. In at least one embodiment of the present invention, an a data set or blog i that has a high score ai can be seen as an authority in a sense that the entity or blogger may better represents the trend. The overall popularity tj at time j may be high when it is base on the contribution of many good authority data set(s) or blogs, and a good authority data set or blog must contribute to the popularity when the overall popularity tj is high. The scalar eigen-trend and authority scores may also have the following properties: Property 3. If all elements of X are non-negative, then the singular value decomposition can be written in such a way that all elements of {right arrow over (u)}1 (and therefore {right arrow over (a)}) arid {right arrow over (v)}1 (and therefore {right arrow over (t)}) are non-negative. Property 3 may guarantee that {right arrow over (a)} and {right arrow over (t)} will be non-negative. This may be helpful because {right arrow over (t)} may be used to represent the temporal trend and {right arrow over (a)} to represent the authority score, and it may be difficult to interpret negative values in either of them. It is worth noting that all elements of {right arrow over (u)}1 and {right arrow over (v)}1 may be made non-positive by flipping the signs of {right arrow over (u)}1 and {right arrow over (v)}1 at the same time. Property 4. When {right arrow over (a)}·{right arrow over (t)}T is used to approximate X, the square error can be derived from the second through the last singular values as ∥X−{right arrow over (a)}·{right arrow over (t)}T∥F2=Σi>1σi2. Property 4 can provide a measure on how much information may be captured by the trend 270, e.g., the eigen-trend, and the authority indicator 280, e.g., the authority score.
Compared to traditional count-based trends, the scalar eigen-trend is capable of capturing the main stream of the data set(s) or blog(s) activity more clearly. In the various blog(s) of the blogosphere, entity(ies) or blogger(s) may contribute, post or publish entries that may typically be driven by events (e.g., press releases of new products). If many of the entities or bloggers react to the same events at the same time, their synchronous activity may form a “trend”.
The dominance, popularity or authority score of a data set or blog may serve as a “track record” of the data set or blog over time, to indicate the amount of contribution that the particular data set or blog makes to the main-stream trend. An interested person such as a system user or analyst can focus on such authoritative data set(s) or blog(s) to get deeper insights on the trend. On the other band, if a particular entity(ies) or blogger(s) behaves independently from the main-stream trend, its authority score may be small and its effect on the trend may be discounted. This means that the scalar eigen-trend may be generally less noisy than the count-based trends in extracting the main trend from the observed data. This concept will be demonstrated herein through experiments on various data sets. In addition, the {right arrow over (a)}, the first singular vector of X, may be used to represent the general popularity score distribution of the given term or keyword.
The scalar eigen-trend may also capture multiple trends. When the second singular value is large (i.e. the square error of Property 4 is large), another (secondary) trend may be extracted from the data set by using the second singular vector. For example, the same term or keyword (e.g., tax) may be populated by different groups of data set(s) or blog(s) that have different points of view (e.g., finance vs. politics). There may be latent trends on the same term or keyword, which may be combined into the observed data from the data set(s) or blog(s). The traditional count-based method will not be able to decompose such trends. However, the present invention using the second singular value from the scalar eigen-trend may be able to discover these secondary trends from non-dominating interest groups of the data set(s) or blog(s). Examples of such observations and characteristics win be described below when discussing various experimental results.
Referring now to
Referring to
Referring now to
Referring now to
Furthermore, for each time window t1, t2, t3, . . . tn, the data set(s) or blog(s) graphs 600 such as shown in
In the earlier section related to scalar eigen-trends that may include SVD, an element xij of matrix X may represent a dominance, popularity or authority score of blog i at time j. This dominance, popularity or authority may be measured by the number of relevant entries by blog i at time j. However, such a simple definition may have a weak point: it may ignore various characteristics of the community structure, for example the link information of data set(s) of blog(s) in the blogosphere. For example, if relevant entries by a certain data set or blog always attract a lot of links (e.g., references) from other data set(s) or blogs, then that data set or blog may be considered as more important than some other data set(s) or blog(s). As another example, because the a group of data sets or group of blogs in a blogospbere is an ecosystem in which people or entities are mutually aware of each other and interact with each other, it can be expected that for a given term or keyword, there may exist related communities that exhibit structural consistency over time.
For a given term or keyword a graph Gj for time j, may be constructed and designated the term or keyword-specific blog graph. The nodes of Gj may be the m data sets or blogs. There exists an edge epq pointing from blog bp to blog bq if at time j, there are k (k≧1) links pointing from entries in bp to entries in bq that are related to the term or keyword. The weight of epq may be set to be k. An entry-to-entry link epq may be defined to be related to a term or keyword if either the citing entry in bp or the cited entry in bq contains the term or keyword. The term or keyword-specific data set or blog graph may be observed through n consecutive time windows. If each graph is represented as an m×m adjacency matrix, the entire data is represented as a third-order tensor Xεm×m×n, where the first two dimensions of X may be respectively the rows and columns of the adjacency matrices, and the third dimension is the time line.
As mentioned above, various embodiments of the present invention, the method(s) and system(s) may be used to directly analyzes trends in dynamically changing graph structures or communities of interrelated data sets, e.g., blogs, which has been identified herein as a structural eigen-trend. Higher-order singular value decomposition (HOSVD) may be applied to the observed data X. X may be represented by, for example, three vectors: a trend vector {right arrow over (t)} (e.g., a structural eigen-trend), an authority vector {right arrow over (a)}, and a hub vector {right arrow over (h)}. Whereas the scalar eigen-trends previously introduded may represent the characteristics of individual entities or bloggers with one or more vectors, e.g., a single vector such as an authority vector, this trend analysis technique may provide a pair of vectors {right arrow over (a)} and {right arrow over (h)}. Further, extending this concept, the present invention may capture a community that consists of hub and authority blogs and may track the structure of the community over time. The following description gives a more detailed description of one of the methods that may be use for the structural eigen-trend technique.
Generally, a singular value decomposition may be applied to X for trend analysis on a dynamically changing graph structure. However, unlike the case of a matrix, singular value decomposition may not be uniquely defined on higher-order tensors. Among the various techniques developed, one exemplary technique that may be used can adopt a framework like one proposed by De Lathauwer et al., which is described as follows. First the singular value decomposition X=UΣVT may be rewritten by using n-mode product as:
X=Σx1Ux 2V (5)
where in general, for a tensor AεI
J
I
In other words, an n-mode product xn of A may apply a linear transformation (represented by M) to all the n-mode vectors of A, where an n-mode vector of A is an In-dimensional vector obtained by varying the nth index of A from 1 to In while keeping all other indices fixed. Because a matrix is a special case of tensor, the natural question is if we can generalize Equation (5) to singular value decomposition on higher-order tensors. De Lathauwer et al. proposed a way of doing that and called the method a higher-order singular value decomposition (HOSVD). De Lathauwer et al. showed that for a tensor XεI
X=Sx1U(1)x2U(2) . . . xNU(N) (6)
where U(n)εI
I
Based on this power method, the present invention may us a similar method including the following steps to compute the trend in data set(s) or blog(s) in the blogosphere. First, a third-order tensor X as described above may be built to represent the dynamic change of the term or keyword-specific data set(s) or blog graph(s) over time. Then an iterative method (with appropriate normalization) may be used to compute the first left ({right arrow over (h)}), the first right ({right arrow over (a)}), and third-mode ({right arrow over (τ)}) singular vectors.
It may be shown that the above iteration converges to solutions {right arrow over (h)}, {right arrow over (a)}, {right arrow over (τ)}, λ, such that {right arrow over (h)}·{right arrow over (a)}·λ{right arrow over (τ)}, with · being the tensor outer product, is the rank-1 tensor that best approximates X in terms of Frobenius norm (square error). In various embodiment of the present invention, {right arrow over (i)}=λ{right arrow over (τ)} may be used to represent the temporal trend for the term or keyword-specific data set or blog graph(s).
Thus, as noted above, the trend {right arrow over (t)} may be called herein a structural eigen-trend to distinguish it from the scalar eigen-trend. The first left and right singular vectors, {right arrow over (h)} and {right arrow over (a)}, may be called hub scores and authority scores, respectively, based on the following intuitive interpretations.
In the HITS algorithm mentioned above, for an adjacency matrix X, the hub score, which is the first left singular vector of X, may represent the goodness of the Web pages on summarizing a keyword; the authority score, which is the first right singular vector of X, represents the goodness of the Web pages on being authorities of the keyword. In at least one embodiment of the present invention, because {right arrow over (h)} and {right arrow over (a)} may be extracted from the tensor X, they can be considered as the general hub and authority scores that may capture the main community structure related to a term or keyword in the dynamically changing term or keyword-specific data set or blog graph. From Equation (7) it can be observed that after {right arrow over (h)} and {right arrow over (a)} have converged, the trend at time j is the projection of the keyword-specific blog graph Gj onto the main community represented by the outer product of {right arrow over (h)} and {right arrow over (a)}. Also from Equation (7) the following can be observed: Property 5. The HITS algorithm is a special case of our method by taking a single lime window i.e., taking n to be 1.
Of course, the eigen-trend approach of the present invention is good for analyzing other graph structures. The term or keyword-specific blog graph is illustrated here only as an example; the trend analysis technique presented can be applied to other general graph structures for many other types of data sets, for example, listed open postings to web sites such as Wikipedia, open postings to Craigslist, etc., as well as to analyze various other dynamically changing undirected graph structures. In the cases of undirected graphs, instead of the pair of hub and authority scores, a single eigen vector that represents the main “shape” of the graph structures may be utilized and/or provided.
In addition, the following property for the trend analysis based on HOSVD can be easily verified. Property 6. If all elements of a third-order tensor X are non-negative, the iteration given in Equation (7) will converge to a solution such that {right arrow over (h)}, {right arrow over (a)}, {right arrow over (τ)} and λare all non-negative.
There are a number of benefits to using the structural eigen-trend techniques of the present invention. Some exemplary ones follow. Compared to the scalar eigen-trends, the structural eigen-trends focus on and exploit the link structure in the data set(s) or blog(s) of a blogosphere. Whereas the scalar eigen-trends may emphasize the main group of data set(s) or blogs that publish entries individually, the structural eigen-trends may depict activity of the main community that consists of, for example, hubs and authorities referencing each other. Rather than just applying the HITS algorithm to individual time windows, various embodiments of the present invention may track the linking behavior of the data set(s) or blogs to find constant hubs and authorities over. time. It can discount effects from a particular data set or blog that does not follow the main trend on linking behavior (for example, a data set or blog that generates links randomly) even if it looks like a hub within a specific time window. Similar to the scalar eigen-trend, the secondary trend can be useful, for example, to detect another community behaving differently from the main community.
Referring now to
As previously noted, in one of the exemplary applications of the present invention trend analysis of blogs if performed. The blogosphere is an ecosystem in which blogs interact with each other generating reference structure. In this sense, the blogosphere may be considered as a blog graph where the nodes are blogs and the links reflect endorsements and interactions among blogs. In addition, such a blog graph is changing with time as a result of the development of internal relationships (e.g., interactions among blogs) and external events (e.g., breaking news). Various embodiments of the present invention are directed to analyzing and extracting meaningful trends from such a dynamically changing graph structure.
The present invention's capability and usefulness have been demonstrated using trend analysis and extraction using experiments. Experiments were conducted on synthetic data sets to verify the benefits of eigen-trends, according to at least one embodiment of the present invention. Further, experiments of case studies on a real blog data set were conducted to show interesting trends that are revealed by the systems and methods of the present invention, which are not available through traditional count-based methods.
The synthetic data sets were generated as follows. To study the SVD-based trend extraction method, entries are generated from 10 blogs over 250 time units. In a time unit, each blog generates a random number of entries where the number follows a uniform distribution. The mean values of the distribution are different for different blogs. For easy viewing, we let the mean values vary with time following a sinusoid trend.
To study the HOSVD-based trend extraction method, links are generated among 10 blogs over 250 time units. The number of links in each time unit follows a uniform random distribution whose mean value varies over time following a sinusoid trend. When a link is generated, unless stated otherwise, a source blog and a target blog are selected at random, following distributions pre-defined by two unit vectors. These two vectors serve as the underlining hub and authority scores. It should be noted that compared with the real blogosphere, the scale of the examples presented herein is small but the results found are indicative.
The experimental results tha follow shown in
Referring to
It should be noted that, for all the figures shown herein for trend (e.g., count-based trend, scalar eigen-trend, and structural eigen-trend) have an x-axis representing the time windows and the y-axis represents the trend values. For other singular vectors, the x-axis denotes the blog number from 1 to m where m is the total number of blogs. For the singular values, if we show the top k singular values, then the x-axis denotes the index for the singular values from 1 to k.
As can be seen from
Referring to
Referring to
Referring to
Referring to
Referring now to
Referring to
Referring to
Referring to
As noted above,
to represent the popularity distribution of a fictitious keyword that is popular equally among all bloggers. We then may use, for example, the Kullback-Leibler divergence between the {right arrow over (a)}′ vector of a keyword and {right arrow over (a)}o, i.e.,
, to measure how general a keyword is. Intuitively, the lower the divergence for a keyword, the “flatter” the distribution and hence the keyword is popular in the more general public. In our example, the divergence for Engadget is 7.21 and that for Technorati is 3.68. Applying this measure, we are able to order some representative keywords from more “spiky” distributions to “flatter” distributions as PowerPC, Engadget, MSDN, iMac, TiVo, Macromedia, RFID, Palm, Netflix, Slashdot, Windows Vista, Xbox, Windows XP, iPod Shuffle, Flickr, MSN, iPod Nano, Technorati, iPod, Google, Yahoo, Network, Internet). This result matches our common sense quite well, because the keywords in the front of the list seem to be the names of products with narrower audience while those at the end of the list seem to be more general brand or technology names in which more people are interested.
Experimental results using real data for structural eigen-trend analysis will now be considered for at least one embodiment of the present invention. In the experiments, structural eigen-trends extracted by using HOSVD generally comply with trends obtained by using other methods. Referring to
The dominating authority for the structural eigen-trend in graph 1725 of
To find the reason for this spike, in
As noted earlier, in at least one embodiment, the system(s) and method(s) provided herein may be implemented using a computing device, for example, a personal computer, a server, a mini-mainframe computer, and/or a mainframe computer, etc., programmed to execute a sequence of instructions that configure the computer to perform operations as described herein. In various embodiments, the computing device may be, for example, a personal computer available from any number of commercial manufacturers such as, for example, Dell Computer of Austin, Tex., running, for example, the Windows™ XP™ and Linux operating systems, and having a standard set of peripheral devices (e.g., keyboard, mouse, display, printer).
Instructions may be read into a main memory from another computer-readable medium, such as a storage device. The term “computer-readable medium” as used herein may refer to any medium that participates in providing instructions to the processing unit 1905 for execution. Such a medium may take many forms, including, but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media may include, for example, optical or magnetic disks, thumb or jump drives, and storage devices. Volatile media may include dynamic memory such as a main memory or cache memory. Transmission media may include coaxial cable, copper wire, and fiber optics, including the connections that comprise the bus 1950. Transmission media may also take the form of acoustic or light waves, such as those generated during Radio Frequency (RF) and Infrared (IR) data communications. Common forms of computer-readable media include, for example, floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, Universal Serial Bus (USB) memory stick™, a CD-ROM, DVD, any other optical medium, a RAM, a ROM, a PROM, an EPROM, a Flash EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.
Various forms of computer-readable media may be involved in carrying one or more sequences of one or more instructions to the processing unit(s) 1905 for execution. For example, the instructions may be initially borne on a magnetic disk of a remote computer(s) 1985 (e.g., a server, a PC, a mainframe, etc.). The remote computer(s) 1985 may load the instructions into its dynamic memory and send the instructions over a one or more network interface(s) 1980 using, for example, a telephone line connected to a modem, which may be an analog, digital, DSL or cable modem. The network may be, for example, the Internet, and Intranet, a peer-to-peer network, etc. The computing device 1900 may send messages and receive data, including program code(s), through a network of other computer(s) via the communications interface 1910, which may be coupled through network interface(s) 1980. A server may transmit a requested code for an application program through the Internet for a downloaded application. The received code may be executed by the processing unit(s) 1905 as it is received, and/or stored in a storage device 1915 or other non-volatile storage 1955 for later execution. In this manner, the computing device 1900 may obtain an application code in the form of a carrier wave.
The present system(s) and method(s) may reside on a single computing device or platform 1900, or on multiple computing devices 1900, or different applications may reside on separate computing devices 1900. Application executable instructions/APIs 1940 and operating system instructions 1935 may be loaded into one or more allocated code segments of computing device 1900 volatile memory for runtime execution. In one embodiment, computing device 1900 may include system memory 1955, such as 512 MB of volatile memory and 80 GB of nonvolatile memory storage. In at least one embodiment, software portions of the present invention system(s) and method(s) may be implemented using, for example, C programming language source code instructions. Other embodiments are possible.
Application executable instructions/APIs 1940 may include one or more application program interfaces (APIs). The system(s) and method(s) of the present invention may use APIs 1940 for inter-process communication and to request and return inter-application function calls. For example, an API may be provided in conjunction with a database 1965 in order to facilitate the development of, for example, SQL scripts useful to cause the database to perform particular data storage or retrieval operations in accordance with the instructions specified in the script(s). In general, APIs may be used to facilitate development of application programs which are programmed to accomplish some of the functions described herein.
The communications interface(s) 1910 may provide the computing device 1900 the capability to transmit and receive information over the Internet, including but not limited to electronic mail, HTML or XML pages, and file transfer capabilities. To this end, the communications interface 1910 may further include a web browser such as, but not limited to, Microsoft Internet Explorer™ provided by Microsoft Corporation. The user interface(s) 1920 may include a computer terminal display, keyboard, and mouse device. One or more Graphical User Interfaces (GUIs) also may be included to provide for display and manipulation of data contained in interactive HTML or XML pages.
Referring now to
While embodiments of the invention have been described above, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. In general, embodiments may relate to the automation of these and other business processes in which analysis of data is performed. Accordingly, the embodiments of the invention, as set forth above, are intended to be illustrative, and should not be construed as limitations on the scope of the invention. Various changes may be made without departing from the spirit and scope of the invention. Therefore, the scope of the present invention should be determined not by the embodiments illustrated above, but by the claims appended hereto and their legal equivalents.
All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
This application claims the benefit of U.S. Provisional Application No. 60/733,231 filed Nov. 3, 2005, the entire disclosure of which is hereby incorporated by reference as if set forth fully herein.
Number | Date | Country | |
---|---|---|---|
60733231 | Nov 2005 | US |