The various aspects of the subject innovation are now described with reference to the annexed drawings, wherein like numerals refer to like or corresponding elements throughout. It should be understood, however, that the drawings and detailed description relating thereto are not intended to limit the claimed subject matter to the particular form disclosed. Rather, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the claimed subject matter.
To date there have been several methods proposed for the analysis, clustering, and indexing of input objects (e.g., a set of documents). Notable amongst these methods have been Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet Allocation (LDA). The Latent Semantic Indexing (LSI) approach represents individual objects (e.g. topics) via the leading eigenvectors of AT A, where A is an input term-document matrix. The Latent Semantic Indexing (LSI) technique through utilization of such leading eigenvectors can preserve the major associations between words and documents for a given set of data thereby capturing both the synonymy and polysemy of words. However, while the Latent Semantic Indexing (LSI) line of attack preserves major associations between words and documents for a given set of input data, and further has strengths in relation to spectral approaches, the technique has been found to be deficient in that it does not possess strong generative semantics.
In contrast, the Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet Allocation (LDA) modalities both employ probabilistic generative models to analyze, cluster and/or characterize a set of input objects (e.g., documents) and utilizes latent variables in an attempt to capture an underlying topic structure. LDA is considered to be a richer generative model compared to PLSI. Nonetheless, both the Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet Allocation (LDA) approaches are superior to the Latent Semantic Indexing (LSI) methodology in modeling polysemy and have better indexing power. But nevertheless, in comparison to the Latent Semantic Indexing (LSI) technique, the Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet Allocation (LDA) perspectives lack the advantages of spectral methods. Moreover, the batch nature of Probabilistic Latent Semantic Indexing (PLSI) in particular, since it estimates all aspects together, can lead to drawbacks when it comes to model selection, speed, and in application to expanding object sets, for example.
The claimed subject matter can incrementally and without supervision discover, cluster, characterize and/or generate objects (e.g. topics) that are meaningful to human perception from a set of input objects (e.g., documents) while preserving major associations between the contents (e.g., words) of the set of input objects to thereby capture the synonymy and preserve the polysemy of the words. Additionally, the claimed subject matter is dynamically and adaptively capable of incrementally growing as the need arises. To this end, in one embodiment of the invention ideas from density boosting, gradient based approaches and expectation-maximization (EM) algorithms can be employed to incrementally estimate aspect models, and probabilistic and spectral methods can be utilized to facilitate semantic analysis in a manner that leverages advantages of both spectral and probabilistic techniques to create an incremental unsupervised learning framework that can discover, cluster and/or characterize objects (topics) from a collection or set of input objects (documents).
Density boosting is a technique to grow probability density models by adding one component at a time with the aim of estimating the new component and to combine components such that a cost function (e.g., F=(1−a)*G+a*h, where F is the new model, G is the old model, h is the new component, a is a combination parameter, and h and a are estimated such that F is better than G in some way) is optimized. It should be noted that prior to the new component being estimated that each data point is weighted by a factor, such that the factor is small for points well represented by the old model, and high for points not well represented by the old model; this is equivalent to giving more importance to points not well represented in the old model.
Preliminary indications are the proposed incremental unsupervised learning framework can focus on tightly linked sets of input objects (e.g., documents) and the contents (e.g., words, photographs, and the like) of such objects to discover, cluster and/or characterize objects (topics) in a manner that closely correlates to those discoveries, clusters, and characterizations that would be attained by a human intermediary. Additionally, initial results further suggest that such an incremental unsupervised learning framework has advantages in relation to speed, flexibility, model selection and inference, which can in turn lead to better browsing and indexing systems.
Prior to embarking on an expansive discussion of the claimed subject matter, it can be constructive and prudent at this juncture to provide a cursory overview of aspect models. An aspect model is a latent variable model where observed variables can be expressed as a combination of underlying latent variables. Each aspect z thus models the notion of a topic (e.g., a distribution of words, or how often a word occurs in a document containing a particular topic, and in what ratios the word or words occur), therefore allowing one to treat the observed variables (e.g., documents d) as being mixtures of topics. In other words, an aspect model is any model where objects can be represented as a combination of underlying groupings or themes. To illustrate this point, consider receiving a single text document and being requested to cluster the contents of this document. The document might relate to only one topic, or it might relate to a plethora of issues, though on its face the document has no correlative relationship with any particular categorization; simply put, the document appears to be a morass of words. Thus, at the outset of this exercise, upon receipt, the document appears to be a collection of associated words set forth in a document. Subsequent perusal of the received document however may yield, for example, that the document relates to corporate bankruptcies and crude oil. Therefore, in this simplistic example, the latent aspects or topics to which the received document relates are corporate bankruptcies and crude oil, and as such the document, based on these latent aspects, can be grouped as being related to corporate bankruptcies and crude oil. However, it should be noted that in some aspect models, such as PLSI, the observed variables can be considered to be independent of each other given the aspect.
The foregoing illustration can be expanded and represented mathematically as follows. Assuming for instance that a set of documents D={d1, d2, . . . , dN} is received, and that each document d is represented by an M-dimensional vector of words taken from a fixed vocabulary W={w1, w2, . . . , wM}. The frequency of word w in document d is given by nwd, and the entire input data can be represented as a word-document co-occurrence matrix of size M*N. Further, as stated supra, in PLSI an aspect model is a latent variable model where the observed variables are independent given a hidden class variable z, and that the hidden class variable z (or aspect) models the notion of a topic allowing one to treat documents as mixtures of topics, such an aspect model can decompose the joint word-document probability as:
where K represents the number of aspect variables.
A problem with the foregoing model however is that it is static as the number of aspects or topics have to be known, or guessed, in advance and further the general topic areas have to be identified prior to employing the model. Thus, for example, if a stream of input documents or metadata is continuously fed into the model the number of putative aspects that can be discovered, clustered, and/or characterized is constrained to the number of general topic areas that were identified prior to the employment of the model. Given this problem therefore it would be beneficial to incrementally and/or dynamically estimate the aspects as the model is being executed and contemporaneously with when the data is being received. The advantage of such an incremental approach being that the modality requires fewer computational resources and consequently can deal with larger and more expansive datasets. Further, such an incremental technique allows the model to grow to accommodate new data (e.g., data for which no general topic areas have been pre-defined) without having to retrain the entire model ab initio. Additionally, such an incremental approach permits easier model selection since one can stop and restart the model if required without the necessity of essentially losing topics already extracted, and thus as a corollary since topics are not lost when the model is stopped and restarted, this provides continuity to a user.
Once the observed data has been scanned and digitized the data is passed to an analysis component 120 that assays the digitized representation of the observed data. The analysis component 120 determines whether objects can be grouped together, such as whether words and documents, genes and diseases, one document and a disparate document (e.g., web pages pointing to each other) can be clustered together. In other words, the analysis component 120 can build aspect models where objects can be represented as a combination of underlying groupings or themes. For example for documents, aspects represent topics and documents can be treated as a combination of themes or topics, e.g. in a document about genes and symptoms of disorders, the underlying themes or topics could relate to some property of genes. For instance, where two different genes affect the liver both could cause similar symptoms, but in combination with other genes cause different disorders. These underlying groupings or clusters can be referred to as aspects.
Nominally, aspect models can be defined as probabilistic and generative models since such aspect models usually propose some model by which observed objects are generated. For instance, in order to generate a document, one can select topics A, B, and C with probabilities 0.3, 0.5, and 0.2 (note that the probabilities sum to 1), and then from the selected topics one can select words according to some underlying distribution distinct to each.
The analysis component 120 can estimate a series of aspects denoted as ht (t=1, 2, . . . ), wherein each ht (that can of itself be considered as a weak model) captures a portion of the joint distribution P(w, d), such that the totality of the aspects (the hts) are learned on a weighted version of the digitized data that emphasizes the parts that are not well covered by a current (or a previously generated) weak model. While each aspect (ht) by itself comprises a weak model, the combination Ft=(1−α)Ft−1+αtht, is one that is stronger than the previous one (e.g., Ft−1). In other words, the analysis component 120 additively grows a latent model by finding a new component that is currently underrepresented by the existing latent model (e.g., the analysis component 120 weights the digitized data prior to estimating the new component) and adding the new component to the latent model to provide a new latent model.
Additionally, the analysis component 120 estimates the new model based on an optimization of a function, such as a cost function (e.g., the cost of adding a new component and/or the cost of the overall model after the new component is added), a distance function (e.g., the distance between the current model and a pre-existing ideal, for instance KL-distance), a log cost function, etc., that measures the overall cost after adding each new component to the model. In order to utilize one or more of these cost functions, the analysis component 120 can regularize the cost function as needed to avoid generating trivial or useless solutions. For example, the analysis component 120 can regularize (e.g., through use of an L2 regularizer (not shown) that considers the energy in a probability vector, such as ∥w∥2 and ∥d∥2) the functions by adding cost functions together. For instance, if the cost is f(x) and f(x) tends to 0 as x increases, then the cost can arbitrarily be reduced by making x tend to infinity, however in practice this may not be useful, thus the analysis component 120 can provide a regularized version f(x)+b*g(x), where g(x) increases with x and b is a regularization parameter.
In general, a model is built by adding a component as represented as: P(w,d)=(1−α)F(w,d)+αh(w,d), where there are no special assumptions on the model h(w,d). For the previously mentioned case of the aspect model where the underlying objects are independent given the model i.e., h(w, d)=P(w|zK)P(d|zK) the joint word-document probability distribution P(w, d) of equation 1, supra, can be represented as follows:
where α=P(zK), a mixing ratio, gives the prior probability that any word-document pair belongs to the Kth aspect. Note that the second line in the above equations can alternatively be interpreted as a model where h(w,d) is a joint distribution model where independence assumptions need not be used. Thus, given the current estimate F(w, d) the analysis component 120 can determine values for h and α. It should be noted that the analysis component 120 can utilize many types of optimization techniques to determine h and α. For example, the analysis component 120 can employ gradient descent, conjugate-gradient descent, functional gradient descent, expectation maximization (EM), generalized expectation maximization, or a combination of the aforementioned. Nevertheless, for the purposes of explication and not limitation, the claimed subject matter is described herein as utilizing a combination of generalized expectation maximization and functional gradient descent optimization techniques.
Additionally, it should be noted that when estimating h (a probability distribution) one can make assumptions about how h should be represented. For example, one can select h as belonging to a particular family of distributions and thus estimate parameters in this manner. Further, it can be assumed that the distribution forms a hierarchical structure for h, in which case other optimization steps can be required. Moreover, it can also be assumed that different kinds of objects are independent, e.g., it can be assumed that words and documents are independent of one another.
Accordingly, in order to ascertain the values for h and α the analysis component 120 can maximize the empirical log-likelihood Σwdnwd log P(w,d). Substituting equation (2) into the empirical log-likelihood, the empirical log-likelihood equation can be written as:
over h and α. However, it is difficult to optimize this function directly. Often the optimization is done using a different function that is called a “surrogate” function or in some circles, a Q-function. It is so called because the two functions share some properties such that optimizing the latter will lead to optimizing the former. A surrogate function can be constructed in many ways. One way is to use a function that forms a tight lower bound to the optimization function. For example, in this instance one can write the surrogate function as:
The analysis component 120 having rendered the surrogate function can utilize the following expectation step (E-step):
which is obtained by optimizing the surrogate function (e.g. equation (4)) over Pwd which are also known as the hidden parameters.
The M-step involves estimating the model parameters so that the surrogate function is maximized. In a generalized EM (GEM) approach the surrogate function need not be maximized but just increased (or decreased as appropriate). This can be done in many ways e.g., using a conjugate gradient approach, a functional gradient approach, etc. In this instance, a functional gradient approach is adopted to estimate function h, and the parameters are estimated such that the expected value of the first order approximation of the difference between the optimization function before and after the new model is improved. Specifically if the old model is F and the new model is F′ the aim is to maximize E{L(F′)−L(F)} which when approximated using a first order Taylor expansion, will depend only on the functional gradient of L at
In other words, estimate h such that this functional derivative is maximized and is at least non-negative. Thus, utilizing the log cost function, the functional derivative can be written as
and if the negative of the log cost function with an L2 regularizer is employed, the same derivative can be written as
The above minimization can be done in many ways. Through utilization of a log cost function with a regularizer (e.g. an L2 regularizer) as shown above, one can estimate h such that:
where ∥h∥2 is the norm of h. Further, if one assumes conditional independence, e.g. h(w,d)=p(w|z)p(d|z), then based on this assumption, one can use a different form of the regularizer which depends on the norms of w=p(w|z) and d=p(d|z) such that
where ν and μ are two different regularization parameters. If a new matrix V of dimensions w×d whose entries are
were created, then the above estimation can be rewritten as w,d=argw,dmin−wtVd+νwtw+μdtd. Which can be solved by finding the derivative of the cost with respect to w, d and setting these to zero, resulting in a pair of iterative updates. Further assuming that ν, μ are both equal to 1 the result is a pair of assignments that need to be used iteratively to get to a solution:
w=Vd and d=Vtw. (spectral M-step). (6)
The solution for this pair of equations are the top left and right singular vectors of the matrix V, leading to a spectral approach (e.g., methods and techniques that identify and employ singular values/eigenvalues, singular vectors/eigenvectors of matrices, etc.). Adoption of such a spectral approach facilitates locating tight clusters or groups of objects (e.g., how well connected the underlying objects are).
To illustrate a group of objects that are tightly clustered, consider a graph of words and documents wherein a line or link is drawn between the word and the document if and only if the word exists in the document, and a link strength is indicated by how often a word occurs in the document. Thus, where many words together occur in a plethora of documents and all the strengths between these groups of words and documents are strong then this can be perceived as being a tight cluster.
A weighted term-document matrix can be viewed as a bipartite graph with words on one side and documents on the other. An illustration of such a bipartite graph is provided in
Alternatively, Equation (6) can be viewed as ranking the relevancy of objects based on their link or relations to other objects. For example, for words and documents the most relevant words are the ones that tend to occur more often in more important documents, and the important documents are the ones that contain more of the key words. This co-ranking can be implemented in any other way using other weighted ranking schemes.
Thus, once h is estimated, α can be estimated using one of many methods. For example, a line search can be utilized to estimate α such that
The analysis component 120 can employ many different criteria to evaluate whether to cease adding more components, for example, the analysis component 120 can ascertain that the digitized data is sufficiently well described by a current model, the weights for most of the digitized data is too small, and/or the cost function is not sufficiently decreasing. Additionally, the analysis component 120 can also ascertain that a stop or termination condition has been attained where a functional gradient ceases to yield positive values, a pre-determined numbers of clusters has been obtained, and/or whether an identified object has been located.
Accordingly, utilization of equation (6) by the analysis component 120 effectuates a convergence of the final scores to the leading left and right singular vectors of a normalized version of T, and for irreducible graphs, in particular, the final solution is unique regardless of the starting point, and the convergence is rapid. Thus, if a connection matrix has certain properties, such as being irreducible, there will be a quick convergence to the same topic/theme regardless of how the model is initialized. Nevertheless, not all word-document sets reduce to irreducible graphs, but this shortcoming can be overcome by the analysis component 120 introducing weak links from every node to every other node in the bipartite graph.
By modify a PLSI algorithm aspect models can be built incrementally, each aspect being estimated one at a time. However, before each aspect is estimated the data should be weighted by 1/F. Further, at the start of the PLSI algrorithm w and d are initialized by the normalized unit vector and the regular M-step is replaced by the spectral M-step in Equation (6).
Such an algorithm can thus facilitate a system that can be adapted to accommodate new unseen data as it arrives. To handle streaming data, one should be able to understand how much of the new data is already explained by the existing models. Once this is comprehended, one can automatically ascertain how much of the new data is novel. In one embodiment of the claimed subject matter a “fold-in” approach similar to the one used in regular PLSI can be adopted. Since the function F represents the degree of representation of a pair (w,d) once has to estimate this function for every data point, which in turn means that one has to figure out how much each point is represented by each aspect h i.e., one needs to estimate p(w|h) and p(d|h) for all the new (w,d) pairs in the set of new document X. To this end, one first keeps p(w|z) values fixed for all the words that have already been seen, only estimating the probabilities of the new words (the p(w|z) vectors are renormalized as needed). Then using the spectral projections (Equation (6)) p(d|z) is estimated while still holding p(w|z) fixed. Using this one can compute new F for all X. This is the end of the “fold-in” process. Once the new documents are folded-in, one can use the new F to run more iterations on the data to discover new themes.
To provide further context and to better clarify the foregoing, the following exemplary algorithmic discussion of the claimed subject matter is presented. From the foregoing discussion it can be observed that the unsupervised learning framework generated by system 100 has two constituent parts: a restriction element where based on already discovered, clustered, grouped and/or characterized topic(s)/theme(s), this approach defines a restricted space; and a discovery feature that employs the restricted space located by the restriction component to spectrally ascertain new coherent topic(s)/theme(s) and modify the restriction based on newly identified coherent topic(s)/theme(s). These two constituent parts loop in lock step until an appropriate topic/theme is identified.
Initially upon invocation of each ULF step a new F is employed to select a restriction equal to 1/F which effectively up-weights data points that are poorly represented by F (e.g. ULF commences with a uniform initialization of F). Having selected an appropriate restriction, ULF invokes D
In addition, and in order to streamline the totality of topics/themes rendered and to curtail the occurrence of redundant topics/themes, the claimed subject matter can perform ancillary post-processing at the completion of each ULF iteration. For example, an analysis of the word distribution of the newest topic/theme identified can be effectuated to ensure that newly identified topics/themes are not correlative with one or more topics/themes that may have been identified during earlier iterations of ULF. Where a correspondence between the newly identified topic/theme and the previously identified and/or clustered topics/themes becomes apparent, post-processing can merge the topics/themes based on a pre-determined threshold. Such merging effectively creates an interpolated version of the model where the new word distribution is, for example,
(p(w|z′i)=(p(zi)p(w|zi)+p(zj)p(w|zj))/(p(zi)+p(zj)). (12)
With reference to
It should be noted that the analysis component 120 can also receive input from a user interface (not shown), wherein certain thresholds can be specified. Alternatively, the analysis component 120 can automatically and selectively ascertain appropriate thresholds for use by the various components incorporated therein. For example, a threshold can be utilized by the weighting component 340 wherein the weighting component 340 compares a score received from the scoring component 310 with the threshold specified or ascertained to determine whether a newly created aspect should be merged, eliminated, down-weighted or up-weighted (i.e., when a newly created aspect has never been, or has rarely been, seen during prior iterations).
While the claimed subject matter is described in terms of generative models, the subject matter as claimed can also find application with respect to non-generative models, for example, where the aspect model does not employ a probability distribution but rather utilizes any function, provided that the function(s) is non-negative. Thus, in this non-generative embodiment the non-negative function provides a score for each object such that the score provides a measure of the relevance of the object to a given aspect. Nevertheless, aside from this distinction the claimed subject matter operates in the same manner provided above for generative models including utilization of the log cost function and the combination of the expectation maximization and functional gradient approaches.
As will be appreciated, various portions of the disclosed systems above and methods below may include or consist of artificial intelligence, machine learning, or knowledge or rule based components, sub-components, processes, means, methodologies, or mechanisms (e.g., support vector machines, neural networks, expert systems, Bayesian belief networks, fuzzy logic, data fusion engines, classifiers . . . ). Such components, inter alia, can automate certain mechanisms or processes performed thereby to make portions of the systems and methods more adaptive as well as efficient and intelligent. By way of example and not limitation, the interface component 110, analysis component 120, filter component 410 and notification component 430 can as warranted employ such methods and mechanisms to infer context from incomplete information, and learn and employ user preferences from historical interaction information.
In view of the exemplary systems described supra, methodologies that may be implemented in accordance with the disclosed subject matter will be better appreciated with reference to the flow charts of
Referring to
In order to provide a context for the various aspects of the disclosed subject matter,
As used in this application, the terms “component,” “system” and the like are intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an instance, an executable, a thread of execution, a program and/or a computer. By way of illustration, both an application running on a computer and the computer can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
The word “exemplary” is used herein to mean serving as an example, instance or illustration. Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Similarly, examples are provided herein solely for purposes of clarity and understanding and are not meant to limit the subject innovation or portion thereof in any manner. It is to be appreciated that a myriad of additional or alternate examples could have been presented, but have been omitted for purposes of brevity.
Artificial intelligence based systems (e.g. explicitly and/or implicitly trained classifiers) can be employed in connection with performing inference and/or probabilistic determinations and/or statistical-based determinations as in accordance with one or more aspects of the subject innovation as described hereinafter. As used herein, the term “inference,” “infer” or variations in form thereof refers generally to the process of reasoning about or inferring states of the system, environment, and/or user from a set of observations as captured via events and/or data. Inference can be employed to identify a specific context or action, or can generate a probability distribution over states, for example. The inference can be probabilistic—that is, the computation of a probability distribution over states of interest based on a consideration of data and events. Inference can also refer to techniques employed for composing higher-level events from a set of events and/or data. Such inference results in the construction of new events or actions from a set of observed events and/or stored event data, whether or not the events are correlated in close temporal proximity, and whether the events and data come from one or several event and data sources. Various classification schemes and/or systems (e.g., support vector machines, neural networks, expert systems, Bayesian belief networks, fuzzy logic, data fusion engines . . . ) can be employed in connection with performing automatic and/or inferred action in connection with the subject innovation.
Furthermore, all or portions of the subject innovation may be implemented as a system, method, apparatus, or article of manufacture using standard programming and/or engineering techniques to produce software, firmware, hardware or any combination thereof to control a computer to implement the disclosed innovation. The term “article of manufacture” as used herein is intended to encompass a computer program accessible from any computer-readable device or media. For example, computer readable media can include but are not limited to magnetic storage devices (e.g., hard disk, floppy disk, magnetic strips . . . ), optical disks (e.g., compact disk (CD), digital versatile disk (DVD) . . . ), smart cards, and flash memory devices (e.g., card, stick, key drive . . . ). Additionally it should be appreciated that a carrier wave can be employed to carry computer-readable electronic data such as those used in transmitting and receiving electronic mail or in accessing a network such as the Internet or a local area network (LAN). Of course, those skilled in the art will recognize many modifications may be made to this configuration without departing from the scope or spirit of the claimed subject matter.
With reference to
The system bus 1018 can be any of several types of bus structure(s) including the memory bus or memory controller, a peripheral bus or external bus, and/or a local bus using any variety of available bus architectures including, but not limited to, 11-bit bus, Industrial Standard Architecture (ISA), Micro-Channel Architecture (MSA), Extended ISA (EISA), Intelligent Drive Electronics (IDE), VESA Local Bus (VLB), Peripheral Component Interconnect (PCI), Universal Serial Bus (USB), Advanced Graphics Port (AGP), Personal Computer Memory Card International Association bus (PCMCIA), and Small Computer Systems Interface (SCSI).
The system memory 1016 includes volatile memory 1020 and nonvolatile memory 1022. The basic input/output system (BIOS), containing the basic routines to transfer information between elements within the computer 1012, such as during start-up, is stored in nonvolatile memory 1022. By way of illustration, and not limitation, nonvolatile memory 1022 can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM), or flash memory. Volatile memory 1020 includes random access memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in many forms such as synchronous RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and direct Rambus RAM (DRRAM).
Computer 1012 also includes removable/non-removable, volatile/non-volatile computer storage media.
It is to be appreciated that
A user enters commands or information into the computer 1012 through input device(s) 1036. Input devices 1036 include, but are not limited to, a pointing device such as a mouse, trackball, stylus, touch pad, keyboard, microphone, joystick, game pad, satellite dish, scanner, TV tuner card, digital camera, digital video camera, web camera, and the like. These and other input devices connect to the processing unit 1014 through the system bus 1018 via interface port(s) 1038. Interface port(s) 1038 include, for example, a serial port, a parallel port, a game port, and a universal serial bus (USB). Output device(s) 1040 use some of the same type of ports as input device(s) 1036. Thus, for example, a USB port may be used to provide input to computer 1012 and to output information from computer 1012 to an output device 1040. Output adapter 1042 is provided to illustrate that there are some output devices 1040 like displays (e.g., flat panel and CRT), speakers, and printers, among other output devices 1040 that require special adapters. The output adapters 1042 include, by way of illustration and not limitation, video and sound cards that provide a means of connection between the output device 1040 and the system bus 1018. It should be noted that other devices and/or systems of devices provide both input and output capabilities such as remote computer(s) 1044.
Computer 1012 can operate in a networked environment using logical connections to one or more remote computers, such as remote computer(s) 1044. The remote computer(s) 1044 can be a personal computer, a server, a router, a network PC, a workstation, a microprocessor based appliance, a peer device or other common network node and the like, and typically includes many or all of the elements described relative to computer 1012. For purposes of brevity, only a memory storage device 1046 is illustrated with remote computer(s) 1044. Remote computer(s) 1044 is logically connected to computer 1012 through a network interface 1048 and then physically connected via communication connection 1050. Network interface 1048 encompasses communication networks such as local-area networks (LAN) and wide-area networks (WAN). LAN technologies include Fiber Distributed Data Interface (FDDI), Copper Distributed Data Interface (CDDI), Ethernet/IEEE 802.3, Token Ring/IEEE 802.5 and the like. WAN technologies include, but are not limited to, point-to-point links, circuit-switching networks like Integrated Services Digital Networks (ISDN) and variations thereon, packet switching networks, and Digital Subscriber Lines (DSL).
Communication connection(s) 1050 refers to the hardware/software employed to connect the network interface 1048 to the bus 1018. While communication connection 1050 is shown for illustrative clarity inside computer 1016, it can also be external to computer 1012. The hardware/software necessary for connection to the network interface 1048 includes, for exemplary purposes only, internal and external technologies such as, modems including regular telephone grade modems, cable modems, power modems and DSL modems, ISDN adapters, and Ethernet cards or components.
The system 1100 includes a communication framework 1150 that can be employed to facilitate communications between the client(s) 1110 and the server(s) 1130. The client(s) 1110 are operatively connected to one or more client data store(s) 1160 that can be employed to store information local to the client(s) 1110. Similarly, the server(s) 1130 are operatively connected to one or more server data store(s) 1140 that can be employed to store information local to the servers 1130. By way of example and not limitation, the systems as described supra and variations thereon can be provided as a web service with respect to at least one server 1130. This web service server can also be communicatively coupled with a plurality of other servers 1130, as well as associated data stores 1140, such that it can function as a proxy for the client 1110.
What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the disclosed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims. Furthermore, to the extent that the terms “includes,” “has” or “having” or variations in form thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising” as “comprising” is interpreted when employed as a transitional word in a claim.