The computers that are used by people in a company are typically connected to a server and/or one other over a network. The way that each person in a company uses his/her computer could provide valuable information for others in the organization. Unfortunately, a lot of business knowledge that can be inferred and shared by monitoring the computer activities of users within the company gets lost each day.
Various technologies and techniques are disclosed for aggregating and using data collected from multiple computers to modify a later behavior of those computers. In one implementation, a data aggregation system is described. A data collector is operable to collect behavior data over a network from one or more applications used by the computers, and to save the behavior data to a data store. A data installer is operable to access the behavior data in the data store and convert the behavior data into a format that will modify a future operation of at least one of the applications that is used on at least one of the computers.
In one implementation, a method for creating and distributing a custom dictionary is described. Term data is received from computers over a network. The term data includes terms that have been collected from applications running on the computers. The term data that was received from the computers is analyzed to determine which terms should be marked for distribution to the computers. The terms marked for distribution are sent to at least one of the computers for inclusion in a custom dictionary that is used by one or more of the applications.
In another implementation, a method for identifying related documents is described. Document correlation data is received from a plurality of computers over a network. The document correlation data includes information about documents that are opened at similar points in time. Alternatively or additionally, the document correlation data can include information about documents that are referenced together in an email or other document. The document correlation data that was received from the computers is then analyzed to create a database of related documents. A query request is received from one of the computers over the network. The query request contains a request for any documents that are related to a particular document. In response to the query request, result information is returned regarding one or more documents that are contained in the database of related documents that were previously determined to be related to the particular document.
This Summary was provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
The technologies and techniques herein may be described in the general context as a framework for collecting behavior data from computers over a network and then using the behavior data to alter the operation of those computers, but the technologies and techniques also serve other purposes in addition to these. In one implementation, one or more of the techniques described herein can be implemented as features within a content management application such as MICROSOFT® Office SharePoint Server, or from any other type of program or service that monitors the behavior of one or more computers or that utilizes the behavior data that has been collected from multiple computers.
In one implementation, behavior data is collected from computers over a network, such as an intranet. The term “behavior data” as used herein is meant to include data that is related to actions that happen while a computer is being used, such as what files are opened around the same time, what content actually gets typed into the programs that are open, and so on. Once that behavior data is collected from multiple computers over a network, the behavior data can be analyzed in the aggregate and used to determine interesting updates to make to the client computers.
As one non-limiting example, a custom dictionary can be created and then propagated back down to the computers on the network after analyzing the behavior data to create or revise the custom dictionary. In such a scenario, the behavior data can be in the form of “term data”, which includes terms that are used by end users within documents. For example, term data can include commonly used words, entries from the user's custom lexicon, words that were ignored, etc. As another non-limiting example, documents that have been determined to be related to each other upon collecting data from multiple computers can be shared with other computers in the network. These are just a few examples of how the aggregated behavior data can be used to then update other computers in the network. Turning now to
In one implementation, data collector 102 resides on a server and is connected with computers 108 over a network, such as an Intranet, the Internet, or another network. When data collector 102 is contained on a server, data collector is responsible from collecting behavior data from multiple computers 108 that participate in the network, and then storing the collected behavior data in data store 106. In other implementations, a separate data collector 102 can be installed on each of computers 108, with each data collector 102 then being responsible for recording the data to the data store 106. Data that is collected by each data collector 102 is stored in data store 106 with unique IDs that allow the data to be retrieved later.
One non-limiting example of behavior data that can be collected by data collector 102 includes what files are opened around the same time. If users tend to open a word processing document at the same time as a spreadsheet, then that gives a good indication that these documents may be related or have some other connection to one another. Another non-limiting example of behavior data includes what content actually gets typed into the programs that are open. For example, if an email or word processing document frequently includes hyperlinks or embedded attachments to the certain documents or resources together, then there is a good chance that those documents are related.
Another non-limiting example of behavior data that could be gathered by data collector 102 includes the words that get typed into a word processing or other document that are flagged as incorrect by a proofing tool and then indicated as “correct” by the user. Examples of proofing tools can include a grammar checker, contextual spell checker, etc. When the user indicates that the something is correct, is incorrect, does nothing, etc., this information can be useful. For example, it could evidence a company-specific or industry standard term that may not appear in a general dictionary. These are just a few non-limiting examples to illustrate the types of behavior data that could be collected by data collector 102 from computers 108. Any other actions that can be monitored and collected from computers 108 for use (such as in the aggregate or on an individual user basis) could also be gathered by data collector 102.
When gathered in the aggregate from multiple computers 108 over a network, this behavior data can be used for various scenarios to provide enhanced functionality to some or all of the computers 108 participating in the network. Data collector 102 is responsible for analyzing the behavior data contained in the data store 106. Data installer 104 then converts the behavior data into a format that will modify a future operation of at least one of the applications on one or more of computers 108. For example, this can include creating data for a custom dictionary, making recommendations on documents that are related to one another, providing a list of related people (like on a same team), distributing content and/or application updates, and so on.
In another implementation, behavior data can be collected over one network for use as a training set. The result of the analysis of the training data can then be used to alter the operation of one or more computers on another network (that is separate from the network on which the data was collected).
Various usage examples are described in further detail in
One of ordinary skill in the computer art will appreciate that data collector 102 and/or data installer 104 can be located on one of many varying computers and/or arrangements and still perform some or all of the techniques described herein. For example, data collector 102 and/or data installer 104 can be located on one or more client computers, server computers, and/or both.
Turning now to
In this example, behavior data gets collected from both the server side and the client side (by data collectors 130 and 136, respectively). For example, behavior data can be captured by data collector 130 from the way that users interact with one or more programs that run on the server computer 122, such as browser-based applications. Then, on the client computer 124, the data collector 136 can collect behavior data from applications 140 that are running locally on the machine, such as a word processor, spreadsheet, etc.
In the example shown, the data installers (132 and 138, respectively) are each responsible for accessing data store 126 and making use of the aggregated data on the respective computer. In the case of the server computer 122, data installer 132 is responsible for creating or modifying the operation of one or more programs that run on the server computer 122, such as a web application. On client computer 124, the data installer 138 is responsible for modifying the operation of one of more of applications 140 based upon the aggregated data that was retrieved from the data store 126. As noted in the discussion of
In one implementation, a custom dictionary could be created from this data gathered from multiple client computers. In the implementation shown in
A dictionary creator 244 (which is a data collector) on the server side then analyzes the terms that have been collected from both the client side and the server side to create a list of terms that are marked for distribution to a custom dictionary. This analysis can include analyzing how frequently those terms were used by multiple users across the network, and/or other analysis. The analysis can also include identifying and storing synonyms to those words that are marked for distribution.
In one implementation, dictionary creator 244 simply identifies the terms that need to be distributed across one or more custom group dictionaries on the respective computers and then allows each respective computer to add those terms to its local dictionary. In another implementation, dictionary creator 244 actually creates a revised custom dictionary and distributes an actual custom dictionary file to the respective computers that request it. In this latter example, a custom dictionary installer 242 requests from the data store 240 the terms that have been sent to the data store 240 for inclusion in a custom dictionary. The custom dictionary installer 242 then takes the data and converts it into a custom dictionary that the word processor can load. Then, the next time the client user starts a word processing session, that custom dictionary is loaded that has terms that were aggregated from across many machines over the network.
Turning now to
A query request is later received for any documents that are related to a particular document (stage 306). For example, a word processing application or other application may request information about any other documents that are related to a document that the user is currently accessing. This can be requested specifically by the user who wants to see related documents, or this can be requested automatically by an application so that the application can display those related documents automatically. The result information regarding any related documents is returned to the application that requested the information (stage 308). An example of this will be described in further detail in
A word processor can have a document open detector 358 which tracks which documents get opened around a similar time. This data is also sent to data manager 362 for inclusion as a possibly related document. This data is then saved in a data store 364. A related documents analyzer 368 then analyzes this collected behavior data and determines in the aggregate which of the documents are actually related to one another. Various techniques can be used to create a web of related documents, such as using temporal analysis, frequency analysis, and/or other heuristics. The data store 364 is then updated with the results of the analysis so the related documents can later be retrieved.
When an application such as word processor 356 requests the related documents 360 that are related to a particular document, then a related documents service 370 is called. The request can include the name or other identifier of a particular document that related document information is being requested for. Related documents service 370 can be implemented as a web service, as an executable, or in any other format that allows the related document data to be accessed from one or more client computers. The related documents service 370 then processes the related information 374 that it accesses from the data store 364 using the document identifier.
The related documents service 370 then submits that information back to the client computer 374 and then to the word processor 356 for display. The result information that is returned back to the word processor 356 can be in the format of one or more identifiers that can then be used to retrieve the actual underlying related documents when desired. For example, these identifiers can be a file path and/or a URL to where that document is located. As another non-limiting example, the result information can include the contents of the related documents themselves (i.e. the actual document itself).
In another implementation, some or all of the techniques described herein can be used for distributing updates to multiple computers over a network.
As shown in
Additionally, device 500 may also have additional features/functionality. For example, device 500 may also include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in
Computing device 500 includes one or more communication connections 514 that allow computing device 500 to communicate with other computers/applications 515. Device 500 may also have input device(s) 512 such as keyboard, mouse, pen, voice input device, touch input device, etc. Output device(s) 511 such as a display, speakers, printer, etc. may also be included. These devices are well known in the art and need not be discussed at length here.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. All equivalents, changes, and modifications that come within the spirit of the implementations as described herein and/or by the following claims are desired to be protected.
For example, a person of ordinary skill in the computer software art will recognize that the examples discussed herein could be organized differently on one or more computers to include fewer or additional options or features than as portrayed in the examples.