During approximately the last 30 years dramatic advances in technology—for example, the development of the minicomputer, the rise of the personal computer, and the emergence of the Internet—have revolutionized the way information is created, stored, shared, and used. Today, as technology continues to advance and improve, new breakthroughs are transforming the world once again. The foundation for the current transformation is the combination of an increasing diversity of ever more powerful devices, and the expanding data storage capacity in large scale networked data centers (“the cloud”) that are accessed through the growing ubiquity of broadband networks that comprise the Internet. The capabilities of such technologies are supporting the movement of computing resources, including both consumer and business-oriented applications, from the desktop or enterprise environment out to the Internet as hosted services.
Under such a cloud-computing model, locally installed software on a client platform may be replaced, supplemented, or blended with a service component that is delivered over a network. Such models can often give customers more choices and flexibility by delivering software solutions and user experiences that can typically be rapidly deployed and accompanied by value-added services. In addition to providing application services, cloud-based computing can also provide data sharing and storage capabilities for users to access, collaborate in, and share rich data that leverages the global cloud-computing footprint. Data is synchronized through the service so that files and folders, for example, are commonly replicated across the devices.
While service platforms in the cloud are expected to provide attractive, feature-rich solutions to customers that are well managed, robust, and cost-effective, it is desirable to have effective and efficient ways to synchronize data between client devices and the cloud-based service. In particular, it would be desirable to be able to perform synchronization quickly while preventing the consumption of so many resources that the client device becomes unresponsive to the user. However, these requirements often conflict with each other particularly when a device is starting up. If it has been offline for some time, and a lot of data needs to be replicated, the synchronization process can slow the startup (i.e., boot) process which can lengthen the time that the device is unusable before applications become available and the user can begin to work.
This Background is provided to introduce a brief context for the Summary and Detailed Description that follow. This Background is not intended to be an aid in determining the scope of the claimed subject matter nor be viewed as limiting the claimed subject matter to implementations that solve any or all of the disadvantages or problems presented above.
Synchronization of data across multiple devices which function as endpoints in a mesh network that supports a data sharing service is throttled responsively to user activity in the network by monitoring the activity using a component in a mesh operating environment (“MOE”) runtime that is instantiated on each endpoint. The monitoring may include the collection of data that can be used to infer user activity in the mesh network, as well as data that explicitly indicates activity. State information is maintained so that data can be synchronized across the endpoints even when a user goes offline from the service. When the user logs on to the service, makes changes to a shared file, or the endpoint device starts up upon connection to a mesh network, for example, throttling is performed by prioritizing work items associated with synchronization operations so that resources on the endpoint are not excessively consumed which could reduce the quality of the user experience.
In various illustrative examples, higher priority is assigned to synchronization operations for users who are currently logged on to the service on the mesh network, while lower priority is assigned to users who are not logged on. When no users are logged on, low priority is maintained for all synchronization operations. For currently logged on users, the synchronization operations can be throttled up or down depending on the monitored user activities. For example, calls to the MOE runtime and file system operations may be tracked and used as hints to identify other data, such as files in a common folder, which may be needed by the user which can then be given higher priority for synchronization. In this case, operations like data fetching from a peer endpoint will be given priority to ensure that the folder is kept up to date.
Conversely, when monitored system processes indicate a high level of resource use but such processes are not associated with the data sharing service (for example, a user may be performing a task such as editing a video that is computationally intensive), then synchronization operations can be given lower priority to free up resources to maintain a good user experience at the endpoint. In other examples, certain synchronization processes that are computationally intensive, such as hash calculations used to maintain data security, may be delayed to allow the user to complete activities such as edits to a file. Synchronization operations can then be performed later, rather than attempting to keep up with the user with each change to the file.
During startup of an endpoint, synchronization may be more heavily throttled when resources are typically consumed at peak rates. However, by using historical user activity patterns that may be persisted in a data store, the data that is most likely to be needed first by the user after startup is completed can be synchronized with the highest priority. By reducing the consumption of resources through throttling, the startup can complete more quickly which reduces the time that the endpoint is unusable. When the startup is completed and the user's desktop applications become ready for use, the data and files that the user needs to begin work will already be synchronized and current on the endpoint.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Like reference numerals indicate like elements in the drawings.
Cloud service 120 may replace, supplement, or blend with features and capabilities provided by applications and software that run locally. Offerings may include, for example one or more of identity and directory services, device management and security, synchronized storage and data services across multiple devices or platforms, and services pertaining to activities and news. The cloud service 120 may be provided under a variety of different business models including free, advertising-supported, and subscription-based models.
As shown in
The endpoints 122 shown in
It is emphasized that the endpoints shown in
As shown in
Meshes can overlap as shown in
The user 305 makes changes to a file in the folder 302, in this example, when the laptop computer 1223 is offline (as indicated by reference numeral 321), perhaps during an airplane flight in which network connectivity is normally unavailable. As shown in
Returning to
Illustrative synchronization operations 500 that are generally performed by an endpoint 122 are shown in
Data is respectively uploaded and/or downloaded (520) as required to replicate data identically across the endpoints 122. In this example, synchronization is implemented using a collection of two way Atom feeds in which data in the user's mesh 208 (e.g., file, folder, message, etc.) is rendered as a piece of information in the feed. Alternatives to the Atom feed include feeds expressed using RSS (Really Simple Syndication), JSON (JavaScript Object Notation), Microsoft FeedSync (formerly known as Simple Sharing Extensions or “SSE”), WB-XML (wireless binary extensible markup language), and POX (plain old XML). The Atom feeds provide a common synchronization infrastructure for the mesh 208 between applications on the endpoints and the service 120. Such feeds are also commonly used to support synchronization of non-mesh applications such as e-mail, and news readers, for example. Accordingly, upload and download of Atom feeds are also performed (525). Data describing synchronization state may also be loaded and/or saved to databases as required to perform the synchronization process (530). It is noted that the synchronization operations 500 are illustrative and that the particular operations and the order in which they are performed may vary to meet the needs of a given implementation. To cite just one example of such variation, the data uploading/downloading (520) and the Atom feed uploading/downloading (525) may be performed in reverse order to that shown in
An illustrative software architecture 600 for a representative endpoint 122 is shown in
The MOE runtime 635 is generally configured to expose services to help the applications 606 running on endpoints 122 to create cached or offline-based experiences to reduce round trip interactions with the cloud service 120 or to enable the endpoints to morph data into a more consumable form. More specifically, as shown in
The work item manager 707 interacts with work items 7101, 2 . . . N that represent events that occur in association with synchronization operations in the endpoint 122. Such operations may include, for example, those shown in
The synchronization throttling manager 711 is configured to interoperate with the work item manager 707 by monitoring user activity at the endpoint 122 (as indicated by reference numeral 725), as well as tracking user activity (728) in the form of historical statistics that are persisted to a store 731. In response to the monitoring and tracking performed by the synchronization throttling manager 711, the work item manager 707 will assign a synchronization priority 718 to the work items 710 (735).
The monitoring 725 may comprise collecting data that may be used to infer activity of a user 305, as well as data that explicitly indicates such activity. In the first case, by monitoring the API calls from the applications 606 to the MOE runtime 635 (740), and tracking the data that the applications access, hints as to what other data will be accessed by the user 305 may be ascertained. That is, the calls and data will implicitly indicate which object in the user's mesh 208 is currently being used by the user 305.
For example, if the user is browsing the files in the “My Project” folder 302 on his desktop, the shell 610 is the application that will be making calls to the MOE runtime 635. The user 305 may then start up a word processing application and begin to make edits to a particular file in the folder 302. By monitoring this user activity, the synchronization throttling manager 711 may infer that other files in the folder 302 will also be accessed and/or edited by the user 305. Accordingly, the work item manager 707 may raise the priority of work items 710 associated with the synchronization of the files in the folder 302 so that synchronization of the files takes priority over other operations.
Another example of implicit indications of user activity may come from the monitoring of activity at the system level on the endpoint 122 (743). This may be implemented, for example, by using the APIs 622 provided by the operating system on the endpoint 122 to expose system processes, such as hard disk activity, to the synchronization throttling manager 711. Operations and actions of the file system 616 may be similarly monitored. In this case, if there is a lot of disk access or file system operations that are not related to an object in the user's mesh 208, such activity may be used by the synchronization throttling manager 711 to infer that the user 305 is doing something that is reasonably computationally expensive.
For example, the user 305 may be doing some video editing or recalculating a large spreadsheet using one or more applications 606. The work item manager 707 could then lower the priority of work items 710 so that synchronization operations do not put additional pressures on system resources which may already be being consumed at a high level.
With regard to explicit indications of user activity, the synchronization throttling manager 711 may also monitor Atom feeds (747). Atom feeds, as described above, support the underlying synchronization infrastructure for the resources on the user's mesh 208 in this implementation. The monitoring of the Atom feeds may provide, for example, information about other users' activities on other remote endpoints by an application firing an event across the mesh 208 to explicitly indicate that another user has entered the folder 302 (typically with permission of the owner) and is now browsing the files contained therein. The work item manager can use this explicit information to increase the priority of work items 710 associated with synchronization of the files in this case.
Turning now to
For the signed-in users, their activities will be monitored (830), and synchronization throttled (835) responsively to such monitoring. Examples of the synchronization throttling include the delaying of computationally-intensive hash calculations in some cases. If the user is actively editing a document in the folder 302 which results in quick succession of events from the file system 616, the hashes may be delayed so that a set of user's changes are incorporated as using hashing performed in a batch process, for example. Another example is that the synchronization throttle may be opened when a user is actively browsing the folder 302. In this case, the priority of the Atom feed upload/download operation 525 may be increased as well as the priority for any associated peer-to-peer data fetching from remote endpoints 122 (i.e., upload/download data operation 520).
Throttling may also be performed during startup of an endpoint 122, particularly as there can be many pending changes in data that need to be synchronized from when the endpoint was offline. But as system resources are typically consumed at peak rates during startup, performing synchronization operations without throttling can often be expected to increase startup (i.e., boot) time which can undesirably lengthen the period of time before applications become available and the endpoint device becomes usable.
The solution in this case is to throttle all synchronization operations at startup including loading/saving data to databases 530, file/folder scanning 510, etc., in view of the user's historical activity that is persisted as statistics in the store 731. This enables early synchronization of data that, according to historical trends, is more likely to be accessed by the user upon log-on while also reducing the potential for conflicts that may be generated when the user attempts to modify data that has yet to be synchronized.
For example, historical activity might indicate that the user 305 has been working out of the My Project folder 302 consistently over the course of several days, and perhaps starts and ends a given work session by editing files in the folder 302. In this case, the folder 302 will be given the highest priority for synchronization so that the user's files will be current at whatever endpoint device the user employs for the next log-on. In cases where the activity history is not as clear cut, known statistical analyses may be applied to identify the most likely data candidates to be given high priority for synchronization.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.