1. Field of the Invention
The present invention is related to the area of protecting data in a predefined setting (e.g., an enterprise environment), and more particularly, related to processes, systems, architectures and software products for providing pervasive security and synchronization to digital assets at all times without the content of the digital assets leaving beyond the predefined setting.
2. Description of Related Art
The Internet is the fastest growing telecommunications medium in history. This growth and the easy access it affords have significantly enhanced the opportunity to use advanced information technology for both the public and private sectors. It provides unprecedented opportunities for interaction and data sharing among businesses and individuals. However, the advantages provided by the Internet come with a significantly greater element of risk to the confidentiality and integrity of information. The Internet is a widely open, public and international network of interconnected computers and electronic devices. When a project is undertaken by a group of users in several locations, a file related to the project could be altered unknowingly beyond its original intent or may be even accessed by an unauthorized person or machine from the Internet when being shared among the users.
There are many efforts in progress aimed at protecting proprietary information traveling across the Internet and controlling access to the proprietary information. For example, Dropbox (www.dropbox.com) is a free service that lets a user bring all his/her data files (e.g., photos, documents, and videos) anywhere. Any file saved to a Dropbox folder will automatically save to all computers associated with the user. In some cases, one can log into the Dropbox website to access his/her files. This means that one can start working on a computer at school or office, and finish on a home computer. There is no need to email oneself a file and bring a USB drive to carry the file.
Some business entities, however, prefer not to have a file outside their controls. Although Dropbox provides a solution to synchronize the file that can be accessed anywhere at any time, it requires a user to have a copy of the file on a server controlled by a service provider, not by a business itself that has created the file. For some highly sensitive files, it would be difficult for a business to rely on services similar to Dropbox. There is a need for solutions that can synchronize files while not releasing the files beyond the control of a business.
This section is for the purpose of summarizing some aspects of the present invention and to briefly introduce some preferred embodiments. Simplifications or omissions may be made to avoid obscuring the purpose of the section. Such simplifications or omissions are not intended to limit the scope of the present invention.
The present invention is related to processes, systems, architectures and software products for providing pervasive synchronization services to digital assets at all times and is particularly suitable in an enterprise environment. In general, the pervasive synchronization services mean that digital assets are synchronized at all times regardless where, when or by who a change is made to a digital asset (e.g., a document or a music file). As a result, copies of the digital asset, regardless where they are physically located as long as they are in predefined or registered storage area (e.g., a folder or repository), are kept identical. In other words, one has made a change to a file on one computer, copies of the file distributed on other computers are all updated with the change.
According to one aspect of the present invention, the pervasive synchronization services are performed by using metadata sets of the digital assets. A metadata set includes some or all of header information of a file holding a digital asset and supplemental information generated about the file, wherein the supplemental information includes at least location information to indicate where there are copies of the file. The location information facilitates a computing device to synchronize all the copies of the file with a change made by a user to one of the copies.
According to another aspect of the present invention, a computing device configured to manage the metadata sets and perform the synchronization services does not have to have a copy of the digital assets. In other words, the content of the digital assets stays where they are or within an organization (e.g., on a private network). By deploying a client module running on client devices on the private network, the registered storage area on a client device is watched for any change to a digital asset therein. Whenever there is a change to the digital asset, the client module communicates with the server to cause the corresponding metadata to be updated.
Various embodiments may be implemented as a method, a software product, a service and a part of a system. According to one embodiment, the present invention is a method for synchronizing digital assets via a server device, the method comprises: updating a metadata set of a digital asset when there is a change made to a status of the digital asset, and synchronizing the digital asset via the metadata set without releasing a copy of the digital asset to the server device. The change is detected by monitoring the status of the digital asset. The metadata set includes some or all of header information of the digital asset and supplemental information including at least a location of the digital asset.
According to another embodiment, the present invention is a system for synchronizing digital assets, the system comprises a server configured to update a metadata set of a digital asset when there is a change made to a status of the digital asset. The change is detected by monitoring the status of the digital asset. The metadata set includes some or all of header information of the digital asset and supplemental information including at least a location of the digital asset. The server is configured to maintain the metadata set but does not have an access to the digital asset, and is further configured to communicate with a computing device to receive an update to the metadata therefrom. The server is coupled to a public network while the computing device is coupled to a private network. Depending on implementation, the computing device may be a storage device, a local server or a client device.
According to yet another embodiment, the present invention is a software product stored in a memory space and executed on a server, for synchronizing digital assets, the software product comprises: program code for updating a metadata set of a digital asset when there is a change made to a status of the digital asset, and program code for causing a first computing device to retrieve a latest version of the digital asset from a second computing device when the digital asset is attempted for access by a user on the first computing device. The change is detected by monitoring the status of the digital asset, and the metadata set includes some or all of header information of the digital asset and supplemental information including at least a location of the digital asset. Further, the server does not have an access to the digital asset.
According to yet another embodiment, the present invention is a service provided by a service provider to synchronize digital assets owned by a plurality of entities. By offering an interface to a server operated by the service provider, the entities may define their own storage policy modules and access privileges for their users, register predefined storage areas and procedures when and how to effectuate the synchronization when a change is made to one of the digital assets.
One of the objects in the present invention is to provide a synchronization mechanism for all managed digital assets without releasing the content of the digital assets.
Other objects, features, and advantages of the present invention will become apparent upon examining the following detailed description of an embodiment thereof, taken in conjunction with the attached drawings.
These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where:
The present invention is related to processes, systems, architectures and software products for synchronizing files that may be accessed anywhere at any time without releasing the content of the files. In one perspective, the content of a file stays where it is specified (e.g., a server within an organization or enterprise) but the file itself is synchronized via a sync service provided by an internal or external server.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will become obvious to those skilled in the art that the present invention may be practiced without these specific details. The description and representation herein are the common means used by those experienced or skilled in the art to most effectively convey the substance of their work to others skilled in the art. In other instances, well-known methods, procedures, components, and circuitry have not been described in detail to avoid unnecessarily obscuring aspects of the present invention.
Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Further, the order of blocks in process flowcharts or diagrams representing one or more embodiments of the invention do not inherently indicate any particular order nor imply any limitations in the invention.
Embodiments of the present invention are discussed herein with reference to
To facilitate the description of the present invention, it deems necessary to provide definitions for some terms that will be used throughout the description herein. It should be noted that the definitions in the following are to facilitate the understanding and description of the present invention according to one embodiment. The definitions may appear to include some limitations with respect to the embodiment, the actual meaning of the terms has applicability well beyond the embodiment, which can be appreciated by those skilled in the art.
Digital Asset—defines a type of electronic data that includes, but is not limited to, various types of documents, multimedia data, streaming data, dynamic or static data, executable code, images and texts.
File or document—interchangeably used herein, indicates one type of digital asset and can be accessed by one or more applications without a priori knowledge, wherein an access to a file or document is a request that results in the file or document being opened, viewed, edited, played, listened to, printed, or in a format or outcome desired by a user who has requested the access to the file or document.
Managed file or managed document—interchangeably used herein, defines a type of digital asset that is managed for synchronization across a network, such a file or document can be updated from anywhere at any time under a defined or managed environment while an authorized user can always access the latest version (or revisions) from anywhere at any time within the managed environment, where a defined or managed environment is referred to an environment in which all files are managed for synchronization. An example of the environment is a plurality of computers each of which has at least one folder and executes a module monitoring what files are disposed in the folder, hence managed folder. All files in the folder shall be automatically managed by the module for synchronization.
Access privilege—is one or more rights a user may have with respect to a managed file or folder. A user may only be able to access a managed file or folder from a designated location if his/her access privilege limits him/her to do so. Optionally, access privilege may specify other limitations on a specific managed file or folder, for example, a file transfer protocol, an access application (model and/or version), a permit to grant access privilege to others (e.g. a consultant) or membership in other groups, etc.
Client device, computer, or machine—interchangeably used herein, is a terminal device typically used by a user to access managed documents.
Server device, computer, or machine—interchangeably used herein, is a computing device. In one embodiment, such a computing device is configured to provide synchronization service for managed files that are created in or accessible from a client machine.
Client module—generally means a version of one embodiment of the present invention and typically is loaded in a client device to deliver functions, features, benefits and advantages contemplated in the present invention. In the context of the present invention, the client module is configured to monitor or track the status of a managed file and any revisions that may have happened to the file. To facilitate the description of the present invention, the client module is also referred to herein as an agent or a monitoring module.
Server module—generally means a version of one embodiment of the present invention and typically is loaded in a server device to deliver functions, features, benefits and advantages contemplated in the present invention. In the context of the present invention, the server module is configured to communicate with the client module to consolidate or synchronize the statuses of a managed file and any revisions that may have happened to the file on one or more client devices. To facilitate the description of the present invention, the server module is also referred to herein as a sync module. A server device is configured to execute the sync module to provide file synchronization management, hence a sync manger.
Server and Client—unless otherwise specifically or explicitly stated, a server may mean either a server machine or a server module, a client may mean either a client machine or a client module, and in either case, the particular meaning shall be evident in the context.
According to one embodiment, the client computer 100 is loaded with a client module that is a linked and compiled, or interpreted, version of one embodiment of the present invention and is capable of communicating with a server 104 or 106 over a data network (i.e., the Internet or a local area network). According to another embodiment, the client computer 100 is coupled to the server 104 through a private link. As will be further explained below, a document created by an authoring tool is managed by the client module that will be further described in detail below. The client module, when executed, is configured to monitor the status of a managed document as long as it is disposed in a predefined storage area (e.g., a pre-allocated data repository) or simply a managed folder.
By virtue of the present invention, documents stored in a managed folder can be accessed by users as usual, but any changes or updates to any of the managed documents are synchronized. In other words, a file being edited at one place can be continued at another place without having to carry a version of it. In one embodiment, access privilege or access privileges may also be placed on a file or a folder. The access privilege for a user may include, but not be limited to, a viewing permit, a copying permit, a printing permit, an editing permit, a transporting permit, an uploading/downloading permit, and a location permit.
In one embodiment, a document may be created on the client computer 100 and shared as a copy on a client computer 102. The copy is stored in a managed folder in the client computer 102, resulting in the copy of the document being managed. The copy is further uploaded from the client computer 102 via the network 110 to a computing or storage device 104 that may serve as a central repository managed by an IT department at an enterprise. Upon arriving in the repository, the document becomes managed, namely its status, revisions and whereabouts and other information about the document (but not the content thereof) are tracked or managed at a central location (e.g., the sync manager running on a server 106). Should the document be copied to another computer running a similar client module, the location of the computer and any changes or revisions made to the document are also tracked to allow authorized users to know what has happened to the document and where copies of the document have been distributed.
Although not necessary, the network 110 is preferably a private or secured link. Such a link may be provided by an internal network within an enterprise, or a secured communication protocol (e.g., VPN and HTTPS) is used over the Internet. Alternatively, such a link may be simply provided by a TCP/IP link. As such, managed files on any of computing devices on the network 110 may be remotely accessed and synchronized via the sync manager.
According to one embodiment, the storage device 104 is coupled to or part of a server executing a client module or a local version of a server module of one embodiment of the present invention. One of the key features, advantages and objectives in the present invention is to synchronize all managed files via a server without having to release the files to the server. The storage device 104 is provided to keep a latest version of a file, although there are copies of the file distributed on several computers on the private network 110. It can be appreciated that not every copy of the file is updated depending upon whether a particular computer is up running at the time a change is made to the file. The storage device 104 is configured to be up running and communicate with the server 106 to ensure all managed files are synchronized.
Coupled to a public network 108, the server 106 executes a server module (i.e., a sync manager) to provide file synchronization management for one or more organizations or business entities. As will be further explained below, the server module in the server 106 maintains or interfaces to a database that includes, but is not limited to, a list of organizations, corresponding access privileges for an entire organization, rules for folders or files. One of the key features, advantages and objectives in the present invention is that the sync manager is configured to synchronize the managed files using their respective metadata thereof. In any event, managed files are kept behind a firewall or always under the control of an entity (e.g., an enterprise). In other words, the content of the managed files will not be released to anyone other than the authorized users within an enterprise.
Main memory 132, such as random access memory (RAM), is also interfaced to data bus 120 to provide CPU 122 with instructions and access to memory storage 136 for data and other instructions. In particular, when executing stored application program instructions, such as a synchronizing module in the present invention, CPU 122 is caused to manipulate the data to achieve results contemplated by the present invention. Read-Only Memory (ROM) 134 is provided for storing executable instructions, such as a basic input/output operation system (BIOS) for operation of input interface 140, display 126 and pointing device 142, if there are any.
Referring now to
Once the file is disposed into the store, the header information of the file is extracted. Many file types can be identified by using what is known as a file header. A header is a unit of information that precedes a data object (content), so an operating system and other software know what to do with the following contents. It is a region of each file where bookkeeping information is kept and may include, depending on a file format, a file name, the date the file was created, the date it was last updated, the file size, a geographic information or other relevant information. At 206, a data set, referred to as metadata, of the file is produced or updated, where the metadata includes some or all of the header information and supplemental information prepared by the module. Depending on the implementation, the supplemental information may include where the file was initially located, who created it or has updated it, how many copies it has had and where the distributed copies are located.
Essentially, as shown in
Alternatively, a file in a managed folder appears substantially similar but not identical to a regular file (unmanaged). For example, icons or file names of managed documents may appear in a different color or with a visual indication to distinguish from non-managed files. According to one embodiment, such an indication is controlled by the client module. When a managed file or a copy thereof is ended up in a machine or readable medium (e.g., an unmanaged folder, CD or USB), the synchronization of the managed file or the copy would be lost.
According to one embodiment, sets of files/folders are logically kept in collections and referred to as datasets. These datasets are in essence the container (e.g., managed folder) of a particular set of files and folders.
Exemplary Information for a Dataset (or a Folder) in a Set of Metadata:
It shall be noted that a dataset may have versions. Managed files and folders may have versions as well. A revision history of a particular file or folder is maintained via the metadata. Depending on the storage policy in place for the dataset, all revisions of a file, only a fixed number of the latest versions (e.g. last 3 versions) or just the latest version can be kept. In one embodiment, a record of the entire history of a file is kept so that it is possible to tell who has modified the file, when and where the file has been modified.
Exemplary Information that is Maintained for Revisions:
For image files, according to one embodiment, preview versions and thumbnail versions of the files may be maintained. These files are generated by services on the server or a client device in the cloud associated with the metadata storage.
According to one embodiment, file sharing is tracked so that the synchronization management knows exactly how a managed file has been shared. Links to locations of the managed file or copies thereof are tracked in the metadata. Such a link allows us to specify for a user a component which is really a symbolic link to another component that could actually be owned by another user. These links between components allow the sync manager to share items (files/folders) between users.
In one embodiment, there are 2 special datasets for a user:
i) Shared By Me Dataset—holds the components that the user has shared with other users; and
ii) Shared To Me Dataset—holds the components that other users have shared with this user.
For example, when a photo is shared with another user
i) A new entry in the Shared By Me Dataset will be created with:
At 302, one or more managed storage locations are allocated to keep files therein synchronized. According to one embodiment, at least one folder is allocated on one or more of the computing devices on a private network so that files in the managed folder may be cooperatively worked and synchronized. Depending on implementation, certain users may be authorized to access some of the managed folders while a group of users may be authorized to access all managed folders. The access controls of the users are managed by respective granted privileges.
Referring now to
It can be noted that the computer 334 is allocated an extra managed folder named “Price” in view of the computer 332. In one embodiment, storage allocation is done based on the granted access privileges for each of the users in an enterprise. User B uses the folder “Price” to synchronize files therein with other users authorized to access the folder “Price”. In one embodiment, each of the managed folders is assigned an identifier (e.g., using an alphanumeric code or characters) that is included in the metadata of a file therein so that the server knows where the file is located.
Referring now back to
The process 300 goes to 314, waiting for the file to be accessed if there is no change to the file at 306. It is assumed now that there is a change to a managed file at 306. The process 300 now goes to 308 to record the change. As described above, the managed file is processed by the client module to produce the metadata thereof by extracting some or all of the header information along with supplemental information generated about the file. Depending on implementation, the metadata may be sent to the server in its entirety or only a difference between the updated metadata and a previous version thereof is sent to the sync manager running on the server at 310. It shall be noted that the server on a private network or the Internet receives the metadata, not the managed file itself. In one embodiment, a latest version of the file is kept on a storage device on the private network in an enterprise.
At 312, the server is configured to determine the location of the managed file. There are cases in which several users are cooperatively working on the managed file. Any changes to the managed file shall be reflected on the copies thereof distributed on other computers. It is assumed herein that the server has already invoked to synchronize the file, hence a metadata set is already there and constantly updated. Given the updated metadata, the server knows where the file or copies thereof reside. The updated metadata or changes thereto are distributed to those locations (i.e., computing devices each running a client module) to incorporate the changes made by the user or users to the file or the copies.
The functions are performed based on the assumption that the network is up running. In an event in which the connection to one of the computing devices is not available, the server is configured to record the event and accumulate the changes and subsequent changes to the managed file if any, and deliver the changes in metadata to the computing device to effectuate the changes all together to the managed file.
It is often the case in which a copy of the file in a managed folder allocated on another computer is not being worked on. During the period that the copy is not being worked on, there are changes made to the file on other computers. When the copy of the file is accessed at 314, a corresponding application is invoked to access the file. For example, xxx.doc would invoke Microsoft WORD application to open the file. At 318, changes made to the file on other computers are incorporated by the corresponding application before the file becomes available to the user at 320. Alternatively, a latest copy of the file is transferred from a storage device (e.g., the storage device 104 of
Referring now to
It is assumed that the user of computer B did not do anything with the document 342 till a time T. Before the time T, the user of computer A has updated the document a few times to reach a version 346. Accordingly, the corresponding metadata has been updated several times. At the time T, the user of computer B decided to access the copy of the document in computer B. Before the document can be opened for use, the corresponding metadata is received or checked with (e.g., from the server or already received in computer B) to indicate that the document has been updated several times, instead of opening the version 342, the latest version 346 is opened. Depending on implementation, the latest version 346 may be obtained from a central repository (e.g., the storage device 104 of
The metadata has been updated from computer B. In one embodiment, the central repository has the updated version that may be obtained by computer A when computer A is caused to access the document. After the moment T, both computers A and B now have the identical versions 348 and 350, namely the documents 348 and 350 are synchronized without the server receiving any of the document 340 or its revisions 344, 346 or 348. As the user of computer B continues to work on the document 348 to update it to version 352, updated metadata is sent to the server. Under the supervision of the server, computer A is caused to update the document 350 to the updated version 354. At the moment, both documents 352 and 354 on two different computers are synchronized.
The above description indicates that a central repository or storage server is used to keep the latest version of a document for a client device to retrieve it when the client device is caused to access a local version of the document. It can be appreciated by those skilled in the art that any of the client devices may be designated as a repository to keep the latest version of a document.
Referring now to
As the names suggests, the administration interface 406 facilitates a service provider to register users and grant respective access privileges to the users and is an entry point to the server module. In one embodiment, the administration interface 406 is provided for a system administrator of an enterprise to set up hierarchy access levels for various managed folders, storage locations, users or group of users. For example, one user may be an executive or a branch supervisor who has the access privileges to any managed folders or storage locations. Others have limited access privileges and can access certain managed digital assets. The privileges may include, but not be limited to: open, edit, write, print, copy, download and others. Examples of the other privileges may include: altering access privileges for other users, accessing managed files from one or more locations, and setting up a set of access rules for a managed folder for a group of users. According to one embodiment, the administration interface 406 is a user graphic interface showing options for various tasks that an authenticated system administrator or operator may need to perform in order to synchronize managed digital assets within an enterprise.
Essentially, the account manager is a database or an interface to a database 407 (e.g., an Oracle database) maintaining all the registered organizations subscribing the synchronization services provided by a service provider operating the server 400, and their respective access privileges, and perhaps corresponding links to policy modules. In one embodiment, the account manager 408 authenticates a user when the user tries to access a managed digital asset and also determines if the user may access managed digital assets from the location the user is currently at. In general, the account manager 408 is provided for an enterprise to control its users.
This is a predefined storage area to maintain metadata sets for all managed digital assets. Depending on implementation, metadata sets may be grouped and maintained for all managed digital assets in one folder and updated all together whenever there is a change to anything in the folder or the metadata sets may be individually managed and updated only when there is a change to a corresponding digital asset (e.g., a WORD file). In one embodiment, the metadata set is maintained in a markup language (e.g., XML). In another embodiment, the metadata is kept in a database.
This module is configured to monitor an access request from a user and his/her access location. In one embodiment, a user is granted to access the managed digital assets from one or more designated locations or networked computers. Should the user access a managed file from an unknown location (e.g., unrecognized IP address or unauthorized computer), the user monitor 412 may be configured to deny the access.
This manager is provided to keep a storage policy module for an organization. For example, when a user requires to access a file while several devices have such a file, the storage policy by one organization may specify that the one closer to the requesting computer in terms of geographic distance is caused to service the access request, which is particularly efficient when an organization operates on several countries. Similarly, the storage policy by one organization may also specify that the one having the most available resources in terms of bandwidth and computing power at the moment shall service the access request, which is particularly efficient when an organization has a larger number of employees at a location.
This module is designed to be responsible for distributing an appropriate local module for a local server servicing a predetermined location or a predetermined group of users. According to one embodiment, the local server manager 414 replicates or customizes some of the server module 402 and distributes a localized version to an enterprise, where the localized version is executed on a server on a private network. For example, it can be loaded on the server 104 of
This module is configured to record or track all access activities (successful or attempted) and primarily works with respective client modules running on client machines. The access report manager 418 is preferably activated by a system administrator and the content gathered in the access report manager 418 shall be only accessed by the system administrator or with authority.
It should be pointed out that the server module 402 in
As described above in one embodiment, a localized version 402 of the server module may be obtained and loaded on one of the computers (e.g., the server/storage device 104 of
The present invention may be implemented as a method, a system, a computer readable medium, a computer product and other forms that achieve what is desired herein. Those skilled in the art will understand that the description could be equally applied to or used in other various different settings with respect to various combinations, embodiments or settings provided in the description herein.
The processes, sequences or steps and features discussed above are related to each other and each is believed independently novel in the art. The disclosed processes, sequences or steps and features may be performed alone or in any combination to provide a novel and unobvious system or a portion of a system. It should be understood that the processes, sequences or steps and features in combination yield an equally independently novel combination as well, even if combined in their broadest sense, i.e., with less than the specific manner in which each of the processes, sequences or steps and features has been reduced to practice.
The forgoing description of embodiments is illustrative of various aspects/embodiments of the present invention. Various modifications to the present invention can be made to the preferred embodiments by those skilled in the art without departing from the true spirit and scope of the invention as defined by the appended claims. Accordingly, the scope of the present invention is defined by the appended claims rather than the foregoing description of embodiments.