The present disclosure relates generally to software systems, and in particular, to systems and methods for parallel transport of data between computer systems.
Computer systems require data to produce useful and meaningful results. Data preparation and analytics can involve complex, time consuming preparation of relations, visualizations, and compilations of data. When data is prepared in such a way, it may be beneficial to share the data across multiple computer systems. However, moving complex data structures across different systems can be a challenge. One particular challenge is to ensure that the latest version of data is transported. In some cases, data being transported may be an older version. Accordingly, new data can be overwritten with old data and/or old data may be retrieved from a system when newer data was available.
The present disclosure addresses these and other challenges and is directed to techniques for moving data between computer systems.
Described herein are techniques for backing up data. In the following description, for purposes of explanation, numerous examples and specific details are set forth in order to provide a thorough understanding of some embodiments. Various embodiments as defined by the claims may include some or all of the features in these examples alone or in combination with other features described below and may further include modifications and equivalents of the features and concepts described herein.
In various embodiments, the user may use the techniques described herein to move data from content management system 102 to server 103 (e.g., data import to server) or to move data from server 103 to content management system 102 (e.g., data export from server). Content management system 102 may similarly import data from a server or export data to a server, for example. In some embodiments, users of multiple different application servers 103 may store data generated locally on server 103 in a content management system 102 to share data with other users on other applications servers (not shown). Accordingly, it may be desirable to import and export data between content management system 102 and server 103. In some embodiments, the data may comprise data packages. Data packages passed between computer systems may include a plurality of data objects, metadata describing the plurality of data objects and dependencies between the data objects, and information summarizing the data package. Example objects in object data store that may be shared include objects corresponding to complex analytics content (e.g., models, stories, visualizations, dimensions, connections, etc. of data) that may be used to discover the unseen data patterns to boost the business productivity, for example.
To facilitate movement of data between systems, content management system 102 includes an automated move process 110. In various embodiments, data movement between the systems may include initiating automated process 110 to move data between a first computer system and a second computer system. Advantageously, data may be moved in parallel. For example, automated process 110 may instantiate a plurality of jobs 111a-n to move data units between the first computer system and the second computer system. In some embodiments, the data units are data packages comprising a plurality of data objects, metadata describing the plurality of data objects and dependencies between the data objects, and information summarizing the data package. Each job 111a-n may assemble and move one data package for example. However, because the data is shared, in various scenarios it may be desirable to ensure that the data being moved is a latest version and/or that other users are not modifying the data in the process of being moved, for example. Thus, prior to execution of each job 111a-n, the system may determine if an overlap exists between data units of the jobs. For instance, if one job is moving data off the content management system while another job is moving a modified version of the same data off the content management system, the overlap may be detected. A job may be executed when no overlap exists between data units moved by any of the currently executing jobs 111a-n and a particular job to be executed, for example. Further examples and advantages of the present techniques are described further below.
The present disclosure may be used in the context of an analytics content network in a cloud computer system that combines business intelligence (BI) and planning and predictive capabilities, for example. In any business intelligence application, the analytics content (model, story, visualizations, etc.) plays the central role in discovering the unseen patterns to boost the business productivity. Hence sharing of the analytics content across users is very helpful for better collaboration. Also, a standard content template can be reused by all user by plugging their corresponding data. This infrastructure for sharing the analytics content is sometimes referred to as an “Analytical Content Network” (ACN). The content entity that contains content to be shared is called “package.”
ACN may be arranged landscapes as a central component, all of which are connected. An application landscape is a coherent set of interconnected applications often within an enterprise, business, or organization, which are often associated with different geographical regions, for example. Logically ACN is “one global content network” which can provision or share any content with servers and users across landscapes. ACN may supports the following end-user workflows. A content creator creates content in the form of stories, models, dimensions, connections, Value-Driver Trees (VDT) etc. If authorized, the user can then export this content from a tenant (a portion of system resources securely assigned to a particular group) to ACN by creating a “content package,” which can contain any number of these content items and share this with multiple other tenants, for example. Another SAC Content user can view all available content packages in their listing and import those packages relevant for their analytic workflows. This includes Public content (Templates or Demo content) and Private Content (shared privately with them). To achieve sharing across tenants, the content is bundled in what is referred to as a “package.” A package may contain the details of each object present in the package, the dependency information between those objects, and an overview which summarizes the content details, for example. Data objects are stored in datastore 312 and data describing the data objects and packages may be stored in database 313, for example. An example package is shown in
Embodiments of the present disclosure may include three steps for transportation of data content and objects across tenants as shown in
One issue with important packages from a CMS to a server is that packages may contain common objects. If Tenant1 has packages:
Therefore, if more than one user triggers import of packages containing the same set of objects at same time, there are chances of data overwrite. For example, the following scenario may occur. UserC creates (exports) Story1, adds a table view, and creates PackageA containing Story1 and its dependencies, and if UserC imports PackageA (e.g., at 4 PM and it takes an hour to complete) with an import order of: Dimension1, Connection1, Model1, Model2, Story1. UserD modifies Story1, adds a pie chart, a line chart (in addition to the above added table view), and creates (exports) PackageB containing Story1 and its dependencies, where UserD imports PackageB at 4:30 PM with an import order of: Currency3, Model1, Story1. For the above scenario, if PackageA and PackageB execute at non-overlapping times, there is no conflict. However, if PackageA import is ongoing when PackageB import is initiated, there is chance to override Story1, Modell objects. PackageB imports Story1 (table view, pie chart, line chart). PackageA imports Story1 (table view), and may overwriting changes made by UserD, for example. Note, generally, exported object structure is maintained so it can be recreated in another system. If new elements are added/removed from a package (story), the package in CMS will not be affected.
Features and advantages of the present disclosure allow parallel transport post analysis of already executing jobs (package and/or content) for faster movement of data between systems without overlap or conflict to avoid data loss as described above.
Using the techniques described in the above example, an ACN supported package transport may be performed in parallel to reduce the time taken to export or import multiple packages, improve usage of resources (e.g., DB connection pool, number of instances and threads), and reduce time taken during tenant migration, which may involve bulk package transport.
In some systems, computer system 510 may be coupled via bus 505 to a display 512 for displaying information to a computer user. An input device 511 such as a keyboard, touchscreen, and/or mouse is coupled to bus 505 for communicating information and command selections from the user to processor 501. The combination of these components allows the user to communicate with the system. In some systems, bus 505 represents multiple specialized buses for coupling various components of the computer together, for example.
Computer system 510 also includes a network interface 504 coupled with bus 505. Network interface 504 may provide two-way data communication between computer system 510 and a local network 520. Network 520 may represent one or multiple networking technologies, such as Ethernet, local wireless networks (e.g., WiFi), or cellular networks, for example. The network interface 504 may be a wireless or wired connection, for example. Computer system 510 can send and receive information through the network interface 504 across a wired or wireless local area network, an Intranet, or a cellular network to the Internet 530, for example. In some embodiments, a frontend (e.g., a browser), for example, may access data and features on backend software systems that may reside on multiple different hardware servers on-prem 531 or across the network 530 (e.g., an Extranet or the Internet) on servers 532-534. One or more of servers 532-534 may also reside in a cloud computing environment, for example.
Each of the following non-limiting features in the following examples may stand on its own or may be combined in various permutations or combinations with one or more of the other features in the examples below. In various embodiments, the present disclosure may be implemented as a system, method, or computer readable medium.
Embodiments of the present disclosure may include systems, methods, or computer readable media. In one embodiment, the present disclosure includes computer system comprising: at least one processor and at least one non-transitory computer readable medium (e.g., memory) storing computer executable instructions that, when executed by the at least one processor, cause the computer system to perform a method as described herein and in the following examples. In another embodiment, the present disclosure includes a non-transitory computer-readable medium storing computer-executable instructions that, when executed by at least one processor, perform a method as described herein and in the following examples.
In some embodiments, the method of moving data comprising: initiating an automated process to move data between a first computer system and a second computer system; instantiating a plurality of jobs to move data units between the first computer system and the second computer system; prior to execution of each job, determining if an overlap exists between data units of the jobs; and executing each particular job when no overlap exists between data units moved by a currently executing job and each particular job.
In some embodiments, the first computer system is an application server and the second computer system is a content management system.
In some embodiments, the content management system instantiates the plurality of jobs and wherein the data units are data packages comprising a plurality of data objects, metadata describing the plurality of data objects and dependencies between the data objects, and information summarizing the data package.
In some embodiments, the automated process to move data is a data import operation.
In some embodiments, determining if an overlap exists between data units of the jobs comprises determining if a currently executing job and a new job share a same set of objects, wherein when the currently executing job and the new job share the same set of objects, the new job is not executed.
In some embodiments, the automated process to move data is a data export operation.
In some embodiments, determining if an overlap exists between data units of the jobs comprises determining if a currently executing job and a new job operate on a same data package, wherein when the currently executing job and the new job operate on a same data package, the new job is not executed.
The above description illustrates various embodiments along with examples of how aspects of some embodiments may be implemented. The above examples and embodiments should not be deemed to be the only embodiments, and are presented to illustrate the flexibility and advantages of some embodiments as defined by the following claims. Based on the above disclosure and the following claims, other arrangements, embodiments, implementations, and equivalents may be employed without departing from the scope hereof as defined by the claims.
This application is related to, and concurrently filed with, U.S. Patent Application Ser. No. (Unassigned; Attorney Docket No. 000005-105500US, entitled “SYSTEMS AND METHODS FOR AUTHORIZED MOVEMENT OF INFORMATION BETWEEN COMPUTER SYSTEMS”, naming Sahana Durgam Udaya as inventor, on Dec. 6, 2023, the disclosure of which is hereby incorporated herein by reference. This application is related to, and concurrently filed with, U.S. Patent Application Ser. No. (Unassigned; Attorney Docket No. 000005-105600US, entitled “SYSTEMS AND METHODS FOR SCHEDULING PACKAGES TO SYNCHRONIZE CONTENT ACROSS COMPUTER SYSTEMS”, naming Sahana Durgam Udaya and Pranav Kumar as inventors, filed on Dec. 6, 2023, the disclosure of which is hereby incorporated herein by reference. This application is related to, and concurrently filed with, U.S. Patent Application Ser. No. (Unassigned; Attorney Docket No. 000005-106300US, entitled “SYSTEMS AND METHODS FOR COPYING DATA BETWEEN COMPUTER SYSTEMS”, naming Sahana Durgam Udaya, Soumya Basavaraju, Abhishek Nagendra, Ashokkumar Kandasamy Narayanan, and Mickey Wong as inventors, filed on Dec. 6, 2023, the disclosure of which is hereby incorporated herein by reference. This application is related to, and concurrently filed with, U.S. Patent Application Ser. No. (Unassigned; Attorney Docket No. 000005-106400US, entitled “SYSTEMS AND METHODS FOR STORING AND RETRIEVING PUBLIC DATA”, naming Sahana Durgam Udaya as inventor, filed on Dec. 6, 2023, the disclosure of which is hereby incorporated herein by reference.