Hypothetical analysis is used in a range of business areas and business functions. For example, Enterprises planning future actions may want to take advantage of certain Data Science capabilities to try to determine what actions could benefit them the most. Data Science functionalities might include mining results of past actions, learning likely correlations, predicting trends and simulating effects of the hypothetical actions. These functionalities could help suggest both reasonable business strategies and evaluate the effects of pursuing those strategies. Such evaluation is quite difficult in complex environments where competitors, customers and economies dynamically evolve.
System 100 includes a service 105 that may be developed and delivered to one or more entities such as, for example, client devices 120, 125, and 130. In some aspects, service 105 may be a cloud-based service provided by a service provider thereof. Service 105 may be supported and facilitated by one or more backend systems 110 and 115. Backend systems 110 and 115 may include hardware, software, and combinations thereof configured to deliver and make service 105 available to client devices 120, 125, and 130. Backend systems 110 and 115 may include, alone and/or in combination, processors, different types of memory, software applications and operating systems, communication devices, and interface mechanisms to facilitate communication between the different components and a framework or platform to deliver service 105.
Service 105 may comprise an application server, an enterprise application, a messaging service (e.g., a mail service), a social networking service, a data center to provide resources from data sources (not shown), and other systems, devices, components, and resources. In some aspects, service 105 may include an enterprise application used for conducting enterprise planning and analysis. Some such enterprise applications may include, for example, Enterprise Performance Management, Financial Planning, Sales and Opportunity Planning, and Business Warehouse Planning applications provided by the assignee hereof, SAP SE. These, and other, planning applications may be used for decision support and report generation transactions, as well as for warehouse analysis.
In some instances, service 105 may include a cloud based application, system, service, or resource that provides a service, resource, and/or access to a service or resource to client devices 120, 125, and 130. Service 105 may be delivered by a service provider remotely located from client devices 120, 120, 125, and 130. Communication between the client devices, service provider, and the backend systems may be accomplished using any communication protocol known and that becomes known.
In some aspects, an enterprise application and/or an administrator, user, or other entity may have a desire to analyze one or more “what if” results in parallel to an on-going operation of a data management system. In some regards, it may be desirable to analyze the different “what if” scenarios for one or more different hypothetical actions while also allowing other applications and processes to concurrently perform actual updates and other operations on the underlying data in a manner such that the hypothetical (i.e., “what if”) operations do not interfere with each other or with actual updates and other changes to the data.
Some database systems offer a form of snapshot isolation capabilities so that transactions can operate on data as the data existed as of particular (logical) times, usually when each transaction or a statement within a transaction started. With a basic form of snapshot isolation, changes of the same data by different transactions would interfere with each other, even if those changes were hypothetical modifications in transactions that support analysis but would never be committed.
In some embodiments, a “Branching Deltas” approach to perform hypothetical data changes is disclosed herein. The disclosed approach uses an innovative and efficient computation process to leverage visibility semantics that may already be used for implementing snapshot isolation by, for example, multiversion concurrency (MVCC) systems such as SAP HANA. Herein, the disclosed “Branching Deltas” approach may be described primarily in context of and based on SAP HANA's in-memory database management system design. However, the approach disclosed herein, including a variation of the “Branching Deltas” approach, may be extended to encompass and be implemented for other database management systems using snapshot isolation (e.g., MVCC systems) and other isolation levels (notwithstanding different semantics) such as Read Consistency.
Table 300 T′ is illustrated in
If there were multiple simultaneous hypothetical transactions, they would have separate Timestamps. In some regards, actual transactions do not see changes made by hypothetical transactions. Modifications to an actual table committed after a hypothetical transactional started would not be visible to the hypothetical transaction. Furthermore, hypothetical transactions herein do not commit and do not hold locks on data, unless specifically noted herein for a particular instance or circumstance.
Whereas table T 200 illustrates the data as it existed at the start of a hypothetical modification of that data and table T′ 300 illustrates the hypothetical changes to be made to the data of table T,
For the example of
In some aspects, hypothetical transactions including deletions of data may be handled in a number of different methods. One method to handle deletions is to enter a “tombstone” indicating that a given row has been deleted as of a specific logical time (or during an in-flight transaction). Another method that may be used is to enter null or other default value (e.g., zero) for the non-key columns of a row that is deleted. The former method may be used in some embodiment herein, although the latter method may also be used.
While the processes herein may be described in relation to SAP HANA in some embodiments, aspects of the present disclosure may find applicability in other (SQL) databases. That is, embodiments of the present disclosure are not limited to SAP HANA, neither specifically nor completely, and may be generically applicable to other database management systems.
The present disclosure will now describe how the results shown in
In the following example, it will be assumed that modifications/updates change only a single column, in this case, the column labeled “Profit”. In some embodiments, changes may occur in at least one column of at least one table of a collection or set of data related to hypothetical planning herein.
In some embodiments, prior to operation 505, tables T and T′ including data persisted in a database may be generated. Storage of data in the tables (e.g., T and T′) may comprise two data structures—a Main storage (i.e., Main) and a Delta storage (i.e., Delta) in terms of SAP HANA. In some aspects, the Main storage may be defined and optimized for read performance and memory consumption (i.e., dictionary compression and other memory compression techniques) and may be immutable or read-only. The Delta storage may be defined and optimized for modifications to be made to the data stored in the Main. An online Delta Merge may be run periodically on tables, based on triggers such as Delta size or time, to produce a new, read efficient Main. Since Delta Merge is a background task, database work continues during Delta Merge. Other details and specific characteristics of the Main and Delta exist and are not discussed herein in great detail.
In some aspects, a principal property of the “Branching Deltas” approach disclosed herein to be implementable in a database system is the capability that all new modifications to a table go into a separate partition, with visibility semantics determined by “latest entry” for rows even when, for example, they are in different partitions. This property may have value for other reasons; for example, bulk inserts or streams might be pre-processed to create a new partition (or a new part of a partition) in an efficient (perhaps binary) form. This disclosure is applicable to database systems that do not have Main and Delta as described for HANA. For example, it could be embodied in database systems that support visibility semantics for modifications across multiple stores, or even (as will be shown in [0040]) in systems with a single store.
For systems like SAP HANA that have Main and Delta, or possibly multiple deltas, the idea of “Branching Deltas” may be readily employed. Namely, an additional delta may be generated and used for each hypothetical transaction, similar to the hypothetical modifications (T) presented in
The term “Branching Deltas” is used herein because a new “branching” Hypothetical Delta is created for each hypothetical transaction, in some embodiments herein. Instead of defining visibility based on a Main overwritten by the standard Delta, the data management system defines visibility for a hypothetical transaction based on (Main OverWritten by Delta) OverWritten by the hypothetical transaction's Hypothetical Delta.
Further illustrated in
Returning to
Operation 510 of process 500 includes generating, in response to the request to initiate the hypothetical transaction, a Hypothetical Delta storage data structure to include any changes made by the hypothetical transaction.
Process 500 proceeds to operation 515 that includes applying the changes of the hypothetical transaction included in the hypothetical Delta to the actual table to obtain a hypothetical result data structure that includes the data of the actual table existing at the first logical time and the changes made by the hypothetical transaction.
In some aspects, the “Branching Deltas” approach disclosed herein involving hypothetical transactions does not hold locks and does not commit data to storage. Accordingly, the hypothetical transactions do not interfere with each other or with actual transactions. While some memory is required for hypothetical modifications, the original data is not copied. In some embodiments, a programmer must specify or otherwise indicate that a transaction is hypothetical to invoke some of the aspects disclosed herein. In some embodiments, an underlying system may be changed so that a Hypothetical Delta is created for any table that is modified and modifications are placed into that Hypothetical Delta, not the (actual) Delta used by actual transactions. In some regards, the rolling back of a hypothetical transaction may only require that the Branching or Hypothetical Deltas of the hypothetical transaction be dropped.
In some embodiments, instead of entering a hypothetical transaction's modifications in a Hypothetical Delta, the modifications could be entered (with the TransactionId of the hypothetical transaction) into the standard Delta. Since hypothetical transactions never commit and hold no locks, hypothetical transactions of this example would still not interfere with each other or with actual transactions. When such a hypothetical transaction rolls back, the modifications entered on its behalf into the standard Delta are irrelevant, just as they would be for an actual transaction that rolled back. If a Delta Merge occurs before the hypothetical transaction is rolled back, its changes will be in the new Main but once again they are irrelevant just as they would be for an actual transaction and they will be discarded the next time the Delta Merge occurs.
In some respects, an advantage of putting hypothetical modifications into the standard Delta, rather than a Hypothetical (Branching) Delta, is that there is a single OverWrite, rather than two OverWrites. Moreover, no special check needs to be made to see if a given transaction is hypothetical or not when determining visibility. Conversely, entering hypothetical modifications into the standard Delta may fill up the standard Delta more quickly and require additional work by an actual transaction to scan the hypothetical modifications and ignore them.
In some embodiments, a principle needed for “Branching Deltas” to be implementable is the ability for multiple “partitions” to store versions of the same row, with the right visibility semantics. However, Applicants have realized that one can implement “Branching Delta” capabilities while performing modifications in the partition assigned to the row (e.g., by hash or range partitioning). This methodology may work for many database systems providing snapshot isolation, particular those based on MVCC, as well as other isolation methods, such as Read Committed.
Processor 1105 may communicate with a storage device 1130. Storage device 1130 may comprise any appropriate information storage device, including combinations of magnetic storage devices (e.g., a hard disk drive), optical storage devices, solid state drives, and/or semiconductor memory devices. In some embodiments, storage device 1130 may comprise a database system, including in some configurations an in-memory database.
Storage device 1130 may store program code or instructions for a data processing engine 1135 that may provide processor executable instructions for hypothetical planning processes, in accordance with processes herein. Processor 1105 may perform the instructions of the data processing engine 1135 to thereby operate in accordance with any of the embodiments described herein. The program instructions for data processing engine 1135 may be stored in a compressed, uncompiled and/or encrypted format. Program instructions for data processing engine 1135 may furthermore include other program elements, such as an operating system, a database management system, and/or device drivers used by the processor 1105 to interface with, other devices and systems (not shown in
All systems and processes discussed herein may be embodied in program code stored on one or more tangible, non-transitory computer-readable media. Such media may include, for example, a floppy disk, a CD-ROM, a DVD-ROM, a Flash drive, magnetic tape, and solid state Random Access Memory (RAM) or Read Only Memory (ROM) storage units. Embodiments are therefore not limited to any specific combination of hardware and software.
In some embodiments, aspects herein may be implemented by an application, device, or system to manage recovery of an entity or other application in a consistent manner across different devices, effectively across an entire domain.
In some aspects and embodiments, additional or alternative extensions of the processes and techniques disclosed herein may be extended to allow hypothetical transactions to persist updates to special tables. Instead of throwing away results, a hypothetical transaction may write results to their hypothetical modification tables and perhaps to other special “Hypothetical Analysis” files or tables, supporting comparison and analysis of results across multiple planning exercises and simulations. Modifications to standard tables would be treated as described earlier, with neither locks nor persistence.
In some embodiments, additional or alternative extensions may allow hypothetical transactions to commit changes to standard tables, as well as special tables. If there are no conflicts, this is relatively simple; if there are conflicts, they could be resolved in various ways including, for example, last one wins; when updates are commutative or otherwise composable, compose them together; show conflicts to planner and have planner decide how to resolve them; and roll back the effects of the hypothetical transaction; determine intent of hypothetical transaction, and apply that to current state of the database.
In some embodiments, additional or alternative extensions may allow hypothetical planners to see their proposed changes applied to the current state of the database (always or on request), so that they have a more current view of effects of their plans. Conflicts could be resolved as described for the previous item.
In some embodiments, additional or alternative extensions may allow recursive hypothetical planning, where after initial planning a hypothetical plan might consider hypothetical alternatives, choose the best, and continue planning.
In some embodiments, additional or alternative extensions may generalize hypothetical transactions to hypothetical processes, or other planning scenarios involving multiple transactions. One variation of allows a planner to create alternative worlds (e.g., with different supplies and prices) and play actual transactions against those alternative worlds, observing a subset of the financial consequences in that alternative world. In some embodiments, a hypothetical table herein does not have to be materialized.
Although some embodiments have been described with respect to cloud-based entities, some embodiments may be associated with other types of entities that need not be cloud-based, either in part or whole, without any loss of generality.
The embodiments described herein are solely for the purpose of illustration. Those in the art will recognize other embodiments which may be practiced with modifications and alterations.