A typical relational database system is capable of calculating a formula specified by a single relational statement (e.g., a Structured Query Language statement). However, relational algebra can only express formulas in which each formula operand is located in a same table row. For example, SQL distinguishes between AGGREGATION, an operation on one single column that involves all rows (e.g., SUM, MIN, MAX, AVG . . . ), and CALCULATION, an operation on a row that can involve all columns, but can only involve values within the same row. Therefore, a single relational statement may be used to perform an aggregation operation on values located in disparate rows of a same table column, but cannot be used to perform a calculation in which two or more formula operands are located in different table rows.
The following description is provided to enable any person in the art to make and use the described embodiments and sets forth the best mode contemplated for carrying out some embodiments. Various modifications, however, will remain readily apparent to those in the art.
In some embodiments, data source 110 is implemented in Random Access Memory (e.g., cache memory for storing recently-used data) and one or more fixed disks (e.g., persistent memory for storing their respective portions of the full database). Alternatively, data source 110 may implement an “in-memory” database, in which volatile (e.g., non-disk-based) memory (e.g., Random Access Memory) is used both for cache memory and for storing its entire respective portion of the full database. In some embodiments, the data of data source 110 may comprise one or more of conventional tabular data, row-based data, column-based data, and object-based data. Data source 110 may also or alternatively support multi-tenancy by providing multiple logical database systems which are programmatically isolated from one another. Moreover, the data of data source 110 may be indexed and/or selectively replicated in an index to allow fast searching and retrieval thereof.
Query server 130 generally provides data of data source 110 to reporting clients such as client 120. The data may be provided based on metadata 140, which provides information regarding the structure, relationships and meaning of the data of data source 110.
In some embodiments, metadata 140 describes database tables of data source 110. The descriptions may include table names, technical names and data types of each table column, and column names.
Client 120 may comprise one or more devices executing program code of a software application for presenting user interfaces to allow interaction with query server 130. Presentation of a user interface may comprise any degree or type of rendering, depending on the type of user interface code generated by query server 130. For example, client 120 may execute a Web Browser to receive a Web page (e.g., in HTML format) from query server 130, and may render and present the Web page according to known protocols. Client 120 may also or alternatively present user interfaces by executing a standalone executable file (e.g., an .exe file) or code (e.g., a JAVA applet) within a virtual machine.
According to some embodiments, client 120 may generate MultiDimensional eXpressions (MDX) statements based on user interaction with its user interfaces. Such a client 120 may comprise an “analysis client” typically used to execute calculations upon stored multi-dimensional data and to view the results of such calculations. Client 120 may also or alternatively generate SQL statements based on received user interactions. SQL-based clients are typically used to access and view stored relational (i.e., “tabular”) data. As will be described below, client 120 may generate MDX statements to execute calculations upon relational data stored in data source 110. System 100 may include any number of clients 120 of one or more types according to some embodiments.
A calculation is received prior to process 200. The calculation may be received from client 120 by query server 130. The calculation includes two or more operands, with each operand being associated with a row of a table and a value.
For purposes of providing an example of process 200 according to some embodiments, it will be assumed that a calculation is received prior to process 200 in the form of the following MDX statement received from client 120:
The above MDX statement is intended to generate a “calculated member”, which is a “new” table row including values derived by formula from values located in other rows.
Table 300 includes two columns, CALYEAR and SOLD_QUANTITY. Each column includes two rows and a value is associated with each row. The above MDX statement is intended to calculate the difference “Diff 2003/2002” between the number of sold entities (SOLD_QUANTITY) of year 2003 (CALYEAR=2003) and year 2002 (CALYEAR=2002). Stated symbolically, Diff 2003/2002=(SOLD_QUANTITY, CALYEAR=2003)−(SOLD_Quantity, CALYEAR=2002). This calculation is performed within a column (SOLD_QUANTITY), in that the values used in the calculation (10, 20) are located in different rows of the column. Accordingly, this calculation cannot be expressed in a single relational statement.
Accordingly, process 200 begins at S210, in which operands of the calculation are identified. Also identified at S210 is a column of the table including values associated with the operands and used in the calculation. In the present example, the operands are CALYEAR—2003 and CALYEAR—2002, and the column is SOLD_QUANTITY.
Next, at S220, a column is created for each operand. Each created column includes the value associated with its operand, in a row associated with the operand. Table 400 of
Similarly, the column SOLD_QUANTITY, CALYEAR=2002 was created for the operand CALYEAR, 2002. This column includes the value (i.e., 20) which is associated with its operand (i.e., CALYEAR, 2002) in the identified column (i.e., SOLD_QUANTITY), in a row of the newly-created column which is associated with the operand. The remaining rows of the newly-created columns are not populated.
Table 400 may simply comprise a logical structure to illustrate the intermediate states of process 200. That is, table 400 need not be instantiated at any time during process 200. If it were to be instantiated, the selections reflected therein could be expressed by the following SQL statement:
According to some embodiments, query server 130 supports restricted key figures, and therefore the selection of S220 could be logically visualized in an SQL-like notation as follows:
Next, at S230, the created columns are column-wise aggregated into a single row in order to have the values of all calculation operands in a single row. In the example above, the following columns remain after S230:
SOLD_QUANTITY, CALYEAR=2003
SOLD_QUANTITY, CALYEAR=2002
Only a single row remains after aggregation, because the aggregation of a value and NULL results in a single value (value+NULL=value).
Next, at S240, the calculation is executed using the values of the single row and a result column is created. The result column includes a result of the calculation. As mentioned above, location of the values in a single row facilitates the generation and execution of a single SQL statement which expresses the calculation.
Since it is known that the aggregation at S230 will result in a single row, the aggregation and the calculation at S240 may be executed simultaneously. The following SQL statement may be used in some embodiments to implement this functionality:
The first SQL statement presented herein is nested in the above SQL statement. Accordingly, the above SQL statement represents execution of S220-S240 of process 200.
In the case of a query server supporting restricted key figures, the following SQL-like notation may visualize S220-S240:
The result column is added as a row to the original table at S250. Table 700 illustrates this addition and represents the final result set. A constant value “Diff 2003/2002” has been added to fill the new row of column CALYEAR.
According to some embodiments, the result column is added at S250 by adding the relation of S240 (i.e., table 600) to the initial relation (i.e., table 300) using a UNION operation.
A suitable SQL statement for this union follows, which nests the first and third SQL statements set forth above:
Accordingly, the above SQL statement may implement S220-S250 in some embodiments.
Similarly, the following notation may visualize S220-S250 in conjunction with a query server supporting restricted key figures.
The above calculation is merely an example, and some embodiments may support any calculations in which operand values are located in disparate table rows.
According to some embodiments, query server 130 generates an execution plan based on the above SQL statements, as is known in the art, and passes the plan to data source 110 to generate the requested relation.
MDX interface 835 typically receives MDX statements from analysis client 825, which may specify calculations upon multi-dimensional data stored in multi-dimensional database 815, and interacts with multi-dimensional database 815 to provide corresponding data to MDX client 825. SQL interface 830, on the other hand, receives SQL statements from relational client 820 and retrieves corresponding data from relational database 810.
According to some embodiments, and as illustrated in
Apparatus 900 includes processor 910 operatively coupled to communication device 920, data storage device 930, one or more input devices 940, one or more output devices 950 and memory 960. Communication device 920 may facilitate communication with external devices, such as a reporting client, or a data storage device. Input device(s) 940 may comprise, for example, a keyboard, a keypad, a mouse or other pointing device, a microphone, knob or a switch, an infra-red (IR) port, a docking station, and/or a touch screen. Input device(s) 940 may be used, for example, to enter information into apparatus 900. Output device(s) 950 may comprise, for example, a display (e.g., a display screen) a speaker, and/or a printer.
Data storage device 930 may comprise any appropriate persistent storage device, including combinations of magnetic storage devices (e.g., magnetic tape, hard disk drives and flash memory), optical storage devices, Read Only Memory (ROM) devices, etc., while memory 960 may comprise Random Access Memory (RAM).
Query server 932 may comprise program code executed by processor 910 to cause apparatus 900 to perform any one or more of the processes described herein. Embodiments are not limited to execution of these processes by a single apparatus. Data source 934 may implement data source 110 as described above. As also described above, data source 110 may be implemented in volatile memory such as memory 960. Data storage device 930 may also store data and other program code for providing additional functionality and/or which are necessary for operation of apparatus 900, such as device drivers, operating system files, etc.
The foregoing diagrams represent logical architectures for describing processes according to some embodiments, and actual implementations may include more or different components arranged in other manners. Other topologies may be used in conjunction with other embodiments. Moreover, each system described herein may be implemented by any number of devices in communication via any number of other public and/or private networks. Two or more of such computing devices may be located remote from one another and may communicate with one another via any known manner of network(s) and/or a dedicated connection. Each device may comprise any number of hardware and/or software elements suitable to provide the functions described herein as well as any other functions. For example, any computing device used in an implementation of system 100 may include a processor to execute program code such that the computing device operates as described herein.
All systems and processes discussed herein may be embodied in program code stored on one or more non-transitory computer-readable media. Such media may include, for example, a floppy disk, a CD-ROM, a DVD-ROM, a Flash drive, magnetic tape, and solid state Random Access Memory (RAM) or Read Only Memory (ROM) storage units. Embodiments are therefore not limited to any specific combination of hardware and software.
Embodiments described herein are solely for the purpose of illustration. Those in the art will recognize other embodiments may be practiced with modifications and alterations to that described above.