The present disclosure relates generally to the field of data processing and, more specifically, to computer-implemented systems and methods for processing a multi-dimensional data structure.
Data processing technology has evolved to where data can be stored and processed in multidimensional data structures. In a multidimensional data structure, data is often represented as multi-dimensional cubes. Each dimension of a multidimensional cube represents a different type of data. For example, a three-dimensional cube can be used to store travel expense data of an enterprise. A first dimension of the cube may specify travel costs, a second dimension of the cube may specify the months, and a third dimension of the cube may specify divisions of the enterprise. Hence, data in a particular cell of the cube may indicate the travel costs by a particular division of the enterprise during a particular month.
As disclosed herein, computer-implemented systems and methods are provided for processing a multi-dimensional data structure. For example, systems and methods are provided for processing the multi-dimensional data structure and allowing cell selection rules related to the multi-dimensional data structure to be resolved efficiently, so that computational cost associated with processing the multi-dimensional data structure can be saved.
As another example, data in a multi-dimensional data structure is represented as multi-dimensional cubes. Data regarding a multi-dimensional cube that includes a plurality of multi-dimensional cells are received. A cell selection rule that defines one or more cells to be identified for computer-based operations is received. Cell indices are calculated for the one or more cells defined by the cell selection rule. The calculated cell indices are used to identify the one or more cells in the cube for performing the computer-based operations upon data associated with the identified cells.
As additional examples, a dimension of the cube includes a plurality of dimension members, each dimension member having an offset value that represents a position of the dimension member in the dimension. A cell of the cube includes at least one dimension member from each dimension of the cube. A cell of the cube is identifiable by a cell index associated with the cell. Whether the one or more cells defined by the cell selection rule form a sub-cube is determined, where a cell in a sub-cube shares same dimension members from all but one dimensions of the cube with at least another cell in the sub-cube. For the one or more cells forming a sub-cube, a first cell index for a starting cell of the sub-cube is calculated based on the offset values of dimension members of the starting cell. Cell indices for the remaining cells of the sub-cube are calculated based on the first cell index. The calculated cell indices are used to identify the one or more cells in the cube for performing the computer-based operations upon data associated with the sub-cube.
The multi-dimensional data structure processing system 104 can perform different processes for identifying the cells, such as sub-cube identification 112 and random cell identification 114, based on a determination of whether the cells defined by the cell selection rules form a sub-cube. For example, if the cells to be identified are determined not to form a sub-cube, the random cell identification 114 is performed to generate cell indices for the cells to be identified independently. On the other hand, if the cells to be identified are determined to form a sub-cube, the sub-cube identification 112 is performed to generate the cell indices for the cells to be identified using a dependent cell index calculation method.
As shown in
The multi-dimensional cube 202 includes a plurality of cells 204 which are identifiable by cell indices associated with the cells. The cell selection rules 206 define certain cells in the cube 202 to be identified for the computer-based operations 212. Data regarding the cube 202 and the cell selection rules 206 are input to the cell identification process 208. The data regarding the cube 202 includes one or more of the following: financial data, sales data, enterprise performance data, inventory data, budget planning data, business intelligence data, and human resources data. Indices of the defined cells 210 are generated based on the data regarding the cube 202 and the cell selection rules 206. The cells defined by the cell selection rules 206 are identified using the generated indices 210. The computer-based operations 212 are performed upon data associated with the identified cells.
The number of dimension members in a particular dimension is called a cardinality of the dimension. For example, the dimension D1 includes three dimension members {M1, M2, M3}, the dimension D2 includes three dimension members {M4, M5, M6}, and the dimension D3 includes three dimension members {M7, M8, M9}. Thus, each dimension (e.g., D1, D2, and D3) of the cube 300 has a cardinality of 3.
A multiplier of a particular dimension (e.g., D1) represents a difference in cell indices of two cells which have adjacent dimension members in the particular dimension and share same dimension members in all other dimensions. The multiplier of a dimension (e.g., D1) of a multi-dimensional cube can be determined based on the number of dimension members contained in the dimension and an order of dimensions in the multi-dimensional cube.
Further, each dimension member (e.g., M6, etc.) has an offset value that represents a position of the dimension member in the dimension. For example, if a cell 302 which can be represented by three dimension members (M1, M4, M7) is set to be a starting cell for the cube 300, the dimension member M1 has an offset value of 0 in the dimension D1, M4 has an offset value of 0 in the dimension D2, and M7 has an offset value of 0 in the dimension D3. Then, the dimension member M2 and M3 have offset values of 1 and 2 in the dimension D1, respectively. Similarly, the dimension members M5 and M6 have offset values of 1 and 2 in the dimension D2, respectively. Additionally, the dimension members M8 and M9 have offset values of 1 and 2 in the dimension D3, respectively.
Among a plurality of cells 404 in the cube 402, the cell selection rules 406 may define certain cells to be identified for the computer-based operations 412 by specifying one or more members for every dimension in the cube. For example, with reference to
Data regarding the cube 402, dimensional metadata 416 (e.g., cardinalities, multipliers, etc.), and the cell selection rules 406 are input to the cell identification process 408. The cell identification process 408 generates cell indices of the cells defined by the cell selection rules 406 based on a determination of whether these cells form a sub-cube. If the cells defined by the cell selection rules 406 do not form a sub-cube, a random cell identification is performed in the cell identification process 408 by implementing an independent index calculation method in which a cell index for each cell is calculated independently. On the other hand, if the cells defined by the cell selection rules 406 do form a sub-cube, a sub-cube identification is performed in the cell identification process 408 by implementing a dependent index calculation method in which a cell index for each cell is calculated based on a cell with a known cell index.
Cell indices 410 are output for identifying the cells that are defined by the cell selection rules 406. Depending on whether the cells defined by the cell selection rules 406 form a sub-cube, an independent cell index calculation method or a dependent cell index calculation method may be implemented during the cell identification process 408 for generating the cell indices 410. Computer-based operations 412, including overwrite protection 418 and visibility protection 420, are performed upon data associated with the cells that are identified based on the cell indices 410.
From an overall processing perspective, the system operations can be configured such as shown in
Particularly, the cells to be identified can be determined to form a sub-cube if a cell to be identified shares the same dimension members from all but one dimensions of the cube with at least another cell to be identified. If the cells to be identified are determined not to form a sub-cube, cell indices for the cells to be identified are calculated independently at 512. On the other hand, if the cells to be identified are determined to form a sub-cube, cell indices for the cells to be identified are calculated using a dependent cell index calculation method. A first cell index for a starting cell of the sub-cube is calculated based on the offset values of dimension members of the starting cell at 508. Then cell indices for the remaining cells of the sub-cube are calculated based on the first cell index at 510. The remaining cells of the sub-cube can be traversed sequentially based on an order of cell indices in the multi-dimensional cube.
Each dimension (e.g., D1, D2, and D3) of the cube 600 has a multiplier which can be determined based on the number of dimension members contained in the dimension and an order of dimensions in the cube 600. For example, the multipliers for the dimensions (e.g., D1, D2 and D3) can be calculated using the following example program, where an input parameter is “dimCardinalities” that represents the cardinalities of the dimensions of a multi-dimensional cube, and an output parameter is “dimMultipliers” which represents the multipliers of the dimensions of the multi-dimensional cube.
For the cube 600, the multipliers are determined to be 1 for the dimension D1, 9 for the dimension D2, and 3 for the dimension D3.
If the cells to be identified are determined not to form a sub-cube, based on the cardinalities and the multipliers, the cell indices of the cells to be identified can be determined using an independent calculation method as follows:
flatIndex=c1*d1+c2*d2+c3*d3+ . . . +cn*dn (1)
where flatIndex represents a cell index for a cell to be identified, n represents the number of dimensions, c1 . . . cn are offsets values of dimension members of the cell to be identified, and d1 . . . dn are multipliers for the dimensions of a multi-dimensional cube that contains the cell.
An example program may be used to implement Equation 1 as follows, where input parameters are “dimCardinalities” and “dimMultipliers,” an output parameter is “flatIndex” which represents a cell index of a cell to be identified, and “cv” represents the offset values of dimension members of the cell.
Accordingly, the cell indices of the cells in the cube 600 may be determined using Equation 1 and the related program based on the cardinalities and the calculated multipliers (e.g., 1 for the dimension D1, 9 for the dimension D2, and 3 for the dimension D3). The cell 602 is a starting cell of which offset values of the dimension members (e.g., M1, M4 and M7) are zeros. Table 1 shows the calculated cell indices of the cells in the cube 600.
On the other hand, if the cells to be identified are determined to form a sub-cube, then the cell indices of the cells to be identified can be calculated using a dependent cell index calculation method as follows:
flatIndex2=flatIndex1+dimMultipliers[dim]×(rank2−rank1) (2)
where flatIndex2 represents a cell index of a cell to be identified, flatIndex1 represents a known cell index of a base cell, rank1 represents an offset value of a dimension member of the base cell, rank2 represents an offset value of a dimension member of the cell to be identified that is in the same dimension as the dimension member of the based cell, and dimMultipliers[ ] represent multipliers of the dimensions of a multi-dimensional cube.
As an example, a cell selection rule defines cells to be identified as cells with the dimension member M1 in the dimension D1, the dimension members M4 or M6 in the dimension D2, and the dimension members M7 or M9 in the dimension D3. As shown in
The cell indices of these four cells can be determined using the dependent cell index calculation method. With reference to
For example, the cell 604 is different from the cell 602 only in the dimension D3, where the cell 604 has the dimension member M9, and the cell 602 has the dimension member M7. As discussed above, the dimension D3 has a multiplier of 3. The dimension member M7 has an offset value of 0, and the dimension member M9 has an offset value of 2 in the dimension D3. Thus, according to Equation 2, the cell index of the cell 604 can be determined, using the cell 602 as the base cell, to be:
Cell index of cell 604=Cell index of cell 602+dimMultipliers[D3]×(M9−M7)=0+3×2=6.
Similarly, the cell 606 is different from the cell 602 only in the dimension D2, where the cell 606 has the dimension member M6, and the cell 602 has the dimension member M4. The dimension D2 has a multiplier of 9. The dimension member M4 has an offset value of 0, and the dimension member M6 has an offset value of 2 in the dimension D2. Thus, according to Equation 2, the cell index of the cell 606 can be determined, using the cell 602 as the base cell, to be:
Cell index of cell 606=Cell index of cell 602+dimMultipliers[D2]×(M6−M4)=0+9×2=18.
Additionally, the cell 608 is different from the cell 606 only in the dimension D3, where the cell 608 has the dimension member M9, and the cell 606 has the dimension member M7. The dimension D3 has a multiplier of 3. The dimension member M7 has an offset value of 0, and the dimension member M9 has an offset value of 2 in the dimension D3. Thus, according to Equation 2, the cell index of the cell 608 can be determined, using the cell 606 as the base cell, to be:
Cell index of cell 608=Cell index of cell 606+dimMultipliers[D3]×(M9−M7)=18+3×2=24.
As such, the cell indices of the cells defined by the cell selection rules are calculated for identifying these cells for the computer-based operations.
Each dimension of the cube 800 includes a plurality of dimension members. For example, the “time” dimension includes three dimension members: January, February, and March. The “product” dimension includes dimension members: product A, product B, and product C. Additionally, the “region” dimension includes dimension members: Midwest, Eastern, and Western. For example, the sales data stored in the cell 802 indicates the sales of the product C in the Western region in March amounts to $15,000.
Computer-based operations are often needed to be performed upon data associated with certain cells of a multi-dimensional cube.
As shown in
Four cells 902, 904, 906 and 908 are defined by one or more cell selection rules to be identified for overwriting protection. That is, data contained in these four cells are to be protected from unauthorized overwriting, while data contained in all other cells of the cube 900 can be overwritten. As shown in
As shown in
This written description uses examples to disclose the invention, including the best mode, and also to enable a person skilled in the art to make and use the invention. The patentable scope of the invention may include other examples. For example, a computer-implemented system and method can be configured for reducing computational cost associated with resolving a cell selection rule applied to one or more multi-dimensional data structures. As another example, a computer-implemented system and method can be configured for reducing computational cost associated with resolving a number of cell selection rules applied to one or more multi-dimensional data structures. As another example, a computer-implemented system and method can be configured such that a multi-dimensional data structure processing system 1102 can be provided on a stand-alone computer for access by a user, such as shown at 1100 in
As another example, the systems and methods may include data signals conveyed via networks (e.g., local area network, wide area network, internet, combinations thereof, etc.), fiber optic medium, carrier waves, wireless networks, etc. for communication with one or more data processing devices. The data signals can carry any or all of the data disclosed herein that is provided to or from a device.
Additionally, the methods and systems described herein may be implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem. The software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform the methods and operations described herein. Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.
The systems' and methods' data (e.g., associations, mappings, data input, data output, intermediate data results, final data results, etc.) may be stored and implemented in one or more different types of computer-implemented data stores, such as different types of storage devices and programming constructs (e.g., RAM, ROM, Flash memory, flat files, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.). It is noted that data structures describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
The systems and methods may be provided on many different types of computer-readable media including computer storage mechanisms (e.g., CD-ROM, diskette, RAM, flash memory, computer's hard drive, etc.) that contain instructions (e.g., software) for use in execution by a processor to perform the methods' operations and implement the systems described herein.
The computer components, software modules, functions, data stores and data structures described herein may be connected directly or indirectly to each other in order to allow the flow of data needed for their operations. It is also noted that a module or processor includes but is not limited to a unit of code that performs a software operation, and can be implemented for example as a subroutine unit of code, or as a software function unit of code, or as an object (as in an object-oriented paradigm), or as an applet, or in a computer script language, or as another type of computer code. The software components and/or functionality may be located on a single computer or distributed across multiple computers depending upon the situation at hand.
It should be understood that as used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise. Finally, as used in the description herein and throughout the claims that follow, the meanings of “and” and “or” include both the conjunctive and disjunctive and may be used interchangeably unless the context expressly dictates otherwise; the phrase “exclusive or” may be used to indicate situation where only the disjunctive meaning may apply.
| Number | Name | Date | Kind |
|---|---|---|---|
| 5767854 | Anwar | Jun 1998 | A |
| 5799300 | Agrawal et al. | Aug 1998 | A |
| 5918225 | White et al. | Jun 1999 | A |
| 5926820 | Agrawal et al. | Jul 1999 | A |
| 5943677 | Hicks | Aug 1999 | A |
| 6182060 | Hedgcock et al. | Jan 2001 | B1 |
| 6341240 | Bermon et al. | Jan 2002 | B1 |
| 6366199 | Osborn et al. | Apr 2002 | B1 |
| 6456999 | Netz | Sep 2002 | B1 |
| 6460026 | Pasumansky | Oct 2002 | B1 |
| 6470344 | Kothuri et al. | Oct 2002 | B1 |
| 6484179 | Roccaforte | Nov 2002 | B1 |
| 6546135 | Lin et al. | Apr 2003 | B1 |
| 6581068 | Bensoussan et al. | Jun 2003 | B1 |
| 6643608 | Hershey et al. | Nov 2003 | B1 |
| 6728724 | Megiddo et al. | Apr 2004 | B1 |
| 6750864 | Anwar | Jun 2004 | B1 |
| 6898603 | Petculescu et al. | May 2005 | B1 |
| 7016480 | Saylor et al. | Mar 2006 | B1 |
| 7031955 | de Souza et al. | Apr 2006 | B1 |
| 7089266 | Stolte et al. | Aug 2006 | B2 |
| 7133876 | Roussopoulos et al. | Nov 2006 | B2 |
| 7171427 | Witkowski et al. | Jan 2007 | B2 |
| 7177854 | Chun et al. | Feb 2007 | B2 |
| 7430567 | Goldstein et al. | Sep 2008 | B2 |
| 7660823 | Clover | Feb 2010 | B2 |
| 7698314 | Croft et al. | Apr 2010 | B2 |
| 7777743 | Pao et al. | Aug 2010 | B2 |
| 7831615 | Bailey | Nov 2010 | B2 |
| 7895191 | Colossi et al. | Feb 2011 | B2 |
| 7904319 | Whear et al. | Mar 2011 | B1 |
| 7996378 | Wang et al. | Aug 2011 | B2 |
| 20020126545 | Warren et al. | Sep 2002 | A1 |
| 20030028403 | Olson | Feb 2003 | A1 |
| 20030033170 | Bhatt et al. | Feb 2003 | A1 |
| 20030046250 | Kuettner et al. | Mar 2003 | A1 |
| 20030061104 | Thomson et al. | Mar 2003 | A1 |
| 20030093424 | Chun et al. | May 2003 | A1 |
| 20030105646 | Siepser | Jun 2003 | A1 |
| 20030126143 | Roussopoulos et al. | Jul 2003 | A1 |
| 20030208503 | Roccaforte | Nov 2003 | A1 |
| 20040111388 | Boiscuvier et al. | Jun 2004 | A1 |
| 20040122689 | Dailey et al. | Jun 2004 | A1 |
| 20040138908 | Lowe, Jr. et al. | Jul 2004 | A1 |
| 20040215626 | Colossi et al. | Oct 2004 | A1 |
| 20050015273 | Iyer | Jan 2005 | A1 |
| 20050066277 | Leah et al. | Mar 2005 | A1 |
| 20050262108 | Gupta | Nov 2005 | A1 |
| 20060184377 | Tan et al. | Aug 2006 | A1 |
| 20060184379 | Tan et al. | Aug 2006 | A1 |
| 20060190432 | Wang et al. | Aug 2006 | A1 |
| 20060212386 | Willey et al. | Sep 2006 | A1 |
| 20070233621 | de Souza et al. | Oct 2007 | A1 |
| 20080288889 | Hunt et al. | Nov 2008 | A1 |
| 20110035353 | Bailey | Feb 2011 | A1 |
| Entry |
|---|
| De Prisco et al., “On Optimal Binary Search Trees”, Information Processing Letters, vol. 45, pp. 249-253 (Apr. 1993). |
| Kalbfleisch, J.D. et al., “Methods for the Analysis and Prediction of Warranty Claims,” Technometrics, vol. 33, No. 33, 25 pp. (Aug. 1991). |
| Lawless, J.F. et al., “Some Simple Robust Methods for the Analysis of Recurrent Events,” IIQP Research Report RR-93-02, 25 pp. (Feb. 1993). |
| Pedersen, Torben Bach et al., “Multidimensional Database Technology”, IEEE, Computer, vol. 34, Issue 12, pp. 40-46 (Dec. 2001). |
| Thode, Henry C., “Testing for Normality,” Chapter 12: Robust Estimation of Location and Scale, 29 pp. (2002). |
| Wu, Huaiqing et al., “Early Detection of Reliability Problems Using Information From Warranty Databases,” Technometrics, vol. 44, No. 2, pp. 1-28 (May 2002). |
| Business Wire, “Spotfire Launches Spotfire.net, the First Web Portal for Technical Decision-Making Communities” (Mar. 21, 2000). |
| Number | Date | Country | |
|---|---|---|---|
| 20130054608 A1 | Feb 2013 | US |