Claims
- 1. A method for improving performance for SQL queries, comprising:
obtaining multidimensional metadata associated with a cube model metadata object; automatically identifying one or more summary tables to be built based on the obtained multidimensional metadata; and automatically identifying one or more indexes to create based on the obtained multidimensional metadata.
- 2. The method of claim 1, wherein identifying the one or more summary tables is based on at least one of database statistics, sampling, query type, diskspace limitations, time limitations, and multidimensional regions to be optimized.
- 3. The method of claim 1, wherein identifying the one or more indexes is based on at least one of database statistics, sampling, query type, diskspace limitation, time limitations, and the generated one or more summary tables.
- 4. The method of claim 1, further comprising:
generating statements to be executed to create the one or more summary tables and the one or more indexes.
- 5. The method of claim 1, wherein at least one of the SQL queries is submitted by an on-line analytical processing system.
- 6. The method of claim 1, further comprising:
consolidating multiple hierarchies within each dimension; and replacing one or more of the consolidated multiple hierarchies with one or more virtual hierarchies that contain a set of attributes from one or more of the consolidated multiple hierarchies.
- 7. The method of claim 1, further comprising:
defining a virtual hierarchy for each network hierarchy within the cube model metadata object, wherein each virtual hierarchy includes attributes from an associated network hierarchy in a selected order.
- 8. The method of claim 1, wherein identifying the one or more summary tables further comprises:
identifying the one or more summary tables without derivative measures.
- 9. The method of claim 1, wherein identifying the one or more summary tables further comprises:
rating effectiveness of a slice based on what coverage is provided by the slice and how much performance would improve with the use of the slice.
- 10. The method of claim 1, wherein identifying the one or more summary tables further comprises:
rating a cost of creating a slice within one of the summary tables based on one or more of an estimated number of rows in the slice, an estimated size of the slice, a maximum row width of the slice, a number of columns in the slice, and an amount of time required to build the slice.
- 11. The method of claim 10, further comprising:
rating a subset of slices by including at least one of zero or more non-crossing slices and two or more crossing slices from a set of candidate slices.
- 12. The method of claim 1, further comprising:
determining cost effective slices for drilldown queries by calculating cost effectiveness of a current slice starting from a highest level of each hierarchy in one or more dimensions and rating one or more combinations of slices one level below a current level from each of the hierarchies until a predefined limitation has been reached, wherein rating gives preference to higher level slices.
- 13. The method of claim 1, further comprising:
determining cost effective slices for report queries by calculating cost effectiveness of a current slice starting from a highest level of each hierarchy in one or more dimensions and rating one or more combinations of slices one level below a current level from each of the hierarchies until a predefined limitation has been reached, wherein rating gives preferences to lower level slices.
- 14. The method of claim 1, further comprising optimizing a subset of the cube model metadata object:
determining cost effective slices for drillthrough queries by calculating cost effectiveness of a current slice starting from a highest level of each hierarchy in one or more dimensions and rating one or more combinations of slices one level below a current level from each of the hierarchies a predefined limitation has been reached, wherein rating gives preferences to lower level slices.
- 15. The method of claim 1, further comprising:
receiving one or more regions to optimize for extract queries; for each region, identifying a bottom slice; and consolidating bottom slices of zero or more regions based on cost of the bottom slice identified for each region.
- 16. The method of claim 1, wherein the cube model metadata object includes one or more sets of asymmetric measures, wherein each set of asymmetric measures includes one or more slices, and further comprising:
for each set of compatible asymmetric measures, selecting one or more slices that match an inner-most select statement of a multi-step select statement.
- 17. The method of claim 1, further comprising:
building a clustering index for a summary table by using columns of one or more hierarchies and factoring in one or more of estimated index cardinality, average number of rows per index entry, estimated data distribution of index entries, the inclusion of time attributes, and the query type being optimized.
- 18. The method of claim 1, further comprising:
building non-clustering indexes for a summary table by using columns of one or more hierarchies and selecting indexes based on one or more of estimated index cardinality, estimated data distribution of index entries, the inclusion of time attributes, and the query type being optimized until the combined sizes of the indexes for the summary table exceed a user specified size limit or the number of indexes for the summary table reaches a defined limit.
- 19. The method of claim 1, further comprising:
when at least one measure in a set of measures uses a non-distributive function, determining cost effective slices by evaluating the set of measures to determine whether a set of slices is needed in the summary table to optimize a region of the cube model metadata object.
- 20. The method of claim 1, wherein identifying the one or more summary tables further comprises:
rating effectiveness of a region based on what coverage is provided by the region and how much performance would improve with the use of the region.
- 21. The method of claim 1, wherein identifying the one or more summary tables further comprises:
rating a cost of creating a region within one of the summary tables based on one or more of an estimated number of rows in the region, an estimated size of the region, a maximum row width of the region, a number of columns in the region, and an amount of time required to build the slice.
- 22. An article of manufacture including a program for improving performance for SQL queries, wherein the program causes operations to be performed, the operations comprising:
obtaining multidimensional metadata associated with a cube model metadata object; automatically identifying one or more summary tables to be built based on the obtained multidimensional metadata; and automatically identifying one or more indexes to create based on the obtained multidimensional metadata.
- 23. The article of manufacture of claim 22, wherein identifying the one or more summary tables is based on at least one of database statistics, sampling, query type, diskspace limitations, time limitations, and multidimensional regions to be optimized.
- 24. The article of manufacture of claim 22, wherein identifying the one or more indexes is based on at least one of database statistics, sampling, query type, diskspace limitation, time limitations, and the generated one or more summary tables.
- 25. The article of manufacture of claim 22, the operations further comprising:
generating statements to be executed to create the one or more summary tables and the one or more indexes.
- 26. The article of manufacture of claim 22, the operations further comprising:
consolidating multiple hierarchies within each dimension; and replacing one or more of the consolidated multiple hierarchies with one or more virtual hierarchies that contain a set of attributes from one or more of the consolidated multiple hierarchies.
- 27. The article of manufacture of claim 22, wherein the operations for identifying the one or more summary tables further comprise:
rating effectiveness of a slice based on what coverage is provided by the slice and how much performance would improve with the use of the slice.
- 28. The article of manufacture of claim 22, wherein the operations for identifying the one or more summary tables further comprise:
rating a cost of creating a slice within one of the summary tables based on one or more of an estimated number of rows in the slice, an estimated size of the slice, a maximum row width of the slice, a number of columns in the slice, and an amount of time required to build the slice.
- 29. The article of manufacture of claim 28, the operations further comprising:
rating a subset of slices by including at least one of zero or more non-crossing slices and two or more crossing slices from a set of candidate slices.
- 30. The article of manufacture of claim 22, the operations further comprising:
determining cost effective slices for drilldown queries by calculating cost effectiveness of a current slice starting from a highest level of each hierarchy in one or more dimensions and rating one or more combinations of slices one level below a current level from each of the hierarchies until a predefined limitation has been reached, wherein rating gives preference to higher level slices.
- 31. The article of manufacture of claim 22, wherein the cube model metadata object includes one or more sets of asymmetric measures, wherein each set of asymmetric measures includes one or more slices, the operations further comprising:
for each set of compatible asymmetric measures, selecting one or more slices that match an inner-most select statement of a multi-step select statement.
- 32. The article of manufacture of claim 22, the operations further comprising:
building a clustering index for a summary table by using columns of one or more hierarchies and factoring in one or more of estimated index cardinality, average number of rows per index entry, estimated data distribution of index entries, the inclusion of time attributes, and the query type being optimized.
- 33. The article of manufacture of claim 22, the operations further comprising:
building non-clustering indexes for a summary table by using columns of one or more hierarchies and selecting indexes based on one or more of estimated index cardinality, estimated data distribution of index entries, the inclusion of time attributes, and the query type being optimized until the combined sizes of the indexes for the summary table exceed a user specified size limit or the number of indexes for the summary table reaches a defined limit.
- 34. The article of manufacture of claim 22, wherein the operations for identifying the one or more summary tables further comprise:
rating effectiveness of a region based on what coverage is provided by the region and how much performance would improve with the use of the region.
- 35. The article of manufacture of claim 22, wherein the operations for identifying the one or more summary tables further comprise:
rating a cost of creating a region within one of the summary tables based on one or more of an estimated number of rows in the region, an estimated size of the region, a maximum row width of the region, a number of columns in the region, and an amount of time required to build the slice.
- 36. A system for improving performance for SQL queries, comprising:
a computer system having at least one program for: (i) obtaining multidimensional metadata associated with a cube model metadata object; (ii) automatically identifying one or more summary tables to be built based on the obtained multidimensional metadata; and (iii) automatically identifying one or more indexes to create based on the obtained multidimensional metadata.
- 37. The system of claim 36, wherein identifying the one or more summary tables is based on at least one of database statistics, sampling, query type, diskspace limitations, time limitations, and multidimensional regions to be optimized.
- 38. The system of claim 36, wherein identifying the one or more indexes is based on at least one of database statistics, sampling, query type, diskspace limitation, time limitations, and the generated one or more summary tables.
- 39. The system of claim 36, wherein for identifying the one or more summary tables, the at least one program further comprises:
rating effectiveness of a slice based on what coverage is provided by the slice and how much performance would improve with the use of the slice.
- 40. The system of claim 1, wherein for identifying the one or more summary tables, the at least one program further comprises:
rating a cost of creating a slice within one of the summary tables based on one or more of an estimated number of rows in the slice, an estimated size of the slice, a maximum row width of the slice, a number of columns in the slice, and an amount of time required to build the slice.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is related to commonly owned and co-pending application Ser. No. 10/341,763 entitled “METHOD, SYSTEM, AND PROGRAM FOR SPECIFYING MULTIDIMENSIONAL CALCULATIONS FOR A RELATIONAL OLAP ENGINE,” by N. Colossi, et al., Docket No. SVL920020072US1, filed on Jan. 13, 2003, and which is incorporated by reference herein in its entirety.