The invention relates generally to data access middleware, and in particular to a system and method of summary filter transformation.
A typical data access environment has a multi-tier architecture. For description purposes, it can be separated into three distinct tiers:
The web server contains a firewall and one or more gateways. All web communication is performed through a gateway. A gateway is responsible for passing on requests to the application server, in tier 2, for execution.
The applications tier contains one or more application servers. The application server runs requests, such as reports and queries that are forwarded by a gateway running on the web server. Typically, one of the components of the applications tier is a query engine, which is data access middleware that provides universal data access to a variety of heterogeneous database systems. The query engine formulates queries (typically SQL) and passes them on to the data tier, through a native database API (such as ODBC) for execution.
The data tier contains database management systems (DBMS), which manage raw data stored in a database. Examples of such systems include Oracle, DB2, and Microsoft SQL Server.
Although a multi-tier architecture can be configured in several different ways, a typical configuration places each tier on a separate computer (server). A database server is typically a “high end” server, and thus can process queries at a relatively fast speed. An application server cannot generally process queries as quickly as a database server.
In order to solve many business questions, a query engine may generate SQL queries that utilize The SQL/OLAP technology introduced in the SQL-99 standard. However, many database systems do not support this technology. Thus, the SQL queries would have to be performed on the report server that is generally slower than the database server. It is desirable to have as much processing performed on the database server.
There is a need to prevent or reduce the amount of local (application server) processing required to process a summary filter.
One way of overcoming this problem is for the query engine to generate a basic query to retrieve the data required to process the filter and all post-filter aggregates. Unfortunately, this solution requires processing time on the report server. It is desirable to have a way of transferring the SQL queries to the database server with minimal processing on the report server.
It is an object of the present invention to provide a method of summary filter transformation in a database system that does not support SQL-99 standard.
In accordance with an embodiment of the present invention, there is provided a system for summary filter transformation. The system comprises a summary filter analysis module for analysing a multidimensional query that is not supported by a target database system, and a summary filter transformation module for transforming the multidimensional query into a semantically equivalent query that is supported by the target database system.
In accordance with another embodiment of the present invention, there is provided a method of summary filter transformation. The method comprises the steps of analysing a multidimensional query that is not supported by a target database system, and transforming the multidimensional query into a semantically equivalent query that is supported by the target database system.
In accordance with an embodiment of the present invention, there is provided a method of summary filter transformation. The method comprises the steps of analysing a summary filter transformation to determine an overall filter grouping level, analysing a transformation select list to determine if a transformation is to be performed, creating a derived table, traversing the transformation select list to move PREFILTER aggregates and aggregates computed at the filter grouping level into the derived table, and extracting and moving aggregates from the summary filter into a derived table select list.
In accordance with an embodiment of the present invention, there is provided a computer data signal embodied in a carrier wave and representing sequences of instructions which, when executed by a processor, cause the processor to perform a method of summary filter transformation. The method comprises the steps of analysing a multidimensional query that is not supported by a target database system, and transforming the multidimensional query into a semantically equivalent query that is supported by the target database system.
In accordance with an embodiment of the present invention, there is provided a computer-readable medium having computer readable code embodied therein for use in the execution in a computer of a method of summary filter transformation. The method comprises the steps of analysing a multidimensional query that is not supported by a target database system, and transforming the multidimensional query into a semantically equivalent query that is supported by the target database system.
In accordance with an embodiment of the present invention, there is provided a computer program product for use in the execution in a computer of a group query transformation system for summary filter transformation. The computer program product comprises a summary filter analysis module for analysing a multidimensional query that is not supported by a target database system, and a summary filter transformation module for transforming the multidimensional query into a semantically equivalent query that is supported by the target database system.
In order to solve many business questions, a query engine 15 generates SQL queries that utilize the SQL/OLAP (Online Analytical Programming) technology introduced in the SQL-99 standard. These SQL queries include SQL/OLAP functions (windowed aggregates). However, many database systems 12 do nor support this technology. In order to prevent or reduce the amount of local (application server) processing required to process these types of queries, the query engine 15 attempts to generate semantically equivalent queries that can be processed on the database server 12 by the target database system. These semantically equivalent queries include standard aggregate functions and the GROUP BY operator.
The summary filter transformation system 20 is implemented as a sub-system of the query engine 15 in the data access environment 10. This transformation 20 may generate queries that can be processed in their entirety on the database server 12, or queries that require processing on both the application server 13 and the database server 12.
Advantageously, the summary filter transformation system 20 reduces processing that might otherwise be required on an application server, thereby improving performance in many cases. Furthermore, the summary filter transformation system 20 takes advantage of functionality provided by a target database system.
There are two types of OLAP functions: framed functions and report functions. Framed OLAP functions contain a window frame specification (ROWS or RANGE) and an ORDER BY clause. Through window frames, capabilities such as cumulative (running) sums and moving averages can be supported. Report functions do not contain a window frame specification, and produce the same value for each row in a partition.
The SQL language is extended to include a FILTER clause that allows the specification of a summary filter (note that this clause is not part of the current SQL standard). Unlike the WHERE clause, which is applied before any OLAP functions in the select list are computed, the FILTER clause is applied before some OLAP functions are computed, and after others are computed.
The SQL language is also extended to include a PREFILTER keyword in an OLAP function specification to allow control of when the function is computed in the presence of a FILTER clause. Any OLAP function with PREFILTER specified is computed before the FILTER clause is applied, while all others are computed after.
The summary filter transformation generates a derived table and standard WHERE clause to apply the filter condition. Before describing this transformation, a couple of definitions are provided:
As described above, the first step in performing the summary filter transformation (40) is to analyze the summary filter condition to determine an overall filter grouping level (if any) (41). Preferably, step (41) is accomplished by first enumerating all groups using the following rules:
To determine how to perform the transformation (42), all enumerated groups are compared to determine an overall grouping level. If all groups are compatible, the group with the lowest level of granularity (group with the most columns) is chosen as the overall filter group. For instance, if the enumerated groups are (SNO), and (SNO, PNO), the filter group is (SNO, PNO). If the groups are not compatible, the filter group is NULL, and no optimization can be performed.
Some examples are given in the following table:
If no optimization can be performed, a simple transformation is performed. Otherwise, aggregates in the select list are analyzed and replaced with equivalent expressions in an effort to avoid introducing detail information into the inner select. This might involve replacing the aggregate all together, or replacing the aggregate operand with another aggregate (a nested aggregate) computed at the same level as the FILTER group.
The basic steps in performing the transformation are as follows:
Assuming the FILTER group is (SNO, PNO), the action taken for various aggregates is described below:
The following examples are provided to illustrate the functionality of the summary filter transformation system (20) and methods (30), (40):
In this example, a simple summary filter is illustrated.
Explanation
The FILTER condition is first analyzed, and the group is determined to be (SNO, PNO). A derived table is then constructed whose select list contains the required detail information (SNO, PNO) and the aggregate appearing in the condition. The first SUM in the main select list is computed based on the SUM in the derived table. Since it's group is (SNO), an AT clause is added to its specification to eliminated double counting. The second SUM is identical to the SUM in the derived table, so it is replaced accordingly.
In this example, use of the PREFILTER keyword in an OLAP function is illustrated.
After applying the GROUP query transformation on the derived table, the query becomes:
Explanation
The FILTER condition is first analyzed, and the group is determined to be (SNO, PNO). A derived table is then constructed whose select list contains the required detail information (SNO, PNO) and the aggregate appearing in the condition. The first SUM in the main select list is computed based on the SUM in the derived table. Since it's group is (SNO), an AT clause is added to its specification to eliminated double counting. The second SUM has a group of (PNO), which does not match the group of the FILTER condition, but The PREFILTER keyword is specified, so it is moved into the derived table.
In this example, the effect the presence of the AVG function has on the transformation is illustrated.
After applying the GROUP query transformation on the derived table, the query, becomes:
Explanation
The FILTER condition is first analyzed, and the group is determined to be (SNO, PNO). A derived table is then constructed whose select list contains the required detail information (SNO, PNO) and the aggregate appearing in the condition. The MAX function has a group of (SNO, PNO) which matches the group of the FILTER condition, so it is added to the derived table. The AVG function has a group of ( ), which does not match the group of the FILTER condition, so it must be replaced by an expression that involves aggregates computed at the same grouping level as the FILTER condition. Hence, a SUM and COUNT aggregate are added to the derived table, and the AVG function is replaced. The AT clauses in the two SUM function in the outer select eliminate double counting.
In this example, the effect the presence of the DISTINCT keyword has on the transformation is illustrated.
The query above can then be reformulated as follows:
Explanation
The FILTER condition is first analyzed, and the group is determined to be (SNO, PNO). A derived table is then constructed whose select list contains the required detail information (SNO, PNO) and the aggregate appearing in the condition. Because of the presence of the DISTINCT keyword, and the fact that the detail information required are columns in the FILTER condition group; a GROUP BY can be introduced into the derived table. The first SUM can be computed based on the SUM in the derived table—no AT clause is required since the GROUP BY eliminates the possibility of duplicates. The second SUM is the same as the SUM in the derived table, so it is replaced accordingly. Finally, the DISTINCT keyword can be eliminated since the GROUP BY inside the derived table ensures that there will be no duplicate rows.
The systems and methods according to the present invention may be implemented by any hardware, software or a combination of hardware and software having the functions described above. The software code, either in its entirety or a part thereof; may be stored in a computer readable memory. Further, a computer data signal representing the software code that may be embedded in a carrier wave may be transmitted via a communication network. Such a computer readable memory and a computer data signal are also within the scope of the present invention, as well as the hardware, software and the combination thereof.
While particular embodiments of the present invention have been shown and described, changes and modifications may be made to such embodiments without departing from the true scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2429910 | May 2003 | CA | national |
Number | Name | Date | Kind |
---|---|---|---|
5963936 | Cochrane et al. | Oct 1999 | A |
6009432 | Tarin | Dec 1999 | A |
6128612 | Brereton et al. | Oct 2000 | A |
6289334 | Reiner et al. | Sep 2001 | B1 |
6339775 | Zamanian et al. | Jan 2002 | B1 |
6341281 | MacNicol et al. | Jan 2002 | B1 |
6345272 | Witkowski et al. | Feb 2002 | B1 |
6370524 | Witkowski | Apr 2002 | B1 |
6370541 | Chou et al. | Apr 2002 | B1 |
6438537 | Netz et al. | Aug 2002 | B1 |
6460027 | Cochrane et al. | Oct 2002 | B1 |
6519604 | Acharya et al. | Feb 2003 | B1 |
6567802 | Popa et al. | May 2003 | B1 |
6574623 | Leung et al. | Jun 2003 | B1 |
6609123 | Cazemier et al. | Aug 2003 | B1 |
6611838 | Ignat et al. | Aug 2003 | B1 |
6629094 | Colby et al. | Sep 2003 | B1 |
6651055 | Kilmer et al. | Nov 2003 | B1 |
6684204 | Lal | Jan 2004 | B1 |
6714928 | Calow | Mar 2004 | B1 |
6766330 | Chen et al. | Jul 2004 | B1 |
6847962 | Cochrane et al. | Jan 2005 | B1 |
6934712 | Kiernan et al. | Aug 2005 | B2 |
7110999 | Colby et al. | Sep 2006 | B2 |
7120645 | Manikutty et al. | Oct 2006 | B2 |
7133865 | Pedersen et al. | Nov 2006 | B1 |
7275056 | Cheng et al. | Sep 2007 | B2 |
7315849 | Bakalash et al. | Jan 2008 | B2 |
7318058 | Styles | Jan 2008 | B2 |
7428532 | Styles | Sep 2008 | B2 |
7769769 | Rasmussen | Aug 2010 | B2 |
7917463 | Dagum et al. | Mar 2011 | B2 |
20010037345 | Kiernan et al. | Nov 2001 | A1 |
20020059203 | Witkowski et al. | May 2002 | A1 |
20020087524 | Leathers | Jul 2002 | A1 |
20020107840 | Rishe | Aug 2002 | A1 |
20020188600 | Lindsay et al. | Dec 2002 | A1 |
20030066051 | Karr et al. | Apr 2003 | A1 |
20030088558 | Zaharioudakis et al. | May 2003 | A1 |
20030101169 | Bhatt et al. | May 2003 | A1 |
20030115194 | Pitts et al. | Jun 2003 | A1 |
20040133567 | Witkowski et al. | Jul 2004 | A1 |
20040181537 | Chawla et al. | Sep 2004 | A1 |
20050010569 | Styles | Jan 2005 | A1 |
20050010570 | Styles | Jan 2005 | A1 |
20050015369 | Styles et al. | Jan 2005 | A1 |
Number | Date | Country |
---|---|---|
1081611 | Mar 2001 | EP |
WO 03012698 | Feb 2003 | WO |
WO 03038662 | May 2003 | WO |
Entry |
---|
Pirahesh et al, A Rule Engine for Query Transformation in Starburst and IBM DB2 C/S DBMS, Published Apr. 1997, pp. 391-400. |
Microsoft Computer Dictionary, Fifth Edition, (Published May 2002), [Retrieved on Aug. 12, 2010] Retrieved from the Internet: http://proquest.safaribooksonline.com/0735614954; pp. 1-2. |
EP 04 01 2635, European Search Report. |
Schwarz H. et al., Improving the Processing of Decision Support Queries: The Case for a DSS Optimizer, Jul. 16, 2001. |
Zemke F. et al., “Introduction to OLAP Functions”, Apr. 12, 1999. |
Winter R., “SQL-99's New OLAP Functions”, Jan. 20, 2000. |
Chiou A.S. et al., “Optimization for Queries With Holistic Functions”, Apr. 18, 2001. |
“European Search Report”, Aug. 17, 2005 for Application No. EP 04012614, 4 pages. |
“European Search Report”, Jul. 28, 2005 for Application No. EP 04012615, 6 pages. |
“European Search Report”, Jul. 28, 2005 for Application No. EP04012613, 2 pages. |
“European Search Report”, EP04076566, Aug. 8, 2005, 4 pages. |
“Oracle Technology Network—Discussion Forums”, Online, http://forums.oracle.com/forums/thread.jspa?messageIDS=343362񓵂, XP-002337527, retrieved Jul. 22, 2005. |
Johnson, Theodore et al., “Extending Complex Ad-Hoc OLAP”, Association for Computing Machinery, Proceedings of the 8th International Conference on Information Knowledge Management, CIKM '99, Kansas City, Missouri, Nov. 2-6, 1999, ACM CIKM International Conference on Information and Knowledge Management, New York, Nov. 1999, pp. 170-179. |
Lee, Sang-Won, “SQL/OLAP”, PowerPoint presentation, http://vldb.skku.ac.kr/vldb/talk/sql-olap.ppt, Jul. 12, 2001, 32 pages. |
Ross, Kenneth a. et al., “Complex Aggregation at Multiple Granularities”, Lecture Notes in Computer Science, vol. 1377, 1998, pp. 263-277. |
Schwarz, Holger et al., “Improving the Processing of Decision Support Queries: Strategies for a DSS Optimizer”, University Stuttgart, Technical Report TR-2001-02, Germany, May 2001, pp. 1-20. |
Sordi, J.J., “The Query Management Facility”, IBM Systems Journal, 1984, vol. 23, No. 2, pp. 126-150. |
Liebling, Michael, “Matlab vs. IDL”, Biomedical Imaging Group, Feb. 28, 2002, 5 pages. |
Number | Date | Country | |
---|---|---|---|
20050038782 A1 | Feb 2005 | US |