1. Technical Field
The present invention relates to a system and method for building an XQuery using a model-based XQuery building tool. More particularly, the present invention relates to a system and method for optimizing a query model that is built from existing node structures and generating an XQuery from the optimized query model.
2. Description of the Related Art
Extensible Markup Language (XML) is a versatile markup language that labels information content over diverse data sources, including structured documents, semi-structured documents, relational databases, and object repositories.
As increasing amounts of information are stored, exchanged, and presented using XML, the ability to intelligently query XML data sources becomes increasingly important. One of XML's strengths is its flexibility in representing many different information types from diverse sources. To exploit this flexibility, an XML query language, called XQuery, has been developed.
XQuery provides concise and easily understood queries, and is also flexible enough to query a broad spectrum of data sources, including both databases and documents. Some of these data sources may be available when a user writes and tests an XQuery, while other data sources may not be available, such as a new, unpopulated database. In addition, some data sources may only be available to the query at run-time, such as when a query executes within a running program (e.g., web-service request). When a data source is not available, a user must pass the data to the XQuery via parameter bindings when testing an XQuery. A challenge found with this approach is that manually coding these parameters is a cumbersome process.
Another challenge found for a user is the process of creating an XQuery. Typically, a user starts with a source schema and a target schema. The user analyzes the target schema, identifies an instance document structure adhering to that schema, and builds an XQuery using node constructors corresponding to the instance document structure, as well as new node constructors where applicable. As expected, this approach is a time consuming and error prone process, especially for XQuery novice users who must first learn nesting node constructors and query logic syntaxes.
Furthermore, another challenge found with creating an XQuery is generating relative paths for XPath elements. When a user selects an XML node from an XML document, the node may be represented in an XQuery model as an “absolute” path, which is a path anchored at the root node of the XML document and extending to a selected node. Although the node path is represented as an absolute path in the XQuery model, the node may be represented in the resulting XQuery code as a relative path that starts at a variable and includes a path from the variable to the document's root node. Thus, a node may be identified in an XQuery by the combination of an absolute path (contained in the variable) and a relative path (from the variable to the node). If this path combination is itself held in a variable, another node in the XQuery model may be represented in the XQuery code as a path relative to that variable, thus minimizing the size and complexity of the paths in the query. In order to have sensible and efficient relative paths, an XQuery tool must generate them from the absolute paths described in the model. Existing art, however, requires a user to identify the relative paths during XQuery development. Again, because XQuery is a fairly complicated scripting language, this approach is a difficult and error prone process, especially for novices.
Finally, another challenge found with XQuery development is the ability to convert For-Let-Where-Orderby-Return (FLWOR) expressions, which are useful when writing an XQuery, to XPath expressions during XQuery execution, which increases the readability of the XQuery. The difference between an XPath expression and a FLWOR expression is that a FLWOR expression is more verbose and explicit, which allows a user to view the logic behind its corresponding XPath expression. Thus, when writing an XQuery, a user may prefer to view FLWOR expressions instead of XPath expressions. XPath expressions, however, are easier to read when viewing the XQuery as a whole. Therefore, it would be desirable to automatically identify appropriate FLWOR expressions and convert the FLWOR expressions to XPath expressions.
What is needed, therefore, is a system and method for automating XQuery generation steps in order to alleviate the challenges discussed above.
It has been discovered that the aforementioned challenges are resolved using a system, method, and program product that generates an XQuery by identifying an object in a query model and determine that the object is an XPath object. The system, method, and program product then create a relative path for the XPath object that corresponds to a hierarchical location of the XPath object relative to other objects included in the query model. The system, method, and program product then include the relative path in the query model. Once the relative path is included in the query model, the system, method, and program product generate an XQuery using the query model.
Finally, the system, method, and program product execute the generated XQuery using a query engine.
In one embodiment, the system, method, and program product annotate, for the XPath object, an absolute path in the query model, a parent's absolute path that corresponds to a parent object of the XPath object, and a model tree level. In this embodiment, the system, method, and program product include the annotated absolute path, parent's absolute path, and the model tree level in the relative path. When the XPath object represents a variable, the system, method, and program product annotate a variable name and a variable type that corresponds to the variable, and include the variable name and variable type in the relative path.
In one embodiment, the system, method, and program product select a nearest ancestor to the XPath object. In this embodiment, the system, method, and program product identify a nearest ancestor variable path offset that corresponds to the selected nearest ancestor and includes the nearest ancestor variable path offset in the relative path.
In one embodiment, the system, method, and program product identify a For-Let-Where-Orderby-Return (FLWOR) object in the query model. In this embodiment, the system, method, and program product determine whether the FLWOR object is convertible to an XPath object based upon one or more convertibility rules. When the system, method, and program product determine that the FLWOR object is convertible, the system, method, and program product convert the FLWOR object to the XPath object using one or more conversion rules.
In one embodiment, the system, method, and program product identify an XML instance document and select a node included in the XML instance document. Once selected, the system, method, and program product replicate the node into the query model using a general user interface that results in the object. In another embodiment, the system, method, and program product determine that the XML instance document is used during query execution and the XML instance document is bound to a file. In this embodiment, the system, method, and program product generate a runtime parameter that corresponds to the XML instance document and include the runtime parameter in the XQuery.
In another embodiment, in response to determining that the XLM instance document is used during query execution and the XML instance document is bound to a file, the system, method, and program product pass the XML document as a runtime parameter to the query engine during the query execution.
The foregoing is a summary and thus contains, by necessity, simplifications, generalizations, and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting. Other aspects, inventive features, and advantages of the present invention, as defined solely by the claims, will become apparent in the non-limiting detailed description set forth below.
The present invention may be better understood, and its numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings.
The following is intended to provide a detailed description of an example of the invention and should not be taken to be limiting of the invention itself. Rather, any number of variations may fall within the scope of the invention, which is defined in the claims following the description.
User 105 uses GUI 108 to identify instance documents 120, which are located in documents store 110, that include node examples that are similar to results (results 190) that user 105 wishes to receive from an XQuery. XQuery builder 100 retrieves user 105's identified documents (instance documents 120) from documents store 110 and stores them in instance store 130. Documents store 110 and instance store 130 may be stored on a nonvolatile storage area, such as a computer hard drive.
User 105 then uses GUI 108 to replicate the nodes included in instance documents 120 that are similar to the results that user 105 wises to receive. GUI 108 may include two window areas, whereby the first area includes the nodes included in instance document 120 and the other area is a replication area that shows replicated nodes (see
When user 105 is finished replicating nodes, XQuery builder 100 retrieves query model 155 from query model store 150, and proceeds through a series of actions to optimize query model 155 prior to generating an XQuery. These optimization steps include generating relative paths for XPath objects in order to minimize the size and complexity of the paths in the query. In addition, XQuery builder 100 analyzes For-Let-Where-Orderby-Return (FLWOR) objects and converts the FLWOR objects to XPath objects if appropriate (see
Once XQuery builder 100 optimizes query model 155, XQuery builder 100 generates XQuery 160 using the optimized query model, and stores XQuery 160 in XQuery store 170. During XQuery 160 generation, XQuery builder 100 identifies documents that the query uses, which are bound to a file, and generates corresponding runtime parameters. These runtime parameters are included in XQuery 160 (see
XQuery 160 is now ready for execution. XQuery builder 100 retrieves XQuery 160 from XQuery store 170, and passes it to query engine 180. In addition, in order to test XQuery 160, XQuery builder 100 passes documents 175 to query engine 180, which are the XML documents that XQuery builder 100 identified earlier as being used in the query and bound to a file (discussed above). Query engine 180 uses XQuery 160 and documents 175 to generate results 190 for user 105 to view (see
At step 220, processing uses a GUI interface to replicate nodes from the instance documents into a query model that is located in query model store 150 (step 220). The GUI interface combines a flexible query building tool as a context and provides a user with the choice of selecting and replicating a node, which are associated with a query output. (see
Processing automatically generates sensible and efficient relative path information from absolute path information that describes non-replicated nodes whose associated path is a path in the source document (pre-defined process block 230, see
Once relative paths are in place and FLWOR objects are converted to XPath objects, processing generates an XQuery using the query model, and stores the XQuery in XQuery store 170 (pre-defined process block 250, see
Processing executes the generated XQuery using query engine 180, which produces results 190. To test the XQuery, processing passes documents that the XQuery uses, which are bound to a file, as input parameters to query engine 180 (pre-defined process block 260, see
Processing commences at 300, whereupon processing identifies an object in the query model located in query model store 150 (step 310). Query model store 150 is the same as that shown in
A determination is made as to whether the object represents a variable (decision 340). For example, the object may be a child of an Object of type “FOR” or “LET.” If the object does not represent a variable, decision 340 branches to “No” branch 342 bypassing variable annotation steps. On the other hand, if the object represents a variable, decision 340 branches to “Yes” branch 348 whereupon processing annotates the object's corresponding variable name and type, and stores it back in the query model located in query model store 150 (step 350) (see
A determination is made as to whether the query model includes more objects (decision 360). If the query model includes more objects, decision 360 branches to “Yes” branch 362 whereupon processing identifies (step 370) and processes the next object. This looping continues until there are no more objects to process, at which point decision 360 branches to “No” branch 368 whereupon processing generates a relative path for the XPath objects and stores the relative paths back into the query located in query store 150 (pre-defined process block 380, see
Processing commences at 400, whereupon processing sorts the annotated objects included in query model store 150 by level (e.g., ascending order) at step 410. At step 420, processing selects the first object from the sorted objects and, at step 430, processing performs a path intersection for the full path values of the other objects in order to locate a common ancestor in the object's corresponding XML source document. For example, processing may compare strings starting from a root node to find overlapping paths. At step 435, processing sorts the list using the intersection point with the nearest intersection path that is higher in the list.
At step 440, processing applies “nearest ancestor decision rules” and selects the most appropriate nearest ancestor variable. For example, the rules may include:
Processing, at step 450, identifies a nearest ancestor variable path offset. For example, the intersection between the absolute path to a target node and the absolute path to its nearest ancestor node may be used to calculate the relative path of an object to the target node using a formula such as:
Relative path=$+variablename+(count_subpaths (relevant variable path−Intersection path)) * “..”+(target object path−Intersection path)
where:
At step 460, processing appends the variable name to the object in query model store 150, which creates a relative path for the object (see
A determination is made as to whether there are more objects to process (decision 470). If there are more objects to process, decision 470 branches to “Yes” branch 472 whereupon processing selects (step 480) and processes the next object. This looping continues until there are no more objects to process, at which point decision 470 branches to “No” branch 478 whereupon processing returns at 490.
When writing an XQuery, it is therefore helpful to view XPath objects as FLOWR objects because the logic is more easily understood and manipulated. However, a concise XPath object is typically easier to read in the context of the entire query than the verbose FLWOR object. A FLWOR object may include other FLWOR objects that, in turn, may also include more FLWOR objects. In order to convert FLWOR objects to XPath objects, an XQuery builder recursively traverses a query model's hierarchy, starting from the innermost FLWOR object.
Conversion processing commences at 500, whereupon processing identifies an object in the query located in query store 150 (step 510). Query model store 150 is the same as that shown in
A determination is made as to whether the child object is a FLWOR object (decision 540). If the child object is not a FLWOR object, decision 540 branches to “No” branch 542 bypassing XPath conversion steps. On the other hand, if the object is a FLWOR object, decision 540 branches to “Yes” branch 548 whereupon a determination is made as to whether the FLWOR object is convertible based upon convertibility rules (decision 550). For example, the convertibility rules may include:
If the object is not convertible based upon the convertibility rules, decision 550 branches to “No” branch 552 bypassing conversion steps. On the other hand, if the object is convertible, decision 550 branches to “Yes” branch 558 whereupon processing converts the FLWOR object to an XPath object at step 560 using conversion rules (see
A determination is made as to whether processing should continue to recursively search for, and convert, FLWOR objects (decision 570). If processing should continue, decision 570 branches to “Yes” branch 572 which loops back and moves up to the parent object (step 580), and selects the next child object (step 530) (see
A determination is made as to whether the document is bound to a file (decision 630). In the above example, processing checks of the table and column attributes are empty. If they are empty, processing knows that the document is going to be passed into the query. If the document is not bound to a file, decision 630 branches to “No” branch 632 bypassing code generation steps. On the other hand, if the document is bound to a file, decision 630 branches to “Yes” branch 638 whereupon processing generates code for a runtime parameter in the query, and stores it in XQuery store 170 (step 640). XQuery store 170 is the same as that shown in
A determination is made as to whether there are more objects to process in the query model that is located in query model store 150 (decision 650). If there are more objects to process, decision 650 branches to “Yes” branch 652 whereupon processing selects (step 660) and processes the next object. This looping continues until there are no more objects to process, at which point decision 650 branches to “No” branch 658 whereupon processing returns at 670.
A determination is made as to whether the document is bound to a file (decision 740). If the document is not bound to a file, decision 740 branches to “No” branch 742 bypassing document passing steps. On the other hand, if the document is bound to a file, decision 740 branches to “Yes” branch 748 whereupon processing passes the document as a runtime parameter to query engine 180 for use during the prepared statement's execution (step 750).
A determination is made as to whether there are more objects in the XQuery to process (decision 760). If there are more objects to process, decision 760 branches to “Yes” branch 762 which loops back to select (step 770) and process the next object. This looping continues until there are no more objects to process, at which point decision 760 branches to “No” branch 768.
At step 780, processing executes the query using the prepared statement and the runtime parameters. In turn, query engine 180 produces results 190, which is the same as that shown in
Window 800 allows a user to indicate nodes of a target XML structure to replicate by dragging the node at the root of that structure to a query design area. A user selects a particular document using selection window 810, such as “purchaseOrder.xml.” The user may then “right click” a pointer, which displays menu 820 on window 800. In turn, the user positions pointer 830 over “replicate node structure.” This action instructs an XQuery builder to add query logic to a query model that corresponds to the node.
Window 1050 includes code lines 1060 through 1090 that correspond to code lines 1010 through 1040, respectively, shown in
Diagram 1100 includes FLWOR objects 1110, through 1135. To begin the recursive process, a query builder identifies and converts FLWOR objects 1125, 1115, 1120, and 1135 to XPath objects. The query builder then proceeds up the hierarchy to convert the other FLWOR objects (see
Diagram 1140 shows that an XQuery builder converted FLWOR objects 1125, 1115, 1120, and 1135 shown in
Since FLWOR object 1130 passes particular conversion rules, the XQuery builder may convert FLWOR object 1130 to an XPath object. FLWOR object 1110, however, may not be converted because it includes an ORDERBY object with two “returns” (see
Diagram 1200 shows that an XQuery builder converted FLWOR object 1130 shown in
PCI bus 1514 provides an interface for a variety of devices that are shared by host processor(s) 1500 and Service Processor 1516 including, for example, flash memory 1518. PCI-to-ISA bridge 1535 provides bus control to handle transfers between PCI bus 1514 and ISA bus 1540, universal serial bus (USB) functionality 1545, power management functionality 1555, and can include other functional elements not shown, such as a real-time clock (RTC), DMA control, interrupt support, and system management bus support. Nonvolatile RAM 1520 is attached to ISA Bus 1540. Service Processor 1516 includes JTAG and I2C busses 1522 for communication with processor(s) 1500 during initialization steps. JTAG/I2C busses 1522 are also coupled to L2 cache 1504, Host-to-PCI bridge 1506, and main memory 1508 providing a communications path between the processor, the Service Processor, the L2 cache, the Host-to-PCI bridge, and the main memory. Service Processor 1516 also has access to system power resources for powering down information handling device 1501.
Peripheral devices and input/output (I/O) devices can be attached to various interfaces (e.g., parallel interface 1562, serial interface 1564, keyboard interface 1568, and mouse interface 1570 coupled to ISA bus 1540. Alternatively, many I/O devices can be accommodated by a super I/O controller (not shown) attached to ISA bus 1540.
In order to attach computer system 1501 to another computer system to copy files over a network, LAN card 1530 is coupled to PCI bus 1510. Similarly, to connect computer system 1501 to an ISP to connect to the Internet using a telephone line connection, modem 15155 is connected to serial port 1564 and PCI-to-ISA Bridge 1535.
While
One of the preferred implementations of the invention is a client application, namely, a set of instructions (program code) in a code module that may, for example, be resident in the random access memory of the computer. Until required by the computer, the set of instructions may be stored in another computer memory, for example, in a hard disk drive, or in a removable memory such as an optical disk (for eventual use in a CD ROM) or floppy disk (for eventual use in a floppy disk drive), or downloaded via the Internet or other computer network. Thus, the present invention may be implemented as a computer program product for use in a computer. In addition, although the various methods described are conveniently implemented in a general purpose computer selectively activated or reconfigured by software, one of ordinary skill in the art would also recognize that such methods may be carried out in hardware, in firmware, or in more specialized apparatus constructed to perform the required method steps.
While particular embodiments of the present invention have been shown and described, it will be obvious to those skilled in the art that, based upon the teachings herein, that changes and modifications may be made without departing from this invention and its broader aspects. Therefore, the appended claims are to encompass within their scope all such changes and modifications as are within the true spirit and scope of this invention. Furthermore, it is to be understood that the invention is solely defined by the appended claims. It will be understood by those with skill in the art that if a specific number of an introduced claim element is intended, such intent will be explicitly recited in the claim, and in the absence of such recitation no such limitation is present. For non-limiting example, as an aid to understanding, the following appended claims contain usage of the introductory phrases “at least one” and “one or more” to introduce claim elements. However, the use of such phrases should not be construed to imply that the introduction of a claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an”; the same holds true for the use in the claims of definite articles.