The present invention relates to a join query technique used with respect to a database and is suited for use in a search system capable of conducting a cross search.
In recent years, attention has been drawn up to a technique searching both non-structural data such as texts, sounds and images, and structural data in a field making use of data (hereinafter referred to as “cross search”). As one realized aspect of the cross search, there is a method of using a table function, which is a standard for SQL (Structured Query Language) 2003, in a system where a relational database management system (hereinafter referred to as “RDBMS [Relational DataBase Management System]), for managing the structural data, works with a search engine, graph database and social network for managing and searching the non-structural data (hereinafter referred to as the external data source).
The table function herein mentioned which is the standard for the SQL 2003 (hereinafter simply referred to as the “table function”) is a standard for referring to a function described by a user with the SQL and invoking it upon the SQL execution and its execution result is a row set. Since the execution result of the table function is the row set, it is possible to refer to the execution result by using a SELECT statement of the SQL and combine the execution result with another table.
Regarding the cross search using a table function of the SQL, a non-structural data search request is issued to the external data source, a function which describes processing for receiving the search result on the RDBMS side is defined, and the search of the non-structural data is conducted by executing that function when executing the SQL. The cross search realizes utilization of both the structural data and the non-structural data by joining the search results of the structural data and the non-structural data.
Incidentally, PTL 1 discloses a technique that optimizes a query for the cross search by executing cost calculation at the time of compilation of the query with respect to a table of the RDBMS and the query to search the external data source.
Moreover, NPL 1 discloses a technique that executes a query by embedding statistic information inside the RDBMS, which is used when generating a plan of query processing, into the plan, acquiring the statistic information during execution of the query processing, and switching the plan to an optimum plan if there is a gap between the acquired statistic information and the statistic information used when generating the plan.
Furthermore, NPL 2 discloses a technique that executes a query by switching the plan to an optimum plan as in the same manner as in NPL 1 if there is a gap between the statistic information inside the RDBMS, which is used when generating the plan of the query processing, and the statistic information acquired during execution of the query.
[PTL 1] U.S. Pat. No. 5,943,666
[NPL 1] Kabra, N., DeWitt, D.: “Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution) ans”, SIGMOD, pp. 106-117, 1998, (ISBN: 0-89791-995-5)
[NPL 2] Eurviriyanukul, K et al.: “Adaptive Join Processing in Pipelined Plans.”, ACM, EDBT '10 Proceedings of the 13th International Conference on Extending Database Technology, pp. 183-194, 2010, (ISBN: 978-1-60558-945-9)
Meanwhile, regarding the cross search, there is a high demand to combine the search result of the non-structural data and the structural data in a short amount of time.
However, according to the technique disclosed in PTL 1, a cost of executing the query processing is estimated when compiling before execution of the query processing, and the query is executed in a short amount of time by optimizing the query execution. Thus, in case there is a gap between the cost estimated at the time of the compilation and an actual query cost, the problem of incapability to execute the query in an optimum plan and lengthening of the query time occurs. Furthermore, when the cross search is conducted by using the table function, the query result will vary considerably depending on an argument of the table function. Therefore, the query execution cost will vary greatly. However, since it is impossible to predict the result of the query and estimate the cost at the time of the query compilation, the query plan, including determination of an optimum joining method, cannot be optimized.
Furthermore, regarding the cross search, the statistic information of the table function which issues a search request to the external data source does not exist at the time of the compilation. Therefore, it is impossible to apply the techniques like those disclosed in NPL1 and NPL2 based on the existence of the statistic information regarding the query at the time of the compilation.
The present invention was devised in consideration of the above-described circumstances and it is an object of the present invention to achieve a database technique enabling to execute processing of join query at high speed.
In order to solve the above-described problems, the present invention provides a join execution method executed by a join query execution device for searching an external data source connected to a data management system by means of a table function and executing joining processing against a join query regarding a search result of the table function and a table managed by the data management system, wherein the join execution method includes: a first step executed by the join query execution device executing the joining processing regarding the search result of the table function and the table managed by the data management system by a first joining method; a second step executed by the join query execution device determining whether the first joining method is an appropriate joining method as a joining method for the joining processing or not, by using information acquired by search processing on the external data source by means of the table function and/or information found during the joining processing; and a third step executed by the join query execution device, when it is determined that the first joining method is not the appropriate joining method as the joining method for the joining processing, recognizing a speed of a second joining method to be higher than that of the first joining method and switching the joining method for the joining processing from the first joining method to the second joining method.
Furthermore, in the present invention, a join query execution device searches an external data source connected to a data management system by means of a table function, and executes joining processing against a join query regarding a search result of the table function and a table managed by the data management system, wherein the join query execution device comprises: a table function unit that searches the external data source by issuing a search request to the external data source; a joining execution unit that executes the joining processing regarding the search result of the table function and the table managed by the data management system by a first joining method or a second joining method whose processing speed is faster than that of the first joining method; a switch command unit that determines whether the first joining method is an appropriate joining method as a joining method for the joining processing or not, by using information acquired by search processing on the external data source by the table function unit and/or information found during the joining processing, and gives a command to the joining execution unit to switch the joining method for the joining processing to the second joining method when it is determined that the first joining method is not the appropriate joining method as the joining method for the joining processing; and a control unit that controls activation of the table function unit, the joining execution unit, and the switch command unit.
Furthermore, the present invention is created so that a program is stored for comprising a join query execution device search an external data source connected to a data management system by means of a table function and execute joining processing against a join query regarding a search result of the table function and a table managed by the data management system, wherein the processing includes: a first step of executing the joining processing regarding the search result of the table function and the table managed by the data management system by a first joining method; a second step of determining whether the first joining method is an appropriate joining method as a joining method for the joining processing or not, by using information acquired by search processing on the external data source by means of the table function and/or information found during the joining processing; and a third step, which is executed when it is determined that the first joining method is not the appropriate joining method as the joining method for the joining processing, of recognizing a speed of a second joining method to be higher than that of the first joining method and switching the joining method for the joining processing from the first joining method to the second joining method.
According to the present invention, since the joining method can be switched to an appropriate joining method during the joining processing, it is possible to increase the speed of join query execution processing.
An embodiment of the present invention will be described in detail below with reference to drawings.
Referring to
The client 2 is a computer device used by a user and is configured of, for example, a personal computer, a workstation, or a mainframe. The client 2 sends a query to the RDBMS server 4 via the network 3 as operated by the user.
The RDBMS server 4 is a server system equipped with an RDBMS 10 and is configured by including a CPU (Central Processing Unit) 11, a main storage device 12, a communications interface 13, an input device 14, and an output device 15.
The CPU 11 is a processor that controls the operation of the entire RDBMS server 4. Also, the main storage device 12 is composed of, for example, a volatile semiconductor memory and is mainly used to store and retain various kinds of software. Various processing is executed by the RDBMS server 4 as a whole as described later by the CPU executing the software stored in the main storage device 12. The RDBMS 10 is also stored and retained in this main storage device 12.
The communications interface 13 is an interface for protocol control during communications with the client 2, the secondary storage device 5, or the external data source 6 and is composed of, for example, an NIC (Network Interface Card).
The input device 14 is a device used by a system administrator or the like to perform various operations and is composed of, for example, a keyboard and a mouse. Also, the output device 15 is a device used to present various pieces of information to the system administrator and is composed of, for example, a liquid crystal monitor and a speaker.
The secondary storage device 5 is a mass storage system used to save various data managed by the RDBMS server 4 and is composed of, for example, hard disk drives and a RAID (Redundant Arrays of Independent Disks) system. The secondary storage device 5 stores an index 16 for a database managed by the RDBMS 10, tabular data 17 of tables managed by the RDBMS 10, threshold information 18 described later, and so on.
The external data source 6 is a program for managing and searching the non-structural data and, for example, a search engine or a graph database is applied.
Next, the structure of the RDBMS 10 mounted in the RDBMS server 4 will be explained. Referring to
The pre-processing unit 20 is a program having a function that compiles a query from the client 2. Furthermore, the query processing unit 21 is a program having a function that executes the query compiled by the pre-processing unit 20. When the query from the client 2, for example, as in a case of a cross search, requires search results of the external data source 6, the table function unit 22 is a program having a function that issues a search request to the external data source 6 in response to this query, acquiring the search results in response to the search request from the external data source 6, and registering it in a temporary table (not shown in the drawing) of the RDBMS 10. When the table function unit 22 acquires the search results from the external data source 6, it also acquires the number of the search results (hereinafter simply referred to as the “search result count”).
The data management unit 23 is a program that has a function managing the tables managed by the RDBMS 10; and the general control unit 24 is a program that has a function controlling the pre-processing unit 20, the query processing unit 21, the table function unit 22, and the data management unit 23. Incidentally, the temporary table area 25 is a storage area for storing the temporary table which stores the search results of the external data source 6.
Under this circumstance, the pre-processing unit 20 is configured of: a query analyzer 30 for analyzing a query from the client 2; a query optimization unit 31 for optimizing the query analyzed by the query analyzer 30; and a query plan generator 32 for generating a plan 33 for processing on the query.
Furthermore, the query processing unit 21 is configured of: an insertion processing unit 40 for inserting data into a table managed by the RDBMS 10; an update processing unit 41 for updating the table managed by the RDBMS 10; a deletion processing unit 42 for deleting data from the table managed by the RDBMS 10; a search processing unit 43 for searching the table managed by the RDBMS 10 for data which satisfy designated conditions; a joining processing unit 44 for joining tables managed by the RDBMS 10, joining a table managed by the RDBMS 10 and the search result of the external data source 6, or joining the search results of the external data source 6 in response to, for example, a request from the client 2; and a query processing control unit 45 for analyzing a plan generated by the query plan generator 23 and invoking the relevant insertion processing unit 40, update processing unit 41, deletion processing unit 42, search processing unit 43, or joining processing unit 44 according to the content of the query.
Then, when the query from the client 2 is a join query to join the table of the RDBMS 10 and the temporary table which stores the search result of the external data source 6 (that is, a cross search), the joining processing unit 44 for the query processing unit 21 uses a Nested-Loop joining method to execute joining processing on the table managed by the RDBMS 10 with respect to each one search result of the external data source 6 registered in the temporary table. Moreover, when the joining processing unit 44 is activated, it also activates the table function unit 22. Consequently, the joining processing and the processing of the table function unit 22 are executed in parallel.
Furthermore, the joining processing unit 44 continues the processing until the joining processing on all the search results of the external data source 6 terminates. Under this circumstance, every time the joining processing unit 44 executes the joining processing on one search result of the external data source 6 registered in the temporary table and the table managed by the RDBMS 10, it calculates a remaining quantity of data of the temporary table on which the joining processing should be executed (hereinafter referred to as the “remaining join quantity”). At the same time, the joining processing unit 44 makes a query to the table function unit 22 about the search result count of the external data source 6 and then calculates a progress rate of the joining processing based on the thus-acquired search result count. Then, when the remaining join quantity is equal to or more than a predetermined threshold (hereinafter referred to as the “remaining join quantity threshold”) and the progress rate of the joining processing is less than a predetermined threshold (hereinafter referred to as the “progress rate threshold”), the joining processing unit 44 switches the joining method to a hash joining method, whose joining processing speed is faster than that of the Nested-Loop joining method, and then executes the remaining joining processing.
Next, the details of the joining processing according to this embodiment described above will be explained. In this section, an explanation will be given about a case where the RDBMS server 4 receives a join query, like the one illustrated in
When the RDBMS server 4 receives the join query, the general control unit 24 for the RDBMS server 4 invokes the pre-processing unit 20. When the pre-processing unit 20 is invoked by the general control unit 24, it has the query analyzer 30 compile the join query, has the query optimization unit 31 optimize the compiled join query, and then has the query plan generator 32 generate a plan 33 which is the plan used when executing the optimized join query. Specifically speaking, the query plan generator 32 generates the plan 33 as illustrated in
Subsequently, when the general control unit 24 receives a query execution request from the client 2, it invokes the query processing unit 21. Then, the query processing unit 21 has the query processing control unit 45 analyze the plan 33 in
Then, if the query processing control unit 45 determines that the query execution request requests for data insertion into the table of the RDBMS 10, it invokes the insertion processing unit 40 (SP3); and if the query processing control unit 45 determines that the query execution request requests for an update of the table of the RDBMS 10, it invokes the update processing unit 41 (SP4).
Moreover, if the query execution processing is to request deletion of data from the table of the RDBMS 10, the query processing control unit 45 invokes the deletion processing unit 42 (SP5); and if it is determined that the query execution request is to request a data search, the query processing control unit 45 invokes the search processing unit 43 (SP6).
Furthermore, if the query execution request is to request joining between, for example, the table of the RDBMS 10 and the search result of the external data source 6, the query processing control unit 45 invokes the joining processing unit 44 (SP7). Then, the query processing control unit 45 invokes the insertion processing unit 40, the update processing unit 41, the deletion processing unit 42, the search processing unit 43, or the joining processing unit 44 as described above.
Accordingly, when the query sent from the client 2 to the RDBMS server 4 is a join query like the one as illustrated in
Then, when the joining processing unit 44 is invoked in step SP7 of the query execution control processing described above with reference to
When the table-function-and-base-table joining unit 52 is invoked in step SP7 of the query execution control processing, the control unit 63 starts the activation processing illustrated in
Subsequently, the control unit 63 waits for an activation completion notice (hereinafter referred to as the “activation completion notice”) to be transmitted from the switch command unit 61 (SP11); after eventually receiving the activation completion notice, the control unit 63 sequentially invokes the joining execution unit 60 and the table function invoking unit 62 (SP12, SP13) and then terminates this activation processing.
Meanwhile, the joining execution unit 60 which is activated by the control unit 63 in step SP12 of the activation processing (
As means for executing such joining processing, the joining execution unit 60 includes a joining operation unit 74 configured of: a Nested-Loop joining unit 70 for executing processing for joining the joining targets (between the tables managed by the RDBMS 10, between the table managed by the RDBMS 10 and the search result of the external data source 6 or between the search results of the external data source 6) as illustrated in
Furthermore, the joining execution unit 60 includes: a count query unit 75 for acquiring the search result count of the external data source 6 from the table function unit 22 (
Practically, when the joining execution unit 60 is activated by the control unit 63 in step SP12 of the activation processing described above with reference to
Next, the joining execution unit 60: acquires one piece of data from the temporary table which stores data of the joining target stored in the temporary table area 25 (
Obtaining a negative result in this judgment means that the search result by the search processing on the external data source 6, which is being executed in parallel processing, has not been registered in the temporary table, yet. Therefore, when this happens, the joining execution unit 60 returns to step SP21 and then repeats a loop of step SP21, and then step SP22, and back to step SP21 until the temporary table search unit 72 successfully acquires the data from the temporary table.
Then, if the joining execution unit 60 obtains an affirmative result in step SP22 as the temporary table search unit 72 eventually acquires one piece of data from the temporary table, it joins this data and the joining target table managed by the RDBMS 10 from the base table search unit 73 by the then-designated joining method (hereinafter referred to as the “designated joining method”) (SP23).
Subsequently, the joining execution unit 60 determines whether the current designated joining method is Nested-Loop joining or not (SP24); and if the joining execution unit 60 obtains a negative result, it proceeds to step SP31. Meanwhile, if the joining execution unit 60 obtains an affirmative result in the judgment of step SP24, it has the remaining quantity information calculation unit 76 calculate the remaining join quantity (SP25). Specifically speaking, the remaining quantity information calculation unit 76 calculates the remaining join quantity by subtracting the number of pieces of data, on which the joining processing has already been executed, from the number of pieces of data registered in the temporary table.
Then, the joining execution unit 60 has the count query unit 75 make a query to the table function unit 22 about the search result count of the external data source 6 (SP26) and has the progress rate calculation unit 78 calculate the progress rate of the joining processing based on the thus-acquired search result count of the external data source 6 (SP27). Specifically speaking, the progress rate calculation unit 78 calculates the progress rate of the joining processing by dividing the number of pieces of data, on which the joining processing has already been executed, among the data registered in the temporary table by the search result count of the external data source 6 acquired in step SP26.
Furthermore, the joining execution unit 60 has the remaining quantity information communication unit 77 report the remaining join quantity, which was calculated in step SP25, to the switch command unit 61 and also has the progress rate communication unit 79 report the progress rate, which was calculated in step SP27, to the switch command unit 61 (SP28), and then determines whether the switch receiver 80 has received a command, which is transmitted from the switch command unit 29 when necessary as described later, to switch the designated joining method to the hash joining method (hereinafter referred to as the “switch command”) or not (SP29).
Then, if the switch receiver 80 has not received the switch command from the switch receiver 80, the joining execution unit 60 proceeds to step SP31. Meanwhile, if the switch receiver 80 has received the switch command, the joining execution unit 60 switches the designated joining method to the hash joining method (SP30). Furthermore, the joining execution unit 60 then determines whether the progress rate calculated in step SP27 is 100% or not (SP31).
If the joining execution unit 60 obtains a negative result in this judgment, it returns to step SP21 and then repeats the processing from step SP21 to step SP31 while switching the data to be acquired from the temporary table in step SP21 sequentially from one data to another.
Then, if the joining execution unit 60 obtains an affirmative result in step SP31 as the progress rate of the joining processing becomes 100%, it terminates this joining execution processing.
Meanwhile, the switch command unit 61 activated by the control unit 63 in step SP10 of the activation processing described earlier with reference to
As means for the above-described purpose, the switch command unit 61 includes, as illustrated in
Under this circumstance, the threshold information 18 stored in the secondary storage device 5 is composed of information representing a function name, a remaining join quantity threshold, and a progress rate threshold, respectively, as indicated in
Practically, when the switch command unit 61 is activated by the control unit 63 in step SP10 of the activation processing described earlier with reference to
Subsequently, the switch command unit 61 reports the activation completion notice to the control unit 63 (SP41) and then wait for the remaining quantity information receiver 82 and the progress rate receiver 83 to receive the corresponding remaining join quantity and progress rate, respectively, from the joining execution unit 60 (SP42).
Then, if the switch command unit 61 obtains an affirmative result in step SP42 as the remaining quantity information receiver 82 and the progress rate receiver 83 eventually receive the remaining join quantity and the progress rate, respectively, it compares the received remaining join quantity and progress rate with their corresponding remaining join quantity threshold and progress rate threshold of the threshold information 18 (SP43).
Next, as a result of the comparison in step SP42, the switch command unit 61 has the determination unit 84 determine whether or not the remaining join quantity is equal to or more than the remaining join quantity threshold and the progress rate is less than the progress rate threshold (SP44). Then, if the switch command unit 61 obtains a negative result in this judgment, it returns to step SP42 and then waits for the next remaining join quantity and progress rate to be transmitted (SP42).
Meanwhile, if the switch command unit 61 obtains an affirmative result in the judgment of step SP44, it has the switch information communication unit 85 report the switch command to the joining execution unit 60 to switch the joining method (SP45) and then terminates this switch command processing.
Furthermore, the table function invoking unit 62 activated by the control unit 63 in step SP13 of the activation processing described earlier with reference to
Under this circumstance, the table function unit 22 includes, as illustrated in
Practically, when the table function unit 22 is invoked by the function invoking unit 62, it starts the query processing illustrated in
Subsequently, the table function unit 22: has the count receiver 93 receive the search result count which is transmitted from the external data source 6 in response to the search request (SP51); and then determines whether the query about the search result count of the external data source 6 transmitted from the joining execution unit 60 (
Next, the table function unit 22 has the query result receiver 91 receive the search result in response to the search request transmitted from the external data source 6 in step SP50 (SP54) and registers the received search result in the temporary table of the RDBMS 10 (SP55), and then determines whether it has finished receiving all the search results in response to the search request in step SP50 (SP56). Incidentally, this judgment is made by determining whether as many search results as the search result count received in step SP51 have been received or not, or by determining whether a request for termination of transfer of the search results has been received from the external data source or not.
Then, if the table function unit 22 obtains a negative result in this judgment, it returns to step SP54 and then repeats the processing from steps SP54 to step SP56. Then, if the table function unit 22 obtains an affirmative result in step SP56 by eventually receiving as many search results as the search result count received in step SP52, it terminates this query processing.
this execution sequence is started by activation of the table-function-and-base-table joining unit 52 (
Then, when the table-function-and-base-table joining unit 52 is activated, the control unit 63 for the table-function-and-base-table joining unit 52 firstly transmits an activation request to the switch command unit 61 (SP60). Then, the switch control unit 61 which has received this activation request executes the activation processing, reads the threshold information 18 from the secondary storage device 5 (SP61), and then transmits an activation completion notice to the control unit 63 upon completion of the activation processing (SP62).
Furthermore, when the control unit 63 receives the activation completion notice from the switch command unit 61, it transmits an activation request to the joining execution unit 60 (SP63). Also at the same time, the control unit 63 transmits an activation request to the table function invoking unit 62. As a result, the table function unit 22 is invoked by the table function invoking unit 62 (SP64).
Then, the table function unit 22 invoked by the table function invoking unit 62 transmits a search request to the external data source 6 in response to the join query from the client 2 (SP65). Subsequently, when the table function unit 22 receives the search result count and the search result, respectively, transmitted from the external data source 6 in response to this search request (SP66, SP67), it registers the received search result in the temporary table of the RDBMS 10 (SP68).
Moreover, when the activation processing is completed, the joining execution unit 60 which has received the aforementioned activation request transmitted from the control unit 63 reads one piece of data from the temporary table of the RDBMS 10 (SP69) and joins the read data and the table managed by the RDBMS 10 by the Nested-Loop joining method (SP70). Then, the joining execution unit 60 calculates the current remaining join quantity with respect to the search results of the external data source 6 registered in the temporary table (SP71).
Furthermore, the joining execution unit 60 makes a query to the table function unit 22 about the search result count (SP72); and after the search result count is reported from the table function unit 22 (SP73), the joining execution unit 60 calculates the progress rate of the query based on the reported search result count (SP74). Then, the joining execution unit 60 reports the thus-calculated remaining join quantity and progress rate to the switch command unit 61 (SP75).
After receiving this remaining join quantity and the progress rate, the switch command unit 61 determines whether or not the remaining join quantity found during the joining processing is equal to or more than the remaining join quantity threshold and the progress rate of the joining processing is less than the progress rate threshold (SP76); and if the switch command unit 61 obtains an affirmative result in this judgment, it transmits a switch command to the joining execution unit 60 to switch the joining method (SP77).
Consequently, the joining execution unit 60 which has received this switch command switches the joining method used during the joining processing to the hash joining method (SP78) and then thereafter executes the processing for joining the search result of the external data source 6 and the table managed by the RDBMS 10 by the hash joining method.
Regarding the search system 1 according to this embodiment described above, when the client 2 makes a join query (cross search request) which requires to join the search result of the external data source 6 and the table managed by the RDBMS 10, whether the joining method is appropriate or not is determined every time the processing for joining one search result of the external data source 6 registered in the temporary table and the table managed by the RDBMS 10 is executed by the Nested-Loop joining method; and if it is determined that the joining method is not appropriate, the joining method is switched to the hash joining method and the remaining joining processing is then executed, so that the cross search can be processed at high speed.
Furthermore, since this search system 1 executes the joining processing upon execution of the cross search and the search processing on the external data source 6 in parallel, the speed of the cross search can be further increased.
In the first embodiment, every time one piece of data in the temporary table, which stores the search results of the external data source 6, and the table managed by the RDBMS 10 are joined, the remaining join quantity and the progress rate of the joining processing are calculated, respectively, and whether the joining method should be switched or not is determined based on the remaining join quantity and the progress rate.
Meanwhile, this embodiment is characterized in that every time one piece of data in the temporary table, which stores the search results of the external data source 6, and the table managed by the RDBMS 10 are joined, not the remaining join quantity, but a hit rate and the progress rate of the joining processing are found and whether the joining method should be switched or not is determined based on the hit rate and the progress rate.
The hit rate used herein in this embodiment means a rate of data, which satisfy joining conditions, among the entire data of the joining target table managed by the RDBMS 10 with respect to the one piece of data, on which the joining processing is then executed, in the temporary table which stores the search results of the external data source 6.
A search system according to this embodiment is configured in the same manner as the search system 1 according to the first embodiment, except that the configuration of the table-function-and-base-table joining unit 52 for the joining processing unit 44 described earlier with reference to
Under this circumstance, the hit rate information calculation unit 102 has a function calculating the hit rate of one piece of data in the temporary table which stores the search results of the external data source 6 every time the processing for joining the one piece of data and the joining target table managed by the RDBMS 10. Furthermore, the hit rate information communication unit 103 has a function reporting the hit rate calculated by the hit rate information calculation unit 102 to the switch command unit 104.
Practically, when the joining execution unit 101 is activated by the control unit 63 in step SP 12 of the activation processing described earlier with reference to
Subsequently, the joining execution unit 101 has the hit rate information calculation unit 102 calculate the hit rate of data which is a target at that time (SP85). Specifically speaking, the joining execution unit 101 calculates a rate of data, which satisfy the joining conditions, among data of the joining target table managed by the RDBMS 10 as the hit rate with respect to the data acquired in step SP81.
Next, the joining execution unit 101 processes step SP86 and step SP87 in the same manner as in step SP26 and step SP27 of the joining execution processing according to the first embodiment; and then has the hit rate information communication unit 103 report the hit rate calculated in step SP85 to the switch command unit 104 and also has the progress rate communication unit 79 report the progress rate calculated in step SP87 to the switch command unit 104 (SP88).
Then, the joining execution unit 101 processes step SP89 to step SP91 in the same manner as in step SP29 to step SP31 of the joining execution processing according to the first embodiment; and if the joining execution unit 101 eventually obtains an affirmative result in step SP91, it terminates this joining execution processing.
Meanwhile, the switch command unit 104 for the table-function-and-base-table joining unit 100 according to this embodiment is configured in the same manner as the switch command unit 61 (
Furthermore, in the case of this embodiment, the secondary storage device 5 (
Practically, when the switch command unit 104 is activated by the control unit 63 in step SP10 of the activation processing described earlier with reference to
Subsequently, the switch command unit 104 informs the control unit 63 of an activation completion notice to report the completion of activation (SP101) and then waits for the hit rate information receiver 105 and the progress rate receiver 83 to respectively receive the corresponding hit rate and progress rate to be transmitted from the joining execution unit 101 (SP102).
Then, if the switch command unit 104 obtains an affirmative result in step SP103 as the hit rate information receiver 105 and the progress rate receiver 83 eventually receive the hit rate and the progress rate respectively, it has the determination unit 106 compare the received hit rate and progress rate with the threshold information 107 acquired in step SP100 (SP103).
Subsequently, as a result of the comparison in step SP103, the switch command unit 104 has the determination unit 106 determine whether or not the hit rate is equal to or more than the hit rate threshold and the progress rate of the joining processing is less than the progress rate threshold (SP104). Then, if the switch command unit 104 obtains a negative result in this judgment, it returns to step SP102 and then waits for the next hit rate and progress rate to be transmitted.
On the other hand, if the switch command unit 104 obtains an affirmative result in the judgment of step SP104, it has the switch information communication unit 85 transmit a switch command to the joining execution unit 101 to switch the joining method (SP105) and then terminates this switch command processing.
Regarding the search system according to this embodiment as described above, every time one piece of data in the temporary table, which stores the search results of the external data source 6, and the table managed by the RDBMS 10 are joined, the hit rate and the progress rate of the joining processing are calculated and whether the joining method should be switched or not is determined based on the hit rate and the progress rate; and if it is determined that the joining method is not appropriate, the joining direction is switched to the hash joining method and then the remaining joining processing is executed. Therefore, the cross search can be processed at high speed in the same manner as in the first embodiment.
this embodiment is characterized in that whether the joining method should be switched or not is determined based only on the remaining join quantity. Specifically speaking, every time one piece of data in the temporary table, which stores the search results of the external data source 6, and the table managed by the RDBMS 10 are joined, the remaining join quantity is calculated and whether the joining method should be switched or not is determined based on the calculated remaining join quantity.
A search system according to this embodiment is configured in the same manner as the search system 1 according to the first embodiment, except that the configuration of the table-function-and-base-table joining unit 52 for the joining processing unit described earlier with reference to
Practically, when the joining execution unit 111 is activated by the control unit 63 in step SP13 of the activation processing described earlier with reference to
Subsequently, the joining execution unit 111 has the remaining quantity information communication unit 77 report the remaining join quantity calculated in step SP135 to the switch command unit 112 (SP136). How to calculate the remaining join quantity here is the same as the first embodiment.
Then, the joining execution unit 111 processes step SP137 to step SP139 in the same manner as in step SP29 to step SP31 of the joining execution processing according to the first embodiment; and if the joining execution unit 111 obtains an affirmative result in step SP139, it terminates this joining execution processing.
Meanwhile, the switch command unit 112 for the table-function-and-base-table joining unit 110 according to this embodiment is configured in the same manner as the switch command unit 61 (
Furthermore, in the case of this embodiment, the secondary storage device 5 (
Practically, when the switch command unit 112 is activated by the control unit 63 in step SP10 of the activation processing described earlier with reference to
Subsequently, the switch command unit 112 waits for the remaining quantity information receiver 82 to receive the remaining join quantity transmitted from the joining execution unit 111 (SP142). Furthermore, if the switch command unit 112 obtains an affirmative result in step SP142 as the remaining quantity information receiver 82 eventually receives the remaining join quantity, it has the determination unit 113 compare the received remaining join quantity with the threshold information 114 (
Then, as a result of the comparison in step SP143, the switch command unit 112 has the determination unit 113 determine whether or not the remaining join quantity is equal to or more than the remaining join quantity threshold (SP144). Then, if the switch command unit 112 obtains a negative result in this judgment, it returns to step SP142 and then waits for the next remaining join quantity to be transmitted.
On the other hand, if the switch command unit 112 obtains an affirmative result in the judgment of step SP144, it has the switch information communication unit 85 transmit a switch command to the joining execution unit 111 to switch the joining method (SP145) and then terminates this switch command processing.
Furthermore, the table function unit 120 according to this embodiment is configured in the same manner as the table function unit 22 (
Then, the table function unit 120 processes step SP151 to step SP153 in the same manner as in step SP54 to step SP56 of the query processing according to the first embodiment described earlier with reference to
When the RDBMS server 4 receives the join query (cross search request in this example) transmitted from the client 2, this execution sequence is started by activation of the table-function-and-base-table joining unit 110 (
Then, once the table-function-and-base-table joining unit 110 is activated, step SP160 to step SP164 are processed in the same manner as step SP60 to step SP64 of the execution sequence according to the first embodiment described earlier with reference to
Then, the table function unit 120 transmits a search request to the external data source 6 in response to the join query from the client 2 (SP165). Subsequently, when the table function unit 120 receives the search result transmitted from the external data source 6 in response to this search request (SP166), it registers the received search result in the temporary table of the RDBMS 10 (SP167).
Furthermore, when the activation processing is completed, the joining execution unit 111 activated by the control unit 63 reads one piece of data from the temporary table (SP168) and joins the read data and the table managed by the RDBMS 10 by the Nested-Loop joining method (SP169). Then, the joining execution unit 111 calculates the current remaining join quantity with respect to the search results of the external data source 6 registered in the temporary table (SP170) and reports the calculated remaining join quantity to the switch command unit 112 (SP171).
After receiving this remaining join quantity, the switch command unit 112 determines whether or not the remaining join quantity calculated during the joining processing is equal to or more than the remaining join quantity threshold (SP172); and if the switch command unit 112 obtains an affirmative result in this judgment, it transmits a switch command to the joining execution unit 111 to switch the joining method (SP173).
Consequently, the joining execution unit 111 which has received this switch command switches the joining method to be used for the joining processing to the hash joining method (SP174) and then executes the processing for joining the search result of the external data source 6 and the table managed by the RDBMS 10 by the hash joining method.
Regarding the search system according to this embodiment as described above, every time one piece of data in the temporary table, which stores the search results of the external data source 6, and the table managed by the RDBMS 10 are joined, the remaining join quantity is calculated and whether the joining method should be switched or not is determined based on the remaining join quantity; and if it is determined that the joining method is not appropriate, the joining direction is switched to the hash joining method and then the remaining joining processing is executed. Therefore, the cross search can be processed at high speed in the same manner as in the first embodiment even if the search result count cannot be acquired from the external data source 6.
Incidentally, the aforementioned first to third embodiments have described the case where the data management system is the RDBMS 10; however, the present invention is not limited to this example and a wide variety of other data management systems can be applied.
Furthermore, the aforementioned first to third embodiments have described the case where the joining methods applied to the processing for joining the search result of the external data source 6 and the table managed by the RDBMS 10 are the Nested-Loop joining method and the hash joining method; however, the present invention is not limited to this example and a wide variety of other joining methods can be applied. In this case, the processing for joining the search result of the external data source 6 and the table managed by the RDBMS 10 may be started by a first joining method and then the joining method may be switched to a second joining method, whose processing speed is faster than that of the first joining method, when the need arises.
Furthermore, the aforementioned first to third embodiments have described the case where the remaining join quantity and the progress rate of the joining processing, and the hit rate are applied when determining whether the joining method applied to the processing for joining the search result of the external data source 6 and the table managed by the RDBMS 10 is an appropriate joining method or not; however, the present invention is not limited to this example and a wide variety of information other than the remaining join quantity, the progress rate, and/or the hit rate (information acquired by the search processing on the external data source 6 and/or information found during the joining processing) can be applied. In this case, a standard for determining whether the joining method applied to the processing for joining the search result of the external data source 6 and the table managed by the RDBMS 10 is an appropriate joining method or not may be decided according to the information applied at that time.
Furthermore, the aforementioned first to third embodiments have described the case where a storage medium which is a semiconductor memory is applied as a storage medium for storing the RDBMS 10 to which the present invention is applied; however, the present invention is not limited to this example and it is possible to apply a wide variety of storage media, for example, disc-type storage media such as CDs (Compact Discs), DVDs (Digital Versatile Discs), BD (Blu-ray [registered trademark] Discs), hard disk drives, or optical discs, nonvolatile semiconductor memories, or other storage media.
The present invention enables to be applied to a search system capable of conducting a cross search.
1 Search system; 2 client; 4 RDBMS server; 5 secondary storage device; 6 external data source; 11 CPU; 12 main storage device; 16, 107, 114 threshold information; 21 query processing unit; 22, 120 table function unit; 44 joining processing unit; 52, 100, 110 table-function-and-base-table joining unit; 63 control unit; 60, 101, 111 joining execution unit; 61, 104, 112 switch command unit; 62 table function invoking unit; 70 Nested-Loop joining unit; 71 hash joining unit; 73 base table search unit; 74 joining operation unit; 75 count query unit; 76 remaining quantity information calculation unit; 77 remaining quantity information communication unit; 78 progress rate calculation unit; 79 progress rate communication unit; 81 threshold information acquisition unit; 84, 113 determination unit; 90 query unit; 91 query result receiver; 92 temporary table registration unit; 93 count receiver; 102 hit rate information calculation unit; and 103 hit rate information communication unit.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2013/064323 | 5/23/2013 | WO | 00 |