1. Field of the Invention
This invention relates generally to continuous processing systems that process streaming data, and, more specifically, to distributing query processing in such systems over a cluster of servers.
2. Description of the Background Art
A continuous processing system processes streaming data. It includes queries which operate continuously on input data streams and which publish output data streams. As processing the queries can be quite complex, it is desirable to distribute processing over a cluster of servers. While clustering for performance is known for processing static data (i.e., not streaming data), different methods are required to process streaming data. Consequently, there is a need for specific methods for distributing processing of queries that operate continuously on streaming data.
The present invention provides a method for processing a set of registered queries over a cluster of servers in a continuous computation system. Prior to processing the queries, the continuous computation system creates an execution plan for processing the queries over a cluster of servers. In creating the execution plan, the continuous computation system analyzes the semantics and requirements of the queries to determine how to distribute processing across the cluster. In one embodiment, if a system administrator or developer has inputted instructions or manual “hints” as to how to distribute processing, the system will also factor in such instructions/hints.
The semantics and requirements of each query may be analyzed to determine whether input messages for the query can be processed independent of each other, whether input messages for the query can be partitioned into groups that can be processed independent of each other, whether the query includes an aggregator function, and/or whether the query includes a subquery.
If input messages for a query can be processed independent of each other, then logic for processing the query can be duplicated on two or more servers and the input messages can be divided up in a number of ways, such as a round-robin fashion, randomly, to balance the load, etc. If input messages for a query can be partitioned into groups that can be processed independent of each other (where messages within a group need to be processed on the same server), then logic for processing the query can be duplicated on two or more servers, and input messages can be distributed to the servers in accordance with the group in which the messages belong.
If the query includes an aggregator function, then logic for processing the query can be duplicated on two or more servers and input messages can be distributed in any manner to such servers. Partial outputs for the aggregator are generated on each server and combined to create the final output.
If a query includes a subquery, the subquery can be processed on a different server then the query by replacing the subquery in the original query with a new input data stream and creating a new query based on the subquery which published to such new data stream.
Coral8, Inc.'s “In-Motion Processing System” is an example of a continuous processing system. Also, one embodiment of a continuous processing system is described in U.S. patent application Ser. No. 11/015,963, filed on Dec. 17, 2004 with Mark Tsimelzon as the first-named inventor, and titled “Publish and Subscribe Capable Continuous Query Processor for Real-time data streams,” the contents of which are incorporated by reference as if fully disclosed herein.
Queries may be written in a continuous-processing software language (CPL), which is sometimes also referred to as a continuous correlation language (CCL). An example of such a language described in the U.S. patent application Ser. No. 11/346,119, filed on Feb. 2, 2006, and titled “Continuous Processing Language for Real-time Data Streams,” the contents of which are incorporated by reference as if fully disclosed herein.
The method also comprises processing the set of queries over the cluster of servers in accordance with the execution plan (step 220). Such processing may include generating partial outputs for at least one query in the set and using such partial outputs to generate an output data stream for such query. The term “server”, as used herein, means any processing entity. One machine can have one server or multiple servers.
In creating the execution plan, the compiler 120 parses a query, extracts the semantics, and analyzes what is required. Analysis of a query can include determining if, during query processing, input messages can be processed independently of each other (step 210a) or partitioned into groups that can be processed independently of each other (step 210b). In the latter case, each group consists of input messages that need to be processed together on the same server. Furthermore, analysis of a query can include determining whether the query includes an aggregator function (step 210c) or a subquery (step 210d).
If input messages for a query can be processed independent of each other (i.e., processing of one message does not depend on another message), then processing of such query can be distributed over a server cluster by: (1) installing identical logic for processing the query on two or more servers in the cluster and (2) dividing the input messages among such servers, where the input messages can be divided up in any way. In one embodiment, the continuous processing system determines if messages can be processed independent of each other by performing data flow and semantic analysis that indicates whether or not messages need to interact with each other during query processing.
Each of servers x, y, and z generates a partial output for the query. The output is “partial” because each server only processes a portion of the input messages. The partial outputs are then merged to create the output data stream (320). In some cases, the partial outputs and final output may be generated in the context of time- or row-based windows (windows are described in U.S. patent application Ser. No. 11/346,119, which is referenced above). For example, to generate a partial output, a query may operate on input messages received within a 5 second window, or it may operate on the last 5 rows it received.
An example of a query in which messages can be processed independent of each other is a filter operation, which may be expressed as follows:
Insert into Stream2
Select *
From Stream1
Where cost>10.0
The above query, as well as the other example queries disclosed herein, is written in Coral8's CCL language, which described in U.S. patent application Ser. No. 11/346,119 (referenced above). In a filter operation, messages do not need to interact with each other. Instead, they can be filtered independently, and those input message satisfying the filter expression can be merged into the output data stream.
If input messages for a query can be partitioned into groups that can be processed independent of each other (where each groups consists of messages that need to be processed together), then processing of the query can be distributed over a server cluster by (1) installing identical logic for processing the query on two or more servers in the cluster and (2) dividing the input message among such servers by such groups, where a single group is processed on the same server (i.e., the group is not divided up among servers).
Each of servers x, y, and z generates a partial output for the query, where each partial output corresponds to one of the groups. The output data stream is then a union of these partial outputs (420). In some cases, the partial outputs, as well as the final output, may be generated in the context of time-based or row-based windows.
Groups can be based on clauses in the query that define groupings. The “Group By” clause in the below query is an example:
Insert into AvgPrices
Select AVG(Trades.Price)
From Trades KEEP 10 minutes
Group By Trades.Symbol
This query calculates a 10 minute moving average of stock prices. In this query, the input data steam is “Trades,” where the query specifies that data in the “price” column of the “Trades” stream should be averaged over a moving 10 minute period and the averages should be outputted to the output data stream “AvgPrices.” The “Group By” clause specifies that, in calculating the averages and generating the output data stream, the messages should be grouped by stock symbol. Consequently, all messages for the same symbol should go to the same server as they are processed in the context of each other, but messages for different symbols do not interact with each other and thus can be processed on different servers. In determining how to distribute query processing over a cluster, the continuous processing system may examine the queries for clauses like the “Group By” clauses.
If a query includes an aggregator function (such as MAX, MIN, SUM, etc.), then processing of such query can be distributed over a server cluster by (1) installing the aggregator functionality on two or more servers in the cluster and (2) dividing the input messages among such servers, where the input messages are divided up in any way.
Each of servers x, y, and z generates a partial output for the query. The output is “partial” because each server only processes a portion of the input messages. The partial outputs are then combined to create the output data stream (520). In some cases, the partial outputs, as well as the final output, may be generated in the context of time-based or row-based windows.
An example of a query that includes an aggregator is as follows:
Insert into MaxPrices
Select Max(Trades.Price)
From Trades KEEP 10 minutes
This query calculates a 10 minute moving maximum trade price. The input data stream is “Trades” and the output data stream is “Max Prices.” The aggregator function is the “MAX” function, which calculates the maximum value within a data set. As illustrated in
In this case, this technique allows one to scale both CPU and memory usage of computing “MAX” across a number of servers. The same technique can apply to MIN (i.e., a function that calculates a minimum), COUNT (i.e., a function that calculates a count), AVERAGE (i.e., a function that calculates averages), SUM (i.e., a summation function), STD DEVIATION (i.e., a function that calculates a standard deviation), and many other aggregators (e.g., EVERY, ANY, SOME). Note that in some cases intermediate servers need to send extra information to the merging server. For example, when computing AVERAGE, not only PartialAVERAGEs, but also PartialCOUNTs must be communicated to enable the computation of CombinedAVERAGE.
If a query includes a subquery, then processing of such query can be distributed over a server cluster by: (1) replacing the subquery with a new input data stream, (2) creating a new query based on the subquery that generates the new data stream, and (3) processing the query (with the new data stream) and the new query on separate servers.
As will be understood by those familiar with the art, the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. Accordingly, the above disclosure of the present invention is intended to be illustrative and not limiting of the invention.
This application claims the benefit of U.S. Provisional Application No. 60/700,139 filed on Jul. 18, 2005 with first-named inventor Mark Tsimelzon, and titled “Clustering Options for a Continuous-Processing System,” the contents of which are incorporated by reference as if fully disclosed herein.
Number | Name | Date | Kind |
---|---|---|---|
4412285 | Neches et al. | Oct 1983 | A |
6092062 | Lohman et al. | Jul 2000 | A |
6470331 | Chen et al. | Oct 2002 | B1 |
6985904 | Kaluskar et al. | Jan 2006 | B1 |
Number | Date | Country | |
---|---|---|---|
60700139 | Jul 2005 | US |