1. Technical Field
The description generally relates to stream processing and, more particularly, to systems and methods for optimal component composition in a stream processing system.
2. Description of the Related Art
Emerging applications such as trade surveillance for security fraud, network traffic monitoring for intrusion detection, sensor data analysis, audio/video surveillance, and value-added voice-over-IP services, call for sophisticated real-time processing on data streams. In these applications, data streams are continuously pushed to stream processing servers, where they are processed by self-contained stream processing elements called “components”. Each component provides an atomic stream processing function such as filtering, aggregation, and correlation. Since stream applications are inherently distributed, stream processing should operate in a distributed fashion. Moreover, distributed stream processing systems provide better scalability and availability for resource-intensive and quality-sensitive stream processing applications. Thus, a challenging problem is to optimally compose distributed stream processing components into dynamically required stream processing applications.
Component composition has been studied under different contexts, such as service composition and systems software composition. The work on service composition is described, e.g., in the following articles, all of which are included by reference herein in their entireties: Raman et al., “Load Balancing and Stability Issues in Algorithms for Service Composition”, Proc. of IEEE INFOCOM 2003, San Francisco, Calif., pp. 1477-1487, April 2003; Gu et al., “QoS-Assured Service Composition in Managed Service Overlay Networks”, Proc. of IEEE 23rd International Conference on Distributed Computing Systems (ICDCS 2003), Providence, R.I., 194-201, May 2003; and Gu et al., “SpiderNet: An Integrated Peer-to-Peer Service Composition Framework”, Proc. of IEEE International Symposium on High-Performance Distributed Computing (HPDC 2004), Honolulu, Hi., 110-119, June 2004. The work on systems software composition is described, e.g., in the following article which is included by reference herein in its entirety: Kohler et al., “The Click Modular Router”, ACM Transactions on Computer Systems, 18(3), pp. 263-297, August 2000. Disadvantageously, the previous work falls short in addressing the optimization requirements in component composition, which is especially important for stream processing systems.
Previous work on stream processing has addressed problems such as load shedding and load migration. Load shedding is described, e.g., in the following article which is incorporated by reference herein in its entirety: Tatbul et al., “Load Shedding in a Data Stream Manager”, Proc. of the 29th International Conference on Very Large Data Bases (VLDB'03), Berlin, Germany, 309-320, September 2003. Further, load migration is described, e.g., in the following article which is incorporated by reference herein in its entirety: Balazinska et al., “Contract-based Load Management in Federated Distributed Systems”, Proc. of 1st Symposium on Networked Systems Design and Implementation (NSDI), San Francisco, Calif., 197-210, March 2004. Disadvantageously, the previous work does not address the optimal component composition problem.
Given the current state of the prior art, it would be beneficial and highly advantageous to have a system and method for optimal component composition in distributed stream processing environments.
The present invention is directed to systems and methods for optimal component composition in distributed stream processing environment.
According to an aspect of the present invention, there is provided a system for optimizing component composition in a distributed stream-processing environment having a plurality of nodes capable of being associated with one or more of a plurality of stream processing components. The system includes an adaptive composition probing (ACP) module and a hierarchical state manager. The ACP module probes a subset of the plurality of stream processing components to determine the optimal component composition in response to a stream processing request. The hierarchical state manager manages local and global information for use by said ACP module in determining the optimal component composition.
According to another aspect of the present invention, there is provided a method for optimizing component composition in a distributed stream-processing environment having a plurality of nodes capable of being associated with one or more of a plurality of stream processing components. An adaptive composition probing (ACP) is performed which probes a subset of the plurality of stream processing components to determine the optimal component composition in response to a stream processing request. Local and global information are managed for use by the performing step in determining the optimal component composition.
According to yet another aspect of the present invention, there is provided a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for optimizing component composition in a distributed stream-processing environment having a plurality of nodes capable of being associated with one or more of a plurality of stream processing components. An adaptive composition probing (ACP) is performed which probes a subset of the plurality of stream processing components to determine the optimal component composition in response to a stream processing request. Local and global information are managed for use by the performing step in determining the optimal component composition.
These and other objects, features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:
The present invention is directed to a system and method for optimal component composition in distributed stream processing environments.
In contrast to the prior art, the present invention provides an adaptive and real-time system and method that is capable of solving constrained optimization problem in component composition. Of course, given the teachings of the present invention provided herein, other applications therefore may also be employed with respect to the present invention while maintaining the scope of the present invention.
The present description illustrates the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the principles of the invention. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The invention as defined by such claims resides in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
A description will now be given regarding a distributed stream processing environment.
In
The distributed stream processing system 100 includes a set 111 of stream processing server nodes (Vi), each of which can be one or more computers (and thus are hereinafter interchangeably referred to as “computer”, “node”, or “server”) 115. For failure resilience, distributed nodes 115 are connected using application-level overlay links (ei) 104 into an overlay mesh. Each node 115 provides a few stream processing components {c1, . . . , Ck}. Each component provides an atomic stream processing function (Fi) such as filtering, aggregation, correlation, and audio/video analysis. Due to the constraints of security, software licenses, and hardware requirements, it is not presumed that each node can provide all stream processing components. The component composition process is to select and connect currently deployed components into user-required stream processing applications. Components can be dynamically migrated among nodes. The component composition operates based on the current component placement.
Each component receives continuous data units (e.g., data tuples, audio samples, video frames, and so forth) via input queues 103 from its preceding components, as illustrated in
A description will now be given regarding a composite stream processing service.
Distributed components can be dynamically composed into composite stream processing services. Generally, composition topology can be a directed acyclic graph (DAG), as illustrated in
A description will now be given regarding the problem of optimal component composition.
The problem of optimal component composition is to map a function graph into a component graph in the distributed stream processing system based on resource and QoS constraints and an optimization objective function. The function graph 190 illustrated
Thus, the optimal component composition problem can be formally defined as follows: Given a distributed stream processing system G=(V,E) where V denotes the set of |V| stream processing nodes (vi) and E denotes the set of |E| overlay links (ei). Given a stream processing request denotes the set of |C| stream processing components (ci) and L denotes the set of |L| inter-component virtual links (li) such that
Generally speaking, the optimal component composition problem is a multi-constrained optimization problem, which optimizes a global system metric subject to a set of constraints. Herein, the optimization goal is to minimize the congestion aggregation metric φ(λ) defined by Equation 1 for balanced load distribution. The smaller the congestion aggregation metric is, the better load distribution the composition presents, since the stream processing application on the nodes and virtual links are instantiated with larger residual resources. The constraints include the function of the user, and QOS and resource constraints for the composed stream processing application. Equation 2 defines that the component graph λ should provide all stream processing functions specified in the function graph ξ. Equation 3 specifies that the QoS of the composed stream processing application qlλ, . . . , qmλ processing time, loss rate) should satisfy the user QoS requirements [qlreq, . . . , qmreq]. Equation 4 and Equation 5 specify that user required system resources and bandwidth resource should be satisfied (i.e., residual resources should not be negatives).
A general description of the present invention will now be provided, following by more detailed descriptions of various aspects thereof.
Herein, an optimal component composition system and method are provided that use adaptive composition probing (ACP). ACP can efficiently discover a set of “good” candidate component compositions among which the best composition is selected. ACP addresses two key decision-making problems: (1) how many candidate components to probe; and (2) which candidate components to probe. To accommodate dynamic stream processing systems, ACP adaptively adjusts the probed component number to maintain target composition performance with minimal probing overhead. For scalability, ACP selects good candidate components to probe under the guidance of a coarse-grain global state.
A description will now be given regarding system Application Programming Interfaces (APIs).
The user can specify the stream processing request in terms of: (1) function requirements described by a function graph (ξ); (2) QoS requirements (Qreq); and (3) resource requirements (Rreq). The function graph 190 illustrated in
The first interface, namely Session ID=Find (ξ,Qreq,Rreq), invokes the optimal component composition algorithm to find the best component graph ξ. If the composition is successful, then the middleware creates a session record with a session identifier (sessionId) for the user request. Otherwise, a null sessionId is returned to indicate composition failure.
The second interface, namely Process (sessionId, data streams), starts the continuous data stream processing using the composed component graph. The middleware can map the session identifier to the component graph composed by the previous step.
The third interface, namely Close (sessionId), tears down the stream processing session when the application finishes its task. The corresponding session information is deleted from the session table.
A description will now be given regarding an overview of adaptive composition probing (ACP).
The basic idea of the ACP approach is to use a number of probing messages, called probes, to dynamically discover a set of good candidate compositions among which the best composition is selected. The probing process concurrently examines different compositions and collects precise state information from good candidate components. For scalability, ACP avoids brute-force exhaustive probing by performing adaptive selective probing. ACP addresses two key decision-making problems in composition probing: (1) how many candidate components should be probed and (2) which candidate components should be probe.
Intuitively, the more candidate components that are probed, the better component composition can be discovered. However, the probing overhead also becomes larger when more candidate components are probed. A probing ratio α (0,1] is introduced to define the percentage of the candidate components probed for each function. For example, if there are ten candidate components for the function Fi and the probing ratio α=0.3, then 0.3×10=3 candidate components can be probed. ACP adaptively adjusts the probing ratio based on the target composition performance and current system conditions, which are described further herein.
Next, it is to be decided which candidate components to probe for maximizing the probability of finding the best composition. A hierarchical state management scheme is described herein to assist ACP in optimal component composition. The hierarchical state management maintains precise local state at each node and a coarse-grain global state, which is described further herein. Hence, ACP can select good components to probe under the guidance of the coarse-grain global state and select optimal component composition based on precise states collected by the probes. The candidate component selection is described further herein.
A description will now be given regarding ACP protocol.
When a stream processing request is submitted, the request is redirected to a node that is closest to the client based on a predefined proximity metric (e.g., geographical location, and so forth). The selected node, called the deputy node 202 (see
Regarding step 1 (initialization) in
Regarding step 2 (per-hop probe processing) in
Regarding step 3 (optimal composition selection) in
The qualified composition with the smallest φ(ξ) value is the optimal component composition. If two components are located on the same node (e.g., c1 and c2), then the residual bandwidth is set ∞ since the virtual link between two co-located components do not consume any network bandwidth.
Regarding step 4 (application session setup), the deputy node 202 finally establishes the stream processing application session by sending confirmation messages along the selected component composition. The confirmation message makes transient resource allocation permanent on the selected nodes and virtual links. If no qualified component composition can be found, then the deputy node 202 returns a failure message.
A description will now be given regarding hierarchical state management.
The hierarchical state management includes fine-grain local state update and coarse-grain global state maintenance. The local state of a node includes the QoS/resource states of its neighbor nodes in the overlay mesh, and its adjacent overlay links. Each node keeps its local state with high precision using frequent proactive measurement at short time intervals (e.g., 10 seconds, of course other time intervals may be used). For scalability, the precise local state is not disseminated to other nodes.
The global state includes: (1) the QOS and resource states of all nodes; and (2) the QoS and resource states of all virtual links between all pairs of nodes. Since each node can provide multiple stream processing components, the QoS states of each node includes the QoS states of all the stream processing components it provides. For scalability, the global state update is performed at a coarse-grain level. The global state update is triggered only when the QoS and resource state changes on a node or an overlay link exceeds a pre-specified (e.g., large) threshold. Thus, many insignificant state variations are filtered out to reduce the global state maintenance overhead. For example, a node updates its available memory state in the global state only when the available memory variation is larger than 100 KB or some other pre-specified amount. Similarly, a node updates the available bandwidth of its adjacent overlay link in the global state only when the available bandwidth variation is more than 200 kbps or some other pre-specified amount. It is to be appreciated that the preceding parameters are merely illustrative and, thus, other values may be used for available memory variation and available bandwidth variation, while maintaining the scope of the present invention.
One complication in the global state update is that each virtual link is an overlay path that includes several overlay links. Thus, the virtual link state update requires further aggregation from the states of its constituent overlay links. For example, it is desired to update the available bandwidth of a virtual link (li) in the global state, which is mapped to an overlay path li=<e1, . . . , ek). Then, the available bandwidth of li is calculated from the available bandwidth of its constituent overlay links: bal
The distributed hash table (DHT) system is used to store the global state for scalability and failure resilience. Thus, the global state storage load and query computation load can be distributed among all nodes. The global state hash table records the coarse-grain QoS/resource states of all nodes and virtual links in the form of <node identifier, coarse-grain QoS/resource states> and <link identifier, coarse-grain QoS/resource states>. If a node wants to acquire the coarse-grain QoS/resource states about a specific node or a virtual link, then the node can query the global state hash table using the node identifier or link identifier. The DHT system can evenly partition the global state hash table and distribute hash table entries among all nodes.
A description will now be given regarding adaptive probe number selection.
The approach for deciding the number of probes used for each request will now be described. The concept of probing ratio will now be defined. If a function Fi has ki candidate components and the probing ratio is α, then ACP will use ┌α·ki┐ probes to examine ┌α·ki┐ candidate components for the function Fi. Intuitively, the larger the probing ratio is, the better composition performance the ACP can achieve since more candidate compositions are examined. However, larger probing ratio also means larger probing overhead since more probes are generated. Thus, the probing ratio represents a tuning knob to control the trade-off between composition performance and probing overhead. Composition success rate μ(t) is used to characterize the composition performance of the system at sampling time (t). The composition success rate is calculated as follows:
where SuccessNum(t) denotes the number of successful compositions during last sampling period [t-Δt,t] and RequestNum(t) denotes the number of total composition requests during [t-Δt,t]. A higher composition success rate μ(t) indicates better optimal component composition performance at time t. The probing ratio value at time t is denoted by α(t).
Ideally, ACP should always use the minimal probing ratio α(t) for achieving a target composition success rate μ(t) under different system conditions (e.g., workload, QoS/resource requirements). For example,
To address the problem, ACP performs on-line profiling to dynamically derive the current mapping function from probing ratio to composition success rate. Based on the profiling results, ACP can predict the minimal probing ratio α(t) given a target success rate μ(t). It is presumed that the target success rate is achievable. ACP stops increasing the probing ratio if the probing overhead already reaches its limit. The on-line profiling is triggered when the prediction error exceeds a certain threshold δ (e.g., δ=2%), which means that the system conditions have changed. For example, if the target success rate is μ(t), then ACP predicts the minimal probing ratio α(t) based on the current profiling results. At the end of this sampling period, ACP gets the measured success rate μ(t). If |μ(t)−μ′(t)|>δ, then profiling is triggered to derive the new mapping function from the probing ratio to the composition success rate.
Since ACP is highly efficient, the success rate increases very quickly and reaches the saturation value (i.e., the highest achievable success rate) at a small probing ratio, illustrated by
A description will now be given regarding per-hop component selection.
After the deputy node decides the probing ratio, the deputy node executes the ACP protocol to send out composition probes. When a node vi receives a probe, the node vi needs to decide which next-hop candidate components to examine under the probing ratio constraint. For example, if the next-hop function Fi has ki candidate components and the probing ratio is α, then the node vi is allowed to probe ┌α·ki┐ candidate components for Fi. Under the guidance of the coarse-grain global state, vi selects the M best candidate components as described in the following three steps.
In the first step, vi queries the global state to retrieve the coarse-grain QoS/resource states of all candidate components c1, . . . , ck and virtual links l1, . . . , lk to the candidate components. The QoS states of the candidate component ci and the virtual link li are described by [qlc
In the second step, vi filters out unqualified candidate components by checking the input/output stream rate compatibility between the current-hop component and candidate next-hop components. Then, vi further removes unqualified candidate components according to the QoS/resource requirements of the user and the state information retrieved from the global state. The user's QoS requirements for the composed stream processing application are denoted using [qlreq, . . . , qmreq]. It is presumed that the S accumulated QOS values of the probed partial component composition are qlξ, . . . , qmξ. A candidate component ci is unqualified if any of the following inequalities is true:
qiλ+qic
raic
bal
Equation 6 means that the user's QoS constraints cannot be satisfied. Equation 7 means that the candidate component ci cannot meet the end-system resource requirements. Equation 8 means the virtual link to ci cannot meet the bandwidth requirement.
In the third step, vi further selects good candidate components from the above derived qualified components. It is presumed that vi finds Z qualified candidate components. If Z≦M, then vi can probe all the qualified candidate components satisfying the probing ratio constraint. Otherwise, vi needs to select M best qualified components from Z qualified ones. The virtual link from the current component to ci is denoted using li. To meet the multi-constrained QoS requirements [qlreq, . . . , qmreq] a risk function D(ci) is defined for a candidate component ci as follows:
The larger the ratio value
is, the more closely the QoS accumulation (qiξ+qic
The component with smaller W (ci) values is a better candidate to probe since it is less loaded. The candidate component selection approach described above essentially ensures that probes traverse along “good” candidate compositions. The fine-grain QoS/resource states collected by the probes will be used to select the best component composition. Thus, ACP can most efficiently find the optimal component composition although ACP only examines a subset of all candidate compositions.
These and other features and advantages of the invention may be readily ascertained by one of ordinary skill in the pertinent art based on the teachings herein. It is to be understood that the teachings of the invention may be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof.
Most preferably, the teachings of the present invention are implemented as a combination of hardware and software. Moreover, the software is preferably implemented as an application program tangibly embodied on a program storage unit. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU”), a random access memory (“RAM”), and input/output (“I/O”) interfaces. The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.
It is to be further understood that, because some of the constituent system components and methods depicted in the accompanying drawings are preferably implemented in software, the actual connections between the system components or the process function blocks may differ depending upon the manner in which the present invention is programmed. Given the teachings herein, one of ordinary skill in the pertinent art will be able to contemplate these and similar implementations or configurations of the present invention.
Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the invention. All such changes and modifications are intended to be included within the scope of the invention as set forth in the appended claims.
This application is a continuation of U.S. application Ser. No. 11/068,785, filed on Mar. 1, 2005 now U.S. Pat. No. 7,562,355, which is incorporated by reference herein in its entirety.
This invention was made with Government support under Contract No.: TIA H98230-04-3-0001 awarded by the U.S. Dept. of Defense. The Government has certain rights in this invention.
Number | Name | Date | Kind |
---|---|---|---|
7840431 | Hacigumus | Nov 2010 | B2 |
Number | Date | Country | |
---|---|---|---|
20080188987 A1 | Aug 2008 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11068785 | Mar 2005 | US |
Child | 12061284 | US |