In a publish-subscribe system, a publisher running on a computer system sends out one or more messages when a specified event occurs, and a subscriber running on another computer system, or perhaps the same computer system as the publisher, receives published messages to which it has subscribed. A message sent by the publisher in response to some specific event is referred to as a published event. When a subscriber initiates a subscription, it sends out a message to indicate what type of published events it is subscribing to. This type of message sent by the subscriber is referred to as a subscription event.
A simple example of a publish-subscribe system may be provided within the context of providing timely stock quotes of publicly traded corporations. Suppose publisher P is to generate a published event if the stock price of corporations A, B, or C, has changed, where the published event gives the particular corporate name and its current stock price. Suppose subscriber S has subscribed to any published event for corporation B that indicates a change in its stock price. Then, when the event of a new stock price for corporation B occurs, a published event is sent by publisher P and is received by subscriber S. Other examples include news feeds, and auction and trading systems, to name just two.
Publish-subscribe technology allows processes to communicate with each other asynchronously across multiple machines, as well as between multiple executing processes running on the same machine. It is an asynchronous paradigm because there need not be any synchronization process set up between a publisher and a subscriber. This may be desirable for enterprise web applications. For example, a web farm may be viewed as a virtualized system, where system resources are shared among one or more processes. As a particular example, it may be useful for a cache to span multiple applications running in the web farm, and to span multiple machines in the web farm. A distributed cache is an example where a publish-subscribe event system may be of utility to synchronize the contents of the multiple caches across machines.
Software applications are also becoming more virtualized, meaning that the front-end part of applications may be dynamically instantiated across many servers, so as to scale in order to support usage demands. In a web farm, for example, there may be front-end components residing on web servers that send published events to a back-end component to correlate and process events. As a specific example, the behavior of a dynamic web site may be driven by a back-end recommendation system. In this case, the front-end web components send published events to the back-end recommendation engine which provides the recommendations back to the front-end components.
Another common scenario for web applications is metric collection. In this case, it is desirable to track all user activity at a web site. The activity metrics feed into a so-called business intelligence system, and are used for web site personalization, campaign management, and web site improvements.
A publish-subscribe system may be useful in the above discussed scenarios. Often, a publish-subscribe system is implemented by utilizing a broker to handle subscriptions and to deliver published events to the appropriate subscribers, where a broker is an intermediary program. However, the use of a broker in some instances may not provide high enough performance, and may introduce unacceptable latency.
An embodiment implements a publish-subscribe event process on a machine, where a machine may be a router, for example. According to an embodiment, when a subscription event is received at a machine, or initiated by the machine, the subscription event is published locally in the machine by an inter-process communication in the machine, so that the subscription event is made available to other processes on the machine. This inter-process communication may be implemented as shared memory on some embodiments, so that a subscription event may be published locally by placing it in shared memory.
A subscription event identifies published events to which it is subscribing by using an event type, referred to as a subscribed-to event type. If the subscription event was received over a communication channel, a routing table is updated to associate the subscribed-to event type of the subscription event with the communication channel over which it was received, and the subscription event is forwarded over other communication channels, excluding the communication channel over which it was received. If the subscription event was initiated by the machine itself, it is forwarded over all communication channels unless the subscription is marked as local-only.
According to an embodiment, when a published event is received at the machine, it is published locally by the inter-process communication in the machine so that all subscribing applications running on the machine that subscribe to the published event may have access to it. The published event is forwarded to other machines according to the routing table built up during the subscription process.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
In the description that follows, the scope of the term “some embodiments” is not to be so limited as to mean more than one embodiment, but rather, the scope may include one embodiment, more than one embodiment, or perhaps all embodiments.
In describing the embodiments, it is pedagogically useful to refer back and forth to two drawings, one illustrating a system of modules running on a single machine (
A system of modules on a single machine according to an embodiment of the present invention is illustrated in
A machine may be a general purpose or special purpose computer system. A router is an example of a machine, in which a computer system is optimized in some sense for routing. Because routing tables are maintained on a machine (to be discussed later), it is sometimes convenient to refer to a machine as a router, but it should be understood that this is done merely for convenience, and that a machine may or may not be optimized for routing.
The various modules illustrated in
Event Router Process 114 includes a number of modules: Receiver 104, Forwarder 106, Republisher 108, Listener 110, Subscriptions 112, Persister 116, and Memory 118. Some embodiments may not include Persister 116 and Memory 118. There are also queues, as indicated in
Event Router Process 114 is responsible for routing events, whether subscription events or published events, where published events are routed according to routing tables managed by Subscriptions 112, and subscription events are routed according to a subscription process, unless the subscription event is marked local-only. As implied by the previous sentence, the term “event” may be used to mean either a published event or a subscription event, whereas in the Background, the term “event” was used to denote the actual event that triggers the publication of a published event. In the course of describing the embodiments, the term “event” may be used to mean a published event or a subscription event. It will be clear from context how the term “event” should be interpreted.
Receiver 104 and Forwarder 106 are the communication interfaces to other machines (e.g., routers) in communication with the machine of
The lines connecting the various routers in
Various protocols may be used for the communication channels, and may represent a connection oriented paradigm, or a connectionless oriented paradigm. For example, IP/UDP (Internet Protocol/User Datagram Protocol) or TCP/IP may be used. In setting up a communication channel, sockets (e.g., UDP or TCP sockets) are set up between the communicating routers. For some embodiments, these sockets are kept open for the duration of the publish-subscribe event process to improve performance and reduce latency. Generally, a router is said to be connected to another router if there is a communication channel set up between the two routers.
Every event has a unique identifier, which may be termed an event type. An event type is a GUID (Globally Unique Identifier) that allows any creator of an event, whether a published event or a subscription event, to define their own event type without conflicting with event types created by other publishers or subscribers. A subscription event also has a separate property to define the event type to which it subscribes. Such an event type will be referred to as a subscribed-to event type.
Subscriptions are handled by propagating subscription events throughout the topology of routers. Routing tables are built and maintained dynamically as subscription events are received. Once a subscription event is published locally, Listener 110 forwards the subscription event to Subscriptions 112 to update the routing table stored in the local machine by associating the subscribed-to event type with the communication channel (e.g., TCP socket) of the router from which it came.
For example, if router 202A receives a subscription event from one of its children, say router 204B, then Subscriptions 112 within router 202A updates the routing table in router 202A to associate the GUID of the published event that the received subscription event is subscribing to (the subscribed-to event type or GUID) with the TCP socket by which router 204B communicates with router 202A. Because router 202A also has a parent, Listener 110 hands off the subscription event to Forwarder 106 to send the subscription event to its parent, router 200. Router 200 then handles the subscription event in similar fashion, updating its routing table to associate the subscribed-to GUID with the TCP socket connected to the child that sent it the subscription event, namely router 202A. Note that other embodiments may utilize a protocol other than TCP.
Subscription events propagate from child to parent, and from parent to other children. When router 202A receives a subscription event from its child, router 204B, it also sends the subscription event to its other child, router 204A, so that router 204A can update its routing table. When router 200 receives the subscription event from its child, router 202A, it also sends the subscription event to its other children, routers 202B through 202E. Router 202D, because it is a parent, also sends the subscription event to its child, router 204C. More generally stated, each router that receives a subscription event from its child sends that subscription event to its other children, if any, as well as to its parent if it has one; and each router that receives a subscription event from its parent also sends that subscription event to all of its children, if any. In this way, routing tables in both parents and children are updated.
When a published event is received and has been published locally, Event Router Process 114 routes this published event according its routing table, but does not send the published event back to the router from which it came. In this way, published events are routed to the appropriate subscribers according to the routing tables created during the subscription process.
Returning to
To initiate a publication (that is, published event) on a router, Publication Application 124 uses Publish Manager 126 to publish locally the published event. (That is, the published event is made available to other modules on the router by using Inter-Process Communication 102.) If Listener 110 determines that the published event has an event type (GUID) in the routing table managed by Subscriptions 112, then Forwarder 106 sends the published event to those TCP sockets (or other types of sockets if a protocol other than TCP is used) in the routing table that match the event type.
Listener 110 may also make available events (published events or subscription events) to Persister 116 for the purpose of maintaining various logs, where such logs are stored in Memory 118. As mentioned previously, some embodiments may not utilize Persister 116 and Memory 118, so that logs are not kept for some embodiments.
Subscription Manager 128 keeps track of event types of interest to Subscribing Application 130. When a published event is published locally, whether by Publish Manager 126 or Republisher 108, Subscription Manager 128 hands off to Subscribing Application 130 the event if it has been subscribed to by Subscribing Application 130.
To initiate a subscription process, that is, when there is a new subscriber, Subscription Manager 128 causes an instantiation of a Publish Manager, and this instantiation publishes locally the subscription event by using Inter-Process Communication 102. If the newly initiated subscription event is not marked as local-only, then Event Router Process 114 will route the subscription event to the parent, if there is one, but also to all children, if any. Thereafter, the subscription process follows that of the previous discussion with respect to routers that receive subscription events from other routers.
From the above description, it is seen that by using Inter-Process Communication 102 in a router to publish locally, such as for example by putting a published event into physical memory that is shared by the various modules, subscription applications running on the router will be able to retrieve those published events to which they have subscribed to, and published events will also be sent to other routers according to the routing table kept in the router, so that other subscribing applications running on other routers may have access to the published events.
Other types of router topologies may be used. One such example is a completely connected mesh in which each router has a network connection with all other routers. A simple example is illustrated in
The above description regarding the routing process may be summarized in the flow diagram of
In block 407, Subscription Manager (128) listens to all published events, and any event that is subscribed to by a subscribing application is passed on to that subscribing application. In block 408, the routing table is updated (if needed) if the event is a received subscription event. In block 410, a subscription event is forwarded to all connected routers if it is not marked as local-only, except that a subscription event is not forwarded to the router that sent it. In block 412, a published event is forwarded according to the routing table.
Instructions stored in Memory (computer readable media) 506 cause the computer system of
Although the subject matter has been described in language specific to structural features and methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. Accordingly, various modifications may be made to the described embodiments without departing from the scope of the invention as claimed below.